Migrating gradual types

Abstract Gradual typing allows programs to enjoy the benefits of both static typing and dynamic typing. While it is often desirable to migrate a program from more dynamically typed to more statically typed or vice versa, gradual typing itself does not provide a way to facilitate this migration. This places the burden on programmers who have to manually add or remove type annotations. Besides the general challenge of adding type annotations to dynamically typed code, there are subtle interactions between these annotations in gradually typed code that exacerbate the situation. For example, to migrate a program to be as static as possible, in general, all possible combinations of adding or removing type annotations from parameters must be tried out and compared. In this paper, we address this problem by developing migrational typing, which efficiently types all possible ways of replacing dynamic types with fully static types for a gradually typed program. The typing result supports automatically migrating a program to be as static as possible or introducing the least number of dynamic types necessary to remove a type error. The approach can be extended to support user-defined criteria about which annotations to modify. We have implemented migrational typing and evaluated it on large programs. The results show that migrational typing scales linearly with the size of the program and takes only 2–4 times longer than plain gradual typing.


Introduction
Gradual typing promises to combine the benefits of static and dynamic typing in a single language. In the original formulation by Siek & Taha (2006), the goal is to bring the documentation and safety of static typing to a dynamically typed language. In their formalization, function parameters have dynamic types by default but can be explicitly annotated with static types. The resulting type system provides the same safety guarantees as static typing for expressions using type-annotated variables, yet allows the flexibility of dynamic typing for expressions with unannotated variables.
In gradual typing research, it is quite common to start with simply typed lambda calculus and extend it with annotations for dynamic types (Siek & Vachharajani, 2008;Rastogi et al., 2012;Garcia & Cimini, 2015). A function parameter can be annotated with (the type of dynamic code) when dynamically typed behavior is needed or when the programmer is unsure whether all definitions are type correct but wants to test the runtime behavior.

Challenges applying gradual typing
By integrating static and dynamic typing, gradual typing not only enjoys the benefits of both typing disciplines but also suffers from their respective shortcomings. For example, statically typed parts of the code have more restricted expressiveness and may contain static type errors that yield cryptic error messages (Tobin-Hochstadt et al., 2017), while dynamically typed parts of the code may contain dynamic type errors that are not captured until after the software is deployed. More interestingly, combining statically and dynamically typed code together can raise new challenges; for example, Takikawa et al. (2016) address the challenge of performance degradation in sound gradual typing at the boundaries between statically typed and dynamically typed code. This work, extending Campora et al. (2018a), investigates the problem of migrating gradual programs to be as static as possible without introducing type errors.
To fully realize the benefits of gradual typing, we need the ability to navigate along a program's dynamic-static typing spectrum, in order to make it more static or more dynamic when and where the respective strengths of each are desired. Answering the following three questions will help harness the full power of gradual typing. 1 Q1. Can we make a gradually typed program as static as possible while maintaining its well-typedness to keep it executable? Q2. Can we introduce as few dynamic types as possible to migrate an ill-typed program to a type correct one while still enjoying the benefits of static typing for the welltyped parts? Q3. Can we address the previous questions while keeping some user-indicated parts static or dynamic? Such parts may be indicated, for example, to reduce the granularity of boundaries between static and dynamic code during execution, in order to maintain performance.
The answers to these questions are not obvious. Furthermore, if the answers are yes, it is not clear whether we can implement the operations suggested by the questions efficiently.
In the first part (up until Section 7), we develop machinery for addressing the question Q1. We develop solutions for Questions Q2 and Q3 in Sections 8 and 9.3, respectively. We illustrate the challenges regarding Q1 by considering the following program written in the calculus by Garcia & Cimini (2015) extended with Haskell functions and notations, where parameters annotated with have dynamic types and those without annotations are inferred to have static types. In the rest of the paper, we say these parameters are dynamic and static, respectively. This program is adapted from van Keeken (2006) for formatting rows of a table according to a given width by trimming long rows and padding short rows with empty spaces. rowAtI headOrFoot (fixed:: ) (widthFunc:: ) ( The local variable width represents the width of the table and is computed by the argument widthFunc, either by applying it to fixed if fixed is true, or to widest, the size of largest row in the table. The argument border is added to the beginning and end of each row and is also used to generate the header or footer row when the Boolean argument headOrFoot is true. If we bind the variable tbl to a list of strings, we can then call rowAtI in many ways, such as rowAtI False True (const 3) tbl "_" 0, rowAtI False False id tbl "_" 1, and rowAtI True False id tbl '_' 0.
After some testing, suppose we want to migrate rowAtI to a version that is as static as possible by removing annotations. Removing annotations turns out to be much trickier than we may expect. First, if we remove all annotations, then type inference fails for rowAtI, since it contains multiple static type errors, for example, the then branch requires border to have type Char while the else branch requires it to have type [Char]. Second, if we remove annotations in a left-to-right order, we will encounter a type error as soon as the annotation for widthFunc is removed. (In this paper, we follow the spirit of Garcia & Cimini, 2015 to infer static types only.) However, this does not necessarily indicate that the error was solely caused by widthFunc being statically typed. In fact, the type error involving widthFunc is due to the interaction with fixed when computing the value of width. At this point, we can restore the well-typedness of rowAtI by either re-annotating fixed or widthFunc with . Unfortunately, we cannot easily gauge which annotation is better for typing the rest of the function. If we choose to re-annotate fixed, we will encounter another type error when the annotation for border is removed. Does this type error go away if we instead mark fixed as static and widthFunc as dynamic? The easiest way to tell is by trying it out.
The example illustrates that parameters give rise to complicated typing interactions. The type error caused by making one parameter static may be avoided by making another parameter dynamic, or the type error caused by making two parameters static can be fixed by making another dynamic, and so on. In general, we must examine all possible combinations of static versus dynamic parameters to identify a program that is both well-typed and as static as possible. We refer to all of the potential programs produced by adding or removing annotations as a migration space. The act of moving from one potential program to another by changing types is known as a migration. We say a program in the migration space has a most static type if removing any from the program will make it ill-typed. We call a migration that yields a program with a most static type a most static migration. Due to the nature of type interactions, the most static type, and thus the most static migration, is not unique. Since every parameter can be either static or dynamic, the size of the migration space is exponential in the number of parameters for all functions in the program. For the program consisting of only rowAtI, which has six parameters, we would need to try out all 2 6 = 64 combinations to identify the most static migrations.
The challenges posed by migration between more and less static programs may prevent programmers from fully realizing the potential of gradual type systems. As evidence for this, the CircleCI project recently abandoned Typed Clojure mainly because the cost of adding type annotations to Clojure programs was perceived to exceed the benefits. 2 Similarly, Tobin-Hochstadt et al. (2017) reported that migration of Racket modules to Typed Racked requires too much effort.

Migrating gradual types
In this paper, we address Q1 by (1) developing a type system that efficiently types the entire migration space and (2) designing a method to traverse the result of typing the migration space, calculating which annotations can be removed. In this paper, we mainly consider the removal of annotations to support migrating to a more statically typed program; that is, we make types more precise (Siek & Taha, 2006). However, in Section 8, we describe how a dual approach can be developed to support the addition of annotations (addressing Q2). Also, in Section 9, we describe how the approach can be extended to support further migration scenarios (addressing Q3). In this work, our development focuses on the ITGL calculus. We leave the migration problem in presence of other dynamic and static language features to future work.
As demonstrated in Section 1.1, in general, finding the most static migration requires exploring the entire migration space, which is exponential in size. This rules out a simple brute-force approach that type checks each possibility and compares the results to find the best one.
To illustrate how we can improve on a brute-force search, let us focus on a single parameter, say i in the rowAtI function from Section 1.1. To decide whether we can remove the annotation, we need to type two programs: one where i is static and one where i is dynamic. Observe that the two typing processes differ only slightly. Of the three let-bound variables, only the typing of the second (row) is affected by whether i is static or dynamic. The typing of the other two let-bound variables is identical in both cases. Moreover, since the type of row is determined to be the same regardless of whether i is static or dynamic, the typing of the body of the let-expression is also identical.
This observation suggests that we should reuse typing results while exploring the migration space to determine which annotations can be removed. A systematic way to support this reuse is provided by variational typing (Chen et al., 2012(Chen et al., , 2014. In this paper, we develop a type system that integrates gradual types (Siek & Taha, 2006) and variational types (Chen et al., 2014) to support reuse when typing the migration space. This type system supports efficiently typing the entire migration space, in roughly linear time, even in the presence of type errors.
After typing the migration space, we want to find the point in that space that is most static. Although the number of results to be considered is large, this step can be made efficient by exploiting several relationships between the resulting types. To illustrate these − − + + + 7 + + + − + 8 + + − − − Fig. 1. Types for a sample of the migration space for the rowAtI function. The second column contains a sequence of + and − symbols, indicating whether the annotation is kept or removed, respectively, for each of the five parameters annotated with in rowAtI. For example, for program 2, all parameters except fixed keep their annotations. The entries denote that the corresponding program is ill-typed.
relationships, we list a subset of the migration space for the rowAtI example and their corresponding types in Figure 1. The first observation is that some parameters, whether they are static or dynamic, do not affect the type correctness of the program. In the example, the 3rd and 5th parameters (table and i, respectively) are examples of such parameters. Given this knowledge and the fact that program 2 is well-typed, we can deduce that program 3 is also well-typed since they differ only in the annotations of the 3rd and 5th parameters. Similarly, given that program 8 is type incorrect, we can deduce that program 7 is also type incorrect for the same reason.
The second observation is that if a program is well-typed after removing annotations from a set of parameters P, then (1) removing annotations from a subset of P will also yield a well-typed program (this corresponds to the static gradual guarantees of Siek et al., 2015), and (2) the program with all annotations removed from P is the most statically typed of these programs. For example, program 3 has a more static type than program 2, which in turn has a more static type than program 1. Similarly, this relation holds for the sequence of programs 5, 4, and 1. Note that the number of removed annotations does not provide the same ordering. For example, program 3 removes more annotations than program 4, but program 4 has a more static type.
The third observation is that if removing all annotations for a set of parameters causes a type error, then removing the annotations for any superset of those parameters must also cause a type error. For example, given that making the 4th parameter (border) static in program 7 causes a type error, we can deduce that additionally making the 3rd (table) and 5th (i) parameters static in program 8 will also cause a type error.
These three observations enable an efficient method for finding the most static program. For rowAtI, we immediately discover that programs 3 and 5 are most static (neither one is more static than the other). In this case, we can either pick one of the results or have a programmer specify the preferable program. In Section 5, we show that these three observations hold for arbitrary programs, which allows us to develop an efficient method for finding desired programs in general.

Relations with other work in program migration
The work by Migeed & Palsberg (2019) also studied the problem of program migration. However, there are many significant difference between our work and theirs.

Differences in techniques
There is a fundamental difference in finding the migrations in these two approaches. For a given program, their approach finds migrations in the following steps. First, it generates a set of programs where each program replaces a in the current program with a Int, Bool, or → . Second, it uses the type checking algorithm from Garcia & Cimini (2015) to type check the each program from the set. If a program does not type check, then it is not a migration of the original program. Otherwise, it is a migration, and the whole migration process is continued from the current program. The two-step process stops when no more programs type check. After this process finishes, all programs that type check are considered as possible migrations of the original program. Figure 2 left illustrates the migration process of Migeed & Palsberg (2019) for the expression λx : .x x. In the first step, three programs are generated, each replacing the with a more precise type. The programs λx :Int.x x and λx :Bool.x x do not type check. Therefore, they are not migrations of λx : .x x. In contrast, the program λx : .x x type checks and is a migration. Moreover, program migrations are searched starting from λx : → .x x.
Putting aside variational typing, our approach can be viewed as generating all the programs that are obtained by removing all combinations of the s in the program. After that, we use the type inference algorithm from Garcia & Cimini (2015) to check the type correctness and infer the type of each program. All programs that are type correct are migrations of the original programs. Figure 2 right shows all programs generated in our approach.
Since there is only one in the expression, there are only two possible expressions that we need to investigate for migrations: the original expression and the one that removes the .
To give a more straight view about what the whole search space looks like, we present in Figure 3 all the programs that are generated for finding migrations for rowAtI. Since rowAtI contains five s, the total number of programs we need to investigate is 32. The figure uses a sequence of five + or − characters to denote each generated program. If the ith character is a +, then the ith is kept. Otherwise, it is removed.
As argued in Section 1.1, in general it is necessary to explore all the generated programs to find the programs that remove as many s as possible. Our main goal in this paper is to use variational typing to make the exploration process efficient. Fig. 3. Programs explored for finding migrations for rowAtI in our approach. These programs (configurations) constitute the full migration lattice (Takikawa et al., 2016) for the program rowAtI. Each configuration is identified by a sequence of "+/-" signs, with "+" ("-") indicating that the corresponding is kept (removed). A configuration with strictly more "-"s is more precise. We present several lines relating program precision and omit most of them for clarity.
In summary, the main technical difference is that while Migeed & Palsberg (2019) intertwine program generation and type checking to find migrations, our approach can be viewed as an efficient way of first generating all programs and then using type inference to find all migrations.

Differences in behaviors
The differences in techniques lead to several significant behavioral differences in these two approaches, discussed below.
First, the migration space could be infinite in Migeed & Palsberg (2019) but it is always finite in our approach. The main reason is that in their approach if a program in the migration space type checks, then programs with more precise type annotations will be generated, which may be well-typed, yielding more programs being generated. One such example is in Figure 2. Replacing the original with → makes the expressions type checks, and replacing any with → will also type check. This process may be repeated infinitely. In Figure 2, we use dashed lines to indicate such infiniteness.
Instead, our approach generates exactly 2 n programs, where n is the number of s in the expression. For example, for the expression λx : .x x, our approach generates two expressions (including the original one), as can be seen from Figure 2.
Second, as Migeed & Palsberg (2019) use type checking from Garcia & Cimini (2015) while our approach uses type inference from Garcia & Cimini (2015) and it is well-known that type inference is often incomplete, their approach can lead to more precise program migrations than ours for certain programs. For example, for the expression λx : .x x, their approach will generate a program λx : → .x x. As this program type checks, it is a valid migration. However, in our approach, we will check the expression λx.x x, obtained by removing the from the expression. For this expression, type inference generates two constraints: β = β 1 → β 2 and β 1 ∼ β, where β, β 1 , and β 2 are three type variables. The unification algorithm in Garcia & Cimini (2015) fails to solve these two constraints due to occurs check. Consequently, type inference fails for this expression. As our type inference is a variational version of the one in Garcia & Cimini (2015), we also fail to infer a type for λx.x x. As a result, no improvement is possible in our approach for λx : .x x. In Section 9.2, we present an extension to our approach that could infer more precise types, including finding a migration for the expression λx : .x x.
Their work uses the term "maximal migration" to denote a migration that cannot be made more precise (any such effort leads to ill-typed programs). For certain programs, no maximal migrations exist. The expression λx : .x x is one such example. The reason is that a in any migration can be replaced by a → , thus more precise, without making the program ill-typed. In our work, we use the term "most static migration" to refer to migrations where no more s could be removed and replaced with fully static types. For λx : .x x, the most static migration is itself (our extension in Section 9.2 finds more static migrations). In our approach, most static migrations always exist because among a finite number of migrations we can always find migrations that remove most s. In case no s can be removed and replaced with fully static types, the original expression is considered as the most static migration. Maximal migrations and most static migrations may coincide. For example, the programs in Figure 3 that are in blue and in fourth column are maximal and most static migrations.

Additions in the journal version and contributions
This paper extends Campora et al. (2018a) with the following additions.
• In Section 1.3, we discuss in depth the relation between our work and the work by Migeed & Palsberg (2019). • In Section 8, we present a solution to fixing static type errors by introducing as few dynamic types as possible (question Q2), a dual problem to removing as many as dynamic types (question Q1) . • In Section 9.2, we present an extension to our constraint solving algorithm that enables us to find more precise migrations that the approach in Campora et al. (2018a) was not able to. • In addition to the migration questions Q1 and Q2, we consider many other migration scenarios, such as finding the migrations that migrate the greatest number of parameters. We present the approaches to support them in Section 9.3. These approaches reuse or slightly adapt the machinery for supporting Q1, which demonstrates the potential of our approach for developing more complex migration scenarios.
• In Section 10, we expand our evaluation by converting programs in Grift Kuhlenschmidt et al. (2019) to our language and measure their performances. • We updated related work to discuss the relation with the latest work on gradual typing, including Migeed & Palsberg (2019), Campora et al. (2018b), andPhipps-Costin et al. (2021).
We defer the proofs of this paper to Campora et al. (2022). Overall, this paper makes the following contributions.
1. In Section 1.1, we identify three questions, Q1 through Q3, for migrating gradual program to fully harness the benefits of gradual typing. 2. In Section 4, we present a type system that integrates gradual types (Siek & Taha, 2006), variational types (Chen et al., 2014), and error-tolerant typing (Chen et al., 2012). The type system is correct and efficiently types the whole migration space. We detail the proofs for important cases of the theorems and lemmas that are introduced. 3. In Section 5, we investigate the relationship between different candidate migrations and develop a method for computing the most static migrations. 4. In Sections 6 and 7, we generate and solve constraints to provide type inference for migrational typing and prove that the constraint solving algorithm is correct. 5. In Section 8, we develop a dual to migrational typing to address the migration question Q2. 6. In Section 9, we describe extensions to support additional common language features. We also discuss other migration scenarios and solutions supporting them. 7. In Section 10, we study the performance of our implementation by applying it to synthesized programs. The result shows that our approach scales linearly with program size.
To improve readability, the following table summarizes where important terms and operations are introduced. In the "F | P" column, F i and P i are shorthands for Figure  i and  In this section, we briefly introduce two areas of previous work that our type system for migrating gradual types builds on. In Section 2.1, we present a simple gradually typed language that represents the starting point for our work. This language is adapted from Garcia & Cimini (2015), but includes some minor differences to set up the presentation in Section 4. In Section 2.2, we introduce the concept of variational typing (Chen et al., 2014), which is the key technique that allows us to efficiently type the entire migration space.

Gradual typing
Gradual typing allows the interoperability of statically typed and dynamically typed code. The original formalization by Siek & Taha (2006) defined gradual typing for a simply typed lambda calculus extended with dynamic types. Siek & Vachharajani (2008) and Garcia & Cimini (2015) further investigated gradual typing in the presence of type inference.
In this paper, we consider the migration of programs in implicitly typed gradual languages. In Figure 4, we present the syntax and type system of one such language, ITGL, which is adapted from Garcia & Cimini (2015) and forms the basis for this work. In the syntax, c ranges over constant values, x over variables, γ over constant types, and α over type variables. There are two cases for abstraction expressions, one where the parameter is annotated by and one where it is not. The rest of the cases are standard. The type system will be explained below.
The presentation of ITGL in Figure 4 differs from the original in Garcia & Cimini (2015) in two ways. First, our syntax is more restrictive: we omit a case for explicit type ascription of expressions, and we do not allow arbitrary type annotations on abstraction parameters. We also do not consider let-polymorphism here. These restrictions are made to simplify our formalization later, but we show in Section 9 how they can be lifted. Second, the typing rules are parameterized by a statifier, ω, which is used in the full migrational type system later (Section 4). A statifier is a mapping that maps parameter names that have s to static types, making an expression to have a more static type. The statifier specifies what static types to assign to parameters whose annotations will be removed. For simplicity, we assume parameters have unique names. In the type system as defined in Figure 4, ω is always empty, corresponding to the type system in Garcia & Cimini (2015). In the type system for ITGL in Figure 4, the typing rules for constants and variables are standard. There are two rules for abstractions, ABS for unannotated parameters which must have static types, and ABSDYN for annotated parameters which may have dynamic types. In ABSDYN, we use or (ω(x), ) to return ω(x) if x ∈ dom (ω) or otherwise. Therefore, if ω is empty, then or (ω(x), ) will always be .
Note that a statifier maps parameters to fully static types only, as can be seen from the definition of ω in Figure 4. As such, mappings such as x → → Int or y → → do not belong to ω. This follows the spirit of Garcia & Cimini (2015) that inferred types should be fully static. Consequently, we cannot find an ω to make the expression λx : .x x well-typed, even though the expression λx : → .x x is.
GC if e 1 then e 2 else e 3 : G 2 G 3 Gradual type consistency: Auxiliary definitions: Typing applications is tricky, since dynamically typed arguments can be passed to functions with statically typed parameters and vice versa. For example, assuming the function, succ, has static type Int → Int, both of the following programs in our Haskell-like notation should be accepted by gradual typing. The APP rule accommodates this with the help of a consistency relation, ∼, that dictates when two unequal types are compatible with each other. An application is well-typed if the domain of the LHS (i.e. the parameter type) is consistent with the RHS, and the type of the application is the codomain of LHS. The auxiliary functions dom and cod return the domain and codomain of a function type, respectively, or for a dynamic type (reflecting the fact that is equivalent to → ).
The gradual type consistency relation is defined in Figure 4 by four rules: C1 defines that consistency is reflexive, C2 and C3 define that a dynamic type is consistent with any type, and C4 defines that two functions types are consistent if their respective argument and return types are consistent. As a result, Int → Int ∼ Int → but not Int → Int ∼ Bool → , since the argument types are not consistent in the latter case. Note that the consistency relation is not transitive. Due to C2 and C3, transitivity would lead every static type to be consistent with every other static type, which is clearly undesirable.
Typing conditional expressions relies on the meet operation, , on gradual types. Intuitively, meet chooses the more static of two base types when one is . For two equal static types, meet is idempotent. For two function types, meet is applied recursively to their respective argument and return types. The meet operation helps assign types to conditionals when the two branches might not have an identical type but still have consistent types. Intuitively, meet favors the type of the more static branch of the conditional expression.

Variational typing
Variational typing (Chen et al., 2012(Chen et al., , 2014 enables efficiently inferring types for variational programs. A variational program represents many different variant programs that share some parts among each other and which can each be generated through a static process of selection. The theoretical foundation for variational typing is the choice calculus (Erwig & Walkingshaw, 2011), a formal language for representing variational programs. The essence of the choice calculus is that static variability in programs can be locally captured in variation points called choices, as demonstrated by the following example. This program contains a choice named A with two alternatives, succ and even. We write e d.i to indicate the selection of the ith alternative of each choice named d in e. So, vfun A.1 yields the program succ 1 and vfun A.2 yields even 1. We call d.i a selector and use s to range over selectors. We call d. A variational expression may contain several choices. Choices with the same name are synchronized and independent otherwise. For example, the variational expression A succ, even A 2, 3 has two variants, succ 2 and even 3, obtained by the decisions {A.1} and {A.2}, respectively. The program succ 3 cannot be obtained through selection and so is not a variant of this expression. On the other hand, the variational expression A succ, even B 2, 3 has four variants, and we can obtain the variant succ 3 with the decision {A.1, B.2}.
In general, an expression with n distinct choice names can be configured in 2 n different ways. Since variational programs can easily contain hundreds or thousands of independent choice names (Apel et al., 2016), checking the type correctness of all variants is intractable by a brute-force strategy of generating all of the variants and typing each one individually (Thüm et al., 2014). Variational typing solves this problem by sharing the typing process across all variants, which is achieved by defining and reasoning about variational types.
Variational types are types extended with choices. We define variational types in Figure 5. They include constant types (γ ), such as Int and Bool, type variables (α), function types, and choices over two alternatives.
All concepts and operations on variational expressions carry over to variational types. For example, Figure 5 defines selections on types. Selecting constant types (and type variables) with any selector yield themselves. For a function type, selection is recursively applied on the parameter type and return type. Selecting a choice type (d V 1 , V 2 ) with a selector that has the same choice name (d.i) will yield the ith alternative. The selection is recursively applied to the alternative to eliminate all choices with the same name. For example, if we do not recursively select, A A Int, Bool , Bool A.1 yields A Int, Bool while Int is the expected result, which could be achieved by recursively selecting A Int, Bool with A.1. Selecting a choice type (d V 1 , V 2 ) with a selector (d 1 .i) that has a different choice name will apply the selection to both alternatives. Finally, selecting a type with a decision (s : δ) is recursively defined as first selecting the type with s and then selecting the resulting type with the decision δ.
It is natural to assign variational types to variational expressions. For example, A succ, even has type A Int → Int, Int → Bool . Similar to gradual typing, typing applications in the presence of variation is complicated by the fact that "compatible" types may not be syntactically equal. In particular, 1. the LHS is traditionally expected to be a function type but in variational typing may be a (nested) choice of function types, and 2. when checking whether the type of the argument matches the type of the parameter, we must take into account that either or both may be variational. For example, the type of the function on the LHS of vfun is A Int → Int, Int → Bool , which is not a function type directly, but both variants of vfun, succ 1 and even 1, are well-typed.
Typing applications is supported in variational typing through the definition of a type equivalence relation (Chen et al., 2014), which is presented in Figure 5. Essentially, type equivalence specifies when a type can be transformed into another without affecting its semantics. The semantics of a variational type maps decisions to the variant plain types obtained by selecting from the type using the decision. For example, A Int → Int, Int → Bool , A Int, Int → A Int, Bool , and Int → A Int, Bool are all equivalent because selecting from each of them with {A.1} yields the same type Int → Int and selecting from each of them with {A.2} yields the same type Int → Bool. As a result, we can say that vfun has the type Int → A Int, Bool , which is a function type with the argument type Int matching the type of 1. We can thus assign the type V vfun = A Int, Bool to vfun.
An important result of variational typing is that choice elimination preserves typing. More specifically, if e has the type V , then e δ has the type V δ for any decision δ. For example, vfun A.1 yields succ 1, which has the type Int, the same as V vfun A.1 . An implication of this result is that the type of any variant can be easily obtained by making an appropriate selection into the result type of the variational program. Another important result of variational typing is that it is significantly faster than the brute-force approach.

Road map to migrating gradual types
In Section 1.1, we argued that the complexity of the tasks implied by the questions Q1-Q3, involving the migration of gradual programs, is exponential. In Section 2.2, we have shown that variational typing can efficiently type a set of similar programs. A main idea of this paper is to reduce the problem of typing the migration space to variational typing. Specifically, we assign each parameter with a annotation a choice type whose the first alternative is a and whose second alternative is a static type (In Section 9.1, we deal with parameter types that are partially static, such as Int → ). Consider, for example, the following function widthV that represents the variationally typed version of the function width (also shown below) for computing the table width in rowAtI. width (fixed:: )(widthFunc:: )=if fixed then widthFunc fixed else widthFunc 5 widthV (fixed::A , Bool ) (widthFunc::B , Int → Int ) = if fixed then widthFunc fixed else widthFunc 5 The function widthV encodes all four possible migrations of width. If V widthV is the type of widthV, then V widthV {A.1,B.1} is the type for width with no annotations removed, V widthV {A.2,B.1} is the type that replaces with Bool for fixed and keeps for widthFunc, V widthV {A.1,B.2} is the type that keeps for fixed but replaces with Int → Int for widthFunc, and V widthV {A.2,B.2} is the type that removes both annotations. In order to successfully employ variational typing to improve the performance of migrational typing, several technical challenges must be addressed. Figure 6 presents challenges and relevant theorems. The challenge C2 (error tolerance) does not have any theorems associated with it so we omit it from the figure.
C1. We refer to this challenge type compatibility. In the presence of dynamic and variational types, we need to combine the type equivalence relation between variational types (marked as V ≡ in Figure 6) and the consistency relation between gradual types(marked as G∼ in the figure), which we refer to as the compatibility relation (marked as M≈ in the figure). After introducing the syntax of the migrational type system in Section 4.1, we address this problem in Section 4.2. Theorems 1 through 3 prove that the combination is correct. C2. We refer to this challenge error tolerance. In general, some variants of the variational program that encodes the migration space may contain type errors. We need the typing process to continue even in the presence of type errors to determine the types of all variants. In Section 4.3, we address this problem and give a declarative specification of our type system. C3. We refer to this challenge best typing. In the brute-force approach, we need to generate all expressions (e 1 , e 2 , . . . in Figure 6) from the given expression (e in the figure) by removing all combinations of s. These expressions will need to be typed using the type system GC introduced in Figure 4. Our type system (presented in Section 4.4) types the expression e directly once without generating other programs (the judgment π ; e : M | in Figure 6). We thus need to show that our type system, by typing only one expression, essentially types all possible expressions that could be generated. Theorems 4 and 5 prove that this is indeed the case.
In widthV, we explicitly assigned static types to each parameter. One may wonder whether these are the best types to assign. Maybe other static types could improve the typing result and produce more general types or fewer type errors. Theorem 6 in Section 4.5 proves that in our type system, there exists a best typing derivation that contains the fewest errors and yields most static and general result types. C4. We refer to this challenge migration extraction. In brute-force approach, we need to compare typing results for all generated expressions to determine the most static migrations. While we could type just the original expression once with the best migrational typing, we need to find out the most static migrations from the typing result. This may also require the comparison of an exponential number of result types for the migration space. Fortunately, Theorems 7 through 10 prove that an efficient algorithm exists for finding most static migrations. In Section 5.2, we develop such an algorithm. C5. We refer to this challenge type inference. In challenge C3 (best typing) we claimed that a best migrational typing exists, but how do we find it? We answer this question by solving the type inference problem in Sections 6 (constraint generation C in Figure 6) and 7 (constraint solving U in Figure 6). Theorems 11 through 15 prove desired properties of type inference.

Migrational type system
This section addresses the challenges C1 (type compatibility)-C3 (best typing) from Section 3 to support efficient migrational typing. After introducing the syntax of types and expressions in Section 4.1, the compatibility relation is defined in Section 4.2, addressing C1 (type compatibility). A pattern-constrained typing relation is introduced in Section 4.3 and defined via typing rules in Section 4.4, addressing C2 (error tolerance). Finally, the properties of this type system are discussed in Section 4.5, addressing C3 (best typing).

Syntax
The syntax of expressions, types, and environments is given in Figure 7. The metavariables we use to range over the relevant symbol domains are listed at the top of the figure. For type variables, we typically use β to denote the result type of a function application during constraint generation and κ to denote fresh type variables generated during constraint generation and solving (see Sections 6 and 7). For choice names, we typically use A and B to denote arbitrary specific choices in examples and d as a generic metavariable to range over choices names in definitions. The syntax of expressions, static types, and gradual types is repeated from Section 2.1. To this, we add variational types, which are static types extended with choices, and migrational types, which are gradual types extended with choices. Note that each top-level parameter is assigned a restricted form of migrational type, which is either a fully static type, a , or a choice of restricted migrational types; however, the more general syntax defined in Figure 7 is needed during the typing process. In Section 9.1, we extend our framework to allow an arbitrary mix of and static types for top-level parameters. We also define type context to facilitate our presentations of both the type system and proofs.
The type system relies on three kinds of environments: a type environment maps variables to migrational types, a substitution maps type variables to variational types, and a variational statifier maps variables to variational types. As described in Section 2.1, a statifier ω records one way of making a program more static (by removing some subset of annotations). A variational statifier instead compactly encodes all possible statifiers for an expression. Since we want migration in our formalization to assign static types to parameters whose annotations are removed, maps parameters to variational types, but not migrational types.
Substitutions map type variables to variational types rather than migrational types since substituting dynamic types is unsound. For example, suppose we have f → α → α → α → α and x → in . Now, when typing the application f x, we will substitute {α → }, yielding → → as the type of f x. However, this implies that f x 2 True is well-typed, even though this violates the initial static type of f. The idea of substituting type variables with variational types but not migrational types is reminiscent of Guha et al. (2007), where only certain contracts could be used to instantiate parametric contract variables. Type substitution, written as θ (M), is defined in the conventional way.

Type compatibility
In the rest of this section, we use the widthV example from Section 3 to motivate the technical development of the migration type system and investigate the properties of the type system. The motivating goal is to type the condition fixed and the application widthFunc 5 in widthV.
According to the annotation of widthV, the parameter fixed has type A , Bool . Since fixed is used as a condition, it should have type Bool. Since both alternatives of the choice are consistent with Bool, this use should be considered well-typed. The variable widthFunc has type B , Int → Int , which can be considered equivalent to B , Int → B , Int . (In Section 4.4, we show how to achieve this formally with dom and cod.) The constant 5 has type Int. Since both alternatives of B , Int are consistent with Int, widthFunc 5 should also be considered well-typed.
These two examples demonstrate that we need a notion of compatibility between two migrational types to express that all of their variants are consistent. Intuitively, the compatibility relation incorporates both type equivalence for variational types (Chen et al., 2014) and type consistency for gradual types (Siek & Taha, 2006). The definition of compatibility (M 1 ≈ M 2 ) is given in Figure 8. The relation is reflexive (MT-REFL) and symmetric (MT-SYM). The relation is transitive (MT-VTTRANS) in the case that no s are present, which we indicate by using the metavariable for variational types (V ).
The rules MT-IDEMP and MT-DEADELIM specify compatibility under choice type simplification. Rule MT-IDEMP states that a choice with identical alternatives is compatible To illustrate compatibility, we show A Int, ≈ B , Int . This should hold, since both choice types only produce Int or , which are consistent with each other and themselves. We can start by A Int, Int ≈ Int via MT-IDEMP and Int ≈ B Int, Int via MT-IDEMP and MT-SYM. We can then use MT-VTTRANS to derive A Int, Int ≈ B Int, Int . After that, we can apply MT-DYNINTRO to replace the first Int in B with a , apply MT-SYM, and apply another MT-DYNINTRO to replace the second Int in the choice A with a , yielding B , Int ≈ A Int, . By applying MT-SYM one more time, we can derive the original goal.
With ≈, we can formalize the application rule as follows.
Based on this rule and ≈, we can calculate the type B , Int for widthFunc 5.
We demonstrate the correctness of ≈ by establishing its connection with type equivalence (≡) from Chen et al. (2014) and type consistency (∼) from Siek & Taha (2006) through the following theorems. In the theorems, we write M δ ∈ V and M δ ∈ G to denote that M δ yields a variational type (no ) and a gradual type (no variations), respectively. The first two theorems state the soundness of ≈; the third theorem states its completeness.

Pattern-constrained judgments
The goal in this subsection is to type the application widthFunc fixed in widthV, thus solving challenge C2 (error tolerance) for migrational typing. According to the type annotation of widthV, widthFunc has type B , Int → Int , and fixed has type A , Bool . Since it is impossible to derive B , Int ≈ A , Bool (where the former is the domain of the function type and the latter is the type of the argument), the application rule from Section 4.2 fails to assign a type to widthFunc fixed. If we terminate the typing process, we will not be able to compute any type for widthV, failing to provide support for program migration.
While the compatibility check between A , Int and B , Bool fails, we observe that , the first alternative of A, is compatible with B , Bool and Int, the second alternative of A, is compatible with , the first alternative of B. This suggests that we should describe compatibility at a more fine-grained level than simply saying whether or not two migrational types are compatible. We employ the idea of typing patterns (π ) (Chen et al., 2012) to formalize this idea (see Figure 9). The patterns and ⊥ denote that the compatibility check succeeds and fails, respectively, and the choice pattern d π 1 , π 2 describes the success or failure of compatibility checking within the context of choice d.
In Figure 9, we also define selection on patterns, which is similar to selection on types ( V δ ) in Figure 5. On page 13, we gave a detailed explanation on selection on types, and we skip the explanation of selection on patterns here.
We can now express the partial compatibility between A , Int and B , Bool by the typing pattern A , B , ⊥ . It is also possible to give some pattern that has an identical effect, such as the pattern B , A , ⊥ .
In Figure 9, we define M 1 ≈ π M 2 such that M 1 and M 2 are compatible for all variants of π that are . In contrast, there is no requirement between M 1 and M 2 at other places. For example, Int ≈ A ⊥, A Bool, Int , since Int ≈ Int at A.2 (and since we do not care that Int and Bool are incompatible at A.1).
The idea of constraining compatibility with patterns is quite powerful. We can even generalize it to typing judgments. Specifically, the typing relation π ; e : M holds if δ e δ : M δ for all δ such that π δ = . The advantage is that we do not need to worry about the typing in variants where π has ⊥s. That also means that we should not use (or trust) the typing result at variants where π has ⊥s. We formally define this relation in Figure 9. For example, since 1 : Int we have A , ⊥ ; A 1, True : Int, even though True does not have the type Int. We can also generalize this idea to other operations, such as dom and cod, again defined in Figure 9. 9. Patterns and pattern-constrained relations and operations. . op can be any unary or binary operation on types. The is defined stipulations in the premise mean that the operations are defined on their input types, as specified in Figure 4. The is defined in the conclusion indicates that the operation can be safely carried out on the migrational type when constricted by π .
As shown in the rule PATUNARY, we can also use patterns to constrain unary functions so that they need to be defined for where only the pattern have . In the rule, op could be instantiated to any unary functions, such as dom and cod. We use the following function dom to illustrate this idea.
The function dom is defined for three cases and is undefined for all other inputs. For example dom (Int → Bool) = Int but dom (Int) is undefined. How about dom (A Int → Bool, Int )? We can observe that it is defined for the first alternative but not the second alternative. In such case, we can constrain dom with a pattern to indicate that the function does not need to be defined for all alternatives of variations. For our example, we can use the pattern A , ⊥ to convey that we only need the first alternative of A to be defined (because the pattern there is a ) while ignore whether the second alternative is defined or not (because the pattern there is a ⊥). With this idea, dom A ,⊥ (A Int → Bool, Int ) is defined in both alternatives of A. Moreover, for the second alternative, we can say the result dom is any type because ⊥ in that alternative indicates that the typing result will be discarded. Only typing results in variants where typing pattern has are valid and considered. Similarly, we can define cod π if we have a function cod, which we define in Figure 10. The rule PATBINARY allows us to constrain binary operations or functions in the same way.
Based on the idea of pattern-constrained judgments, we can define the following rule for typing function applications (where dom is defined above and cod will be defined in Figure 10): π ; e 1 : M 1 π ; e 2 : M 2 dom π (M 1 ) ≈ π M 2 π ; e 1 e 2 : cod π (M 1 ) With this new rule, which accounts for migrational types with type errors, we can revisit the problem of typing widthFunc fixed. Let π = A , B , ⊥ . Since 10. Typing rules. The operations dom, cod, and are undefined for cases that are not listed here. The process for obtaining dom π from dom is detailed in Section 4.3. The operations cod π and π can be obtained similarly through Figure 9.
fixed : B , Bool . Next, dom π (M) = A , Int . As we have seen earlier, A , Int ≈ π B , Bool . Thus, all the premises of the application rule are satisfied, and we can derive π ; widthFunc fixed : A , Int . Based on the result pattern, we should not trust the typing information at the variant {A.2, B.2} since π {A.2,B.2} = ⊥.
While pattern-constrained judgments simplify the presentation, we still face the challenge of finding appropriate patterns, which are inputs to the typing relation. However, the pattern is determined by the typing constraints among the subexpressions. For example, the type of the argument must match the argument type of the function. The reason we use A , B , ⊥ in typing widthFunc fixed is that the application is ill-typed at {A.2, B.2}. Therefore, in a language with type inference, the pattern will be computed during the inference process (Sections 6 and 7).

Typing rules
The typing rules are shown in Figure 10. They are based on the compatibility relation (Section 4.2) and pattern-constrained judgments (Section 4.3). The typing judgment has the form π ; e : M | and expresses that e has type M under environment constrained by the pattern π . The mapping collects the types that will be assigned to parameters if their s are removed. We assume that parameter names from different functions are uniquely identified in the domain of . The goal of is to connect the typing rules here with those from Figure 4. We discuss this aspect in more detail in Section 4.5 where we investigate the properties of the type system.
The rules for constants (CON) and variables (VAR) are straightforward. They hold for arbitrary patterns π because constants and bound variables are always well-typed. Moreover, since the types remain unchanged, is always ∅. The rule ABS for an abstraction whose parameter is not annotated with is conventional. In rule ABSDYN for an abstraction whose parameter is annotated with , we assign the parameter a choice type where the first alternative is implying that the is kept and the second alternative can be any type for the body to be well-typed. As a result, when variations are first introduced, their first alternatives are s. This change information is recorded by extending the returned from typing the body of the abstraction. The APP rule for applications is similar to the one in Section 4.3 except that we must combine the variational statifiers from typing the two subexpressions. The operations dom π and cod π can be obtained from dom and cod respectively using the idea of pattern-constrained operations discussed in Section 4.3.
The rule IF types conditionals; it relies on an extended version of the meet operation ( ) from Figure 4 that also handles choices. The definition π can be obtained from Figure 9 by instantiating the op in rule PATBINARY with . In Section 4.3, we gave a detailed example of deriving dom π from dom and π can be derived from similarly.
The WEAKEN rule states that if a typing pattern can be used to derive a typing, then we can use a less-defined pattern to derive the same typing. The operation = π 1 in the premise specifies that its arguments must be the same for places where π 1 has s. A typing pattern π 1 is less defined than π 2 if it contains ⊥ values at least everywhere π 2 does. The purpose of WEAKEN is to make the typing process compositional. Without this rule, the whole typing derivation must use the same π . With this rule, we can use different patterns for typing the children of a construct but adjust them to use the same pattern when typing the construct itself. To illustrate, consider typing an application e 1 e 2 . It is likely that e 1 and e 2 will contain errors at different variants, and thus, the typing patterns for typing them will be different. Without WEAKEN, we should use a single pattern for typing these two subexpressions. With WEAKEN, we can use different patterns for typing subexpressions, and before typing the application itself we can apply WEAKEN to the typing derivation for either or both e 1 and e 2 to make their patterns the same. After that, we can apply the APP rule.
The less-defined relation on patterns, written as π 1 ≤ π 2 , is formally defined in Figure 10. The rules PAT-OK and PAT-ERR define that any pattern is less defined than and more defined than ⊥. The rule PAT-TRANS defines that the relation is transitive. The last three rules handle variational patterns. The rule PAT-SINCHC states that a pattern is less defined than a variational pattern if it is less defined than both alternatives of the variational pattern. The rule PAT-CHCSIN states that a variational pattern is less defined than a pattern if both alternatives are. Finally, the rule PAT-CHCCHC says that two variational patterns satisfy the less-defined relation if their corresponding alternatives do.

Properties
This subsection investigates the properties of the type system. Since the goal of migrational typing in Figure 10 is to type all possible programs that remove s for a given program at once, we want to investigate whether migrational typing does it currently for individual programs and whether it indeed types all programs that remove s. To this end, we consider the relationship of the rules for migrational typing in Figure 10 and the original rules for gradual typing in Figure 4. We also consider the relation between different typing derivations π ; e : M | when different π s and Ms are used for the same and e, which addresses challenge C3 (best typing) from Section 3.
We start by introducing some notation. We say a decision δ is complete for an expression e if it contains d.1 or d.2 for each d created while typing e. For π , a decision δ is complete if π δ yields or ⊥. Note that a complete decision for π may not be complete for the expression since patterns compactly represent where typing succeeds and where it fails. For instance, while typing rowAtI, we created five choices A, B, D, E, and F for the dynamic parameters from left to right, respectively. Thus, each complete decision for rowAtI contains five selectors. One typing pattern for rowAtI is: 2} are complete decisions for π a but not for rowAtI. In the case that the whole migration space for an expression is well-typed, then the pattern is simply and the complete decision is { }. We use the notation δ| 2 to collect all of choice names d such that d.2 ∈ δ.
The notions of decisions (δ), variational statifier ( ), and statifier (ω) are closely related. Specifically, during typing, for each dynamic parameter x, includes a mapping x → V , where V is the type that will be assigned to the parameter once its annotation is removed. Therefore, given and δ, we can generate a statifier as follows, where chc(x) returns the name of the choice created for x.
The notation G 1 G 2 means that G 2 is more static than G 1 ; it is defined as follows.
We further say that G 2 is better than G 1 , written as G 1 G 2 , if G 1 G 2 or G 1 = θ 2 (G 2 ) for some θ 2 . Intuitively, G 1 G 2 if G 2 is equally or more static than G 1 or they are equally static and for any static part in G 1 , G 2 has the same static type or a type variable. For example, we have → α Int → Int and Int → Int Int → α.
We next demonstrate the correctness of our type system by showing that, at the places where the typing pattern is valid, it assigns the same types to all the programs in the migration space as the brute-force approach does.
This theorem states that, for any removal of annotations, the typing result encoded in migrational typing is the same as by typing the program with ITGL.
Conversely, any removal of that yields a well-typed program is encoded in some typing derivation in migrational typing, as expressed in the following theorem.
We can observe that for a given expression, there may be multiple typing derivations based on the typing rules in Figure 10. The reason is that, for example, the variational types used for typing the same ABSDYN in different typings could be different. Particularly, we want to know if there exists a best typing derivation that is more static and more defined (the corresponding typing pattern contains ⊥ in fewest variants) than all other derivations. Fortunately, this is indeed the case (Lemma 2). We next investigate the relation between different typings. In Lemma 1, we will show that different typings can be combined to make the result as correct as possible (that is, to minimize ⊥s in the result pattern). In Lemma 2, we show different typing can be combined to be made as good as possible (that is, to make types more static and more general). Note that the typing process records all dynamic parameters and corresponding variational types in . As a result, the domain of s in different typings are the same. However, the ranges could be different because different typings may use different V s in ABSDYN.
The following lemma states that we can always find a better (in the sense of the better relation defined at the beginning of this section, in Page 24) variational statifier and typing for any expression.

Lemma 2. If π ;
e : M 1 | 1 and π ; e : M 2 | 2 , then there is some typing The properties captured by the previous two lemmas can be combined to show that for any expression there exists a typing that has the most defined pattern and the most static and general result type. We refer to this typing as the most general static migrational typing, abbreviated as the MGSM typing.
Theorem 6 (MGSM Typing). For any e and , there is a MGSM typing π ; e : M | such that for any π 1 ; e : Proof of Theorem 6 The proof of the best typing is a direct consequence of Lemmas 1 and 2, meaning that we can produce a most precise and general typing and then give a most defined pattern to it.
To illustrate the use of Theorem 6, the MGSM typing for width is Theorem 6 implies that while an infinite number of typings may be derived (due to the ⊥ pattern), we need only care about the MGSM typing since it encodes all the typings for the whole migration space. Sections 6 and 7 investigate the problem of computing the MGSM typing.

Finding the best migration
This section addresses challenge C4 (migration extraction) from Section 3, that is, given the MGSM typing, how can we find the most static migrations? We address it by investigating the relationship between different migrations in Section 5.1 and developing an algorithm for extracting the most static migration from the typing pattern of an MGSM typing in Section 5.2. We use the term eliminator to refer to complete decisions. We say that an eliminator δ 2 is stricter than an eliminator δ 1 , written δ 1 δ 2 , if δ 2 does not select the left alternative (corresponding to ) in more choices than δ 1 . Formally, We say an eliminator δ is valid if π δ = where π should be clear from the context. We will use δ v to denote valid eliminators. For example, let , and δ v c are valid, while δ d is not, with respect to π b from Section 4.5.

Relationships between migrations
Since every migration can be identified by an eliminator for the MGSM typing, and since stricter eliminators correspond to more static migrations, the problem of computing the most static migrations can be reduced to the problem of finding the strictest valid eliminators.
Instead of considering all valid eliminators for an expression (which is exponential in the number of dynamic parameters), we instead consider the valid eliminators of the typing pattern for the MGSM typing of the expression. The reason is that typing patterns are usually small, yielding fewer eliminators that we have to consider (in fact, later results will show that we do not have to consider even all of these). For example, the pattern π a from Section 4.5 for rowAtI has only 5 eliminators while the expression itself has 32. As another example, from the pattern π b , defined at the end of Section 4.5 (page 25), we can see that δ v ab = {A.1} compactly represents δ v a and δ v b for width. Our first question is whether any eliminator that is stricter than an invalid eliminator could be valid. This question seems irrelevant for this example because the invalid eliminator δ d is already the strictest for π b . However, this is not the case in general, and knowing the answer to this question helps us to prune the search space. Theorem 7 (Error Irrecoverability). Let π ; e : M | be an MGSM typing for e and . If π δ = ⊥, then ∀δ 1 .δ δ 1 ⇒ π δ 1 = ⊥.
This theorem implies that we can simply ignore invalid eliminators and focus on valid ones, since all invalid eliminators lead to ill-typed expressions.
A valid eliminator for the typing pattern corresponds to potentially many valid eliminators for the expression. We say that a valid pattern eliminator δ 1 covers a valid expression eliminator δ 2 if δ 1 ⊆ δ 2 . Among all the expression eliminators covered by a pattern eliminator, one is the strictest. For example, the eliminator δ v ab for pattern π b covers the eliminators δ v a and δ v b for typing width, and δ v b is the strictest. As another example, the valid eliminator δ v ae = {A.1, E.1} for pattern π a covers eight valid eliminators (two options for each of the three choice names that do not appear in the pattern) for typing rowAtI, and  {A.1, E.1, B.2, D.2, F.2} is the strictest among them.
Among all expression eliminators covered by a pattern eliminator, stricter ones yield better result types. This is expressed by the following theorem.

Theorem 8 (Strict eliminators select better result types). If π ;
e : M | is the MGSM typing for e and , then δ v Proof Based on Theorem 4, we have statifierForDesc based on the definition of statifier generation (Section 4.5). As a result, more precise types are given to variables in a well-typed manner and the gradual guarantee (Siek et al., 2015) gives us M δ v 1 M δ v 2 . As an example illustrating Theorem 8, consider δ v a , δ v b , and M b , introduced in Section 4.5. We can verify that both δ v Theorem 8 provides a way to order the eliminators covered by a single pattern eliminator, but how about ordering different valid eliminators of the typing pattern? Considering pattern π b , neither of the valid eliminators δ v b or δ v c is stricter than the other. Similarly, for pattern π a , neither of the valid eliminators is stricter than the other. In fact, this property holds not only for these two examples but also for a class of typing patterns that are in pattern normal form. We say a pattern is in normal form if it does not contain idempotent choices (choices with identical alternatives) and does not nest a choice in another choice with the same name (no dead alternatives). We capture this property in the following theorem.
Theorem 9 (Eliminator Incomparability). Let π ; e : M | be MGSM typing for e and and π is in normal form, then δ. If δ is a valid eliminator, then we can again show that d 2 is idempotent, a contradiction that π is in normal form.
We could swap the assignments to δ v 1 and δ v 2 , but this will yield the same proof result. It follows from the theorem that for any two valid eliminators δ v 1 and δ v 2 for π 1 , δ v 1 δ v 2 and δ v 2 δ v 1 . Two eliminators that are incomparable with respect to will remove s for different parameters for the same expression, leading to types that are incomparable by (defined in Section 4), and thus incomparable by . For example, since Combining Theorems 8 and 9 yields the following result about finding most static migrations. We develop an algorithm for extracting such migrations in Section 5.2.
Theorem 10 (Uniqueness of most static migrations). Let π ; e : M | be the MGSM typing for e and , and π is in normal form. Then, the number of most static migrations for e equals the number of valid eliminators for π .
Proof of Theorem 10 The proof follows directly from Theorems 9 and 8. Theorem 9 implies that complete decisions are not comparable, and no other complete decisions are better than them. Theorem 8 implies that tighter selectors yields more precise types. By definition, each complete decision yields a most static migration, since no types better than those produced by complete decisions can be assigned to the expression.
It follows from the theorem that e has a unique most static migration if π 1 has only one valid eliminator.

Extracting most static migrations
The most static migrations for a program are identified by valid eliminators that describe whether to pick the annotation or the inferred type for each parameter. We compute this set of eliminators from an MGSM typing in three steps: (1) simplify the typing pattern to its normal form, (2) collect the valid eliminators for the normal form, and (3) expand each valid eliminator into a strictest eliminator for the corresponding expression.
Simplifying a typing pattern to its normal form has two advantages. First, the valid eliminators are fewer and smaller. Second, we can use the result of Theorem 10 to find most static migrations. We use the following rules to simplify patterns to normal forms. d π , π π d π 1 , π 2 d π 1 d.1 , π 2 d.2 π 1 π 2 π [π 1 ] π [π 2 ] The first two rules remove idempotent choices and dead alternatives. The third rule enables simplifying parts of a larger pattern. For example, we can use the third and the first rule to simplify the pattern π c = A E B , , ⊥ , B E , ⊥ , ⊥ to pattern π a from Section 4.5.
We use the function ve (π ) to build the set of valid eliminators for a pattern π in normal form.
To illustrate the definition of ve, we consider the calculation process for the pattern Each expanded valid eliminator is a best eliminator that specifies how to migrate the program. For example, the first best eliminator for rowAtI above removes the annotation for widthFunc, table, and i, while the other best eliminator removes the annotation for fixed, table, and i.
Formally, given an expression e and its MGSM typing π ; e : M | , then for any expanded valid eliminator δ v , we can generate the most static migration using statifierForDesc ( , δ v ), defined in Page 23.
Overall, these three steps provide a simple way to extract the most static migration from an MGSM typing. In Section 10, we show that these steps lead to an efficient implementation. Usually, the normal form of a typing pattern is small and has only a few valid eliminators. For example, if the program is still well-typed after removing all annotations, then the pattern will be , which has only one valid eliminator (the empty set). Similarly, if the program is ill-typed if any annotation is removed, then there is again just one valid eliminator.
Since normal forms are ideal, we will show in Section 7 how we can efficiently maintain patterns to be in normal form throughout the type inference process.

Constraint generation
The constraint generation rules are presented in Figure 11. The judgment C e : M | C states that under , the expression e has type M when the constraint C is solved. Accordingly, e and are inputs, while M and C are outputs. Note that we now omit the statifier in constraint judgments since it is not needed for type inference. We also omit π since π is an input in the declarative typing but will be computed through solving constraints generated here. Constraint solving will be discussed in Section 7. The syntax of constraints are as follows: The first form represents type compatibility constraints. Often it is the case that two types are only partially compatible. Note, when M 1 ≈ ? M 2 is solved, it is not necessary that M 1 and M 2 are compatible everywhere. As a result, constraint solving result includes a typing pattern, which indicates where M 1 and M 2 are indeed compatible. The constraint C 1 ∧ C 2 defines the conjunction of two constraints C 1 and C 2 , while the constraint d C 1 , C 2 defines a choice between two constraints. The constraint ε represents an empty constraint. This is needed to represent a judgment where no constraints are generated.
Finally, the constraint Fail represents a constraint that, when solved, always leads to a failure. Such a constraint is needed when, for example, dom (Int) is calculated during the constraint generation process. As Int is not a function type, dom (Int) will always fail. We generate a Fail to communicate this failure to the constraint solver. The constraint Fail was absent from the original paper (Campora et al., 2018a). Without it, that work outputs a typing pattern and returns a ⊥ as the typing pattern to denote that certain constraint will definitely fail to solve.
A drawback of that approach is that both constraint generation and constraint solving output typing patterns, and these patterns have to be combined into a single pattern, which is one part of type inference result. That work used the notion of "pattern placeholders," which are introduced during constraint generation and will be plugged in with concrete patterns during constraint solving. The introduction of Fail simplifies the handling of patterns. Specifically, only constraint solving outputs a pattern, and we do not need the notion of "pattern placeholders." Also, the typing pattern has no longer to be part of the constraint generation judgment. Moreover, with Fail we have simplified the judgments and definitions of several auxiliary functions (Figures 11 and 12) in this version.
We now walk through each constraint generation rule. The rule CONC, generating constraints for constants, has a very similar form to CON in Figure 10. The rule VARC for variable references is similar to VAR and, like CONC, generates the empty constraint.
The rule ABSDYNC generates constraints for abstractions with dynamic parameters. It helps facilitate migration by creating a fresh choice type with a left alternative containing and a right alternative containing a fresh type variable. The type variable is used to infer a new static type for the parameter, if possible. The rules APPC and IFC are more involved because constraints from premises have to be combined. The rules APPC and IFC use many auxiliary functions to generate constraints. The functions, defined in Figure 12, take the form: domCst (M 1 , M 2 ) → C, codCst (M 1 ) → (M 2 , C), and M 1 M 2 → (M 3 , C), where the objects to the left of → are inputs and those to the right are outputs. Essentially, they implement the dom, cod, and operations defined for the declarative type system in Figure 10. Note, in these functions κ denote fresh type variables. We will use such variables in this and next sections.
We illustrate domCst by considering the example domCst (A , α , Int). Since the first argument is a choice type, domCst proceeds to recursively call on each alternative of A, leading to two subproblems domCst ( , Int) and domCst (α, Int). The first subproblem is handled by the case for , which immediately returns ε, meaning that no further constraints need to be solved. The second subproblem is handled by the case of domCst for type variables. Since dom always expects a function type, the constraint α ≈ ? Int → κ 2 is generated. The constraints for subproblems are combined together with the choice A, yielding the final constraint A ε, α ≈ ? Int → κ 2 .
The following soundness (Theorem 11) and completeness (Theorem 12) theorems state that the constraint generation rules correspond to the declarative typing rules presented in Figure 10. In particular, Theorem 12 implies that constraint generation finds the MGSM typing. Following the spirit of Vytiniotis et al. (2011), we use the idea of sound and most general solutions (θ ) for constraints (C) in the following theorems (Vytiniotis et al. (2011) used the term guess-free). (θ, π ) is sound for a constraint of the form M 1 ≈ ? M 2 if θ (M 1 ) ≈ π θ (M 2 ), is sound for a constraint C 1 ∧ C 2 or d C 1 , C 2 if it is sound for both C 1 and C 2 , is sound for Fail if π is ⊥, and is always sound for ε. In Section 7, we provide a unification algorithm that generates solutions with these desired properties.

Two constraint generation examples
The following table lists the constraint generation process for the expression λx : .succ (x True). In each row, we list the subexpression visited, the type of that subexpression, and the constraint generated. Assume the fresh choice and variable generated for the parameter are A and α, respectively.
The constraints C 1 and C 2 are generated from the third and fourth premises of APPC for typing x True, respectively. The constraint C 4 is generated from the fourth premise of APPC for handling the application succ (x True).
Continuing from the fifth row of the table above, the following table lists additional constraints that will be generated from the expression λx : .x (succ (x True)).

Unification
This section presents a unification algorithm for solving the constraints generated in Section 6, thus completing the road map presented in Section 3.

Solving compatibility constraints
We first motivate the structure and design of the algorithm with the following examples.
Our solver must adhere to certain rules to ensure the correctness of type inference, including: (I) is compatible with any type (Section 2.1). (II) Type variables are only substituted by static types (Section 4). (III) The typing pattern produced must be as defined as possible (Section 4).
Problem (i) helps illustrate rule (II). Intuitively, α should be substituted by a function type whose codomain is Int, but what should the domain be? Essentially, the domain should be an unconstrained type variable so that it can unify with a static type later, if necessary. As a result, we generate the substitutions {κ 2 → Int} • {α → κ 1 → κ 2 }. Since κ 1 is a fresh type variable that is not mapped to anything, it is unconstrained. In contrast, κ 2 is mapped to Int. This substitution satisfies both rules (I) and (II).
Problem (ii) demonstrates the need for error tolerance in solving constraints. The natural way to solve a choice constraint is to decompose it into two constraints. Doing this on constraint (ii) yields two subconstraints, ≈ ? Int and Bool ≈ ? Int, where π = A π 1 , π 2 . According to rule (I), the first constraint is solved successfully and π 1 is updated to . The second constraint, however, fails to solve, since Bool cannot be made compatible with Int, so we update π 2 to ⊥. Consequently, we update π to A , ⊥ to reflect that constraint solving fails in A.2. Choosing instead ⊥ for π would yield a consistent result but would violate rule (III). Figure 13 presents a unification algorithm U , which takes a constraint and produces a substitution θ and a pattern π . The algorithm can be understood as extending Robinson's unification algorithm (Robinson, 1965) to handle variational types and dynamic types and to support error tolerance. To support error tolerance, the unification not only returns a substitution but also a typing pattern. The unification is successful at variants where the pattern has and is failed at variants where the pattern has ⊥. In the algorithm, cases (a) and (a * ) deal with dynamic types, cases (c), (d), and (d * ) deal with variations. Cases (g) through ( j) deal with non-compatibility constraints. Other cases of the algorithm resemble their counterparts in Robinson's algorithm but still need to account for occurrences of s and variations.
We now briefly walk through each case of U . Some cases of U have dual cases, and names of such cases differ by a . Essentially, the starred version delegates the real solving task to the case without a . Case (a) handles the trivial constraints involving . Such constraints are simply discarded without generating any mapping. We return as the pattern, since is compatible with any type. More importantly for α ≈ ? , case (a) takes priority over (b), ensuring that the substitution {α → } is not generated.
Case (b) unifies a type variable α with a migrational type M. This case includes many subcases. First, if M does not contain and α does not occur in M, then α is directly mapped to M. For example, given α ≈ ? A Int, Bool , the substitution {α → A Int, Bool } is returned, and π is updated to . Second, if M contains variation, the result is computed via case (d). For example, the problem α ≈ ? A , Int is transformed into A α, α ≈ ? A , Int .
Next, if M is a function type that contains and α does not occur in M, then we transform α into a function type by using fresh type variables and delegate the solving to case (f). The problem (i) in Section 7.1 falls in this case. This case essentially solves two constraints, and we will have two typing patterns (π 1 and π 2 in the algorithm). We need to combine them into one. The resulting pattern must be restricted enough to create a valid solving result but well defined enough to give useful information about where constraint solving succeeds. The operation can be viewed as a meet operation over the less defined partial order on typing patterns in Figure 10. It creates the greatest lower bound of two patterns, ensuring that the most defined pattern is used for solving the constraint. π = π d π 1 , π 2 d π 3 , π 4 = d π 1 π 3 , π 2 π 4 ⊥ π = ⊥ d π 1 , π 2 π = d π 1 π , π 2 π Back to case (b), if all previous subcases fail, ⊥ is returned, indicating that the constraint failed to solve.
Case (c) handles constraints involving two choice types that share an outer choice name. It decomposes the constraint into two smaller problems and solves them individually. For instance, consider the constraint A , α ≈ ? A Int, Bool . This constraint will be decomposed into ≈ ? Int and α ≈ ? Bool, which will be solved by (a) and (b), respectively. Case (d) unifies a choice type with another type not handled by case (c). This case employs a similar implementation idea as case (c) does. For example, for A , Int ≈ ? Int, the two smaller constraints to be solved are ≈ ? Int and Int ≈ ? Int. Case (e) unifies two static types and is delegated to Robinson's unification algorithm (Robinson, 1965). Case (f) unifies two function types by unifying their respective argument and return types. Cases (g), (h), (i), and ( j) deal with non-compatibility constraints.
To keep patterns in normal form, we also perform the following optimizations to prevent idempotent choices patterns from being created. In cases (c) and (f), when creating the choice pattern d π 1 , π 2 , we check if π 1 and π 2 are the same; if so, the choice pattern is replaced by π 1 . In the last two cases of in Section 6, we perform the same optimization. After this, the algorithm maintains patterns in normal forms, since the generated constraints do not contain dead alternatives and since the case (d) of U prevents dead alternatives from being introduced.
Unification examples In Section 6, we generated two constraints for the expressions λx : .succ (x True) and λx : .x (succ (x True)). We use these two constraints to illustrate the unification process.
The first constraint is A ε, C 1 ∧ C 2 ∧ C 4 . For this constraint, case (h) applies, which breaks the variational constraint into two smaller constraints in each alternative and then combine the results from alternatives. The left alternative has the constraint ε, which will be solved by case (g) with the solution (θ l , ), where θ l = ∅. The right alternative has the constraint C 1 ∧ C 2 ∧ C 4 . We will repeatedly use case (i) to handle each subconstraint C 1 through C 4 . Since there are no s and variations in these constraints, they degenerate to conventional type equality constraints. We can use robinson's unification algorithm to solve them. The unifier is The typing pattern for solving them is as the solving for each constraint returns .
After we have the solutions for both alternatives, we will now combine them together. First, the combined typing pattern is A , , which simplifies to , meaning that the type inference succeeds everywhere. Next, we combine unifiers with the function merge defined earlier in this subsection. Note, since θ l is ∅, the second case of merge will handle each mapping in θ r . For example, as α → Bool → Int ∈ θ r , then the merged substitution includes α → A κ 8 , Bool → Int , where κ 8 is s fresh type variable. Here we use a fresh type variable in the first alternative to denote that the first alternative for α is not constrained yet, allowing future unification with any type, if necessary. Overall, let θ m be the substitution after merging θ l and θ r , then Substituting the result type A , κ 2 → Int with θ m yields the type A , A κ 8 , Bool → Int → Int, which simplifies to the type A , Bool → Int → Int after we eliminate the unreachable alternative κ 8 . Since the combined typing pattern is and selecting with {A.2} yields , it means that we can migrate x, the parameter associated with the choice A. Moreover, based on the result type of A , Bool → Int → Int, we know the migrated expression has the type (Bool → Int) → Int. Now we solve the constraint A ε, C 1 ∧ C 2 ∧ C 4 ∧ C 5 ∧ C 6 generated for the expression λx : .x (succ (x True)). We proceed similarly as before. In particular, constraint solving C 1 through C 4 yields the unifier θ r mentioned above. We then need to solve C 5 and C 6 from θ r . When solving C 6 , we need to unify Bool → Int with Int → κ 8 , which fails. The pattern returned is thus ⊥. Therefore, the pattern for solving the whole constraint is A , ⊥ . Based on the pattern, we know that we cannot migrate x.
Note, even though our approach cannot migrate x, types more precise than could actually be assigned to x, such as → Int. The reason we cannot find this migration is that λx.x (succ (x True)) is not well-typed under type inference by Garcia & Cimini (2015), and our type inference can be considered as the variational version of theirs. We provide an extension to the unification algorithm U to infer more precise types in Section 9.2.

Properties
We now investigate the properties of U . First, U is terminating.
Theorem 13 (Termination). Given C, U (C) terminates.  M 22 . Note that π in the constraint is just a placeholder and will be updated when the constraint solving finishes.
The idea of the proof is to go through all possible constructs of the type M and show that U covers all possibilities. To establish that most general unifiers exist, we get the results directly from the induction hypothesis (and compose the mgus of the subterms) or use proof by contradiction. As the proof is standard and lengthy, we omit it here.

Introducing dynamism for fixing static type errors
Fixing static type errors by introducing s could be useful under several scenarios. First, when migrating a program, the user may have added static types that cause type errors. To pass static type checking of gradual typing, some added type annotations should be removed. Second, the addition of dynamic types can be used to silence type errors and defer the reporting of type errors to runtime (Bayne et al., 2011;Vytiniotis et al., 2012). This idea is particularly intriguing for fixing static type errors as type error messages generated by compilers are often opaque and difficult to understand (Marceau et al., 2011a,b;Pavlinovic et al., 2014;Loncaric et al., 2016;Serrano & Hage, 2016;Munson & Schilling, 2016). For example, the work by Bayne et al. (2011) shows that obtaining even partial result of ill-typed programs helps programmers to understand type errors and accelerate program development. Our recent work indicates that gradual typing leads to more concrete feedback than deferred type errors for ill-typed programs (Chen & Campora, 2019). In particular, in some situations while deferred type errors dump compile-time error messages, gradual typing returns values to the programmer.
A simple approach for removing type errors is adding annotations to all parameters, which are static by default. However, this approach is undesirable for several reasons. First, adding a annotation to every single parameter is laborious to programmers. Second, adding all s hurts the efforts of migrating programs to be static. Third, the program is likely to lose useful type information in many locations.
For this reason, our goal here is to develop a solution to question Q2. Specifically, for a statically ill-typed program, we aim to find a minimum set of parameters such that replacing them with s removes the type error. It turns out that introducing as few dynamic types as possible for answering Q2 is equally tricky as removing as many dynamic types as possible. To illustrate, consider the following program rowAtISt, which shares the body with rowAtI but removes s from all its parameters. This function is ill-typed since, for example, the then branch for computing width requires widthFunc to have the type Bool → Int and the else branch requires it to have the type The difficulties in adding s are similar to the ones espoused for removing s in Section 1.1. There is an exponential number of ways s can be added to the program; adding s to all parameters introduces more dynamism than desired. Some dynamism can be avoided by adding annotations in a left-to-right manner, but this is inefficient and can still add unnecessary dynamism. For example, following this process on rowAtISt leads to a migration that add s from headOrFoot to border, since only then rowAtISt becomes well-typed. In fact, however, the dynamism on, for example, table is unnecessary. If the programmer wants to remove such unnecessary dynamism, they encounter the exact same difficulties detailed in Section 1.1. The similarity in difficulties inspires our solution to introducing dynamism, which is detailed in the next subsection.

Duality to removing dynamism
The program rowAtISt can be thought of as one of the programs in the migration space of rowAtI in Figure 1. In fact, it is the bottom-most program in the figure had we listed out the full migration space there. Recall that programs 3 and 5 were the most static migrations for program 1. While introducing s for rowAtISt, programs 3 and 5 are likewise the programs we desire since they keep as many static types as possible and are still well-typed.
We can envision organizing the whole migration space into a lattice where more dynamic programs are in the upper portions of the lattice (Takikawa et al., 2016). The process of removing dynamism to make the program static keeps going down the lattice before a type error appears. The process of introducing dynamism to fix type errors keeps going up the lattice until type errors disappear. Overall, these two processes are dual. This fact inspires our formal development to realize the process of introducing dynamism, which we shall see next.
Typing rules. In removing dynamism, we introduce variations for parameters whose type annotations are s and not to others. Based on the duality, we should now introduce variations to parameters without annotations and not to others. Specifically, we define a new type system using the judgment form π ; D e : M | . This judgment has the same meaning as the one in Figure 10 and shares the same rules as that one except for ABS and ABSDYN, for which typing rules are as follows.
D λx : .e : → M | These two rules are dual to the corresponding ones in Figure 10. For an abstraction with a static type, the type error may be removed by changing its parameter to have the dynamic type. We express this by creating a fresh variation with its first alternative being , as can be seen in the ABS rule. The rule then records the changes in the variational statifier. For ABSDYN, no changes will be made for the parameter type, and thus no variations are created in the rule, since our goal is to fix static type errors and not to migrate programs towards using more static typing.
Using the given typing rules, we can derive the following type for rowAtISt, assuming the variation names for parameters from left to right are A, B, D, E, F, G.
The typing pattern for it is: Connection to ITGL. Each variational statifier (in this context perhaps it should be renamed to dynamifier) generated by the D type system now collects parameters for which annotations are added (instead of removed as was done previously). From the variational statifier, we can generate a statifier for each given decision as follows.
The generated statifier coerces certain parameters to have type s and leaves others to their original types. We can define a type system similar to the type system in Figure 4 that types gradual expressions under updates from statifiers. The new type system is the same as the one in Figure 4 except for the rules ABS and ABSDYN, which are presented below.
In ABS, a parameter with a static type is maybe assigned a if the ω specifies so. For functions with parameters, handled by ABSDYN, the typing rule does not update their types.
Finding error fixes. The D typing relation indeed finds correct and complete fixes to type errors, as captured in the following theorems, which serve a similar goal as Theorems 4 through 6 served in the type system of removing dynamism. The proofs of these theorems thus follow those closely and are omitted here. The previous theorem indicates that we can use migrational typing to fix errors but does not state that the fixes are minimal. The following theorem states that we can find a most general, least dynamic fix for a program. We call this the MGDM typing.
From the typing pattern π in MGDM, we can reuse the machinery to find the best migration in Section 5.2 for finding migrations that fix type errors by introducing fewest s to parameters. For example, the π for the MGDM of rowAtISt is π d given earlier. This pattern indicates that either fixed and border should have s to remove the type error, or widthFunc and border should have s.

Discussion
This section demonstrates that migrational typing is flexible and can be easily adapted to solve another interesting program migration problem. The fundamental reason is that migrational typing provides an efficient method to explore the typing of the full migration space and extract the desired migrations from that space, which naturally lends itself to solving other migration problems.
It is interesting to see if we can fix type errors and migrate programs to utilizing more static typing simultaneously. Essentially, such a process first adds annotations to remove the type error and then inspects to see if other annotations can be safely removed after the error is fixed. Note that typing rules in Figure 10 introduce variations for parameters with s and those in this section introduce variations for parameters that have no s. This suggests that the type system that simultaneously fixes type errors and migrates programs should create variations for all parameters. Specifically, the ABSDYN rule should be the same as the one in Figure 10 while ABS be the same to the one in D . After that, we can use the method descried in Section 5.2 to extract the migration that removes type errors as well as migrate the program to be as static as possible.
The simplicity of the type system for this purpose echoes our early observation about the flexibility and adaptability of migrational typing.

Extensions
In this section, we consider how to support additional language features in our migrational type system. First, we show that our migrational type system is flexible and can support extensions that make the source language more expressive for programmers. Then, we cover other uses of migrational typing, for example allowing programmers to indicate which regions they want to remain dynamic or static.

Other language features
Our version of ITGL, given in Figure 10, restricts parameters to be either unannotated or annotated by . The formulation of gradual typing by Garcia & Cimini (2015) allows arbitrary gradual type annotations on parameters and also supports type ascription, that is, asserting by e :: G that expression e has type G.
We can extend our type system to support arbitrary gradual type annotations as follows. Given an abstraction λx : G.e, if G = or G is fully static, type the abstraction as usual; if G is a complex type containing types, replace G by a choice whose first alternative is G and whose second alternative replaces all dynamic parts by arbitrary types. For example, if G = Int → → , then the type of the parameter is d Int → → , Int → V 1 → V 2 , where d is fresh. To generate the corresponding constraint (Section 6), we replace V 1 and V 2 by fresh type variables. Note that this extension still tries to assigns full static types for s. As such, this extension will not be find a migration for λx : .x (succ (x True)), as shown in Section 1.3. The extension in Section 9.2 is able to infer partial static types.
We can extend our type system to support type ascription with the following typing rule.
The second premise ensures that the static parts of the ascribed type G are copied to the second alternative of the choice. The third premise ensures that the type of the expression M is compatible with the ascribed type and also a corresponding type V with all types removed. We can update the structure of to accommodate this rule by defining its domain to be program locations rather than parameter names. We use e here as shorthand for the location of e. Finally, we can also add support for let-polymorphism. The approach is straightforward, but the notations become heavier. We use α to denote a list of type variables and {α → V } to denote a set that includes α 1 → V 1 , . . . , α n → V n . The function vars(·) returns the free type variables in its argument. The typing rules are standard except that when typing variable references (VAR) we can only instantiate type schemas with variational types (V ) and not migrational types (M).
In support of all of these extensions, the other machinery of our approach, including constraint generation, unification, and extracting the most static migration, can be reused.

Inferring more precise types
The example in Section 7.1 shows that our approach fails to find a migration for the expression λx : .x (succ (x True)), even though λx : → Int.x (succ (x True)) can be a more precise migration. Recall from Section 6 that during constraint generation we assigned the variational type A , α to the parameter type x and the generated constraint To investigate why our approach cannot find a migration and how we can potentially improve this situation, we list the constraint solving process for the constraint C 1 ∧ C 2 ∧ C 4 ∧ C 5 ∧ C 6 below. The first column lists the constraint being solved, and the latter two columns list the unifier and pattern from solving the constraint. Constraint Solution Pattern Ignored, does not affect result α ≈ ? Int → κ 8 {α → Bool → Int, κ 1 → Bool, κ 2 → Int} ⊥ The constraint solving fails when we need to solve the constraint α ≈ ? Int → κ 8 , since our solution before that point contains α → Bool → Int. When constraint solving fails, the returned pattern is ⊥, and the content of the unifier will no longer be used. As a result, we leave the content of the unifier as the same after solving α ≈ ? Int → κ 8 .
The main reason our approach fails to find a migration is that, as we were solving the first constraint α ≈ ? κ 1 → κ 2 , we made three requirements: (1) the type that α maps to is constructed by the → type constructor, (2) the parameter type of → be a static type, and (3) the return type of → be a static type. However, in x (succ (x True)), the body of the function, x, is used as functions and applied to both Bool and Int values. As a result, no static type could be assigned to x. We can address this problem by relaxing the three requirements for α. To address this problem, we observe that α denotes the type for x when the for x is removed, and we are finding a more precise migration than . Thus, instead of constraining α with all the three requirements at once, we can relax the latter two requirements and require α be unified with a type whose type constructor is → only. From now on, we call type variables that are introduced to replace s for dynamic parameters migration type variables. Migration type variables appear in the right alternatives of choices when choices are first created. We will use α to range over migration type variables.
Overall, the idea of our solution is that when a migration variable is unified against a function type, we require only that the migration variable be mapped to a function type but allow the parameter type and return type to remain a . The typing that happens later decides whether the parameter type and/or return type could be made precise than a . As a result, a parameter can now be migrated to a function type whose parameter or return type remains a .
One technical challenge is that for the parameter type and return type, we need to explore two possibilities: the and a more precise type. Our machinery with variational typing provides a nice solution. Specifically, when a migration variable α is unified with a function type M 1 → M 2 , we refine α to a function type A 1 , α 1 → A 2 , α 2 (We refer to this process as refinement) and unify this function type against M 1 → M 2 . Here, A 1 , α 1 , A 2 , and α 2 are fresh and α 1 and α 2 are migration variables, which could be further refined to function types whose parameter and return types are s. The function type A 1 , α 1 → A 2 , α 2 encodes four possibilities: both the parameter type and the return type could be or a more precise type.
Following this idea, the constraint solving process for the constraints C 1 through C 7 is updated to the following. In the "Solution" column below, we omitted the mappings α 1 → κ 1 and α 2 → κ 2 to save space.
From Section 6 (page 32), we know that the type of λx : .x (succ (x True)) is A , α → A , κ 6 . Plugging in the solution for α from the unifier above, the type for λx : .x (succ (x True)) is M dp =A , A 1 , Bool → A 2 , Int → A , A 2 κ 10 , Int . Moreover, the pattern for the whole function is A , A 1 , ⊥ . Note, A 2 does not appear 14. An extension to the unification algorithm in Figure 13.
in the result pattern because whether we choose or Int for the return type of the function type for α, the well-typedness of the expression remains the same. Applying the operations ve and expand, defined in Section 5.2, to the pattern A , A 1 , ⊥ , we know that the best migration for this expression corresponds to the valid eliminator {A.2, A 1 .1, A 2 .2}. Selecting M dp with {A.2, A 1 .1, A 2 .2} yields the type ( → Int) → Int, the type of λx : .x (succ (x True)) after migrating the parameter x. This means that our extension could indeed find a more precise migration for λx : .x (succ (x True)). Figure 14 presents an extension to the unification algorithm that implements our idea from above. We briefly go through the cases. First, the cases (bR) and (bR * ) replace cases (b) and (b * ) in Figure 13, by renaming the type variables α to β. Note that from now on, we use α to denote migration variables and β to denote all other variables. The cases (b1) through (b4) handle unification between a migration variable and itself, a constant type, a non-migration type variable, and a variational type.

An extension to the unification algorithm
Case (b5) handles the unification between a migration variable and a function type. This case uses an auxiliary function AllLvsDynMvs to determine if the leaves of a given input type are all s or migration type variables. For example, all AllLvsDynMvs (α 1 → α), AllLvsDynMvs (α 2 ), and AllLvsDynMvs (( → α) → α 2 ) are true, while AllLvsDynMvs (α 1 → Int) and AllLvsDynMvs ((α 1 → Bool) → α 2 ) are false. This function helps avoid non-termination in our extension. To illustrate, consider the constraint α ≈ ? α → β. Such a constraint arises when typing a self application, such as in the expression λx : .x x. This constraint fails to solve using the constraint solving algorithm in Figure 13 due to the occurs check.
There are two potential ways to address this problem. The first is that we use a heuristic, such as allowing a single migration variable be refined by up to a certain number of times only. Any further refinement attempt on the same migration variable would be rejected and treated as a unification failure. The second is to detect the unification that unifies a migration variable (α) against a function type that contains the migration variable (α) and all other leaves are other migration variables or s. Such a unification does not reflect any program structure information but is resulted from refining a unification variable to a function type, since constraint generation ( Figure 11) does not generate such a constraint. If such a unification problem is detected, we can terminate the unification with a failure.
In this extension, we use the second way to address this problem. Concretely, we capture it in the first subcase of case (b5). In the second subcase, α does not occur in the function type and all leaves are migration variables, then we directly map α to the function type. In the third subcase, the function type contains some s. We need to refine α to a function type, but without creating new variations. The last subcase implements the idea of refining a migration variable into a function type whose both parameter and return types are variations.
With this extension, let's now turn to finding migrations for the term λx : .x x. First, we generate the constraint A , α ≈ ? A , α → β and the type for the term is A , α →β. This constraint will be solved using case (d) of Figure 13, which will solve two constraints originated from the two alternatives of A. For the left alternative, the constraint is ≈ ? → β, which will be solved by case (a) of Figure 13 with the solution (∅, ). For the right alternative, the constraint is α ≈ ? α → β. This constraint will be handled by the fourth subcase of case (b5) in Figure 14, and it will be transformed to With a few steps, this problem can be solved and the solution is {α → A 1 , α 1 → A 2 , β , α 2 → β} and the pattern is A 1 , ⊥ . Substituting the type of the term with this solution yields A , A 1 , α 1 → A 2 , β →β and the overall pattern is A , A 1 , ⊥ . From this pattern, we can use ve and expand defined in Section 5.2 to calculate the strictest valid eliminator {A.2, A 1 .1, A 2 .2}. Selecting the type A , A 1 , α 1 → A 2 , β →β with this eliminator leads to the type ( → β) → β, which is a most static migration for λx : .x x. This shows that with the extended constraint solving algorithm, we could find a more precise migration for λx : .x x that we could not find earlier.

Further migration scenarios
Sections 4 and 5 provide a type system and a method for finding all best migrations. In practice, there may be different migration requirements. In this subsection, we explore a few of them and show how to support them with machinery developed in earlier sections. Specifically, we consider the following migration scenarios.
i. Can the programmer control which parameters must or must not be migrated? ii. If migrating a set of indicated parameters yields a type error, can we still migrate a subset of these parameters? iii. Given a set of parameters, can we find which parameters cannot be migrated in unison? iv. Can we find the migrations that migrate the greatest number of parameters?
We use the program rowAtI to illustrate these scenarios and the development of corresponding machinery. Recall that the variations introduced for the parameters fixed, widthFunc, table, border, and i are A, B, D, E, and F, respectively. The typing pattern for this program is shown in Section 4.5 and is reproduced here for readability.
We next go through each scenario.
Scenario i: We begin with a concrete case. Assume that the programmer requires that table must be migrated and widthFunc must not be migrated. We can build a decision δ r for refining the pattern π a based on this requirement. To express that table must be migrated, we extend δ r with D.2, as D is the variation introduced for table. For widthFunc to be not migrated, we extend δ r with B.1, making δ r = {B.1, D.2}. After that, we refine π a with δ r , yielding the new pattern A E , ⊥ , E , ⊥ , which could be simplified to E , ⊥ . We can now apply the method developed in Section 5 to the pattern E , ⊥ to find the best migrations for rowAtI while honoring the requirements. Based on the pattern E , ⊥ , the migration result is that border, the parameter corresponds to E, cannot be migrated, and all other parameters can be migrated. Overall, the migration is that we can migrate fixed, i, and table.
In general, for a program and its typing pattern π generated from MGSM, we follow the following steps to handle this scenario.
1. For each parameter that must be migrated, we extend δ r with d.2, where d is the variation introduced for the parameter. 2. For each parameter that must not be migrated, we extend δ r with d.1, where d is the variation introduced for the parameter. 3. We refine the pattern π with δ r . 4. With the resulting pattern from the last step, we use the method for finding most static migrations outlined in Section 5.2 to find desired migrations.
Scenario ii: Assume that the programmer requires to migrate all fixed, widthFunc, and table. According to the process of calculating δ r given earlier, δ r = {A.2, B.2, D.2}. We observe that π a δ r = ⊥, indicating that not all these parameters can be migrated at the same time. However, the ⊥ does not indicate that none of the parameters can be migrated.
To figure out if a parameter within the specified set could be migrated, we could list all decisions yielding best migrations and check if the parameter appears in any set. For example, based on Section 5.2, the decisions corresponding to best migrations for rowAtI are {A.2, B.1, D.2, E.1, F.2} and {A.1, B.2, D.2, E.1, F.2}. From the first set, we could decide that fixed (since fixed corresponds to A and A.2 belongs to the set) and table of the desired set could be migrated. From the second set, we could decide that widthFunc and table could be migrated. In this case, we have two different such sets. In other cases, we may have only one such set. For example, if the programmer indicated that they wanted to migrate fixed and border, then the unique migration corresponds to the decision is {A.2, B.1, D.2, E.1, F.2}, indicating that only fixed within the two parameters could be migrated.
Scenario iii: During program migration, it is quite common that migrating one parameter may preclude the migration of others. For example, in rowAtI, we could not migrate widthFunc if we have migrated fixed and vice versa. Therefore, presenting the unison parameters that could no longer be migrated can be useful to programmers.
Assume that the programmer has migrated fixed and that we want to calculate the impact it has on other parameters. We must now consider two cases. The first case migrates fixed, and the decision is δ r = {A.2}. The second case does not migrate fixed, and the decision is δ ¬r = {A.1}. Let π r and π ¬r denote the typing patterns resulted from selecting π a with δ r and δ ¬r , respectively, we have In the first case, from π r , we have two decisions that lead to ⊥: {B.1, E.2} and {B.2}. In the second case, from π ¬r , only one decision leads to ⊥: {E.2}. By comparing the decisions in these two cases, we observe that both cases contain E.2. This implies that migrating border, the parameter corresponding to E, always causes an error, meaning that fixed being migrated was irrelevant to the reason border cannot be migrated. On the other hand, only a decision in the first case contains B.2 while none in the second case contains it. This implies that the reason widthFunc cannot be migrated is because fixed was migrated. Consequently, the parameter that cannot be migrated in unison with fixed is widthFunc.
Given an expression e and π for its MGSM typing, and assume the parameter x is migrated and the introduced variation for x is d, the following steps list the process of finding parameters that cannot be migrated due to the migration of x.
1. Let π r = π d.1 and π ¬r = π d.2 . 2. Collect the decisions that produce ⊥ when selecting π with π r . 3. Collect the decisions that produce ⊥ when selecting π with π ¬r . 4. For any d , if d .2 appears in some decisions from step (3) but not from any of decision in step (2), then the parameter that corresponds to d cannot be migrated in unison with x.
Scenario iv: This scenario aims to find out the migrations that migrate the greatest number of parameters, which we refer to as maximal migrations. For example, if one most static migration migrates two parameters while another migrates four, then the latter is a maximal migration if no other migrations migrate more than four parameters. In some situation, maximal migrations are not unique. For example, two most static migrations for rowAtI migrate three parameters and both are maximal.
Given an expression and its typing pattern π for its MGSM, a simple process to find maximal migrations is generate all best migrations from π and filter out the migrations that migrate the greatest number of parameters.
This process is straightforward and necessitates no changes to our existing machinery, but is computationally expensive. We can improve the efficiency by slightly adapting the ve function for collecting best migrations from Section 5.2. Specifically, for each internal node of the typing pattern, we compare the cardinality of the decisions from the left and right subtrees and discard the decisions that have more left selectors, which are selectors of the form d.1 for some d (see Section 2.2). We express this idea in the following function mve.
In the definition, δ| 1 (introduced in Section 4.5) returns all left selectors in δ. The notation lmve[0] returns any member from the set lmve. This is valid because all of the members in lmve include the same number of left selectors, and so do those in rmve. The set D (introduced in Section 5.2) contains all variations introduced in typing e. Note, given a decision δ, if d.1 / ∈ δ then the parameter corresponding to d cannot be migrated. Therefore, |D| − |lmve[0]| 1 | gives the number of parameters that can be migrated in lmve [0]. mve is always more efficient than ve since the former keeps the set of decisions that yield maximal migrations only while the latter keeps all best migrations. In particular, if there is a unique maximal migration, then mve returns only one decision.
Discussion. Supporting these scenarios by reusing or slightly adapting existing machinery demonstrates the generality of our approach. We can also support variations or combinations of scenarios we looked at with ease. For example, a combination of scenarios i and iv could be supported by following the first three steps outlined in Scenario i and then applying the mve function to the resulting pattern. As another example, we may be interested in the scenario of finding the maximal migration within a given set of parameters. To support this scenario, we first select the typing pattern of the MGSM typing with selectors of the form d.1, where d corresponds to a parameter that does not belong to the given set. The selection result is a pattern, to which we apply mve to find the maximal migration within that parameter set. Overall, the generality of our approach demonstrates that it could be a useful foundation for developing more complex and significant migration supports in practice.

Evaluation
This section evaluates the performance of migrational typing. For this purpose, we have implemented a prototype in Haskell. The prototype implements the techniques developed in this paper. Besides the features presented in Sections 4.1 and 9.1, the prototype also supports recursive functions, a built-in list type, a built-in Vector type, and a tuple type, which are needed to encode the examples described below.
To evaluate the performance of our idea in practice, we have converted programs in Grift to the language supported by our prototype. We used all the programs from Kuhlenschmidt et al. (2019) except the program sieve, which uses recursive types that are not supported in our prototype. Since these converted programs are all well-typed, we seed errors in the programs by randomly applying between 2 and 25 changes in each. Each change replaces one leaf of the AST (a variable reference or constant) with another leaf. These programs are summarized in columns 2-5 of Figure 15, showing size in lines of non-blank code, number of functions, number of dynamic parameters, and the number of leaves that were changed.
For each evaluated program, we compared the runtime of migrational typing with standard gradual typing and with a brute-force strategy for most static migration for the program, shown in columns 7 through 9 of the table. In standard gradual typing, we run our implementation without creating any variations. We also report the number of most static migrations in column "# Best," computed using the method in Section 5.2. The time for gradual typing can be considered a baseline-this is the time to simply type the given program. The time for the brute-force strategy represents a naive approach to migrational typing, generating 2 n variants of a program with n dynamic parameters, and gradually typing all of them. In Section 1.1, we discussed that an exploration of all programs are needed Fig. 16. Relations between ratios of typed parameters and migrational typing times for the nbody benchmark.
to find best migrations. We omit the time for computing the most static migrations from the figure because the time is always within 0.04 s.
We observe that the brute-force approach, as expected, is exponentially slower than gradual typing, and it successfully types only the programs that have fewer than 10 parameters. On the other hand, migrational typing scales linearly with the size of the program and exhibits only a 2-3.5 times overhead over gradual typing.
We have also investigated the impact of the ratio of typed parameters on migrational typing time, and we presented the results in Figure 16. Note that the x-axis cuts off at 93% because, as we made random changes to the program, not all parameters can be given static types. In general, a higher ratio of typed parameters leads to fewer variations being created and thus takes shorter time for migrational typing to finish. Tansey & Tilevich (2008) studied the problem of automatically upgrading annotations (such as types and access modifiers in Java) in legacy applications in response to the upgrading of, for example, testing frameworks and libraries. This is similar to our work in that it tackles the problem of migrating programs to a new version by changing annotations in the program. Their methodology is quite different; however, in that it needs two example programs illustrating how annotations change between framework versions, so that their inference rules can learn the changes made in the examples. In contrast, our approach only needs to reason about how type annotations affect the typing of the program, so migrating annotations requires only information attainable through the type system. Moreover, the kind of migrations are orthogonal. Their goal is to upgrade an entire codebase automatically to use a new framework, which means that they have one endpoint. Migrational typing presents all of the ways a programmer might want to change the types of their program by adjusting annotations, meaning that there are multiple endpoints.

Annotation upgrading and migratory typing
Migratory typing (Tobin-Hochstadt et al., 2017) provides another approach to migrating dynamically typed code to statically typed code by creating a statically typed sister language that interfaces seamlessly with the dynamically typed language. In general the focus of this work is about designing such a sister language such that types can be assigned to existing programs in the dynamic language with minimal refactoring. While programmers have to manually add type annotations to make programs more static in migratory typing, migrational typing supports systematically typing the whole migration space and automatically finding the best migrations.
This means that a large focus of migratory typing is orthogonal to our work in that we assume we are working within a given gradual language and that we do not have to design a static sister language to a dynamic language. On the other hand, if we were given a static language and gradualized it via the idea of Garcia et al. (2016), Cimini & Siek (2016 we conjecture we could design a migration tool for gradualized languages that supported unification based type inference.

Gradual typing migration
As discussed in Section 1.3, this work is closely related to the work by Migeed & Palsberg (2019) on finding maximal migrations for gradual programs. There are several similarities in their work and ours. For example, they consider a set of possible migrations for a gradually typed programs and try to find all of the maximal migrations. These maximal migrations are migrations that cannot add any more type information to the program without causing a static type error, which are similar to our most static migrations. They show that the process of finding maximal migrations is NP hard.
Their work has some notable differences with our work, however. Mainly, the language they consider is a version of GTLC (Siek et al., 2015) with the ability to add Bool and Int annotations. In contrast, we start with ITGL, a gradualized version of the Hindley-Milner language, which has a principal type inference that works on unannotated terms. Essentially, while both work aims to find maximal migrations, they use different techniques and criteria. In their work, they continuously generate more precise programs by replacing a with a more precise type and tests the well-typedness of the generated program. They find a maximal migration if no more s exist or no more could be replaced with any more precise type. A migration in our work is maximal if no further can be eliminated with respect to ITGL Garcia & Cimini (2015) constraint solving. As a result, their approach may find types that are rejected by the ITGL inference that we adapt. For example, for λx : .x (succ (x True)), their approach infers that x can be given the type → Int, whereas our approach respects ITGL, which considers the use of x to be ill-typed (Our extension in Section 9.2 does infer that x may be migrated to the type → Int).
Finally, we have evaluated the efficiency of our approach on large programs, and we observed that finding all best migrations in our approach is usually within a factor of 2 of typing each possible migration. The efficiency in their approach is unclear. It would be interesting as future work to see if our machinery could be exploited to improve the efficiency of their work. Phipps-Costin et al. (2021) developed a framework named TypeWhich for migrating gradual types. While both our work and the work by Migeed & Palsberg (2019) aim at maximizing type precision during migration, TypeWhich allows users to consider not only type precision, but also type safety (such that migration does not introduce runtime errors) and type compatibility (such that migration does not break the interoperability between migrated and un-migrated code). As such, some migrations in our work and that by Migeed & Palsberg (2019) may introduce dynamic runtime errors, but not in the safety or compatibility mode of TypeWhich. The latter two modes are particularly useful because migrations are often not done for the whole project and the migration process should not break code interactions.
In addition, our work and TypeWhich differ in many aspects. First, our work can find all best migrations for a given program whereas TypeWhich finds just one best migration by default. Further migrations may be returned by TypeWhich through model extraction by the underlying SMT solver. While returning the first model is efficient, the complexity of extracting all migrations is unclear.
Consider, for example, the following expression.
width fixed widthFunc = 2 + (if fixed then widthFunc fixed else widthFunc 33) TypeWhich displays the following migration first for this function when prioritizing type precision.
width (fixed:Int) (widthFunc:Int -> Int) = 2 + (if (fixed: ) then widthFunc fixed else widthFunc 33) Our work finds two best migrations for the function width, and neither is more precise than the other. In the first migration, the type for fixed remains to be whereas the type for widthFunc is Int-> Int, as shown below.
width (fixed: ) (widthFunc:Int -> Int) = 2 + (if fixed then widthFunc fixed else widthFunc 33) In the second migration, the type for fixed is migrated to Bool and the type for widthFunc is migrated to -> Int (without the extension in Section 9.2 the type for widthFunc will remain ). The migrated program is shown below.
width (fixed:Bool) (widthFunc: -> Int) = 2 + (if fixed then widthFunc fixed else widthFunc 33) For programs that cannot be fully statically typed, it is likely that hundreds of best migrations exist. Our approach finds all of them in time linear to the size of the program. Since our approach may find a large number of best migrations simultaneously, it is helpful to allow users to specify preferences about where migrations are desired. We support them through extensions in Section 9.3. Since TypeWhich finds best migrations sequentially, developing such supports could be harder.
Second, by design, TypeWhich may ascribe a type to a subexpression even though the subexpression has a static type during static type checking. This design allows more parameters to be migrated when precision is maximized. For example, in the migration for width above, TypeWhich ascribed to fixed that has the type Int for the condition so that fixed can be used where a Bool is needed. Without the ascription, the migrated program is statically ill-typed. Our approach does not use ascription for maximizing migrations. Third, our approach supports polymorphism through let (Section 9.1) while TypeWhich does not. Henglein & Rehof (1995) developed an approach for embedding Scheme programs in ML by inserting coercions into subexpressions whose type correctness cannot be statically verified. Their approach uses type inference to reduce coercions that will be inserted, making it behave similarly to TypeWhich that prioritizes type safety. Technically, their approach is more involved. Given a program, it collects typing constraints, builds a simple value flow graph, and decides where coercions are needed.

Relation to gradual typing
Work on gradual typing can be broadly defined along three dimensions. The first investigates the integration of gradual typing with advanced typing features, such as objects (Siek & Taha, 2007), ownership types (Sergey & Clarke, 2012), refinement types (Wadler & Findler, 2009;Lehmann & Tanter, 2017;Jafery & Dunfield, 2017;Williams et al., 2018), session types (Igarashi et al., 2017), and union and intersection types (Siek & Tobin-Hochstadt, 2016;Castagna & Lanvin, 2017;Toro & Tanter, 2017;Castagna et al., 2019). From this perspective, our type system studies the combination of variational types with gradual types. Gradual languages with type inference (Siek & Vachharajani, 2008;Rastogi et al., 2012;Garcia & Cimini, 2015) were a large influence on migrational typing. While ITGL was used as the basis for formalizing our type system, we expect that our approach can be extended to handle other features in this line of work. The reason is that the idea and manipulation of variations is orthogonal to other type system features. In particular, the idea of type compatibility in Section 4.2 and the handling of type errors in Section 4.3 can be easily extended.
The second dimension studies runtime error localization and performance issues with sound gradual typing. The blame calculus (Tobin-Hochstadt & Felleisen, 2006;Wadler & Findler, 2007, 2009) adapts the contract system notion of blame so that less precise parts of a program are blamed when cast errors occur. Ahmed et al. (2011Ahmed et al. ( , 2017 extended that work to further handle polymorphic types. Since those works, there has been a number of papers involving parametricity in the gradually typed setting (Toro et al., 2019;New et al., 2019). Takikawa et al. (2016) showed that sound gradually typed languages suffer from performance issues as more interactions between static code and dynamic code leads to frequent value casts. Confined Gradual Typing (Allende et al., 2014) provides constructs to control the flow of values between static and dynamic code, mitigating performance issues and making gradual typing more predictable.
The final dimension studies the production of gradual type systems from specifications of static type systems. For example, Garcia et al. (2016) presented a way to create gradual type systems from static ones using techniques from abstract interpretation. The Gradualizer (Cimini & Siek, 2016 can produce a gradual type system and dynamic semantics for a statically typed language given its formal semantics. It is thus interesting to investigate how these approaches interact with variations in the future. Siek et al. (2015) discussed the criteria for gradual typing. We employed the criteria of the underlying ITGL to prove Theorem 7.

Type inference
The goal of gradual typing is to find out what parameters can be given static types. As such, gradual typing is closely related to the idea of type inference.
Gradual type inference with flow-based typing (Rastogi et al., 2012) has been explored to make programs in dynamic object-oriented languages more performant. Since our work is formalized on ITGL, our work inherits the relations between ITGL and flow-based inference (Garcia & Cimini, 2015). Additionally, while flow-based inference ensures that inferred type annotations do not cause runtime errors, our current formalization does not have this property as our approach is not flow-directed.
The inference in flow (Chaudhuri et al., 2017) is also flow-based and was specifically designed to not produce false positives for idioms that are commonly used in JavaScript. It is possible that migrational typing can help the inference process for languages like JavaScript by using variations to reason about idioms in messy scenarios. A flow-based inference was also employed over Reticulated Python's cast inserted transient translation. The inference was used to optimize program performance, removing unnecessary casts where the inference indicated that it was safe.
A few type systems, such as Guha et al. (2011), Chugh et al. (2012, Pearce (2013), support flow-based reasoning but do not perform type inference.
SimTyper, developed by Kazerounian et al. (2021), aims to infer usable types for Ruby. Unlike most type inference algorithms, the goal of SimTyper is not to infer most general (precise) types, which could be verbose and hard to use in presence of subtyping, structural types, overloading, and other dynamic language features. Instead. the goal of SimTyper is to infer usable types that programmers often write. SimTyper is built on InferDL Kazerounian et al. (2020), a heuristics-based type inference algorithm, and a type equality prediction method based on machine learning. Essentially, when SimTyper discovers an overly general, complicated type, it uses the type equality predictor to find a type that is more concise and is equal. SimTyper then uses that more concise type to replace the complicated one and check if that replacement violates any typing constraint. It accepts the concise type if no violations detected and rejects the type and look for another concise type otherwise. Wei et al. (2020) developed LambdaNet for inferring types for TypeScript. Given a program, LambdaNet first transforms it to a type dependency graph, where nodes are type variables for subexpressions in the programs and hyperedges express constraints (such as the subtyping relation or type equality). Hyperedges may also provide hints to type inference, such as variables giving rise to the connected type variables have similar names. All type variables are then converted to vectors of numbers (known as embedding in machine learning) and, LambdaNet uses a set of rules to propagate type information across the dependency graphs. These rules manipulate the embedding in each node. As with deep learning (Neocleous & Schizas, 2002), the intuitions behind such rules are unclear. Finally, after propagation completes, inferred types are readout from embeddings.

Variational typing and others
This work reuses much machinery from variational typing (Chen et al., 2012(Chen et al., , 2014 to support reuse when typing the whole migration space. Thus, migrational typing can be viewed as an application of variational typing. Variational typing has been employed to improve type inference of generalized algebraic data types , which uses variation types to represent potentially many types for a single expression. Variational typing has also been used to improve error locating in functional programs using counter-factual typing (CFT) (Chen & Erwig, 2014a,b). Both migrational typing and CFT use variational types to efficiently explore a large number of hypothetical situations. A technical difference between CFT and migrational typing is that CFT tries to find a minimal change that would make an ill-typed program type correct. In contrast, migrational typing tries to remove annotations from as many parameters as possible. The process of extracting the maximum change for migrational typing (as described in Section 5.2) is well defined while finding the minimum change in CFT has to rely on heuristics due to the nature of type error debugging. Another difference is that migrational typing considers the interaction between variational types and gradual types. The idea of using pattern-constrained judgments in Section 4.3 yields a declarative specification for handling type errors, while previous applications of variational typing have had to explicitly track the introduction and propagation of type errors.
The variational cost analysis by Campora et al. (2018b) provided an approach that harmonizes type safety and gradual typing performance. The motivation of that work was that migrating programs will likely slowdown program performance. The solution in that work was constructing a "cost lattice" that estimates the runtime overhead induced by type annotations and comparing costs of different migrations. The solution supports different migration scenarios while adding type annotations, for example finding the migrations that yield the best performance. Technically, that work adapted cost analysis for functional programs (Danner et al., 2015) to a gradually typed language. That work also used the machinery of variational typing to reusing typing and cost analysis to efficiently build the cost lattice.
It is possible that type annotations added by programmers during migrations may cause runtime type errors. Campora & Chen (2020) presented a static type system for detecting runtime type errors, finding out the s that prevent the runtime type errors from being detected by the static type system, and suggesting fixes to remove dynamic runtime type errors.
Typing patterns in our work have a close resemblance to BDD (Binary Decision Diagrams) of Boolean formulas (Akers, 1978;Bryant, 1992). For example, choices in patterns correspond to internal nodes in BDD, ⊥ and correspond to leaves 0 and 1 in BDDs, respectively, and selecting the right alternative of a choice corresponds to following the high edge of an internal node. Moreover, the idea of pattern normal forms, introduced before Theorem 9, are similar to reduced BDDs. Variable ordering has a significant impact on the size of a BDD. The number of nodes of a BDD may be linear to the number of variables under one ordering but it could be exponential under another. Similarly, the ordering of choice names impact the size of a typing pattern. For example, the pattern A ⊥, B , C ⊥, has three internal nodes and four leaves, while an equivalent pattern C A ⊥, B , ⊥ , A ⊥, has four internal nodes and five leaves. Due to the reasons below, we conjecture that the ordering problem in our work is not as critical as in BDDs. First, the ordering problem becomes more conspicuous when the leaves mix ⊥s and s. Instead, due to the fact that left alternatives of choices have s when they are created and s unify with any types, left subtrees of patterns tend to have s. Section 5.1 gives a formal account of this. For such patterns, the impact of ordering on sizes decreases. For example, A , B , C , ⊥ has seven nodes, and C , A , B , ⊥ , an equivalent pattern but with different ordering, also has seven nodes. Second, as explained in Section 5.2 (the last second paragraph), typing patterns are usually small, this makes the ordering less important, as even a suboptimal ordering will not cause the pattern to have too many nodes.

Conclusion
We have presented migrational typing, a type system that allows programs in an implicitly typed gradual language to be assigned a new type based on the possible removals of dynamic type annotations in the original program. Migrational typing solves an important unaddressed problem in gradual typing, namely having a safe and efficient way to move around in the possible dynamic-static typing space for a program. It achieves this by conceptually typing the whole migration space, marking where type errors occur so that it can safely present the possible migrations for the program. We have shown that the system can infer the most static possible types that can be assigned to a program and that this process can be constrained according to user-defined criteria. Moreover, the migrational type system is sound and complete with respect to removing dynamic annotations in ITGL, and its constraint generation and unification algorithms are sound and complete.
We have also shown that this approach is scalable, performing nearly exponentially better than the brute-force approach of generating and typing the migration space separately. Later, we showed that migrational typing can be adapted to statically reason about the number of dynamic casts that will be generated by different points in the migration space so that we can support migration scenarios that consider programmers' typing goals and performance goals (Campora et al., 2018b). In future work, we plan to see if we can adapt migrational typing to work with a non-unification based inference. This will allow it to analyze gradual languages with object-oriented features like Reticulated Python or TypeScript with greater ease. We also plan to explore whether migrational typing can be adapted to provided an analysis of the runtime safety of casts in gradual programs.