1. Introduction
We consider solutions
$q\colon {{\mathbb {R}}}\times {{\mathbb {R}}}\rightarrow \mathbb {C}$
of the nonlinear Schrödinger equation

and the (complex Hirota) modified Korteweg–de Vries equation

with initial data
$q(0)\in H^s({{\mathbb {R}}})$
. The upper choice of signs yields the defocusing cases of these equations, while the lower signs correspond to the focusing cases. In this paper, the symbols
$\pm $
and
$\mp $
will only be used in the context of this dichotomy. By restricting (mKdV) to the case of real initial data, we recover the classical mKdV equation of Miura [Reference Miura46]:

To treat both the defocusing and focusing versions of (NLS) and (mKdV) within the same framework, throughout this paper, we adopt the notation

With this convention, both (NLS) and (mKdV) are Hamiltonian equations with respect to the following Poisson structure on Schwartz space: Given
$F,G\colon \mathcal {S}\rightarrow \mathbb {C}$
,

where our notation for functional derivatives is the classical one (see (2.2)). Correspondingly, any Hamiltonian
$H\colon \mathcal {S}\rightarrow {{\mathbb {R}}}$
generates a flow, which we denote by
$e^{tJ\nabla H}$
, via the equation

In particular, since Hamiltonians are real-valued, the relations
$q=\pm \bar r$
are preserved by any such flow.
With these conventions, the equations (NLS) and (mKdV) are the Hamiltonian flows associated to

respectively. Two other important Hamiltonians are the mass and momentum,

which generate phase rotations and spatial translations, respectively. While our names for the basic conserved quantities agree with the usual parlance in the defocusing case, their signs are reversed in the focusing case; in particular, the mass becomes negative definite. However, this sign change is offset by a corresponding sign change in the Poisson structure, so the dynamics remains those given in (NLS) and (mKdV).
All four functions M, P,
$H_{\mathrm {NLS}}$
, and
$H_{\mathrm {mKdV}}$
Poisson commute. While commutation with M and P merely represent gauge and translation invariance, the commutativity of
$H_{\mathrm {NLS}}$
and
$H_{\mathrm {mKdV}}$
is surprising and a first sign of a very profound property of these equations: they are completely integrable.
One expression of this complete integrability is the existence of an infinite family of commuting flows. Taken together, these form the AKNS–ZS hierarchy. This name honors the authors of the seminal papers [Reference Ablowitz, Kaup, Newell and Segur1, Reference Zakharov and Shabat56]. For an authoritative introduction to this hierarchy, with particular attention to the Hamiltonian structure, we recommend [Reference Faddeev and Takhtajan16].
The odd and even numbered Hamiltonian flows in the AKNS–ZS hierarchy behave differently under
$(q,r)\mapsto (\bar q,\bar r)$
. In particular, conjugation acts as a time-reversal operator for M and
$H_{\mathrm {NLS}}$
but leaves the P and
$H_{\mathrm {mKdV}}$
flows unchanged. This leads to a number of significant differences in our treatment of (NLS) and (mKdV).
As we will discuss more fully below, it has been known for a long time that both (NLS) and (mKdV) are globally well-posed for sufficiently regular initial data. In fact, the question of what constitutes sufficiently regular initial data has occupied several generations of researchers. We are now able to give a definitive answer:
Theorem 1.1 (Global well-posedness of the NLS and mKdV).
Let
$s>-\frac 12$
. Then the equations (NLS) and (mKdV) are globally well-posed for all initial data in
$H^s({{\mathbb {R}}})$
in the sense that the solution map
$\Phi $
extends uniquely from Schwartz space to a jointly continuous map
$\Phi \colon {{\mathbb {R}}}\times H^s({{\mathbb {R}}})\rightarrow H^s({{\mathbb {R}}})$
.
Here, we are evidently taking the well-posedness of (NLS) and (mKdV) on Schwartz space for granted. This has been known for a long time [Reference Kato30, Reference Tsutsumi54].
The threshold
$s=-\tfrac 12$
appearing in Theorem 1.1 is both sharp and necessarily excluded. It is also the scaling-critical regularity. Indeed, each evolution in the AKNS-ZS hierarchy admits a scaling symmetry of the form

where m denotes the ordinal position of the Hamiltonian. For example,
$m=0$
for M, while (NLS) corresponds to
$m=2$
and (mKdV) to
$m=3$
.
While a great many dispersive equations have recently been shown to be well-posed at the scaling-critical regularity, this fails for (NLS) and (mKdV). In fact, one has instantaneous norm inflation: For every
$s\leq -\frac 12$
and
$\varepsilon>0$
, there is a Schwartz solution
$q(t)$
to (NLS) satisfying

This was shown for (NLS) in [Reference Christ, Colliander and Tao11, Reference Kishimoto40, Reference Oh48]. In Appendix A, we revisit this work, giving a simplified presentation and showing that the same norm inflation holds also for (mKdV), as well as other members of the hierarchy. This ill-posedness effect does not seem to have been noticed before.
This norm inflation argument does not extend to (mKdVℝ ). Nevertheless, in the appendix, we show (seemingly for the first time) that a slightly weaker form of ill-posedness holds in the focusing case (see Proposition A.3). Previously, [Reference Birnir, Ponce and Svanstedt2] showed that the data-to-solution map cannot be extended continuously to the delta-function initial data in the focusing case. The analogous assertion for NLS (both focusing and defocusing) was proved in [Reference Kenig, Ponce and Vega35].
Let us turn our attention to the existing well-posedness theory. The advent of Strichartz estimates [Reference Strichartz53] had a transformative effect on the study of nonlinear dispersive equations. These estimates provide an elegant and efficient expression of the dispersive effect and allowed researchers to pass beyond the regularity required to make sense of the nonlinearity pointwise in time. In [Reference Tsutsumi55], Tsutsumi used this new tool to prove global well-posedness of (NLS) in
$L^2({{\mathbb {R}}})$
.
We know of no further progress in the scale of
$H^s$
spaces since that time. Here is one reason: No ingenious harmonic analysis estimate, nor clever choice of metric, can reduce matters to a contraction mapping argument. Such constructions lead to solutions that depend analytically on the initial data; however, in [Reference Christ, Colliander and Tao10, Reference Christ, Colliander and Tao11, Reference Kenig, Ponce and Vega35], it is shown that the data-to-solution map cannot even be uniformly continuous on bounded subsets of
$H^{s}({{\mathbb {R}}})$
when
$s<0$
.
Due to the derivative in the nonlinearity, Strichartz estimates alone do not suffice to understand the behavior of (mKdV). By bringing in local-smoothing and maximal-function estimates, Kenig–Ponce–Vega [Reference Kenig, Ponce and Vega33], were able to prove that (mKdV) is locally well-posed in
$H^s({{\mathbb {R}}})$
for all
$s\geq \frac 14$
. The solution they construct depends analytically on the initial data. Moreover, the threshold
$s=\tfrac 14$
is sharp if one seeks solutions that depend uniformly continuously on the initial data. This was shown in [Reference Christ, Colliander and Tao10, Reference Kenig, Ponce and Vega35]. In the case of (NLS), the critical threshold for analytic well-posedness coincides with an exact conservation law, namely, that of
$M(q)$
. Thus, Tsutsumi’s result is automatically global in time [Reference Tsutsumi55]. Due to the absence of any obvious conservation law at regularity
$s=\frac 14$
, it was unclear at that time whether the Kenig–Ponce–Vega solutions to (mKdV) are, in fact, global in time. This was subsequently shown for (mKdVℝ
) through the construction of suitable almost conserved quantities. For
$s>\frac 14$
, this was proved by Colliander–Keel–Staffilani–Takaoka–Tao [Reference Colliander, Keel, Staffilani, Takaoka and Tao15] with the endpoint added later by Guo and Kishimoto [Reference Guo24, Reference Kishimoto39].
With the exact threshold for analytic (or even uniformly continuous) dependence settled, the question immediately arises as to what happens at lower regularity: What lies in the sizable gap remaining between these well-posedness results and the known breakdown of continuity at
$s=-\frac 12$
? This gap corresponds to regularities
$-\frac 12<s<0$
for (NLS) and
$-\frac 12<s<\frac 14$
for (mKdV).
For typical Schrödinger equations in
${{\mathbb {R}}}^d$
with polynomial nonlinearities, there is no gap between analytic local well-posedness and the onset of ill-posedness [Reference Christ, Colliander and Tao11]. Thus, it is all the more remarkable to discover a region of nonperturbative well-posedness in this setting. This phenomenon appears to be a remarkable feature of completely integrable systems, and investigating it necessitates methods that take advantage of this integrability.
A natural first step toward understanding solutions in this delicate region is to seek a priori
$H^s$
bounds. While boundedness of solutions would obviously follow from well-posedness, proving boundedness is typically a first step. It is also the principal challenge in the construction of weak solutions. On the other hand, showing impossibility of such bounds would give ill-posedness.
Early successes in this direction include [Reference Christ, Colliander and Tao13, Reference Koch and Tataru41, Reference Koch and Tataru42] for (NLS) and [Reference Christ, Holmer and Tataru14] for (mKdV). Recently, the definitive result in this direction was obtained in [Reference Killip, Vişan and Zhang38, Reference Koch and Tataru43], where exact conservation laws were constructed that control the
$H^s$
norm of solutions all the way down to
$s>-\tfrac 12$
. Given the norm inflation discussed earlier, one cannot go any lower. The macroscopic conservation laws constructed in [Reference Killip, Vişan and Zhang38, Reference Koch and Tataru43] interact with the scaling symmetry in a useful way; indeed, this was already employed in [Reference Killip, Vişan and Zhang38] to connect differing regularities and to obtain bounds in Besov spaces. Another important consequence of this interaction is that when
$s<0$
, it guarantees equicontinuity of orbits (cf. Definition 4.5 and Proposition 4.6 below). This seems to have been first noted explicitly in [Reference Killip and Vişan37] and will play several important roles in what follows.
One example of the significance of equicontinuity is that it connects well-posedness at different regularities: If
$\sigma>s$
, then existence and uniqueness of solutions with initial data in
$H^s$
automatically guarantees the same for initial data in
$H^\sigma $
. That the
$H^s$
-solution remains in
$H^\sigma $
at later times follows from the existence of a priori bounds. However, continuity of the data-to-solution map in
$H^\sigma $
requires more; convergence at low regularity together with boundedness at higher regularity does not guarantee convergence at the higher regularity. Equicontinuity in
$H^\sigma $
is the simple necessary and sufficient condition for convergence in
$H^\sigma $
under these circumstances. There are two further aspects of the history we wish to discuss before describing the methods we employ: well-posedness results outside the scale of
$H^s$
spaces and for these partial differential equations (PDEs) posed on the torus.
By working in Fourier–Lebesgue and modulation spaces, several researchers succeeded in studying well-posedness questions outside the scale of
$H^s$
spaces. For (NLS), for example, analytic local well-posedness was shown in almost-critical spaces by Grünrock [Reference Grünrock19] and Guo [Reference Guo23]. For (mKdV), analogous almost-critical results in Fourier–Lebesgue spaces were obtained in [Reference Grünrock18, Reference Grünrock and Vega22]. The threshold for analytic well-posedness of (mKdV) in modulation spaces was determined in [Reference Chen and Guo8, Reference Oh and Wang50]; however, this still does not coincide with scaling criticality.
Each of the three types of spaces (Fourier–Lebesgue, modulation, and Sobolev) has a very different character; nevertheless, each of the spaces just described can be enveloped by
$H^s$
provided one takes
$s>-\frac 12$
sufficiently close to
$-\frac 12$
. Conversely, both Fourier–Lebesgue and modulation spaces suppress high frequencies more strongly than negative regularity
$H^s$
spaces; this substantially reduces the dangers of high-high-low interactions, which are the dominant source of instability in these models.
We are not aware of any global well-posedness results in Fourier–Lebesgue spaces close to criticality. However, by ingeniously exploiting the way Galilei boosts interact with the conservation laws constructed in [Reference Killip, Vişan and Zhang38], Oh and Wang [Reference Oh and Wang49] obtained global bounds in modulation spaces, which then yield global well-posedness in these spaces.
In order to construct solutions via a contraction mapping argument, one must employ an array of subtle norms expressing the dispersive effect. The question arises whether there might be other solutions that are continuous in
$H^s$
but lie outside the auxiliary space. This is the question of unconditional uniqueness, pioneered by Kato [Reference Kato31, Reference Kato32]. For the latest advances in this direction, see [Reference Guo, Kwon and Oh25, Reference Kwon, Oh and Yoon44]. We now give a quick review of what is known for (NLS) and (mKdV) posed on the circle (i.e., for periodic initial data). In the Euclidean setting, dispersion causes solutions to spread out. This is impossible on the circle, there is nowhere to spread to. Nevertheless, Bourgain [Reference Bourgain3, Reference Bourgain4] proved that select Strichartz estimates do hold (expressing a form of decoherence). As an application, these new estimates were used to prove global well-posedness of (NLS) in
$L^2(\mathbb {T})$
and local well-posedness of (mKdV) in
$H^{1/2}(\mathbb {T})$
. Global well-posedness of (mKdVℝ
) in
$H^{1/2}(\mathbb {T})$
was subsequently proved in [Reference Colliander, Keel, Staffilani, Takaoka and Tao15]. Moreover, [Reference Christ, Colliander and Tao10] showed that these results match the threshold for analytic (or uniformly continuous) dependence on the initial data.
For (NLS) on the circle, this
$L^2$
threshold also marks the boundary for even continuous dependence on the initial data. This was shown in [Reference Burq, Gérard and Tzvetkov6, Reference Christ, Colliander and Tao12, Reference Guo and Oh26] and represents a sharp distinction from the line case. This “premature” breakdown of well-posedness is now understood as arising from an infinite phase rotation, which, in turn, suggests a suitable renormalization, namely, Wick ordering the nonlinearity. This point of view has been confirmed in [Reference Christ9, Reference Grünrock and Herr21, Reference Oh and Wang49], where Wick-ordered NLS is shown to be globally well-posed in (almost-critical) Fourier–Lebesgue spaces where the traditional (NLS) is ill-posed.
For (mKdVℝ
) on the circle,
$H^{1/2}$
is not the threshold for continuous dependence. In [Reference Kappeler and Topalov29], Kappeler and Topalov proved well-posedness in
$L^2(\mathbb {T})$
; this was shown to be sharp by Molinet [Reference Molinet47]. By renormalizing the nonlinearity (to remove an infinite transport term), well-posedness was then shown in [Reference Kappeler and Molnar28] for a larger Fourier–Lebesgue class of initial data (see also [Reference Schippa52]). The recent work [Reference Chapouto7] dramatically clarifies the situation regarding the full complex equation (mKdV): It is shown that
$H^{1/2}$
is the threshold for continuous dependence in this setting; moreover, it is shown that to go below this threshold (even in Fourier–Lebesgue spaces), a second renormalization is required.
Given the known thresholds for continuous dependence on the circle, the proof of Theorem 1.1 must employ some property of our equations that distinguishes the line and the circle cases! This will be the local smoothing effect, that is, a gain of regularity locally in space on average in time. This constitutes a significant point of departure from [Reference Killip and Vişan37], where the arguments developed do not distinguish between the two geometries.
The local smoothing estimates that are relevant to us involve fractional numbers of derivatives. Correspondingly, some prudence is required in selecting the proper way to localize in space. We do so by choosing a fixed family of Schwartz cutoff functions

whose particular properties will allow it to be used throughout the analysis. Corresponding to this cut-off, we define local smoothing norms by

In Lemma 2.2, we will see that this norm is strong enough to control any other choice of Schwartz-class cut-off function.
The restriction of time to the interval
$[-1,1]$
in (1.6) was a rather arbitrary choice; however, we see little advantage to introducing additional time parameters. Results for alternate time intervals (or indeed other spatial intervals) can be achieved by a simple covering argument, using time- and space-translation invariance.
We are now ready to state the local smoothing estimates we prove for the solutions constructed in Theorem 1.1. As the gain in regularity differs between the two evolutions, it is easier to state our results separately:
Theorem 1.2 (Local smoothing: NLS).
Fix
$-\frac 12<s<0$
. Given initial data
$q_0\in H^s({{\mathbb {R}}})$
, the corresponding solution
$q(t)$
to NLS constructed in Theorem 1.1 satisfies

moreover,
$q_0\mapsto q(t)$
is a continuous mapping from
$H^s$
to
$X^{s+\frac 12}$
.
Theorem 1.3 (Local smoothing: mKdV).
Fix
$-\frac 12<s<\frac 12$
. The solution
$q(t)$
to mKdV with initial data
$q_0\in H^s({{\mathbb {R}}})$
constructed in Theorem 1.1 satisfies

moreover,
$q_0\mapsto q(t)$
is a continuous mapping from
$H^s$
to
$X^{s+1}$
.
Estimates of this type are well-known for the underlying linear equations and readily proven either by Fourier-analytic techniques, or by explicit monotonicity identities. In the special cases where one has a suitable microscopic conservation law, the latter technique can be adapted to nonlinear problems. Indeed, the original local smoothing effect was the case
$s=0$
of (1.8), which was proven in [Reference Kato30] by employing the microscopic conservation law

satisfied by solutions of (mKdV). The analogous microscopic conservation law for (NLS) is

which yields (1.7) with
$s=\frac 12$
.
When the sought-after regularity does not match a known conservation law, local smoothing results for nonlinear PDE have traditionally been proven perturbatively, building on the corresponding estimates for the underlying linear equation. In particular, the arguments of [Reference Tsutsumi55] can be used to show that (1.7) continues to hold for
$s\geq 0$
. That (1.8) continues to hold for
$s\geq \frac 14$
was proved in [Reference Kenig, Ponce and Vega33]; indeed, there the local smoothing effect was crucial to even constructing solutions.
Due to the breakdown in uniform continuity of the data-to-solution map at low regularity, we cannot expect the nonlinear flow to be well modeled by a linear flow, and so some truly nonlinear technique is needed to prove Theorems 1.2 and 1.3. It is the discovery of a new one-parameter family of microscopic conservation laws for these equations that will allow us to achieve such low regularity. As local smoothing is a linear effect, it is surprising that the loss of uniform continuity is not accompanied by any lessening of this effect — the estimates we obtain exhibit the same derivative gain as seen for the linear equation.
As we shall see, the proof of Theorem 1.1 relies crucially on the local smoothing effect (though in a rather stronger form than presented in Theorems 1.2 and 1.3). With this in mind, it is natural to begin our discussion of the methods employed in this paper by describing how local smoothing is to be proved.
Local smoothing estimates also allow us to make better sense of the nonlinearity. Note that Theorem 1.1 already allows us to make sense of the nonlinearity taken holistically: If
$q_n$
are Schwartz solutions converging to q in
$L^\infty _t H^s$
, then directly from the equation, we see that the corresponding sequence of nonlinearities converge, for example, as spacetime distributions. By contrast, one may seek to make sense of the individual factors in the nonlinearity in a way that allows them to be multiplied; this is where local smoothing helps.
For example, our results show that for any
$s>-1/2$
, solutions of (mKdVℝ
) with initial data in
$H^s({{\mathbb {R}}})$
belong to
$L^3_{t,x}$
on all compact regions of spacetime. Analogously, we see that solutions to (NLS) are locally
$L^3_{t,x}$
whenever
$s\geq -1/6$
.
1.1. Outline of the proof
As we have mentioned earlier, (NLS) and (mKdV) belong to an infinite hierarchy of evolution equations whose Hamiltonians Poisson commute. Among PDEs, this phenomenology was first discovered in the case of the Korteweg–de Vries equation [Reference Gardner, Greene, Kruskal and Miura17]. And it was these discoveries that Lax [Reference Lax45] elegantly codified by introducing the Lax pair formalism (the monograph [Reference Faddeev and Takhtajan16] employs a parallel approach based around the zero-curvature condition).
As noted above, Lax pairs for (NLS) and (mKdV) were introduced in [Reference Ablowitz, Kaup, Newell and Segur1, Reference Zakharov and Shabat56]. Several different (but equivalent) choices of these operators exist in the literature. Our convention will be to use Lax operators

Here,
$\varkappa $
denotes the spectral parameter (which will always be real in this paper). The second member of the Lax pair (traditionally denoted P) can be taken to be

for (NLS) and (mKdV), respectively.
The Lax equation
$\partial _t L = [P,L]$
guarantees that the Lax operators at different times are conjugate. In the setting of finite matrices, this would guarantee that the characteristic polynomial of L is independent of time. In the case of (1.9), renormalization is required — indeed, L is not even bounded, let alone trace-class. Such a renormalization was presented in [Reference Killip, Vişan and Zhang38] based on the renormalized Fredholm determinant
$\operatorname{\mathrm{det}}_2(1+A) = \operatorname{\mathrm{det}} (1+A) e^{-\operatorname {\mathrm {tr}}(A)}$
. Concretely, it was shown in [Reference Killip, Vişan and Zhang38] that

is well-defined, conserved for Schwartz solutions, and coercive. This was the origin of the coercive macroscopic conservation laws constructed in that paper. The regularities of these laws were adjusted by integrating against a suitable measure in
$\varkappa $
.
Unfortunately, such macroscopic conservation laws are of no use in proving local smoothing. We need not only microscopic conservation laws but coercive microscopic conservation laws. In Section 4, we present our discovery of just such a density
$\rho $
and its attendant currents j. We feel that this is an important contribution to the much-studied algebraic theory of these hierarchies. Moreover, it is the driver of all that follows.
We do not have a systematic way of finding microscopic conservation laws attendant to the conservation of the perturbation determinant. If we compare the answer for KdV from [Reference Killip and Vişan37] with that developed in this paper, it is tempting to predict that it should always be a rational function of components of the diagonal Green’s function. However, we have also found the corresponding quantity for the Toda lattice [Reference Harrop-Griffiths, Killip and Vişan27], and, in that case, it is a transcendental function of entries in the Green’s matrix. On the other hand, the closely related one-parameter family of macroscopic conservation laws

are easily seen to admit a microscopic representation based on the diagonal of the Green’s function. The associated density
$\gamma $
turns out to be far inferior for what we need to do here. Indeed, in Lemma 4.9, we will show that, unfortunately, the current corresponding to
$\gamma $
is not adequately coercive. This undermines its utility for proving local smoothing. In principle, one could recover a
$\rho $
-like object by integrating
$\gamma $
in energy. (Of course, this need only agree with
$\rho $
up to a mean-zero function.) In fact, we pursued this approach for a long time while still seeking the true form of
$\rho $
. We can attest that this approach is extremely painful and dramatically increases the number of subtle cancellations that need to be exhibited later in the argument.
The proof of local smoothing is far and away the most lengthy and complicated part of the paper, comprising the entirety of Section 5 and employing crucially all of the preceding analysis. One reason is that we actually need a two-parameter family of estimates that go far beyond the simple a priori bounds (1.7) and (1.8). The role of the first of these two parameters is easy to explain at this time: it acts as a frequency threshold in the local smoothing norm. This refinement will allow us to prove that the high-frequency contribution to the local-smoothing norm is controlled (in a very quantitative way) by the high-frequency portion of the initial data. This is the essential ingredient in the continuity claims made in Theorems 1.2 and 1.3. (The basic question of whether such continuity holds for Kato’s original estimate [Reference Kato30] seems to have been open up until now.)
This extra frequency parameter also plays a major role in Section 6, where it is used to show that an
$H^s$
-precompact set of Schwartz-class initial data leads to a collection of solutions that is
$H^s$
-precompact at later times. In view of the equicontinuity of orbits mentioned earlier, this is a question of tightness.
As local smoothing estimates control the flow of the
$H^s$
norm through compact regions of spacetime, it is natural to attempt to employ them to prove tightness in
$H^s$
. However, it is precisely the fact that the transport of
$H^s$
norm cannot exceed the total
$H^s$
norm available that is used to prove Theorems 1.2 and 1.3; thus, these results do not provide sufficient control to yield tightness! Our tightness result relies crucially on the extra frequency parameter to demonstrate that there is little local smoothing norm residing at high frequencies and, consequently, little high-speed transport of
$H^s$
-norm.
The compactness result just enunciated guarantees the existence of weak solutions. To obtain well-posedness, we must verify uniqueness (i.e., that different subsequences do not lead to different solutions), as well as continuous dependence on the initial data. To achieve that, we will rely crucially on ideas introduced in [Reference Killip and Vişan37] and further developed in [Reference Bringmann, Killip and Visan5, Reference Killip, Murphy and Visan36].
While these papers provide a useful precedent on overall strategy, they provide no guidance on how to implement it. The first triumph of this paper is to construct the algebraic and analytic framework needed for this type of analysis in the AKNS-ZS hierarchy. We will see that even though the two equations belong to the same hierarchy, the fundamental monotonicity laws for (NLS) and (mKdV) are different; moreover, neither equation provides significant guidance in finding the numerous cancellations necessary to treat the other.
The first step in this strategy is the introduction of regularized Hamiltonians indexed by a scalar parameter
$\kappa $
. The flows induced by these Hamiltonians should (a) be readily seen to be well-posed, (b) commute with the full flows, and (c) converge to the full flows as
$\kappa \to \infty $
. Such flows are introduced in Section 4 where they are easily proven to have properties (a) and (b). That they enjoy property (c) in the desired topology, however, is highly nontrivial. This is the subject of Section 7, which is the climax of this paper.
Due to their commutativity, the problem of controlling the difference between the full and regularized flows can be reduced to controlling the evolution under the difference Hamiltonian (that is, the difference of the full and regularized Hamiltonians). In fact, this is the key insight of the commuting flow paradigm introduced in [Reference Killip and Vişan37]: instead of needing to estimate the distance between two solutions (which is rendered intractable by the breakdown of uniformly continuous dependence), one need only study a single evolution, albeit under a much more complicated flow.
The difference flow retains all the bad behavior of the original PDE; indeed, the regularized flows are (by construction) relatively harmless. All obstacles that prevented previous researchers from successfully analyzing solutions in this nonperturbative regime are retained. To succeed, we will need to rely on a number of new insights; these include the new two-parameter local smoothing estimates, a novel change of unknown, and the demonstration of myriad cancellations between the full flow and its regularized counterpart.
The necessity of employing a (diffeomorphic) change of variables is common also to [Reference Bringmann, Killip and Visan5, Reference Killip and Vişan37]. In those works, the new variable is the diagonal Green’s function. The fact that this originates from a microscopic conservation law places one derivative in a favorable position. Alas, all conservation laws for the NLS/mKdV hierarchy are quadratic in q, and so none can offer a diffeomorphic change of variables.
In place of the diagonal Green’s function that proved so successful in the treatment of the KdV hierarchy, we adopt an off-diagonal entry
$g_{12}(x)$
of the Green’s function as our new variable. Among its merits are the following: it has a relatively accessible time evolution; as an integral part of the definition of
$\rho $
, it is something for which we need to develop extensive estimates anyway; the mapping
$q\mapsto g_{12}$
is a diffeomorphism; and, lastly, it gains one degree of regularity, which aids in estimating nonlinear terms.
Nevertheless, this change of variables comes with significant shortcomings. Foremost, it is not possible to control the evolution of
$g_{12}$
without employing local smoothing (or some other manifestation of the underlying geometry). For, otherwise, one would obtain results for the circle that are known to be false!
At this moment, it is important to remember that we are discussing the difference flow and that our ambition is to prove that it converges to the identity as
$\kappa \to \infty $
. Concomitant with this, the local smoothing effect deteriorates rapidly as
$\kappa \to \infty $
. This inherent deterioration in the local smoothing estimates means that in order to treat all regularities
$s>-\frac 12$
, we must discover every cancellation available between the full and regularized flows. This, in turn, necessitates the carefully premeditated decomposition of error terms in Section 7 and the stringent estimation of paraproducts in Section 5.
Due to the need for local smoothing estimates, we will only be able to verify convergence locally in space. The tightness results of Section 6 are therefore essential for overcoming this deficiency.
In Section 8, we prove Theorem 1.1. The tools we develop in the first seven sections allow us to prove Theorem 1.1 in the range
$-\frac 12<s<0$
. This suffices for (NLS) but leaves the gap
$[0,\frac 14)$
for (mKdV). To close this gap, we construct suitable macroscopic conservation laws for both equations that allow us to prove the equicontinuity of orbits in
$H^s$
for
$0\leq s<\frac 12$
and so deduce well-posedness from that at lower regularity. This is interesting even for (NLS), where, for example, global in time equicontinuity of orbits in
$L^2$
does not seem to have been shown previously (nor is it trivially derivable from the standard techniques).
Section 9 is devoted to proving Theorems 1.2 and 1.3. All the ingredients we need for the range
$-\frac 12<s<0$
are presented already in Section 5. Thus, the majority of Section 9 is devoted to proving local smoothing for (mKdV) over the range
$0\leq s<\frac 12$
by using a new underlying microscopic conservation law.
In closing, let us quickly recapitulate the structure of the paper that follows. Section 2 discusses myriad preliminaries: settling notation, verifying basic properties of the local smoothing spaces, and proving a variety of commutator estimates. In Section 3, we discuss the (matrix-valued) Green’s function of the Lax operator, with particular emphasis at the confluence of the two spatial coordinates. Section 4 introduces the conserved density
$\rho $
and derives equations for the time evolution of this and other important quantities. Section 5 proves local smoothing estimates, not only for (NLS) and (mKdV), but also for the associated difference flows. It is essential for what follows that these local smoothing estimates contain an additional frequency cut-off parameter. The freedom to vary this parameter plays a crucial role, for example, in Section 6, where these local smoothing estimates are used to control the transport of
$H^s$
-norm. Section 7 uses local smoothing to demonstrate the convergence of the regularized flows to the full PDEs by proving that the difference flow approximates the identity. In Section 8, we prove Theorem 1.1. Section 9 addresses Theorems 1.2 and 1.3. Appendix A gives a new presentation of existing ill-posedness results for (NLS), extending them to other members of the hierarchy, including (mKdV).
2. Some notation and preliminary estimates
For the remainder of the paper, we constrain

and all implicit constants are permitted to depend on s. In view of the scaling (1.3), it will suffice to prove all our theorems under a small-data hypothesis. For this purpose, we introduce the notation

We use angle brackets to represent the pairing:

In addition to being the natural inner product on (complex)
$L^2({{\mathbb {R}}})$
, this also informs our notions of dual space (the dual of
$H^s({{\mathbb {R}}})$
is
$H^{-s}({{\mathbb {R}}})$
) and of functional derivatives: If
$F:\mathcal {S}\to \mathbb {C}$
is
$C^1$
, then

For real-valued F, the functions
$\tfrac {\delta F}{\delta q}$
and
$\tfrac {\delta F}{\delta \bar q}=\pm \tfrac {\delta F}{\delta r}$
are complex conjugates. These are functional analogues of the (Wirtinger) directional derivatives of complex analysis — q and
$\bar q$
are not independent variables!
We write
$\mathfrak I_p$
for the
$\ell ^p$
Schatten class over the Hilbert space
$L^2({{\mathbb {R}}})$
. For most of our analysis, the Hilbert–Schmidt class
$\mathfrak I_2$
will suffice.
Commensurate with our choice of time interval in (1.6), all spacetime norms will also be taken over this time interval (unless the contrary is indicated explicitly). Thus, for any Banach space Z and
$1\leq p\leq \infty $
, we define

Our convention for the Fourier transform is

We shall repeatedly employ a “continuum partition of unity” device based on the cut-off
$\psi _h^{12}$
. Specifically, as

in
$H^\sigma ({{\mathbb {R}}})$
sense, for any
$f\in H^\sigma ({{\mathbb {R}}})$
and any
$\sigma \in {{\mathbb {R}}}$
.
2.1. Sobolev spaces
For real
$|\kappa |\geq 1$
and
$\sigma \in {{\mathbb {R}}}$
, we define the norm

and write
$H^\sigma := H^\sigma _1$
.
For
$-\frac 12<s<0$
, elementary considerations yield

Consequently, we have the following algebra property:

Arguing by duality and using the fractional product rule, Sobolev embedding, and (2.4), we may bound

Lemma 2.1. If
$s'<s$
,
$|\kappa |\geq 1$
, and
$q\in H^s$
, then

Proof. By scaling, it suffices to consider the case
$\kappa = 1$
. We may then write

By considering the cases
$|\xi |\leq 2$
and
$|\xi |>2$
separately, we may bound

and the estimate (2.7) then follows from the Fubini-Tonelli theorem.
2.2. Local smoothing spaces
It will be important to consider a one-parameter family of local smoothing norms, generalizing that presented in the Introduction. To this end, given
$\kappa \geq 1$
and
$\sigma \in {{\mathbb {R}}}$
, we define the local smoothing space

so that
$X^\sigma _1 = X^\sigma $
, where we write
$\tfrac {\psi _h^6q}{\sqrt {4\kappa ^2 - \partial ^2}} = (4\kappa ^2 - \partial ^2)^{-\frac 12}(\psi _h^6q)$
. At this moment, placing the inverse differential operators under their arguments (rather than in front of them) may seem clumsy; however, the mere act of writing out (3.23) in traditional form will quickly convince the reader of the virtue of this approach.
To ease dimensional analysis, the
$X^\sigma _\kappa $
spaces have been defined to scale the same as
$H^\sigma $
spaces.
Our next lemma allows us to understand the effect of changing the localizing function
$\psi ^6$
or the regularity
$\sigma $
in the definition of the local smoothing norm:
Lemma 2.2. Given
$\kappa \geq 1$
,
$\sigma \in {{\mathbb {R}}}$
, and
$\phi \in \mathcal {S}$
,

Moreover, if
$s-1\leq \sigma '\leq \sigma $
, then

Proof. We begin by discussing (2.8). Let
$T_h:L^2\to L^2$
denote the operator with integral kernel

By applying Schur’s test, we find that

Moreover, this bound holds uniformly in
$\kappa $
. Thus, by employing (2.3), we find

which settles (2.8).
Turning to (2.9), and setting
$N=\kappa ^{\frac 1{1 + \sigma - s}}$
, we have

Taking the supremum over h, we obtain the estimate (2.9).
Next, we record several commutator-type estimates that we will use in the later sections.
Lemma 2.3. Fix
$\kappa \geq 1$
. Then


uniformly for
$h\in {{\mathbb {R}}}$
. Moreover, for
$\ell =2,3,4$
and
$2+s\leq \sigma +\ell \leq 4+s$
,

uniformly for
$h\in {{\mathbb {R}}}$
.
Proof. The estimate (2.10) follows from the observation that

The lower bound on
$\sigma $
expresses that the maximum possible decay in
$\kappa $
is
$\kappa ^{-4-s}$
.
To handle
$\ell =1,2,3$
, we also use the fact that

from which we see that the maximum possible decay in
$\kappa $
is
$\kappa ^{-2-s}$
.
We now turn to (2.12) and write

Using (2.8), this readily yields

We also have the following estimates:
Lemma 2.4. Let
$\sigma>0$
,
$\kappa \geq 1$
, and
$f,g\in \mathcal C([-1,1];\mathcal {S})$
. If
$|\varkappa |\geq 1$
, then

Further, we have the product estimates



All estimates are uniform in
$\kappa $
and
$\varkappa $
.
Proof. By translation invariance, it suffices to prove the estimates for a fixed choice of
$\psi _h$
on the left-hand side. For simplicity, we take
$h=0$
.
We start with (2.14). By Plancherel, we have

On the other hand,
$(2\varkappa - \partial )(\psi ^6f) = \psi ^6 (2\varkappa - \partial )f - (\psi ^6)' f$
. Thus, the first inequality in (2.14) follows from (2.8); the second is elementary.
For the product estimates (2.16) and (2.15), we first decompose dyadically to obtain

For the high-low interactions, where
$N_2\ll N_1\approx N$
, we use Bernstein’s inequality at low frequency to bound

After summing in
$N,N_1,N_2$
, we obtain a contribution to
$\mathrm {RHS}$
(2.18) that is

For the high-high interactions where
$N\lesssim N_1\approx N_2$
, we use Bernstein’s inequality at the output frequency to bound

After summation, we again obtain a contribution to
$\mathrm {RHS}$
(2.18) that is

For the low-high interactions, where
$N_1\ll N_2\approx N$
, we proceed similarly to the case of the high-low interactions, using Bernstein’s inequality at low frequency to bound

In this case, we obtain a contribution to
$\mathrm {RHS}$
(2.18) that is

This completes the proof of (2.15). Alternatively, we may bound

to obtain a contribution to
$\mathrm {RHS}$
(2.18) of

which completes the proof of (2.16).
2.3. Operator estimates
For
$0<\sigma <1$
and
$|\kappa |\geq 1$
, we define the operator
$(\kappa \mp \partial )^{-\sigma }$
using the Fourier multiplier
$(\kappa \mp i\xi )^{-\sigma }$
, where, for
$\arg z\in (-\pi ,\pi ]$
, we define

We observe that with this convention, for all
$|\kappa |\geq 1$
, we have

and readily obtain the estimate

We will make frequent use of the following Hilbert–Schmidt estimates:
Lemma 2.5. For all
$q\in H^s_\kappa ({{\mathbb {R}}})$
,



Moreover, for any real
$|\kappa |\geq 1$
,

Proof. By scaling, it suffices to consider
$\kappa = 1$
. By Plancherel’s theorem, we have

For the particular choices of
$\alpha $
and
$\beta $
relevant to (2.20) and (2.21), we have

from which we obtain (2.20). The estimate (2.22) can be proved in a parallel manner (see [Reference Killip, Vişan and Zhang38, Lemma 4.1]).
Arguing by duality, the key observation to prove (2.23) is that

which combines the duality of
$H^\sigma _\kappa $
and
$H^{-\sigma }_\kappa $
with the algebra property (2.5).
Our next two lemmas are devoted to similar bounds, but employing the local smoothing norm on the right-hand side. The former employs the local smoothing norm pertinent to (NLS), while the latter is relevant to (mKdV).
By introducing spatial localization, we obtain the following improvements:
Lemma 2.6. We have the estimates


uniformly for
$|\varkappa |\geq \kappa ^{\frac 23}\geq 1$
,
$q\in \mathcal C([-1,1];H^s)\cap X^{s+\frac 12}_\kappa $
, and
$h\in {{\mathbb {R}}}$
.
Proof. By translation invariance, it suffices to consider the case
$h=0$
. Given a dyadic number
$N\geq 1$
, we define

presaging the notation (3.5). Employing (2.22), we may bound

The estimate (2.24) now follows by taking a square root and summing over
$N\in 2^{\mathbb {N}}$
.
From Bernstein’s inequality, we have

which combined with the first part of (2.26) yields

Thus, we may prove (2.25) via first interpolating between (2.26) and (2.27), and then summing over
$N\in 2^{\mathbb {N}}$
. This is most easily accomplished by breaking the sum at
$\kappa ^{\frac 23}$
and
$|\varkappa |$
.
Lemma 2.7. Fix
$2\leq p <\infty $
. Then

uniformly for
$|\varkappa |\geq \kappa ^{\frac 12}\geq 1$
,
$q\in \mathcal C([-1,1];B_\delta )\cap X^{s+1}_\kappa $
, and
$h\in {{\mathbb {R}}}$
. Moreover, the factor
$(1+\tfrac {\kappa ^2}{\varkappa ^2})$
may be deleted if
$p\leq 5$
.
Proof. We mimic the proof of Lemma 2.6, replacing (2.26) with

and reusing (2.27). We simply interpolate and then sum. Note that the logarithmic factor is only necessary when
$p(\frac 12-s)\in \{3,5\}$
. When
$p\leq 5$
, the extra factor can be neglected due to the other summand and the constraint
$\smash {|\varkappa |\geq \kappa ^{\frac 12}}$
.
In order to apply Lemmas 2.6 and 2.7, we will need to bring some power of the localizing function
$\psi $
adjacent to copies of q and r. This is the role of the following:
Lemma 2.8 (Multiplicative commutators).
For
$|\varkappa |,|\kappa |\geq 1$
,
$\sigma \in {{\mathbb {R}}}$
, and any integer
$|\ell |\leq 12$
, we have the following estimate uniformly for
$h\in {{\mathbb {R}}}$
and
$u\in \mathcal {S}$
,

Further, if
$N\geq 1$
is a dyadic integer,
$1\leq p\leq \infty $
, and
$n\geq 0$
, we have

Proof. By translation invariance, it suffices to consider the case
$h=0$
.
Using Schur’s test and the explicit kernel (3.3), we find

We will need this shortly. It is important, here, that the exponential decay of the convolution kernel is faster than that of the function
$\psi ^\ell $
. This is a reason both for the large constant
$99$
appearing in (1.5) and for requiring a bound on the size of
$\ell $
.
We first consider the estimate (2.29). By duality, it suffices to consider the case
$\sigma \geq 0$
. For
$z\in \mathbb {C}$
, we write

with the intention of using complex interpolation to prove
$\|B_\ell (\sigma )\|\lesssim _{\sigma ,\ell } 1$
, which implies (2.29). As imaginary powers of
$\kappa ^2 - \partial ^2$
are unitary, we find

for any integer
$m\geq \sigma $
. For concreteness, we choose the least such integer.
Combining
$|\psi '|\lesssim \psi $
and (2.31) with the rewriting

Turning our attention now to
$B_\ell (m)$
, we notice that

moreover, we may expand
$\tilde B_\ell (m)$
as

where the sum is over all decompositions
$m=m_1+m_2+m_3+m_4$
using nonnegative integers. The key observation that finishes the proof is that each operator in square brackets is bounded; indeed, for every
$n\geq 0$
, we have

for any integers
$\ell $
and
$n\geq 0$
.
The proof of (2.30) employs similar ideas: We first write

which shows that we need only prove

This is easily verified, by commuting the derivatives and employing (2.31) and (2.32).
3. The diagonal Green’s functions
The role of this section is to introduce three central characters in the analysis, namely,
$g_{12}$
,
$g_{21}$
, and
$\gamma $
, and to develop some basic estimates for them. What unifies these objects is that they all arise from the Green’s function associated to the Lax operator
$L(\kappa )$
introduced in (1.9). Recall

We shall only consider
$\kappa \in {{\mathbb {R}}}$
with
$|\kappa |\geq 1$
. Note that

Evidently, both identities hold for
$L_0$
, since then
$q=r=0$
.
We will be constructing the Green’s function, which is matrix valued, perturbatively from the case
$q=r=0$
. By direct computation, one finds that

admits the integral kernel

For
$\kappa <0$
, we may use
$G_0(x,y;-\kappa ) =-G_0(y,x;\kappa )$
, which follows from (3.2).
Formally, at least, the resolvent identity indicates that
$R(\kappa ):=L(\kappa )^{-1}$
can be expressed as

Here, and below, fractional powers of
$R_0$
are defined via (2.19). This series forms the foundation of everything in this section; its convergence will be verified shortly as part of proving Proposition 3.1. With a view to this, we adopt the following notations:

whose significance is that

These operators also satisfy

as is easily deduced from either (2.20) or (2.22).
Proposition 3.1 (Existence of the Green’s function).
There exists
$\delta>0$
so that
$L(\kappa )$
is invertible, as an operator on
$L^2({{\mathbb {R}}})$
, for all
$q\in B_\delta $
and all real
$|\kappa |\geq 1$
. The inverse
$R(\kappa ):=L(\kappa )^{-1}$
admits an integral kernel
$G(x,y;\kappa )$
so that

is a continuous mapping from
$H^s_\kappa ({{\mathbb {R}}})$
to the space of Hilbert–Schmidt operators from
$H^{-\frac 34 - \frac s2}_\kappa $
to
$H^{\frac 34 + \frac s2}_\kappa $
. Moreover,
$G-G_0$
is continuous as a function of
$(x,y)\in {{\mathbb {R}}}^2$
. Lastly,


in the sense of distributions.
Proof. From (3.7), we have

uniformly for
$|\kappa |\geq 1$
. Thus, for
$\delta>0$
sufficiently small, the series (3.4) converges in operator norm uniformly for
$|\kappa |\geq 1$
. It is elementary to then verify that the sum acts as a (two-sided) inverse to
$L(\kappa )$
.
This argument also yields that
$R-R_0 \in \mathfrak I_2$
. In particular, it admits an integral kernel in
$L^2({{\mathbb {R}}}^2)$
. To prove (3.8) is continuous, we only need to verify that the series defining
$R-R_0$
converges in the sense of Hilbert–Schmidt operators from
$H^{-\frac 34 - \frac s2}_\kappa $
to
$H^{\frac 34 + \frac s2}_\kappa $
. This follows readily from (2.20).
The continuity of
$G-G_0$
as a function of
$(x,y)$
follows from the Hilbert–Schmidt bound on (3.8) because
$\frac 34 + \frac s2>\frac 12$
.
For regular q, the identities (3.9) and (3.10) precisely express the fact that G is an integral kernel for
$R(\kappa )$
. The issue of how to make sense of them for irregular q is settled by (3.8).
From the jump discontinuities evident in (3.3), we see that one cannot expect to restrict
$G(x,y;\kappa )$
to the
$x=y$
diagonal in a meaningful way. However, as we have just shown,
$G-G_0$
is continuous. This allows us to unambiguously define the continuous functions

Here, subscripts indicate matrix entries. While the inclusion of the factor
$\operatorname {\mathrm {sgn}}(\kappa )$
may seem unnecessary, it has the esthetical virtue of eliminating corresponding factors in many subsequent formulas, such as (3.12)–(3.14) below.
If
$q\in B_\delta \cap \mathcal {S}$
, we may use the identities (3.9) and (3.10) for G to obtain



in the sense of distributions. Combining (3.11), (3.12), and (3.13) yields the further identity

which recurs several times in our analysis. From (3.2), we also have

From the series representation (3.4) of the resolvent, we naturally can deduce corresponding series representations of
$g_{12}$
,
$g_{21}$
, and
$\gamma $
. These are effectively power-series in terms of q and r, albeit with each term being a paraproduct, rather than a monomial. In what follows, we shall often need to discuss individual terms in these series so, being sensitive to the order of such terms in q and r, we adopt the following notations:


with
$g_{12}^{[2m]}(\kappa )=g_{21}^{[2m]}(\kappa ):= 0$
, and similarly,
$\gamma ^{[2m+1]}(\kappa ):=0$
and

In this way, we see that

In particular, we note that the expansion of
$g_{12}$
contains only terms with q appearing once more than r, while the expansion of
$\gamma $
contains only terms of even order, with q and r appearing equally. Analogous to our notation for individual terms, we write tails of these series as

We also extend these “square bracket” notations to algebraic combinations of these series (see, for example, (3.38)).
For small indices, it is possible to find explicit representations of the individual paraproducts via the explicit form of
$G_0$
; however, this quickly becomes overwhelming. A more systematic approach can be based on iteration of the identities

which follow from (3.12), (3.13), and (3.31), respectively. Pursuing either method, one is led to


as well as


Here, dots emphasize occurrences of pointwise multiplication.
With these preliminaries out of the way, we are now ready to present some basic estimates on
$g_{12}$
,
$g_{21}$
, and
$\gamma $
. Propositions 3.2 and 3.3 focus on properties that hold pointwise in time; later in Lemma 3.4 and Corollary 3.5, we employ local smoothing spaces.
Proposition 3.2 (Properties of
$g_{12}$
and
$g_{21}$
).
There exists
$\delta>0$
so that for all real
$|\kappa |\geq 1$
, the maps
$q\mapsto g_{12}(\kappa )$
and
$q\mapsto g_{21}(\kappa )$
are (real analytic) diffeomorphisms of
$B_\delta $
into
$H^{s+1}$
satisfying the estimates

Further, the remainders satisfy the estimate

uniformly in
$\kappa $
. Finally, if q is Schwartz, then so are
$g_{12}(\kappa )$
and
$g_{21}(\kappa )$
.
Proof. It suffices to consider the case
$\kappa \geq 1$
, as the case
$\kappa \leq -1$
is similar; moreover, by (3.15), it suffices to consider
$g_{12}(\kappa )$
. Recalling (3.20), we obtain

To bound the remaining terms in the series, we employ duality and Lemma 2.5:

provided
$\delta>0$
is sufficiently small. This proves (3.25) and completes the proof of (3.24).
We wish to apply the inverse function theorem to obtain the diffeomorphism property. At the linearized level, we already have

which is an isomorphism, as noted already in (3.26). At the nonlinear level, we apply the resolvent identity, which shows that for any test function
$f\in \mathcal {S}$
, we have

Repeating the analysis used to prove (3.25), we find

and so deduce that the diffeomorphism property holds for
$\delta>0$
sufficiently small, which can be chosen independent of
$|\kappa |\geq 1$
.
Next, we seek to show
$g_{12} \in \mathcal {S}$
whenever
$q\in B_\delta \cap \mathcal {S}$
, beginning with a consideration of derivatives. For any
$h\in {{\mathbb {R}}}$
, we have

In particular, differentiating n times with respect to h and evaluating at
$h=0$
, we may use duality to bound

where the constant
$C = C(s)> 0$
may be chosen independent of
$\kappa $
. To handle spatial weights, we observe that

In particular, by duality, we may bound

Combining these, we see that if
$q\in B_\delta \cap \mathcal {S}$
, then
$g_{12}(\kappa )\in \mathcal {S}$
.
Proposition 3.3 (Properties of
$\gamma $
).
There exists
$\delta>0$
so that for all real
$|\kappa |\geq 1$
, the map
$q\mapsto \gamma (\kappa )$
is bounded from
$B_\delta $
to
$L^1\cap H^{s+1}$
, and we have the estimates




uniformly in
$\kappa $
. Further, we have the quadratic identity

and if q is Schwartz, then so is
$\gamma (\kappa )$
.
Proof. Once again, it suffices to consider the case
$\kappa \geq 1$
. Using (2.5) and (3.22), we obtain

To handle
$\gamma ^{[\geq 4]}$
, we use the series representation (3.19) and the same duality argument used to prove (3.25). The estimate (3.28) then follows from (3.27) via (2.4).
Setting
$\varkappa =\kappa $
in (3.14), we find that

From (3.24) and (3.27), we see that the term in braces vanishes as
$|x|\to \infty $
. Thus, the quadratic identity (3.31) follows by integration.
By using this quadratic identity, we may write

By Proposition 3.2 and (3.27), we have

Thus

which yields the estimate (3.30). The estimate (3.29) then follows from applying the Cauchy-Schwarz inequality to (3.22).
If
$q\in B_\delta \cap \mathcal {S}$
, then from Proposition 3.2 and the quadratic identity (3.31), we see that
$\gamma + \frac 12\gamma ^2\in \mathcal {S}$
. As
$H^{s+1}$
is an algebra, we may then bound

so using the estimate (3.27), we see that
$\gamma (\kappa )\in \mathcal {S}$
, provided
$0<\delta \ll 1$
is sufficiently small.
Next, we consider local smoothing estimates for
$g_{12} = g_{12}(\varkappa )$
and
$\gamma = \gamma (\varkappa )$
. We consider both (NLS) and (mKdV) here, and so must allow two values for
$\sigma $
, namely,
$s+\frac 12$
and
$s+1$
. In fact, the proof below works uniformly on the interval
$[s+\frac 12,s+1]$
.
Lemma 3.4 (Local smoothing estimates for
$g_{12}$
,
$\gamma $
).
Let
$\sigma \in \{ s+\frac 12,s+1\}$
. Then there exists
$\delta>0$
, so that for all real
$|\kappa |\geq 1$
,
$|\varkappa |\geq 1$
, and
$q\in \mathcal C([-1,1];B_\delta )\cap X^\sigma _\kappa $
, the functions
$g_{12} = g_{12}(\varkappa )$
and
$\gamma = \gamma (\varkappa )$
satisfy the estimates



where the implicit constants are independent of
$\kappa ,\varkappa $
.
Proof. Applying the product estimate (2.15) with the quadratic identity (3.31) and the symmetry relation (3.15), we may bound

In view of (3.27), taking
$0<\delta \ll 1$
sufficiently small (independently of
$\varkappa $
) and using (3.24), we get

As a consequence, the estimate (3.35) follows from the estimate (3.33).
To prove the estimate (3.33), we first apply the estimate (2.14) to obtain

From the identity (3.12) for
$g_{12}$
, we see that
$g_{12}^{[\geq 3]} = -(2\varkappa - \partial )^{-1} (q\gamma )$
. Thus, employing (2.14), we find

To continue, we use (2.16) together with (3.27) and (3.36) for
$\gamma $
to obtain

Using (2.6) and (3.27), we may bound

As a consequence,

Combining this with (3.37) and choosing
$0<\delta \ll 1$
sufficiently small (independently of
$\kappa ,\varkappa $
), we obtain (3.33) and so also (3.34).
Due to the structure of our microscopic conservation law, the functions
$g_{12}$
and
$\gamma $
will frequently occur in the combination
$\frac {g_{12}(\varkappa )}{2 + \gamma (\varkappa )}$
. Naturally, this may also be written as a power series in q and r, and we adapt our square brackets notation accordingly:

where the leading order terms are given by

and the remainders by


Our earlier results yield the following information about these quantities:
Corollary 3.5. Let
$\sigma \in \{ s+\frac 12,s+1\}$
. Then there exists
$\delta>0$
so that for all real
$|\kappa |\geq 1$
and
$|\varkappa |\geq 1$
, we have the estimates


for any
$q\in B_\delta $
. Moreover, for
$q\in \mathcal C([-1,1];B_\delta )\cap X^\sigma _\kappa $
,



where
$g_{12}=g_{12}(\varkappa )$
and
$\gamma =\gamma (\varkappa )$
.
Proof. From (3.20) and (3.38), we see that

Thus, (3.41) will follow once we prove (3.42). Moreover, using also (3.12), we find

and thence

where the second step was an application of (2.6) and (3.27). To handle the remaining rational functions, we expand as series and employ the algebra property (2.5), together with (3.24) and (3.27). This yields (3.41) for
$\delta>0$
sufficiently small.
Next, we prove (3.44), since (3.43) follows from this, (3.38), and (2.14).
In order to prove (3.44), we first employ (3.39). The requisite estimate for the first term was given already in Lemma 3.4. The second summand can be treated by combining that lemma with the algebra property (2.17).
It remains to prove (3.45). Recalling the expansion (3.40), the last two terms are easily controlled using (3.34), (3.35), (3.43), and Lemma 2.4. To control the first two terms, we use (3.12) and (3.32).
4. Conservation laws and dynamics
At a formal level, the logarithmic perturbation determinant
$\log \operatorname{\mathrm{det}} (L_0^{-1}L)$
(multiplied by
$\operatorname {\mathrm {sgn}}(\kappa )$
) is given by

For
$\ell>1$
, the trace is well-defined because the operator is trace class. For
$\ell =1$
, this fails; however, in view of (3.6), it is natural to regard the trace as being zero in this case. In fact, (3.6) implies that only the even
$\ell $
contribute to this sum.
With this in mind, we adopt the following as our rigorous definition of A:

We will prove the convergence of this series in Lemma 4.1 below, as well as deriving several other basic properties.
The quantity A is readily seen to be closely related to the quantity
$\alpha (\kappa ;q)$
that formed the center point of the analysis in [Reference Killip, Vişan and Zhang38]. Concretely, for
$\kappa \geq 1$
,

(see (4.3) below). In that paper, it was shown that
$\alpha (q)$
is preserved under the NLS and mKdV flows. In fact, the argument given there even shows that
$A(\kappa ;q)$
is conserved. However, for our purposes here, we need several stronger assertions of a similar flavor.
First, we need that
$A(\kappa ;q)$
is conserved under all flows generated by the real and imaginary parts of
$A(\varkappa ;q)$
for general
$\varkappa $
. This is proved in Lemma 4.3 below, and will yield the conservation of
$\alpha $
under our regularized Hamiltonians. This allows us to obtain a priori bounds for these regularized flows.
Second, we rely on our discovery of a microscopic expression of the conservation of A; this will be essential in our development of local smoothing estimates. The relevant density
$\rho $
is introduced in Lemma 4.1 (see (4.6)). The corresponding currents (for various flows) are collected in Corollary 4.14, building on a number of intermediate results.
Lemma 4.1 (Properties of A).
There exists
$\delta>0$
so that for all
$q\in B_\delta $
and real
$|\kappa |\geq 1$
, the series (4.1) defining A converges absolutely. Moreover,




Proof. First, we observe that the series (4.1) converges absolutely and uniformly for
$|\kappa |\geq 1$
and
$q\in B_\delta $
, provided
$0<\delta \ll 1$
. This follows from the estimate (3.7). In the same way, convergence holds for the term-wise derivative of the series (4.1) with respect to
$\kappa $
. The terms appearing are exactly those from (3.18) and (3.19), and so we may deduce that

This proves the first assertion in (4.5) as well as justifying 1.10. The second assertion of (4.5) then follows, since (3.7) guarantees that
$A(\kappa )\to 0$
uniformly on
$B_\delta $
as
$|\kappa |\to \infty $
.
The conjugation symmetry (4.3) follows immediately from (3.15) and (4.5).
Differentiating the series (4.1) with respect to r yields the series (3.19) for
$g_{12}$
with an additional minus sign, thus giving the second assertion in (4.4). The first assertion follows in a parallel manner, or by invoking conjugation symmetry. The third part of (4.4) follows from the first two parts via (3.11).
We now turn our attention to (4.6). First, we must clarify what we mean by
$\int \rho $
. When
$q\in \mathcal {S}$
, then
$\rho $
also belongs to Schwartz class (for
$\delta $
small enough), and so the integral can be taken in the classical sense. For
$q\in H^s$
, however, we interpret this integral via the duality between
$H^s$
and
$H^{-s}$
, noting that

(see Corollary 3.5). By density and continuity, it suffices to verify (4.6) for
$q\in \mathcal {S}$
.
Differentiating (3.12), (3.13), and (3.31) with respect to
$\kappa $
and then combining these with the original versions shows

Using also (3.11), we obtain

These identities then combine to show

which can then be integrated in x to yield

The veracity of (4.6) then follows by observing that both sides of (4.6) vanish in the limit
$|\kappa |\to \infty $
.
Next, we show that our basic Hamiltonians arise as coefficients in the asymptotic expansion of
$A(\kappa )$
as
$\kappa \to \infty $
. This will also be important for introducing our renormalized flows later on.
Lemma 4.2. For
$q\in B_\delta \cap \mathcal {S}$
,

as an asymptotic series on Schwartz class.
Proof. While the first few terms can readily be discovered by brute force, we follow a systematic method based on the biHamiltonian relations

which, in view of (4.4), are merely a recapitulation of (3.12) and (3.13).
By iterating (4.8), we find

which can then be integrated to recover the series for A; indeed,

In following this algorithm, we have found it convenient to successively update the asymptotic expansion of
$\gamma $
using (3.31), rather than compute
$\partial ^{-1}(r\tfrac {\delta A}{\delta r} - q\tfrac {\delta A}{\delta q})$
by laboriously finding complete derivatives. We record here the key result:

This technique is easily automated on a computer algebra system, which we have done as a check on our hand computations.
Although the mechanical interpretation of the Poisson bracket (1.1) originates in real-valued observables F and G, the definition makes sense for complex-valued functions as well. In view of the conjugation symmetry (4.3), the following guarantees the commutation of both the real and imaginary parts of A:
Lemma 4.3 (Poisson brackets).
There exists
$\delta>0$
so that for all real
$|\kappa |,|\varkappa |\geq 1$
and
$q\in B_\delta \cap \mathcal {S}$
, we have

Proof. If
$\kappa = \varkappa $
, there is nothing to prove. Suppose now that
$\kappa \neq \varkappa $
. From (4.4) and then (3.14), we deduce that

As shown already in [Reference Killip, Vişan and Zhang38], the conservation of
$A(\kappa )$
leads to global in time control on the
$H^s$
norm. Rather than simply recapitulate that argument, which was based on the series (4.1), we will present a proof that brings the density
$\rho $
to center stage. This approach will be essential later, when we introduce localizations (see Lemmas 5.2 and 6.3).
Proposition 4.4 (A priori bound).
There exists
$\delta>0$
so that for all
$q\in B_\delta $
and
$\kappa \geq 1$
, we have

Choosing
$\delta>0$
even smaller if necessary, we deduce the a priori estimate

for any Hamiltonian flow that is continuous on Schwartz class and preserves
$A(\varkappa )$
for all
$|\varkappa |\geq 1$
.
Proof. We first decompose
$\rho (\varkappa )= \rho ^{[2]} (\varkappa )+ \rho ^{[\geq 4]}(\varkappa )$
with


Inspired by (4.2), we compute

and so, invoking (2.7), deduce that

On the other hand, interpolating the bounds in (3.42), we find

and consequently,

Thus, (4.11) follows by choosing
$\delta>0$
sufficiently small.
To deduce (4.12), we exploit continuity in time.
Proposition 4.4 is the key to proving equicontinuity of orbits. The proper extension of the notion of equicontinuity from the setting of the Arzelà–Ascoli theorem to Sobolev spaces was discussed already by M. Riesz [Reference Riesz51].
Definition 4.5 (Equicontinuity).
A set
$Q\subset H^s$
is said to be equicontinuous if

Beyond boundedness and equicontinuity, the other key ingredient needed for compactness is tightness (see Definition 6.1).
Proposition 4.6 (Equicontinuity of orbits).
Suppose that
$Q\subset B_\delta \cap \mathcal {S}$
is equicontinuous in
$H^s$
. Let
$H_1,H_2$
be Hamiltonians with flows that are continuous on Schwartz class and preserve
$A(\varkappa )$
for all
$|\varkappa |\geq 1$
. Then the set

is equicontinuous in
$H^s$
.
Proof. By Plancherel (cf. [Reference Killip and Vişan37, Section 4]), it is easy to show that a bounded set
$Q\subset H^s$
is equicontinuous if and only if

The result then follows directly from the estimate (4.12).
Next, we address the question of how
$\gamma $
,
$g_{12}$
, and
$g_{21}$
evolve when taking
$A(\kappa )$
as the Hamiltonian. As a complex-valued function,
$A(\kappa )$
cannot be a true Hamiltonian. Nevertheless, there is a natural vector field associated to it by Hamilton’s equations. We caution the reader that this vector field does not respect the relation
$r = \pm \bar q$
. Ultimately, we would like to restrict to the real and imaginary parts of
$A(\kappa )$
; however, it is convenient to temporarily retain this illusory complex Hamiltonian and recover the real and imaginary parts later using (4.3). This context is important for our next two results: the evolution equations we derive for the
$A(\kappa )$
vector field really represent a complex linear combination of the vector fields associated to the real and imaginary parts (taken separately).
Proposition 4.7 (Lax representation).
For distinct
$\kappa ,\varkappa \in {{\mathbb {R}}}\setminus (-1,1)$
,

Equivalently, under the
$A(\kappa )$
vector field,
$U :=[\begin {smallmatrix} 1&0\\0&-1 \end {smallmatrix}] L(\varkappa )$
obeys

Corollary 4.8. Fix distinct
$\kappa ,\varkappa \in {{\mathbb {R}}}\setminus (-1,1)$
. Then under the
$A(\kappa )$
vector field,

Moreover,

and
$\partial _t \gamma (\varkappa ) + \partial _x j_\gamma (\varkappa ,\kappa ) =0$
, where
