Disquisitiones Mathematicae
ERT13: Roth theorem
This post is nothing more than a complement of ERT12. I preferred to separate it to show its own importance, specially in the historic development of the achievements. With the help of the tools developed in ERT12, we prove the
Theorem 1 (Roth, 1953) If has positive upper-Banach density, then it contains arithmetic progressions of length .
His proof relied on a Fourier-analytic argument of energy increment for functions: one decomposes a function as , where is good and is bad in a specific sense. If the effect of is large, it is possible to break it into good and bad parts again and so on. In each step, the “energy” of increases a fixed amount. Being bounded, it must stop after a finite number of steps. At the end, controls the behavior of and for it the result is straightforward. This follows the same philosophy of Calderón-Zygmund theory of singular integrals in harmonic analysis. See this article of Alexander Arbieto, Carlos Matheus and Carlos Gustavo Moreira for further details.
This represented the first affirmative result in the direction of Szemerédi theorem, at that time known as the Erdös-Turán conjecture. Only 16 year later, in 1969, Szemerédi proved the existence of arithmetic progressions of length 4 and, finally, in 1975 proved the conjecture in its full generality.
By Furstenberg correspondence principle (see ERT4), Theorem 1 will follow from
Theorem 2 (Roth, quantitative version) Let be such that . Then
In order to prove this, we apply Koopman-von Neumann theorem obtained in ERT12 to write
where the other terms are the sum of 7 expressions of the type
where at least one of is weak mixing. It turns out that the appearance of at least one weak mixing function vanishes the above limit. This is the content of the
Proposition 3 If at least one of is weak mixing, then
Proof: We assume is ergodic. The general result follows by ergodic decomposition. We will prove that if or is weak mixing, then
We can also assume this because
and so the positions of can be switched. To prove (2), we make use of van der Corput trick. Defining , we have
and then
where in the last equality we used the ergodicity of . If ,
which guarantees that
Proposition 3 proves Theorem 2. In fact, by (1),
The component is non-negative, by Proposition 4 of ERT1. In addition,
because , being weak mixing, is orthogonal to the constant functions, that is, has zero mean. This guarantees the non-negativity of the inferior limit. To prove it is positive, we proceed as in ERT10: given , the norm for a syndetic set of iterates . If is sufficiently small, to each of these iterates the integral is positive. This concludes the proof of Theorem 2.
Previous posts: ERT0, ERT1, ERT2, ERT3, ERT4, ERT5, ERT6, ERT7, ERT8, ERT9, ERT10, ERT11, ERT12.
Symposium “Abel Prize 2010”
Since the end of March (more precisely March 30), I’m participating of the “Dynamics and PDEs” trimester at Institut Mittag-Leffler. During this trimester (ending on June 15), I had several opportunities to attend interesting talks and minicourses. In fact, they have 4 talks (of 1 hour) each Tuesday and Thursday and J. C. Yoccoz is giving his minicourse (1 hour every Wednesday) on his 220 pages long paper with J. Palis (about Nonuniformly Hyperbolic Horseshoes) during “normal” weeks, while special thematic weeks (varying from abstract Ergodic Theory, interval exchange maps, non-uniformly hyperbolic dynamics and KAM theorems for PDEs) had 4 talks per day. In particular, I got a lot of new ideas for future posts, although it will take some time to publish them. Indeed, the excellent scientific ambient provided by Institut Mittag-Leffler stimulated me (and my coauthors) to write down our ongoing projects in a more systematic way during the available period between the talks, so that my time for new posts was somewhat reduced (besides that, I should confess that it is difficult to resist taking a bicycle to visit Stockholm during summer weekends ).
In any case, I would like to start the series of posts related to my stay in Stockholm with the symposium “Abel Prize 2010” held at the Royal Swedish Academy of Sciences. This symposium occurred on May 31 (Monday) and there were 2 non-technical talks (of 5 and 15 min. describing the Abel Prize) and the following 4 mathematical talks:
- Applications of Tate’s work to cyptography by J. Håstad (KTH, Stockholm);
- The arithmetic of elliptic curves by the Abel Prize 2010 laureate John Tate;
- Point count statistics for families of curves over a fixed finite field by P. Kurlberg (KTH, Stockholm);
- Detecting elements in the Grothendieck ring of varieties by T. Ekedahl (Stockholm University).
As one could expect from this kind of symposium, it was mostly accessible to a non-specialist (like me). In fact, I attended the first 3 talks and they were really joyful: the speakers went directly to the heart of the matter with the minimum possible technicalities. In particular, I decided to take the advice of my friend David Damanik to write down a sketch of these lectures. Of course, the curious reader may ask me why I skipped the last talk and the reason is very simple: at the beginning of the symposium, they provided lecture notes for the fourth talk, and the most basic definition (appearing in the first page of these notes) was the concept of Grothendieck-Kontsevich universal group of varieties (in a not-so-simple-to-follow language from category theory); after seeing that the 3 previous talks started with much humble concepts (such as elliptic and hyperelliptic curves), I thought that this 4th talk would not be suited for a dynamicist (in other words, the propaganda of the 4th talk made at the beginning of the symposium had the opposite effect on me).
Concerning the talks of J. Håstad and P. Kurlberg, let me make a few comments on them before passing to the main focus of this post, namely, J. Tate’s lecture.
Firstly, J. Håstad started his lecture by reviewing some basic facts about crytography: in particular, he explained some well-known basic principles in public-key crytography via the standard Alice and Bob example. After that, he mentioned that N. Koblitz and V. Miller (independently) proposed the use of elliptic curves to perform more efficient (in the sense that the size of the key is smaller [say 70 digits] when compared with previous methods [whose keys have 300 digits say]). This subject is nowadays known as elliptic curve cryptography. Here, the advantage of the algebraic (Abelian group) structure of elliptic curves over finite cyclic groups (for instance) for the public-key cryptography is related to the unfeasibility of solving the so-called discrete logarithm problem: in fact, as it is explained in this Wikipedia article here, when trying to communicate a message (encrypted as an element of an Abelian group), we usually lock it by taking powers and sending this data through insecure networks; thus, the key-exchange protocol is more secure when the solvability of the discrete logarithm problem (i.e., given and , find such that ) on a given Abelian group becomes hard; since the discrete logarithm problem is notably harder over elliptic curves than finite cyclic groups, the use of elliptic curves in such cryptography tasks is more than justified. Nevertheless, by the end of his lecture, J. Håstad explained how one can use the so-called Weil pairing and its properties to make the slight improvements in elliptic curve based protocols.
Secondly, P. Kurlberg gave a nice lecture (based on his paper with Z. Rudnick) about the problem of counting points on hyperelliptic curves over finite fields. More precisely, let be a finite field (of odd cardinality ) and consider square-free monic polynomials of degree . Since is assumed to be square-free, we have that
is a smooth projective hyperelliptic curve of genus or (depending on the parity of ). We denote by the number of -points (i.e., points whose coordinates belong to ). The leitmotiv of Kurlberg’s talk was the limit average behavior of when the genus and/or the cardinality of grows. In order to attack this problem, he recalled the cute approach of comparing our problem with an appropriate random matrix model. Roughly speaking, we write , where are the eigenvalues of Frobenius action on certain cohomology groups. During his proof of the Riemann hypothesis over finite fields, A. Weil showed that for any (compared it with Hasse’s bound over elliptic curves). This allows us to write
where is a unitary matrix (which is well-defined modulo conjugation) and stands for the trace of the matrix . Therefore, one can hope to apply some techniques from random matrix theory (see, e.g., these posts by Terence Tao for an excellent introduction to the subject) to control and, a fortiori, , at least when (or, more precisely, its conjugacy class) becomes “equidistributed”. Using this point of view, N. Katz and P. Sarnak used Deligne’s equidistribution theorem to show that, for a fixed genus , we have that, when , the limiting distribution of is the Haar measure on (unitary symplectic group). On the other hand, P. Diaconis and M. Shahshahani showed that the limiting distribution of is a Gaussian distribution with zero mean and variance 1. Therefore, the limiting distribution of
is a Gaussian distribution (of zero mean and variance 1) when and (in this order). Of course, one can ask what happens when we let and grow at the same time. In this direction, P. Kurlberg and Z. Rudnick showed (in their loc. cit. paper) that one still gets a Gaussian distribution with zero mean and variance 1. Also, by the end of his lecture, he mentioned the problem of understanding the limiting distribution of (where is a smooth curve [not necessarily hyperelliptic]) when is fixed but the genus grows. In this situation, as P. Kurlberg pointed out, a naive approach using random matrix theory can’t work: in fact, since our original problem concerns point counting, we have a trivial constraint
which is clearly not taken into account by the Gaussian distribution (as the previous inequality is violated for any close to the identity [when ]). In this context, P. Kurlberg mentioned a recent paper joint with E. Wigman where they constructed specific families of curves of increasing genus over fixed finite field whose limiting distribution is Gaussian.
Finally, after all these preliminaries, let’s start discussing Tate’s lecture.
Remark: Besides my notes, I used also some nice pictures (taken by my wife Aline G. Cerqueira) to illustrate today’s post.
-Elliptic curves-
Let be a field, e.g., or . An elliptic curve is a smooth projective curve of genus 1 (i.e., topologically a torus) defined over with a -rational point . Any elliptic curve admits an algebraic (plane) curve model with non-vanishing discriminant (this last condition is the algebraic incarnation of the smoothness assumption on our elliptic curve). For some introductory material on elliptic curves (and some references), see these links here and here.
We denote by the set of -rational points of . It is well-known that is an Abelian group: from the naive point of view, we declare that whenever are collinear (this makes sense because a line intersects the zero set of a cubic equation within 3 points [counting multiplicities]), and from the advanced point of view, we say that the map from to the group of divisor classes of degree 0. See the photo of Tate’s slide below and this link for nice illustrations of the naive point of view, and this post (from the nice blog “Rigorous Trivilities”) for more comments on the advanced point of view.
Abelian group structure on E(K)
Below, we find a photo showing J. Tate explaining the example of the elliptic curve : here, it is indicated 14 integral points, namely, where and , the discriminant and the fact that the (Abelian) group of -rational points is isomorphic to in the present case is mentioned. Also, J. Tate introduces a height function
where , ( and coprimes), so that because .
An example
-Mordell-Weil theorem-
Once we know that is an Abelian group, one may ask what kind of Abelian group can be. The answer is provided by the Mordell-Weil theorem:
Theorem 1 (Mordell-Weil) Let be a number field. Then, is a finitely generated Abelian group.
Remark 1 This theorem was proved by L. Mordell in the case and by A. Weil in the general case .
Remark 2 In the sequel, we’ll present Mordell’s “accidental” proof. As pointed out by J. Tate (compare with the photo below), he says that Mordell’s proof was “accidental” because when he asked (personally) L. Mordell about how the idea of the proof came out, Mordell said that he was trying to prove other results when he realized that his arguments gave a proof of this theorem.
Mordell-Weil theorem
Proof: The argument can be divided into two parts:
- firstly, one shows that is a finite group;
- secondly, one construct a height function verifying the following properties
- (a) for every , the set is finite;
- (b) there exists a constant (depending on the elliptic curve ) such that for all ;
- (c) for some constant depending on the elliptic curve and the point .
The first part of the argument (claiming that is finite) is known as weak Mordell-Weil theorem. Since its proof is beyond the scope of this post, we recommend to the interested reader this link here for a proof using group cohomology and this .ps file here for a proof using some commutative algebra (and number theory).
The second part of the argument involves the construction of appropriate height functions: while this is not hard ([at least when ] since an adequate modification of the height function introduced above does the job), we’ll assume its existence because it is not the main point of Mordell’s proof (in the sense that any height function with the previous properties is sufficient to perform the argument, as we’re going to see). We refer the reader to the loc. cit. .ps file for further details on the construction of these height functions.
From this point, we can derive the Mordell-Weil theorem as follows. From the weak Mordell-Weil theorem, we can select a finite set of representatives of the (finite) set . By definition, given a point , there exists such that , i.e., for some . Using the properties of the height function, we see that
so that
where . For our future purposes, we introduce . We claim that the previous estimate implies that is generated by the finite set
of -rational points with height at most (we’re using here the property (a) of ). Indeed, this fact is easy to derive intuitively (via a modification of Fermat’s infinite descent argument): if we start with a point of height , we can write it as where and we saw that any can be written as where , which is, roughly speaking, half of the size of ; hence, we can iterate this procedure finitely many times (i.e., ) to write as a finite combination of elements of heights at most (since the height decrease by half at each iteration). More formally, given a point , we take an integer such that (e.g., ), and we write with and as above. Since , we see that . By iterating this process, we see that, after steps, we can write as a sum of points of heights . Thus, by taking , we get that is the sum of points of heights , as it was claimed.
Remark 3 Although the previous argument allows to bound the number of elements of the finite set used to generate a given point , it is not effective because, for instance, there is no efficient method (to the best of my knowledge) to find explicit representatives of .
A direct consequence of Mordell-Weil theorem and the fundamental theorem of finitely generated Abelian groups is:
Corollary 2 is isomorphic to where is a finite (Abelian) group and .
In the literature, is called the torsion subgroup and is the rank of . For example, we saw that in the case of the elliptic curve , so that its torsion group is trivial and its rank is 1.
In the photo below, we see J. Tate showing an example of N. Elkies (discovered in 2006) of an elliptic curve with trivial torsion group and rank (although the precise value of the rank is not known). Also, a list of (the coordinates of) 28 rationally independent points is presented.
N. Elkies example (2006)
-Birch-Swinnerton-Dyer conjecture-
The previous proof of the Mordell-Weil theorem hints a natural way to investigate finer properties of . In fact, as explained by J. Tate in the photo below, one can use some group cohomology to induce some short exact sequence (starting from ) leading to the Selmer () and Shafarevich groups. See this link for more details.
Selmer and Shafarevich groups
As J. Tate pointed out, although Selmer groups are understood (in the sense that they’re finite and computable by the method of descent), it is a hard open problem to decide whether the Shafarevich group is finite!
After this, we can start doing some number theory with elliptic curves in the following way: loosely speaking, given an elliptic curve , we can use the quantities to construct zeta functions
From the expressions of these zeta functions as rational functions of , we can produce numbers (for each prime ), which in turn can be put together to define a L-series (called Hasse-Weil zeta function) via a Euler product (type) expression. See this Wikipedia article on elliptic curves for more discussion and references.
It is know that converges absolutely when (essentially in view of Hasse’s theorem). Furthermore, after the celebrated works of A. Wiles and R. Taylor (among others), we know that this L-series is an entire function of the complex plane satisfying a functional equation relating to : technically speaking, this was derived from the proof of the so-called Shimura-Tanyama conjecture asserting that elliptic curves are completely related to modular forms (some objects with nice L-series attached to them). Another famous consequence of this relationship between elliptic curves and modular forms is Fermat’s last theorem: after Frey, Serre and Ribet, we have that the existence of a solution of (with prime) would imply that the elliptic curve has too little ramifications to be related to modular forms. See this photo of a slide of J. Tate where these facts are resumed.
Consequences of the modularity of elliptic curves
Comments on some possible extensions of the results
As we can see in the previous picture, J. Tate also states the Birch and Swinnerton-Dyer conjecture giving a precise prediction of the behaviour of the L-series (zeta function) near : its expansion (in terms of ) starts with where is the rank of and is an explicit constant depending on the cardinalities of the Shafarevich group and the torsion subgroup (besides some “local” factors ). Nevertheless, he stated his theorem (with Artin) saying that the Birch-Swinnerton-Dyer is true over function fields if and only if the Shafarevich group is finite, and the results of Gross-Zagier and Kolyvagin saying that the Birch-Swinnerton-Dyer conjecture over is true if has a zero of order . Concerning these results, J. Tate thinks that they will be extended to totally real fields , but we’re still not capable of attacking the cases of higher rank () elliptic curves (or not totally real).
Closing his lecture, J. Tate reported on three recent results. The first one is due to Manjul Bhagarva. Given an elliptic curve , we consider its algebraic curve model . This permits to order them using the height function (the exponents of and are chosen in view of the formula of the discriminant of the elliptic curve).
Theorem 3 Using the previous ordering on elliptic curves, we have:
- the average rank is ;
- a positive proportion of elliptic curves have , so that, by the results of Gross-Zagier and Kolyvagin, the Shafarevich group is finite, the Birch-Swinnerton-Dyer conjecture is true and the rank is zero for a positive proportion of elliptic curves (over );
- If the Shafarevich group is finite for every elliptic curve , then a positive proportion of elliptic curves have rank equal to 1.
In the picture below, J. Tate stresses out that the first item (on the average rank) is an unconditional result (in the sense that it doesn’t depend on any conjecture such as the generalized Riemann hypothesis or Birch-Swinnerton-Dyer conjecture). Also, he pointed out other interesting results of M. Bhagarva (such as the fact that the average size of the Selmer group is 3).
Potpourri of results of M. Bhagarva
The second and third results concern elliptic curves and Hilbert’s 10th problem (about the existence of algorithms capable of solving Diophantine equations). More precisely, after the works of B. Poonen, A. Shlapentokh and K. Eisenträger, we have:
Theorem 4 Suppose that, for every cyclic extension of prime degree of number fields, we can find an elliptic curve such that the (i.e., there are elliptic curves whose rank doesn’t increase with the extension ). Then, Hilbert’s 10th problem has a negative solution over the ring of integers of number fields.
While at a first glance, the hypothesis of this theorem maybe strange, it turns out that, after the work of B. Mazur and K. Rubin (accepted for publication in Inventiones Mathematicae), we have an explicit criterion for the verification of this hypothesis:
Theorem 5 If the Shafarevich group is finite, then the hypothesis of the previous theorem is always satisfied.
In other words, these two results together say that the conjecture of the finiteness of the Shafarevich group implies a negative answer to Hilbert’s 10th problem over the ring of integers of number fields.
Elliptic curves and Hilbert's tenth problem
ERT12: Kronecker factor – coexistence of compact and weak mixing behaviour
We have studied two behaviors that are, in the spectral sense, disjoint. It turns out that they are also complementary, meaning that every measure-preserving system can be decomposed in a component which behaves in a compact fashion and its complement in a weakly mixing way. The compact component defines a -algebra known as the Kronecker factor, which encapsulates all the linear structure of the system. Actually, the next post will prove this component is enough to obtain the case of Furstenberg theorem: this is known as Roth theorem, who first proved the existence of arithmetic progressions of length 3 in subsets of with positive density.
As we saw in ERT8 and ERT10, weak mixing and compact systems are identified by verifying numerical and topological properties of the sequences . Here, stands for the -action defined by the mps and also for the Koopman-von Neumann operator . More specifically, is weak mixing iff
for any such that , and compact iff is pre-compact in for any .
Yet considering the spectral point of view, let be a Hilbert space and a unitary operator. Motivated by the above situation, let us make the following
Definition 1 An element is weak mixing if
and almost periodic if the set is pre-compact in .
Consider the sets
The main theorem of this post, due to Koopman and von Neumann, estabilishes the equality . In order to do this, we characterize and in different manners.
Proposition 2 is equal to the closure of the union of eigenspaces:
Proof: The inclusion is clear: if , the closure of is equal to , where is a subgroup of . To the reverse inclusion, consider almost periodic and let . The restriction map is a compact operator. Being also self-adjoint, the spectral theorem implies the existence of an orthonormal basis of eigenvectors of . In particular, is in the closure of the subspace generated by them, which concludes the proof.
The alternative characterization of uses the van der Corput trick (see ERT9). Let us remember it.
Theorem 3 (van der Corput trick) If is a bounded sequence in a Hilbert space and if
then .
Proposition 4 .
Proof: By Lemma 4 of ERT8,
This last expression may be rewritten as
so it is enough to prove that
We use van der Corput trick to this matter: define . Then
whose absolute value is bounded by and so
which goes to zero by assumption. This proves (1) and concludes the proof.
We now proceed to the announced result.
Theorem 5 (Koopman-von Neumann) .
Proof: First, note that if and , say , then
and so . This proves the inclusion .
For the reverse inclusion, let . We want to show that is weak mixing, that is, that
Consider, for each , the Hilbert-Schmidt operator (see the Appendix) defined as
and the operator . The claim will follow if we manage to prove that . We have
1. is a well-defined linear operator: what we can guarantee is that, for every , exists. In fact, is a unitary linear operator on the tensor product , whose inner-product is
,
and so, by von Neumann theorem, the limit of
exists. By Riesz representation theorem, there exists a unique for which
The linearity of follows from the linearity of the inner-product.
2. is a compact operator: we remark that, being defined as the weak limit of , is not compact by definition. This is the reason we consider the Hilbert-Schmidt operators.
For each , the norm of is at most . Then is the weak limit of a sequence of bounded Hilbert-Schmidt operator. By the Appendix, is compact.
3. commutes with : this follows from the “almost” commutativity of and . In fact,
and this last expression goes to zero as .
4. For any , is almost periodic: in fact,
is pre-compact, by the compactness of .
By hypothesis, is orthogonal to every almost periodic element. In particular, , completing the proof.
Going back to the ergodic theoretical setup, we obtain
Corollary 6 The mps is weak mixing iff and compact iff .
Let be the -algebra generated by the eigenfunctions of (it is the smallest -algebra for which every eigenfunction is measurable).
Definition 7 is called the Kronecker factor of .
One can show that is the inverse limit of a sequence of -algebras, each of them being isomorphic to a rotation on a abelian group. Such explicit characterization allows to prove the existence of double ergodic averages, but this is a topic for later posts.
1. Appendix: Hilbert-Schmidt operators
Definition 8 A continuous linear operator is Hilbert-Schmidt if there exists an orthonormal basis such that
In this case, the norm of is defined as . The main properties we used above are:
HS1. Every Hilbert-Schmidt operator is compact.
HS2. The limit, in either the norm, strong or weak norm topologies, of a sequence of bounded Hilbert-Schmidt operators is Hilbert-Schmidt.
The interested reader is invited to read this post of Terence Tao.
Previous posts: ERT0, ERT1, ERT2, ERT3, ERT4, ERT5, ERT6, ERT7, ERT8, ERT9, ERT10, ERT11.
Z^d-actions with prescribed topological and ergodic properties
Last Moday, I gave a talk at the Bicentennial Workshop of Dynamical Systems, which has been held at the beautiful city of San Pedro de Atacama, Chile. This was my first talk in a conference and was based on the work I did while I was at The Ohio State University last year, under the advising of Vitaly Bergelson. I would like to thank Vitaly and the Department of Mathematics of OSU for its great hospitality and possibility of the development of nice conditions of work and also Carlos Matheus for, once more, inviting me for writing in this blog, which is among the 50 best blogs for Math majors (see the ranking here).
The theorem is a topological counterpart to Bourgain´s theorem on the a.e. existence of ergodic averages along polynomials (see ERT3) for -actions. It extends a previous work of one of Bergelson´s student, Ronnie Pavlov, who proved it for -actions.
The idea is to construct an increasing sequence of finite alphabets/words with controlled combinatorial and statistical properties in such a way that we have freedom of combinatorics at a set which is negligible (in the sense of zero upper-Banach density) inside .
As the slides are self-contained, it would be a loss of time for me and loss of patience for the reader to repeat everything, so I´m just making them available here. I hope you learn something with it!
Axiom A versus Newhouse phenomena for Benedicks-Carleson toy models
A few days ago, Carlos Gustavo (Gugu) Moreira, Enrique Pujals and I uploaded to Arxiv our paper Axiom A versus Newhouse phenomena for Benedicks-Carleson toy models.
In a previous post, I gave a rough outline (and a link to a short announcement already published by Oberwolfach Reports) of this paper. In particular, we discussed Smale’s conjecture and the main known obstruction to the validity of its analogous among diffeomorphisms of surfaces, namely, Newhouse phenomena (a local mechanism to create robust homoclinic tangencies using -stable intersections of dynamically defined Cantor sets).
However, after taking a look at this previous post, I found out that, although we had three sources of motivation for this project, I have mentioned only two of them. More precisely, our strategy can be resumed in a few words as:
- Try to extend (to the two-dimensional setting) a “modern” proof nicely sketched in the excelent book of de Melo and van Strien of a theorem of M. Jakobson on the -denseness of Axiom A among unimodal maps of the interval (i.e., Smale’s conjecture in one dimension);
- During this process, apply Gugu’s theorem about the non-existence of -stable intersections of dynamical Cantor sets to get rid of the eventual tangencies (our potential enemies [in view of Newhouse phenomena]) obstructing Axiom A;
- Because it isn’t clear how to perform the previous scheme starting with arbitrary diffeomorphisms of surfaces (since a priori robust tangencies aren’t the sole enemies), we restrict our considerations to the so-called Benedicks-Carleson toy models: they are a simplified version of Henon dynamics with sufficiently simple geometry (so that our ideas could be tested without entering a huge number of technical troubles) but sharing part of the dynamical complications of (true) Henon maps (e.g., they exhibit some sort of Newhouse phenomena). See our paper for the precise definition of these toy models.
Concerning these three items, the quoted post touches upon the last two topics. Therefore, besides making some propaganda for the preprint , I decided to take the opportunity to talk about Jakobson’s theorem and how it connects to the strategy of our paper.
--denseness of Axiom A among unimodal maps-
Theorem 1 (Jakobson) Axiom A is -dense among the set of unimodal maps of the interval .
Remark 1 Recall that a map of the interval is called unimodal if has exactly one critical point .
The original argument of Jakobson is a little bit long because he deals with multimodal maps (i.e., intervals maps with several critical points) and his criterion of uniform hyperbolicity is somewhat technical. For these reasons, as we already pointed out, we will adopt here a modern strategy outlined in the excelent book of W. de Melo and S. van Strien: using Mañé’s criterion of hyperbolicity for intervals maps (certainly not available at the time of writing of Jakobson’s paper), we know that uniform hyperbolicity holds for the set of points whose orbits stay “away” from the critical point. Thus, it remains only to analyze the dynamics nearby the criticality. At this point, the idea is the following: after a -small perturbation, it is possible to put the orbit of the critical point inside the basin of attraction of a sink (i.e., an attracting periodic point). Once we get this fact, the proof of Jakobson’s theorem is complete since we are showing that the points of the non-wandering set should stay far away from the critical point. Indeed, if the orbit of a point of the non-wandering set is very close to the critical point, it falls into the basin of a sink (since we are assuming that the critical point is absorbed by a sink). In particular the orbit of this point is wandering, a contradiction. By Mañé’s criterion, it follows that the non-wandering set is the union of finitely many sinks and a compact invariant hyperbolic set (i.e., the map is Axiom A).
Remark 2 Usually, Axiom A requires that the periodic points are dense in the non-wandering set and the non-wandering set is hyperbolic. However, since the endomorphisms considered above are not invertible, generally speaking the non-wandering set is not invariant by backward iteration, so that a slight modification of the Axiom A is needed.
Bearing this plan in mind, let us begin the proof of Jakobson’s -density result. We begin with a fundamental criterion of hyperbolicity due to Mañé:
Theorem 2 (Mañé) Let be a endomorphism of the interval and be an open neighborhood of the set of critical points of . Denote by the union of the basins of attractions of the sinks of . It holds:
- any periodic orbit of inside with sufficiently high period is a source;
- if every periodic point of inside is a source, then there are some constants and such that for any .
Corollary 3 (Mañé) Let be a endomorphism of the interval and be a compact -invariant set. Assume that all periodic point of are sources and does not contain critical points of . Then, is hyperbolic, i.e., there are constants and such that for all and .
Remark 3 The reader should pay attention to the regularity hypothesis in the statements above (which is a necessary assumption).
Keeping these results in our toolbox, we are ready to begin the proof of the density of Axiom A among unimodal maps. Let be a unimodal map of the interval . Without loss of generality, we can suppose that:
- is
- is the critical point of and
- is Kupka-Smale (all periodic points are hyperbolic).
For sake of simplicity, we denote by the set of endomorphisms of verifying the three conditions above.
Proposition 4 For a typical (Baire generic) , the critical point is recurrent or it falls into the basin of some sink.
Proof: For each , let be the set of such that either for some or falls into the basin of a sink of . We claim that is open and dense (for every ). Indeed, since is clearly open, our task is reduced to show that is dense.
Given and , consider
Note that is a compact invariant set without critical points and sinks. Since is Kupka-Smale, every periodic point inside must be a source, so that the corollary 3 implies that is hyperbolic, i.e., for some , , we have for all . In particular, is a Cantor set (i.e., a compact set with empty interior). In fact, if the interior of were non-empty, say where is a non-trivial, it would follow that so that the condition of hyperbolicity implies for all . Thus, when , a contradiction with and . On the other hand, since , it follows that .
At this point, we use the following very simple idea: because is a Cantor set, the condition can not be generic (so that it can be destroyed by small perturbation). More precisely, we consider a -small perturbation of supported on such that . We claim that . Indeed, since coincides with outside , we get
Thus, if , we have
Hence, using that by construction, we obtain , that is, is attracted by a sink of . However, since the orbit of under (and consequently ) never touches the interval , one sees that the orbit of sink of attracting belongs to . Therefore, the sink of attracting is also a sink of . In other words, , i.e., , an absurd. This shows that is open and dense for any .
Finally, we complete the proof of the proposition by taking the residual set of . Clearly, any verifies the statement of the proposition. This finishes the proof.
Remark 4 The argument used in the proof of this proposition does not rely on the -topology so that a similar statement holds with respect to the -topology for any .
Up to now, we do not used the -topology in our discussion. However, the next proposition strongly relies on the -topology and it is the main obstruction for an extension of Jakobson’s result to the -topology, for instance.
Proposition 5 (Flatness perturbation) Let such that is recurrent, i.e., . Then, there exists an arbitrarily small -perturbation of such that belongs to the basin of attraction of a sink of , i.e., .
Proof: Since the management of this argument is a little bit hard (while the idea behind it is quite clear), we’ll just sketch the proof of the proposition. We start by selecting a large integer such that is very close to and for all .
Denote by . Since , we can made arbitrarily small and we can select such that . For sake of convenience, we take minimal for the property . We have two possibilities:
- (a) is more close to than , i.e., ;
- (b) is more close to than , i.e., .
In the first case (a), we apply a “Closing Lemma” argument: take to be the middle point between and and a small open neighborhood of with for all . See the figure below.
Case (a): is more close to 0 than .
In this situation, we modify inside as follows: denoting by , take -close to so that , has an unique critical point at and outside . It follows that is unimodal with critical point such that is periodic (the fact that the recurrent point becomes periodic justifies the name “Closing Lemma” for this argument). By a -perturbation of , we obtain verifying the conclusion of the proposition (in fact, it turns out that here the critical point itself is a super-attracting sink).
Finally, in the second case (b), we introduce the “flatness perturbation”: take to be the middle point between and and a small open neighborhood of such that for all with . See the figure below.
Case (b): is more close to than 0.
In this context, we modify as follows. Take -close to so that is constant on and . Observe that this perturbation can be done (only) in the -topology since we know that is a critical point of and are fairly close to . Next, note that is a periodic point of with period . Moreover, is a super-attracting sink of (because is constant on and thus ). This allows us to make a -perturbation of so that is unimodal with derivative almost zero (but never vanishing) on and possesses a sink whose basin contains . See the figure below.
Flatness perturbation -- Case (b).
Consequently, is -close to and . This completes the proof.
Once these two propositions are proved, Jakobson’s theorem follows directly. Indeed, given a endormorphism of the interval , we can assume that (as discussed before). Next, we approximate by a “typical” so that the proposition 4 says that either or .
If , say the critical point falls by iteration into the basin of a sink of ), we see that there exists such that the neighborhood falls into the basin of attraction of the sink . On the other hand, since , we know that is the union of an uniformly expanding hyperbolic set and a finite number of sinks. In particular, we get that is an Axiom A endomorphism arbitrarily -close to .
If (i.e., the critical point is recurrent), we apply the proposition 5 to get an endomorphism arbitrarily -close to so that . Therefore, the discussion of the previous paragraph gives us some Axiom A endomorphism arbitrarily -close to .
Thus, in any case, can be -approximated by an Axiom A endomorphism. This completes the proof of Jakobson’s theorem.
- Few comments on -denseness of Axiom A for Benedicks-Carleson toy models-
As we saw in the previous section, the “modern” proof of Jakobson’s theorem has two ingredients: Mañé’s hyperbolicity criterion and adequate perturbations to force critical points to fall into the basins of sinks. Of course, in the two-dimensional setting of Benedicks-Carleson toy models (corresponding to our article), we can’t use directly Mañé’s hyperbolicity criterion and we have a Cantor set of critical points (we should deal with at the same time). In particular, we replace Mañé’s criterion by its two-dimensional version, namely, the theorem B of E. Pujals and M. Sambarino. This theorem is a fundamental tool used by Pujals and Sambarino in their solution of Palis conjecture for -surface diffeomorphisms (stating that Axiom A and homoclinic tangencies are -dense among surface diffeomorphisms). In our context, the theorem B of Pujals-Sambarino implies that any maximal invariant set (of Benedicks-Carleson toy models) far away from the critical set is hyperbolic.
Using this fact, it remains to deal with the critical points. Recall that, in the one-dimensional setting, it was very easy to show that the unique critical point is typically recurrent: in fact, since the set of orbits staying away from the critical point form a Cantor set (by Mañé’s criterion), we can make small perturbations of the orbit of the critical point avoids this Cantor set. However, in the two-dimensional setting, we have to ensure that the orbits associated to a Cantor set of critical points avoid the stable manifold of a hyperbolic set. Of course, this is a subtle problem because the perturbations removing some critical points from the stable manifold of a hyperbolic set may create other critical points belonging to this stable manifold. The attentive reader sees that this kind of problem resembles Newhouse phenomena (and stable intersections of Cantor sets) and, actually, this is the case: morally speaking, critical points belonging to stable manifolds of hyperbolic sets corresponds to certain (heteroclinic) tangencies. Thus, we apply Gugu’s theorem (on the non-existence of -stable intersections of Cantors sets) to get recurrence of critical points. Finally, using the simple geometry of these models, we combine the recurrence of critical points with an appropriate “flatness” perturbation argument to ensure that the Cantor set of critical points -typically falls into the basins of a finite number of sinks.
In any case, this completes our discussion. See you!
ERT11: Conjugation, equivalence and similarity of measure-preserving systems
In order to characterize compact systems, we discuss the notions of conjugacy between measure-preserving systems. We will do much more than needed for this matter, as I think these notions are important for any ergodic theorist.
Given two mps and , we want to investigate when they are compatible in some sense. There are three main notions in this respect:
- Similarity.
- Conjugacy between boolean algebras.
- Spectral equivalence.
The main reference of this lecture is Lectures on Ergodic Theory, from Paul Halmos.
1. Similarity
Definition 1 Two mps and are similar if there exists an invertible bimeasurable transformation such that
In this case, is called an isomorphism.
Notice that only needs to be defined in almost every point of . The main invariant of similarity is metric entropy.
Theorem 2 If and are similar, then the entropy of with respect to is equal to the entropy of with respect to .
The above result gave the first proof that the two-symbol and three-symbol shifts, both endowed with the natural metrics, are not similar. Similarity is the most usual notion we will discuss. Because of this, many can be found in any book of ergodic theory.
2. Conjugacy between boolean algebras
In many situations, the difficulty in constructing isomorphic bijections from isomorphisms is to deal with sets of measure zero. Because of this, we discard them by considering the set
of all subsets of measure zero. This is an ideal of in the following sense:
- 1. If and , then .
- 2. If , then .
These properties allow the definition of the quocients
is a set of equivalence classes defined by the equivalence relation
and, if is the equivalence class of ,
defines a function on . Obviously, these do not depend on the representant of the class.
Definition 3 is the boolean algebra and the measure algebra of the probability space .
is a boolean algebra under the natural boolean operations for sets: union, intersection and complement. It is clear that
that is, is the only element of zero measure and, by the same reason, is the only element of total measure.
Definition 4 Two probability spaces and are conjugate if there exists an invertible bimeasurable map , called boolean isomorphism, such that
- is an isomorphism in the boolean algebra category: it preserves union, intersection and complement.
- .
A measure-preserving transformation on induces in a natural way a mapping of into itself: the image of an equivalence class under this mapping is defined by selecting a representative and forming the equivalence class of ,
The measure-preserving character of implies that the image class is unambiguously determined by this process and that the measure of the image class is the same as the measure of the original one. Because preserves , this is well-defined. Observe that is not a map sending points of to . Instead, it sends classes of subsets of to classes of subsets of .
Definition 5 Two mps and are conjugate if there exists an invertible bimeasurable map such that
is also called a boolean isomorphism, being implicit and .
Exercise 1 Prove that similarity implies conjugacy.
Conjugacy does not imply similarity in general. There exist some highly pathological measure spaces that are in a certain vague sense absolutely non-measurable. One way that this pathology shows up is that these measure spaces do not possess sufficiently many measure-preserving transformations to induce all the desired boolean isomorphisms, that is, if the boolean algebra is richer than the original space. On the other hand, conjugagy does imply similarity in all decent measure spaces. By decent I mean Lebesgue spaces, which is a usual assumption in ergodic theory. From now on, we only consider Lebesgue spaces, so that we have the
Theorem 6 (von Neumann) Conjugacy implies similarity.
The proof of the above theorem may be found in the book Ergodic Theory and Differentiable Dynamics by Ricardo Mañé.
3. Spectral equivalence
Definition 7 Two mps and are spectrally equivalent if there is a unitary invertible operator such that
Proposition 8 Conjugacy implies spectral equivalence.
Proof: Let and be two conjugate mps and the conjugacy. The map defines an isometry of the characteristic functions of into . Because the characteristic functions generate a dense set in , the linear extension of the above association defines an an isometry , which is surjective because is so.
If is an isomorphism between the mps and , then the induced unitary map from to preserves more than the norm and the linear structure of the spaces. The more comes from the fact that the elements of are not merely abstract vectors: being functions, they also have multiplicative properties. If are bounded, then also belongs to , as well as , are bounded. In this case, sends bounded function to bounded functions and preserves their product. These properties turn out to be sufficient when going from spectral equivalence to similarity.
Proposition 9 (Multiplicative theorem) Let and be mps and a surjective unitary linear map such that
- (a) and send bounded functions to bounded functions;
- (b) , for every bounded functions .
Then and are conjugate.
Proof: Let be the characteristic function of . Because , and so is the characteristic function of a set . Also, if , then , such that . That is clear, because is unitary. is surjective because (as is multiplicative), is a characteristic function, for every .
preserves intersection because it is multiplicative. Also, if are the characteristic functions of , then is the characteristic function of , which shows that preserves union.
Exercise 2 In the notation of the above proposition, prove that is multiplicative.
Ok. If we have such in hands, the systems are conjugate. But when does exist?
Definition 10 The spectrum of is the set
If conjugates and and , say , then
that is, . This means spectral equivalence implies the equality . In the case of ergodic compact systems (discrete spectrum), this is a sufficient condition.
Exercise 3 Suppose and are unitary operators with discrete spectrum, where , are Hilbert spaces. Prove that and are equivalent iff .
Theorem 11 (Discrete spectrum theorem) If and are ergodic transformations with discrete spectrum, the following assertions are equivalent:
- and are similar;
- and are conjugate;
- and are spectrally equivalent;
- .
Proof: (a) (b) (c) (d) are clear. (b) (a) is Theorem 6. (d) (c) is the previous exercise. Let us prove that (c) implies (b) by checking the conditions of Proposition 9. For each , let , be the unitary eigenfunctions of , associated to , respectively. Because and are ergodic, they are unique up to scalar multiple. We make the following
Assumption. .
Observe that, in general, this is not the case. Actually, what we have is that and are both eigenfunctions of so that . The assumption says we can assume . The reader can check this on page 46 of Lectures on Ergodic Theory.
Let linear such that
By definition, . Also,
and so, by linearity and a denseness argument is multiplicative.
4. The representation theorem
We finally arrive at the expected result.
Theorem 12 (Representation theorem) An ergodic and compact mps is similar to a rotation on a compact abelian group.
Proof: The idea is to construct an ergodic and compact rotation with the same spectrum as the original one. The discrete spectrum theorem will guarantee the conclusion. Let be the mps and . Also, let be the character group of . Consider the element such that
It is clear that . It defines a rotation , , that preserves the Haar measure of . is discrete because of the properties of the characters of . In fact, they form an orthonormal basis of and, if is one of them, then
that is, is an eigenfunction whose eigenvalue is . By Pontryagin duality, the dual group is canonically isomorphic to . Such correspondence shows that the spectrum of is exactly . Moreover, by the same reason, every eigenvalue is simple, so that is ergodic. This concludes the proof.
Previous posts: ERT0, ERT1, ERT2, ERT3, ERT4, ERT5, ERT6, ERT7, ERT8, ERT9, ERT10.
ERT10: Compact Systems
We now turn attention to the structured situation. Consider a measure-preserving system . Remember that, by ERT8, this is the case when the Koopman-von Neumann operator has a basis of eigenvectors. Assume that is the multiset of eigenvalues and is the eigenvector associated to .
Definition 1 We say that is a compact system if
Let’s investigate an example: given , consider the rotation
If , , then
that is, each is an eigenvector. We know, by Fourier analysis, that generates the space , where is the Lebesgue measure on . This implies that is compact. In this setting, it is straightforward to prove multiple recurrence. Assume that is an interval and fix . We want to show the existence of such that
This is easy: given , take such that
for every . This implies that
whenever . If is such that ,
which implies that
Also, note that the set of satisfying is syndetic, so that the above argument actually proves that
The same argument holds for any with positive measure. In fact, by Lebesgue density theorem, given , there exists such that
which is the main inequality used in the above argument. We get our first result.
for any .
Note that the main tool is that, for various , the functions
are simultaneously close to each other. This is what we are looking for: structure (given by the algebraic structure of ) implying “almost periodicity”, as will be explained in the next section.
1. Almost periodicity
Consider a mps .
Definition 3 A function is almost periodic if the set is pre-compact in .
In other words, if is a compact subset of , considered with the norm topology.
We will see below that the systems for which every element of is almost periodic are exactly the compact ones.
At first impression, this does not seen the expected definition we want, but the following proposition clarifies the apparent uncorrelation.
Proposition 4 Given a mps and , are equivalent:
- is almost periodic.
- The restriction of to is a minimal homeomorphism of a compact metric space.
- For every , the set is syndetic.
Proof: (a) (b). Note that the referred restriction is a transitive isometry, by definition. The assertion then follows from the
Exercise 1 Let be a compact metric space and an isometry on . Then is transitive iff it is minimal.
(b) (c). It follows from
Exercise 2 Consider a homeomorphism of the metric space . Then is minimal iff, for any open and , the set
is syndetic. If in addition is an isometry, the above condition is equivalent to
being syndetic for any and .
(c) (a). Given , the syndeticity of guarantees that the closure of is covered by finitely many balls . This proves the compacity of .
Using (c), we obtain
Proposition 5 Let be a mps and almost periodic. Then, for any and , the set
is syndetic.
Proof: Suppose . Then for every . Hence,
Corollary 6 (Multiple recurrence for almost periodic functions) Let be a non-negative almost periodic function such that . Then, for every ,
The next theorem will be proved in ERT11.
Theorem 7 A mps is compact if and only if every is almost periodic.
The if part is easily obtained by means of simultaneous diophantine approximations. For the converse, we need an algebraic characterization of compact systems, to be established in the next lecture.
Corollary 6 says that compact systems are multiply recurrent in the sense of (1). In particular, considering , we get
Corollary 8 (Multiple Poincaré Recurrence for compact mps) If is compact and , , then
for every .
The next post will characterize compact system. They are the Kronecker systems: rotations in abelian groups. This is why the example of studied above deserves attention.
Previous posts: ERT0, ERT1, ERT2, ERT3, ERT4, ERT5, ERT6, ERT7, ERT8, ERT9.
“Vers une vision globale des systèmes dynamiques”
Today I gave a general audience talk (whose title was the same of this post) about Dynamical Systems for the Matinée ChADoC at Collège de France. The basic idea was to reunite 6 PhD and pos-doc students from several knowledge areas (Chemistry, Anthropology, Biology, Indian History, Mathematics and French Literature) to give short talks (i.e., 25 minutes) about their respective research areas.
Remark. For those of you who are curious about the origin of the word ChADoC, I should say that its literal meaning is “Chercheurs Associés et Doctorants” (Associated Researchers and PhDs in a free translation), but actually it is a parody with the cartoon “Shadok” (of “theatre of absurd” style) of great success in France (during the sixties/seventies).
I believe that this matinée was a great opportunity to learn about nanotechnology and its applications, animal sacrifice rituals in Mexico, homeoproteins, Skandasvamin and the Indian history, and Marcel Proust work.
Finally, the slides of my talk (in French) are here. Of course, since it is the first time I try to write something in French, please don’t take my mistakes too seriously!
The local linearization problem of generalized interval exchange maps
Two weeks ago Stefano Marmi posted on Arxiv his joint paper with Pierre Moussa and Jean-Christophe Yoccoz about a local conjugation theorem for generalized interval exchange transformations (see this link for the preprint). Morally speaking, the main goal of the paper is the extension of the theory of smooth linearization of circle diffeomorphisms (of V. Arnold, M. Herman, H. Russmann, J.-C. Yoccoz, etc.) to the case of interval exchange transformations (i.e.t. for short), i.e., they show that, for almost every standard i.e.t. (i.e., is locally a translation), the local -conjugacy class of amongst generalized i.e.t. close to with trivial conjugacy invariant (i.e., no non-trivial obstructions to conjugation at the level of the first derivatives) is a submanifold of explicitly computable codimension.
The basic idea of S. Marmi, P. Moussa and J.-C. Yoccoz is the following: usually, the conjugation problem can be understood from the corresponding cohomological equation, i.e., the linearized version of the conjugation equation; in this respect, the cohomological equations related to i.e.t.’s were already studied (by e.g. G. Forni and S. Marmi, P. Moussa, J.-C. Yoccoz), so that we dispose of nice criteria for its solvability; in particular, if we can convert a solution of the linearized (cohomological) equation into a solution of the conjugacy equation, we are done.
Of course, the previous strategy should be worked in details: for instance, the results of Marmi, Moussa and Yoccoz about the cohomological equations of i.e.t.’s don’t apply directly to the situation at hand and, even after obtaining the necessary results, it is not clear how to convert solutions of cohomological equations into the desired conjugations. Because of the limitation of space, I’ll focus today on the discussion of the second problem (namely, the conversion of solutions of cohomological equations into true conjugations) in the more simple context of circle diffeomorphisms. More precisely, we are going to study M. Herman’s simple proof of a local conjugacy theorem for circle diffeomorphisms using the Schwartzian derivative trick. The reason why we’re restricting our discussion to this particular topic is two-fold: besides the fact that Herman’s trick gives a simple method to exhibit conjugations once we can solve cohomological equations, it is so nice that it can be generalized to the case of i.e.t. (and this one of the crucial remarks of Marmi, Moussa and Yoccoz). My basic references for the material below are M. Herman’s original article, Appendix B of Marmi, Moussa, Yoccoz preprint and my notes of Yoccoz’s 2009-2010 course (at Collège de France) on this paper.
Let’s warm up with the following local conjugacy problem on the circle : given a smooth diffeomorphism close to the rotation of irrational angle , we want to know when is smoothly conjugated to , i.e., we’re searching for a circle diffeomorphism satisfying the conjugacy equation. Of course, the nonlinear nature of the conjugacy equation indicates that we shouldn’t try to attack it directly: in fact, we’ll pursue the standard strategy of linearizing the conjugacy equation in order to get an idea of how to solve the original problem. More precisely, we consider the following ansatz: since is close to , we’ll restrict ourselves to smooth conjugacies close to the identity. In other words, we write and . In this case, becomes , i.e.,
If we think of and as perturbation terms, the first order approximation of is , so that the linearized version of previous equation is the so-called cohomological equation
Here is our initial data and we’re searching a solution of this linear equation. The discussion of solutions of cohomological equations is a recurrent theme in Dynamical Systems (with several applications such as Furstenberg’s example of a minimal non-ergodic area-preserving analytic diffeomorphism of the two-dimensional torus) and the curious reader can look at Hasselblat and Katok’s book (and references therein) for more details.
In the present case, the cohomological equation can be solved in Sobolev scale by Fourier analysis. By taking the Fourier transform in the cohomological equation, we obtain
By comparison of the Fourier coefficients, we get
for every . Observe that the zero-th Fourier mode gives the normalization condition (which is necessary condition for the solvability of the cohomological equation). For sake of simplicity, we take . Observe that there is no loss of generality here since (where is a constant) solves the cohomological equation whenever is a solution of this equation.
From the previous formula, we see that the Sobolev regularity of depends on the sizes of the so-called small divisors , i.e., on the Diophantine properties of .
More precisely, we consider the Sobolev spaces
where for sake of definiteness, and the Diophantine conditions
where and .
Since , we can derive from (1) the following proposition:
Proposition 1 Let and . Suppose that . Then, the solution of the cohomological equation obtained from (1) verifies and
Before proceeding further, let’s make a few remarks about this proposition.
Remark 1 In other words, we are able to solve the cohomological equation (in Sobolev scale) with a controlled loss of derivative depending on the strength of the Diophantine properties of (i.e., we start with and we end up with ). This loss of derivatives phenomenon is well-studied in Dynamical Systems and it can’t be avoided in general.
Remark 2 While the Sobolev scales are useful for many purposes (e.g., in Harmonic Analysis and PDEs), it is less handful when dealing with Dynamical Systems by the following simple reason. Generally speaking, the nonlinear terms of several important PDEs have a polynomial nature (e.g., they are obtained by taking powers of our functions), so that Sobolev spaces can be handful because e.g. they form an algebra with respect to the multiplication when the regularity index is sufficiently high. However, the nonlinear terms of important equations related to Dynamical Systems (e.g., the conjugation equation) are obtained by composition of our functions, and unfortunately Sobolev spaces are bad-behaved with respect to composition.For this reason, it is pretty common to find the Sobolev scale in PDE problems and the (and/or Hölder) scale in Dynamics problems. For instance, a major problem related to this difficulty is the extension of Dolgopyat-type estimate (for exponential mixing) from the hyperbolic case (where Sobolev-like scales can be used) to the case: indeed, although this seems a technical (minor) regularity problem, it is one of the main obstacles to study the rate of mixing of the Lorenz attractor.
In the light of the previous remark, we state the following version of the previous proposition to the Hölder scale:
Theorem 2 (Russmann, Herman, …) Let and . Suppose that and . Then, the solution of (1) verifies and .
The proof of this result is similar in spirit to the Sobolev case: one considers Littlewood-Paley decomposition and apply Hadamard’s interpolation inequalities to handle the Hölder norms. The details can be found in the fourth chapter of M. Herman’s article.
At this stage, our understanding of the cohomological (i.e., linearized conjugacy) equation on the circle is sufficiently developed and we can pass to the study of the initial nonlinear (conjugacy) problem.
-Herman’s Schwartzian derivative trick-
The main result of this post is:
Theorem 3 (M. Herman) Let be a circle diffeomorphism -close to the irrational rotation . Suppose that with . Then, for a unique circle diffeomorphism -close to and close to . Furthermore, the map is .
Remark 3 By direct inspection of the statement, the reader can see that it is not optimal in several senses: we loose derivatives to solve the conjugacy equation, we don’t treat all Diophantine conditions (since we assume ) although this covers a full Lebesgue measure set of angles , etc. However, the relevance of this result consists into its flexible proof.
Remark 4 At a first sight, the appearance of the extra rotation seems strange, but it is necessary to adjust the rotation number of : in fact, we know (from H. Poincaré’s work) that, if is conjugated to a rotation , then its rotation number must be . In other words, we can’t hope to find a conjugation between a diffeomorphism close to unless we use to match the rotation numbers.
The basic idea of M. Herman consists into a slight change of linearization operator: instead of taking usual derivative to analyze the conjugacy equation, he “linearizes” it with the mildly nonlinear Schwartzian derivative (which has a good behavior under composition). We recall that the Schwartzian derivative of is
Amongst its main properties, we can quote:
- (a) if and only if with and ;
- (b) .
The geometrical meaning of the Schwartzian derivative is explained by (a): it measures how far is fractional linear transformations. Also, the fact that Schwartzian derivative is adapted to Dynamics problems is explained by (b): it interacts well with the composition operation.
Coming back to Herman’s theorem, let’s analyze the conjugation equation with the aid of the Schwartzian derivative: we rewrite this equation as and we apply the Schwartzian derivative to get:
i.e., we obtain the following “cohomological equation”:
This linear difference equation on resembles (1) except for the fact that the left-hand side depends on . Nevertheless, this suggests a fixed-point approach to find our solution : we introduce the operator
and we seek for a solution of .
As we learn in ODE courses, we need good (Banach) functional spaces to perform fixed-point arguments. In this direction, we consider the spaces of circle diffeomorphisms with and of functions on with zero mean (in addition to the spaces of circle diffeomorphisms and of functions on ).
We begin with two simple exercises:
Exercise 1 Show that , , is a map whose differential at is
Exercise 2 Show that (defined by ) is a map and is a diffeomorphism near the identity. (Hint: is because is a function. Furthermore, the derivative of at the identity is , so that the inverse function theorem guarantees that is a local diffeomorphism near the identity).
In the sequel, the local inverse of near identity is denoted by (the letter P stands for “primitive”) and its derivative at is denoted by .
To reinforce our arsenal of operators, we use the theorem 2 to construct such that, for every ,
that is, is the solution of the cohomological equation with initial data . Observe that the theorem 2 ensures that is a bounded operator because, by hypothesis, , .
In this notation, a fixed point of the operator solves the cohomological equation (2) modulo the constant , i.e., verifies
that is,
By choosing conveniently, we have the normalization and (to kill off the averages). On the other hand, the equation (3) says that
It follows from the exercise 2 that , i.e.,
where .
Hence, the proof of the main theorem will be complete once we can find fixed points of the operator . Keeping this goal in mind, we note that the differential of at is
Because this linear map is a (super) contraction on the variable , it follows from the implicit function theorem that has a unique fixed point close to the identity for every sufficiently close to (and the map is ). This ends the post.
Jacob Palis 70th birthday conference
During the last 10 days, I was attending J. Palis’ 70th birthday conference held at Atlantico Hotel, Buzios, Brazil. As one could expect from a conference of this size (~220 participants), it was a nice opportunity to talk to coauthors/friends and to learn several interesting tools. Also, since Buzios has plenty of superb beaches (very close to the hotel) and we had 2 hours of lunch time break, the conference atmosphere was an interesting mixture of intense and relaxing moments. Furthermore, I guess that the fact that the young and senior participants were together in the same place provided a great opportunity to the young generations to meet (and talk directly) with the senior generations, so that they could start new collaborations. In particular, I strongly believe that Jacob Palis and the organizers are very proud of this beautiful conference (which was extremely successful in many ways).
Concerning the (intense) conference program (with 5 plenary talks and 2 parallel sessions per day), you can find the details here, but I should advance that I’m planning to write some posts related to some talks (e.g., J.C. Yoccoz, A. Zorich, A. Avila, etc.).
In any case, this post is intended to accomplish two goals: firstly, I’m making the slides of Enrique Pujals’ talk (on Palis’ work) available here (with his permission, of course), and secondly, I’m fulfilling my (public) promise of putting the slides of my talk here. Remark: Please notice that these pdf files contain photos, so that their sizes are relatively big (~46MB and ~34MB resp.)
About Enrique Pujals’ talk, let me mention that it was a very touching moment: during the preparation of his slides, Enrique had the nice idea of inviting some of Jacob’s mathematical sons and nephews to draw some pictures in order to show how Palis influence appears in several contexts; in particular, each slide of Enrique’s talk was a small tribute of some son/nephew of Jacob (whose signature always appears in the corresponding slide). Also, Enrique putted a lot of effort in making the exposition as funny as possible (e.g., when he showed a picture of Gugu’s son illustrating Newhouse phenomena with his hands, Enrique said that Jacob is still recruiting young people to Dynamical Systems). Finally, the talk ended with Enrique embracing Jacob and saying to him that student isn’t a word meant to refer to a past event, but to a present one.
About my talk, I have only to acknowledge the organizers for the imense honor to give the final talk of the conference: in fact, I was extremely touched (and nervous) with this invitation, specially because 10 years ago (during Palis 60th birthday conference), I was a 1st year graduate student trying to choose my primary research area (and of course my choice was particularly influenced by J. Palis). During the talk, I started by telling the audience about my first meeting with Palis. I guess that the first time Palis heard of me was during a visit of Jean-Pierre Serre. In fact, I was so excited to learn that J.P. Serre was at IMPA that I interrupted him in the middle of a calculation (in his office) to ask in Portuguese to take a photo with him. Of course, Serre didn’t understand my request and told to Jacob (IMPA’s director in 1997) about a strange boy who went into his office like a crazy. After this initial bad impression, during Palis 60th birthday conference (in 2000), I talked to him asking for some advice concerning my research career. Instead of giving me a long speech, Palis saw that I was planning to read his joint book with Floris Takens and he told me to take read the book to decide if I liked the subject or not. I acknowledge his advice and asked for an autograph in this book. Again, instead of writing a long dedicatory, he simply put one decisive word of encouragement: Sucess (this appears as the second slide of my talk). After that, I talked about two mathematical results (joint with J.C. Yoccoz, and G. Forni and A. Zorich) and I closed the exposition with a selection of key (serious and funny) moments of the conference. I hope you will enjoy it!
ERT9: Weak Mixing implies Weak Mixing of all orders
Remember one of the characterizations we got in ERT8 of weak mixing: a mps is weak mixing if and only if, for any ,
that is, if and only if the sequence of functions converges in the -norm to .
For our interests, the notion of weak mixing will be useful only if the above property extends to multiple functions, because, as we saw in ERT4, the convergence of these sequence corresponds to multiple recurrence. This actually holds, as we will see below, and is called weak mixing implies weak mixing of all orders. Such property will follow from two main ingredients:
- The product characterization of weak mixing can be extended to multiple products.
- The van der Corput trick allows an inductive argument.
It is worth mentioning that this is the second time van der Corput trick appears in these lectures (the first one was in ERT2). The reader not used to it might think it is just a technical step in the proof, but it is actually an ingredient present in many situations of Ergodic Ramsey Theory when dealing with random components of mps. It has many versions, each of them to the purpose of particular notions of multiple recurrence. We first discuss the one we need.
1. The van der Corput trick
Theorem 1 (van der Corput trick) If is a bounded sequence in a Hilbert space and if
then .
Proof: Take such that . Notice that, for a fixed ,
goes to zero as and so the assertion is equivalent to
By the triangle inequality,
which goes to zero as .
The significance of this inequality is that it replaces the task of bounding a sum of coefficients by that of bounding a sum of “differentiated” coefficients . This trick is thus useful in “polynomial” type situations when the differentiated coefficients are often simpler (have smaller order) than the original coefficients.
The above theorem is written in a modern fashion. The original van der Corput trick is actually known as van der Corput difference theorem and comes from the theory of uniform distributions.
Definition 2 We say that a sequence is uniformly distributed if, for every interval ,
Exercise 1 Given a sequence , prove that the following assertions are equivalent.
- is uniformly distributed.
- For every continuous , where is the Lebesgue measure.
- (Weyl criterion) For any ,
Theorem 3 (van der Corput difference theorem) A sequence is uniformly distributed if is uniformly distributed for every .
Proof: By Weyl criterion and the assumption, . Fix , and define . Then
converges to zero as . By van der Corput trick,
converges to zero which, again by Weyl criterion, proves that is uniformly distributed.
The previous result implies Weyl equidistribution theorem, which is a generalization of Kronecker theorem on the uniform distribution of , for irrational .
Theorem 4 (Weyl) If is a polynomial with at least one of its coefficients irrational, then , , is uniformly distributed.
Proof: Follows from Kronecker theorem by sucessive applications of Theorem 3.
2. Weak mixing implies weak mixing of all orders
Remember that is weak mixing if and only if is ergodic whenever is ergodic. For each , let .
Proposition 5 If is weak mixing, then is ergodic, for every integers . In fact, it is weak mixing.
Proof: Note that is weak mixing, for any . To prove this, consider and of zero density such that
Let . This set has zero density, as
and the last fraction converges to zero.
We proceed by induction on . The case was proved in the last paragraph. Suppose that, for every integers , is ergodic. If ,
is the product of an ergodic and a weak mixing system, which proves our assertion.
The main result of this section is
Theorem 6 Let be weak mixing. Then, for any ,
in the -norm.
In other words, this means that, for any ,
The proof will consist in an inductive argument, with the use of van der Corput trick to reduce the case to and so on. By linearity of the expression, we assume that . In this setting, we want to prove that
where represents the norm of .
Proof: The case follows from the ergodicity of :
Suppose the result is true for and take . Define
Given , let . Then
By hypothesis,
Rewrite the above expression as
where and is given by
As is ergodic,
By the van der Corput trick,
which concludes the proof.
Corollary 7 (Multiple Poincaré Recurrence for weak mixing mps) If is weak mixing and , then
for every .
Proof: Just consider in (1).
Previous posts: ERT0, ERT1, ERT2, ERT3, ERT4, ERT5, ERT6, ERT7, ERT8.
ERT8: Weak Mixing Systems
1. The dichotomy between structure and randomness
The main tool used in ERT1 and ERT2 was: given a mps , we decomposed the space into two pieces: one structured, formed by the fixed (or periodic, depending on the case) functions, and other random, formed by the functions for which the Cesàro averages converge to zero. This represents an example of the dichotomy that surrounds Ergodic Ramsey Theory: structure vs. randomness. This is actually briefly discussed in the end of ERT2 and in a broader way in this paper of Terence Tao.
This idea is also used in other branches of Mathematics, specially in combinatorics, harmonic analysis, number theory, etc. We can cite many situations:
- Theorems about the existence of ergodic averages.
- Szemerédi regularity lemma.
- Roth theorem on the existence of arithmetic progressions of length three in sets of positive density.
- Gowers norms.
- All proofs (up to my knowledge) of Szemerédi theorem.
- Green-Tao theorem on the existence of arithmetic progressions in the primes.
- The Julia set of holomorphic functions is either connected or a Cantor set.
- Mané-Bochi ’02: a -generic conservative diffeomorphim in surfaces either has all Lyapunov exponents zero almost everywhere or is Anosov.
- Avila-Moreira ’03: For almost every , the quadratic function is either regular (has a periodic attractor) or stochastic (has an invariant absolutely continuous probability with positive Lyapunov exponent).
- Avila-Forni ’07: almost every interval exchange transformation is either an irrational rotation or weak mixing.
- Avila ’10: for Schrodinger operators with a one-frequency and typical real analytic potential, the spectrum is either subcritical or supercritical.
Each situation has a notion of structure/randomness. The one we are interested is multiple recurrence of mps. Let us see this from the spectral theory point of view.
Given a mps , denote also by the Koopman-von Neumann operator, defined by
When necessary, we use the notation to denote this operator. Many of its spectral properties are related to ergodic properties of . We investigate the eigenvalues/eigenfunctions of . If the eigenfunctions form a basis of , then is determined. In fact, let be the multiset (repeated with multiplicity) of eigenvalues of and, for each , the eigenfunction associated to . If
then
In this case, we say that has pure point spectrum and is a compact system. This constitutes the structured notion we were looking for.
In contrast, when has no eigenvalues other than and it is simple, we say that has continuous spectrum and is a weak mixing system. It forms the random part.
As pure point/continuous spectrum are opposite notions, there is a hope that every mps can be decomposed into two parts: one compact and other weak mixing. This is not true at all. Instead, can be decomposed in several parts in such a way that every part is an extension of the previous one and it is a compact or weak mixing extension of the smaller one. In other words, the dynamics of is broken into many parts in which every braking is obtained from the previous one by adding one of the two dynamical prototypes we discussed above.
Formally speaking, given two mps and , we say that is an extension of if there is a surjective measurable map such that
We denote this by and is called a factor of .
Theorem 1 (Furstenberg structural theorem) Given a mps , there exists an ordinal and a family of factors of , for every , such that:
- is a single point.
- is a compact extension of for every successor ordinal .
- is the inverse limit of , for every limit ordinal .
- is a weak mixing extension of .
Above, inverse limit is in the sense that . This result will be discussed in Lebesgue-full detail in the last post. In order to understand it, we need to study four concepts:
- Weak mixing systems.
- Compact systems.
- Weak mixing extensions.
- Compact extensions.
These will be the topics of this and the next 2 or 3 lectures.
2. Weak mixing systems
The definition used below is different from the one we assumed above, but don’t worry: they will be shown to coincide. Actually, we will obtain various equivalent definitions of weak mixing.
Definition 2 A mps is weak mixing if, for every ,
Exercise 1 Consider a bounded sequence of nonnegative real numbers. Prove:
- If , then Conclude that strong mixing implies weak mixing.
- If , then Conclude that weak mixing implies ergodicity.
Exercise 2 is weak mixing if and only if
for every .
Proposition 3 is weak mixing if and only if
for every such that .
We leave the proof to the reader, which may be found in the book Topics in ergodic theory of W. Parry. The notion of weak mixing means that, in some sense, almost all the system behaves in a strong mixing way. This is what says the following lemma.
Lemma 4 Consider a bounded sequence of nonnegative real numbers. Then
if and only if there exists a set of zero density such that
Proof: () Define, for each , the set
is a ascending chain of subsets of . They are the sets that may give problems in the convergence of to zero. Each of them has zero density, because
and then
In this way, take an increasing sequence of integers such that . Define and
By definition,
It remains to prove that has zero density. Consider an integer , let us say, with . As ,
and then
which, by (2), implies that
Then, has zero density.
() Let such that , for every . Given , we want to prove that
for every large enough. By hypothesis, there is for which
and
Then, for ,
Taking such that
we get
which concludes the proof.
Taking and , Lemma 4 implies that is weak mixing if and only if, for every , there exists of zero density such that
By approximation, (4) is equivalent to the existence, for every , of a set of zero density such that
Another characterization comes from Proposition 3: is weak mixing if and only if
for every such that . In fact, by Lemma 4, converges a Cesàro to zero if and only if the same happens to .
Weak mixing, at first impression, seems an artificial notion, obtained by the relaxation of strong mixing. This is not the case: first because it also has the natural spectral characterization discussed in section 1 and second, as was already discussed above, weak mixing is an important part of every mps. In other contexts, it is abundant. For example, Avila and Forni proved that almost every interval exchange transformation is either an irrational rotation or weak mixing, in contrast to an older result of Katok which proved that these are never strong mixing.
3. Product characterization of weak mixing
Consider two mps and .
Definition 5 The product mps of and is the quadruple
where is the -algebra generated by and is the probability measure on defined by
Theorem 6 Given a mps , the following are equivalent.
- is weak mixing.
- is weak mixing.
- is ergodic, for every ergodic mps .
- is ergodic.
Proof: (i) (ii). It is enough to check (4) for a generating algebra of . Let . By assumption, there exist of zero density such that
and
The set has zero density and satisfies
proving that is weak mixing.
(ii) (i). Given , there exists of zero density such that
that is,
(i) (iii). Follows from the exercise below.
Exercise 3 Consider two bounded sequences and of real numbers. If and converge a Cesàro to and , respectively, then converges a Cesàro to .
(iii) (iv). If is the trivial mps with consisting of a single point, we conclude that is ergodic. Then, takin , it follows that is ergodic.
(iv) (i). First, note that is ergodic. Given ,
converges a Cesàro to
proving the assertion. Then
which converges to
4. Spectral characterization of weak mixing
We now characterize weak mixing in terms of spectral properties. At this point, it is interesting to introduce the
Theorem 7 If is an unitary operator on the Hilbert space and , then there is a unique finite Borel measure on the circle such that
When has continuous spectrum, is a continuous measure (it has no atoms), for every such that .In this case, Fubini theorem guarantees that gives zero measure to the diagonal . This in turn implies the
Theorem 8 is weak mixing if and only if has continuous spectrum.
Proof: () Suppose is an eigenfunction associated to . The function defined by is an eigenfunction of associated to . By Theorem 6, is constant and the same happens to .
() Let us check (6). Take such that . Using Theorem 7,
Decompose , where is the diagonal. For , the summand
converges to zero as uniformly in . Since assigns zero measure to , we’re done.
5. Conditions for weak mixing
In this section we resume all conditions obtained above for a a mps be weak mixing.
- For any ,
- For any ,
- For any such that ,
- For any such that ,
- For any , there exists of zero density such that
- For any , there exists of zero density such that
- is ergodic.
- is ergodic, for every ergodic.
- is weak-mixing.
- has continuous spectrum.
Previous posts: ERT0, ERT1, ERT2, ERT3, ERT4, ERT5, ERT6, ERT7.
