**Disclaimer.** All errors, mistakes or misattributions are my entire responsibility.

**1. Introduction **

Given a Riemannian -dimensional manifold , one can often study its Geometry by analyzing adequate smooth real functions on (such as scalar curvature). One of the techniques used to get some information about is the following observation (“baby maximum principle”): if has a local maximum at a point , then we dispose of

- a
*first order*information: the gradient of at vanishes; and - a
*second order*information: the Hessian of at has a sign (namely, it is negative definite).

In order to extract *more* information from this technique, one can appeal to the so-called *doubling of variables method*: instead of studying , one investigates the local maxima of a “well-chosen” function on the *double* of variables (e.g., ). In this way, we have *new* constraints because the gradient and Hessian of depend on more variables than those of .

This idea of doubling the variables goes back to Kruzkov who used it to estimate the modulus of continuity of the derivative of solutions of a non-linear parabolic PDE (in one space dimension). In this post we shall see how this idea was ingeniously employed by Andrews and Clutterbuck (2011) and Brendle (2013) in two recent important works.

We start with the statement of Andrews-Clutterbuck theorem:

Theorem 1 (Andrews-Clutterbuck)Recall that the spectrum of with respect to Dirichlet condition on the boundary consists of a discrete set of eigenvalues of the form:Let be a convex domain of diameter . Consider the SchrÃ¶dinger operator where is the Laplacian operator and is the operator induced by the multiplication by a convex function .In this setting, the

fundamental gapof is bounded from below by

Remark 1This theorem is sharp: when and (by Fourier analysis). In other terms, Andrews-Clutterbuck theorem is anoptimalcomparison theorem between the fundamental gap of general SchrÃ¶dinger operators with the one-dimensional case.

Next, we state Brendle’s theorem:

Theorem 2 (Brendle)A minimal torus inside the round sphere is isometric to Clifford torus .

The sketches of proof of these results are presented in the next two Sections. For now, let us close this introductory section by explaining some of the motivations of these theorems.

** 1.1. The context of Andrews-Clutterbuck theorem **

The interest of the fundamental gap comes from the fact that it helps in the description of the long-term behavior of non-negative non-trivial solutions of the heat equation

with on . More precisely, one has that

where

- is an adequate constant,
- is the
*ground state*of , i.e., , on , on and is normalized so that , and - denotes (as usual) a quantity bounded from above by for some constant and all .

The theorem of Andrews-Clutterbuck answers positively a conjecture of Yau and Ashbaugh-Benguria. This conjecture was based on a series of works in Mathematics and Physics: from the mathematical side, van den Berg observed during his study of the behavior of spectral functions in big convex domains (modeling Bose-Einstein condensation) that for the free Laplacian () on several convex domains. After that, Singer-Wong-Yau-Yau proved that

and Yu-Zhong improved this result by showing that

Furthermore, some particular cases of Andrews-Clutterbuck were previously known: for instance, Lavine proved the one-dimensional case , and other authors studied the cases of convex domains with some (axial and/or rotational) symmetries in higher dimensions.

** 1.2. The context of Brendle theorem **

The theorem of Brendle answers affirmatively a Lawson’s conjecture.

Lawson arrived at this conjecture after proving (in this paper here) that *every* compact oriented surface without boundary can be *minimally embedded* in .

Remark 2The analog of Lawson’s theorem is completely false in : using the maximum principle, one can show that there arenoimmersed compact minimal surfaces in .

Moreover, Lawson (in the same paper loc. cit.) showed that, if the genus of is *not* prime, then admits two *non-isometric* minimal embeddings in .

On the other hand, Lawson’s construction in the case of genus produces *only* the Clifford torus (up to isometries). Nevertheless, Lawson proved (in this paper here) that if is a minimal torus, then there exists a *diffeomorphism* taking to the Clifford torus : in other terms, there is *no* knotted minimal torus in !

In this context, Lawson was led to conjecture that this diffeomorphism could be taken to be an *isometry*, an assertion that was confirmed by Brendle.

**2. Proof of Andrews-Clutterbuck theorem **

One of the key points of Andrews-Clutterbuck argument is an *improvement* of a theorem of Brascamp-Lieb. More precisely, the Brascamp-Lieb theorem ensures, in the context of Theorem 1, the log-concavity of the the ground state of (i.e., the logarithm is a concave function). In this setting, a fundamental ingredient in Andrews-Clutterbuck proof of Theorem 1 is a *quantitative* statement about the log-concavity of .

Before discussing Andrews-Clutterbuck’s improvement of Brascamp-Lieb theorem, let us quickly review Korevaar’s proof of Brascamp-Lieb theorem as an excuse to introduce a first concrete instance of the doubling of variables method.

** 2.1. A sketch of Korevaar’s proof of Brascamp-Lieb theorem **

We want to show that is log-concave. For this sake, we can assume that the domain and the potential are *strictly convex*. Indeed, this is so because and are convex, so that they can be approximated by strictly convex objects, and, furthermore, it can be shown that the ground state varies *continuously* under deformations of and .

By definition, is concave if and only if the function

on the double of variables is non-positive.

We divide the proof of the fact that for all into two parts.

First, we claim that . In fact:

- If with , then because (i.e., ) on and on . Here, we used that .
- If with , one exploits the strict convexity of to say that, near , the ground state “looks like” the distance to the boundary , so that is a concave function near .

Next, once we dispose of the fact that , the proof of the log-concavity of will be complete if we show that at any local maximum .

In this direction, we use the baby maximum principle. If is a local maximum of , then vanishes to the first order at , i.e., . Thus, if denoting by , we deduce from the definition of and the equation that

Moreover, by varying in the direction of a small vector , we get a function

possessing a local maximum at . Therefore, the Laplacian of this function at is non-positive, i.e.,

Now, a simple calculation reveals that the Laplacian of satisfies the equation

because is the ground state of (i.e., ). Combining this equation with (1) and (2), we conclude that

Since is strictly convex, this inequality implies that , and, *a fortiori*, , as we wanted to prove. This completes the sketch of Korevaar’s proof of Brascamp-Lieb theorem.

** 2.2. An improvement of Brascamp-Lieb’s theorem **

The improvement of Andrews-Clutterbuck of the Brascamp-Lieb theorem consists of the following estimate of the modulus of continuity of the derivative of :

This estimate provides new important informations beyond the statement of Brascamp-Lieb theorem: for example, when , the right-hand side of the inequality goes to (which is much better than simply knowing that it is non-positive).

The proof of this estimate is somewhat complicated: it involves a combination of the doubling of variables method, a comparison argument with the one-dimensional case and the study of parabolic PDEs.

For this reason, by following Carron’s talk, we will skip the proof of this estimate, and we will now discuss how this estimate can be used to get lower bound on the fundamental gap in Theorem 1. In this direction, we will follow an approach proposed by Lei Ni (which is not exactly the original argument of Andrews-Clutterbuck).

** 2.3. End of the (sketch of) proof of Theorem 1 **

We consider the eigenfunction with , , , and on .

The quotient is closely related to the fundamental gap : more precisely, the function verifies

and on (where is the unit outward normal to ).

The previous method of Singer-Wong-Yau-Yau consisted of studying first two derivatives of the function

at its local maximum points, extract an inequality, and obtain a (non-optimal) lower bound on by integration of this inequality (together with the fact that satisfies (4)).

The approach proposed by Andrews-Clutterbuck consists in studying the oscillations of , i.e., we compare with the one-dimensional model (on an interval) by means of the function:

where and . (This choice of avoids “boundary effects”).

We want to use the doubling of variables method, i.e., the baby maximum principle applied to . For this sake, we introduce the quantity:

One can show that the strict convexity of implies that is attained on *or* in the interior of .

Remark 3We have not defined on , but a first order expansion says that it is natural to pose

In both cases, the study of the first two derivatives of at a maximum point *and* the improvement (3) of Brascamp-Lieb theorem imply that

Of course, since is arbitrary, this proves that . Hence, the proof of Theorem 1 is complete once we prove (5).

For the sake of exposition, we show the validity of (5) *only* in the case that is attained at , i.e., for some : indeed, the other case is similar (in some sense) to this one.

If is attained at ,, then is a maximum for , so that

Of course, this inequality is a good starting point to study (via (4)), but it is only a *partial* information obtained by letting vary *only* along !

If we vary along the *transverse* direction by considering where is a small vector, we obtain from (and the baby maximum principle) that

which is certainly a *better* estimate than the previous one.

In other words, we got an extra (better) information on thanks to the doubling of variables method applied to !

By differentiating the equation (4), and then applying (6) to the resulting PDE, we deduce that

Now, we notice that the Andrews-Clutterbuck improvement (3) of Brascamp-Lieb theorem says (among other things) that . By plugging this into the previous inequality, we conclude that

Since is a non-constant function (and ), we have from this inequality that

This proves (5) when is attained at , as desired.

**3. Proof of Brendle theorem **

Let be a minimal torus inside the round -sphere . Denote by a choice of unit normal to .

The second fundamental form is the Hessian at of the function (from to ) whose graph over is (locally) equal to . In particular, is a symmetric quadratic form, and, hence, can be diagonalized. The (real) eigenvalues of are called principal curvatures of at .

By definition, is minimal if and only if the trace of vanishes (for all ). In other words, the eigenvalues of are when is minimal.

For later use, we recall the following three facts:

- Lawson proved that when is a minimal torus in . (Of course, this result strongly uses that has genus , and, indeed, it is completely
*false*for other genera) - The minimality of imposes a constraint on known as
*Simon’s formula*. In our setting, this means that the principal curvature verifies the following PDE: - Lawson also showed that if is constant, then the minimal torus is isometric to Clifford’s torus .

The last item above says that our objective is very clear: in order to prove Brendle’s theorem 2, we have to show that is constant.

By Gauss-Bonnet theorem, we have that , i.e., equals to in average. From this point, a natural strategy would be to combine this information with Simons’ equation (and some maximum principles) to show that . Unfortunately, this idea does *not* work mainly because of the (negative) sign of the term in Simons formula.

At this point, Brendle introduces the function

(Note that, since , , and, thus, .)

The *geometrical* meaning of is the following. The quantity is the biggest radius such that stays outside a ball of radius tangent to at .

In other terms, stays outside of the oscullating balls of radius , and , and are mutually tangent at . From this fact, it is possible to check that

This means that the *global* information (curvature of oscullating balls) controls the *local* information (principal curvature).

We *affirm* that the inequality implies that satisfies the following version of Simons formula

in the sense of viscosity. For the sake of exposition, let us prove that satisfies this inequality when : the general case ( is a viscosity solution when is not smooth) follows by a simple modification of the argument below.

Up to changing our choice of unit normal , we can write . Let us apply the doubling of variables method by considering the function

Given a point , we have two possibilities: either or .

In the first case (), since , one has (from the baby maximum principle) that and . By plugging this into Simons formula (7), we deduce that

In the second case , we have that there exists , such that . Since , we get that . Geometrically, this condition means that stays outside the ball and is tangent to at and . In particular, this implies that the tangent planes and are symmetric with respect to the mediator hyperplane of the segment between and . By exploiting this symmetry, Brendle chose good coordinate systems on and leading him to the following inequality

with after *seven* pages of calculations in his paper! Since , we have the good sign to conclude from this estimate that

as it was claimed.

Once we know that is a (viscosity) solution of (8), we can use the maximum principles, the inequality and the Simons formula (7) for to obtain that

where is a constant.

We *claim* that this implies that is constant, so that the proof of Brendle theorem would be complete (by Lawson’s result quoted right after Simons formula (7)). Indeed, we have again two cases: either or .

In the first situation, since the oscullating balls with have the same principal curvature of at , a third order expansion of (the graph of) at reveals that for all , so that is constant.

In the second situation with , we look again at the inequality

showed above (under the assumption , which is our current situation at *all* since , ).

Because satisfies Simons formula (7) and with a constant, we get that the first term of the previous inequality vanishes. In particular, we deduce that

This means that for all (since and ), so that is constant, as it was claimed.

This completes the proof of Brendle’s theorem and, consequently, the discussion of this post.

]]>

This article is motivated by Simion Filip’s recent work on the classification of possible monodromy groups for the Kontsevich-Zorich cocycle.

Very roughly speaking, the basic idea of this classification is the following. Consider the Kontsevich-Zorich cocycle on the Hodge bundle over the support of an ergodic -invariant probability measure on (a connected component of) a stratum of the moduli spaces of translation surfaces. Recall that, in a certain sense, the Kontsevich-Zorich cocycle is a sort of “foliated monodromy representation” obtained by using the Gauss-Manin connection on the Hodge bundle while essentially moving *only* along -orbits on moduli spaces of translation surfaces.

By extending a previous work of Martin MÃ¶ller (for the Kontsevich-Zorich cocycle over TeichmÃ¼ller curves), Simion Filip showed (in this paper here) that a version of the so-called Deligne’s semisimplicity theorem holds for the Kontsevich-Zorich cocycle: in plain terms, this means that the Kontsevich-Zorich cocycle can be completely decomposed into (-)irreducible pieces, and, furthermore, each piece respects the Hodge structure coming from the Hodge bundle. In other terms, the Kontsevich-Zorich cocycle is always diagonalizable by blocks and its restriction to each block is related to a variation of Hodge structures of weight .

The previous paragraph might seem abstract at first sight, but, as it turns out, it imposes *geometrical constraints* on the possible groups of matrices obtained by restriction of the Kontsevich-Zorich cocycle to an irreducible piece. More precisely, by exploiting the known tables (see Â§ 3.2 of Filip’s paper) for monodromy representations coming from variations of Hodge structures of weight over quasiprojective varieties, Simion Filip classified (up to compact and finite-index factors) the possible Zariski closures of the groups of matrices associated to restrictions of the Kontsevich-Zorich cocycle to an irreducible piece. In particular, there are *at most* five types of possible Zariski closures for blocks of the Kontsevich-Zorich cocycle (cf. Theorems 1.1 and 1.2 in Simion Filip’s paper):

- (i) the symplectic group in its standard representation;
- (ii) the (generalized) unitary group in its standard representation;
- (iii) in an exterior power representation;
- (iv) the quaternionic orthogonal group (sometimes called , or ) of matrices on respecting a quaternionic structure and an Hermitian (complex) form of signature in its standard representation;
- (v) the indefinite orthogonal group in a spin representation.

Moreover, each of these items can be realized as an *abstract* variation of Hodge structures of weight over *abstract* curves and/or Abelian varieties.

Here, it is worth to stress out that Filip’s classification of the possible blocks of the Kontsevich-Zorich cocycle comes from a *general* study of variations of Hodge structures of weight . Thus, it is *not* clear whether all items above can actually be *realized* as a block of the Kontsevich-Zorich cocycle over the closure of some -orbit in the moduli spaces of translations surfaces.

In fact, it was previously known in the literature that (all groups listed in) the items (i) and (ii) appear as blocks of the Kontsevich-Zorich cocycle (over closures of -orbits of translation surfaces given by certain cyclic cover constructions). On the other hand, it is not obvious that the other 3 items occur in the context of the Kontsevich-Zorich cocycle, and, indeed, this *realizability question* was explicitly posed by Simion Filip in Question 5.5 of his paper (see also Â§ B.2 in Appendix B of this recent paper of Delecroix-Zorich).

In our paper, Filip, Forni and I give a partial answer to this question by showing that the case of item (iv) is realizable as a block of the Kontsevich-Zorich cocycle.

Remark 1Thanks to an exceptional isomorphism between the real Lie algebra in its standard representation and the second exterior power representation of the real Lie algebra , this also means that the case of of item (iii) is also realized.

Remark 2We think that the examples constructed in this paper by Yoccoz, Zmiaikou and myself of regular origamis associated to the groups of Lie typemightlead to the realizability ofallgroups in item (iv). In fact, what prevents Filip, Forni and I to show that this is the case is the absence of a systematic method to show that the natural candidates to blocks of the Kontsevich-Zorich cocycle over these examples are actually irreducible pieces.

In the remainder of this post, we will briefly explain our construction of an example of closed -orbit such that the Kontsevich-Zorich cocycle over this orbit has a block where it acts through a Zariski dense subgroup of (modulo compact and finite-index factors).

**1. A quaternionic cover of a -shaped orgami **

The starting point of our joint paper with Filip and Forni is the following. The group is related to quaternionic structures on vector spaces. In particular, it is natural to look for translation surfaces possessing an automorphism (symmetry) group admitting representations of quaternionic type.

Note that automorphism groups of translation surfaces (of genus ) are always finite (e.g., by Hurwitz’s automorphism theorem) and the simplest finite group with representations of quaternionic type is the quaternion group

where , , and .

Therefore, this indicates that we should look for translation surfaces whose group of automorphisms is isomorphic to . A concrete way of building such translation surfaces is to consider ramified covers of “simple translation surfaces” such that the group of deck transformations of is isomorphic to .

The first natural attempt is to take the flat torus, and define as the translation surface obtained as follows. We let , , be copies of the flat torus . Then, we glue by translation the rightmost vertical, resp. topmost horizontal side, of with the leftmost vertical, resp. bottommost horizontal side, of , resp. for each . In this way, we obtain a translation surface tiled by eight squares , , such that the natural projection is a ramified cover (branched only at the origin of ) whose group of automorphisms is isomorphic to (namely, an element acts by translating to for all ).

The translation surface constructed above is a square-tiled surface (origami) that we already met in this blog: it is the so-called Eierlegende Wollmilchsau.

Unfortunately, the Eierlegende Wollmilchsau is *not* a good example for our purposes. Indeed, it is known that the Kontsevich-Zorich cocycle over the -orbit of the Eierlegende Wollmilchsau acts through a *finite* group of matrices (see, e.g., this paper here). In particular, this provides *no* meaningful information from the point of view of realizing the items in Filip’s list of possible monodromy groups because in his list one always *ignores* compact and/or finite-index factors.

This indicates that we should look for other translation surfaces than the flat torus.

In this direction, Filip, Forni and I took to be the simplest -shaped square-tiled surface in genus described in this picture here

where any two sides with the same labels are identified by translation.

Next, we take copies , , of this -shaped square-tiled surface , and we glue by translations the corresponding vertical, resp. horizontal, sides of and , resp. . Alternatively, we label the sides of as indicated in the figure below (where is called )

and we glue by translations the pairs of sides with the same labels.

In this way, we obtain a translation surface (called in our joint paper with Filip and Forni) such that the natural projection is a ramified cover branched only at the unique conical singularity of . Also, the automorphism group of is isomorphic to and each acts on by translating each to for all .

A direct inspection reveals that is a genus surface with four conical singularities whose cone angles are . In this setting, the Kontsevich-Zorich cocycle (over ) is simply the action on of the group of *affine homeomorphisms* of .

Similarly to the investigation of Delecroix, Hubert and LeliÃ¨vre of the so-called wind-tree models, the translation surface has a rich group of symmetries allowing us to decompose the Kontsevich-Zorich cocycle.

More precisely, by taking the quotient of by the center of its automorphism group , we obtain a translation surface of genus with four conical singularities whose cone angles are . Moreover, by taking the quotient of by the subgroups , and of its automorphism group , we obtain three genus surfaces , and each having two conical singularities whose cone angles are . In summary, we have intermediate covers , and for .

Using these intermediate covers together with the fact that has a finite-index subgroup whose elements commute with the automorphisms of (i.e., up to finite-index, the Kontsevich-Zorich cocycle commutes with the action of on ), we can determine the natural candidates for blocks of the Kontsevich-Zorich cocycle over , namely,

where is the subspace generated by comes from (is isomorphic to ), comes from for each , and is the symplectic orthogonal of the direct sum of the other subspaces.

These subspaces have the structure of -modules, and, by a quick comparison with the character table of , one can show that , , , and (resp.) are the *isotypical components* of the trivial, -kernel, -kernel, -kernel and the unique four-dimensional faithful irreducible representation of (resp.): for example, is the isotypical component of because acts as on and the character of in is while the other characters in take the value .

Furthermore, is -dimensional because and have genera and (so that and have dimensions and ), and is the symplectic orthogonal of the symplectic subspace . Hence, as a -module.

Note that acts via symplectic automorphisms of the -module (because the actions of and the automorphism group on commute), and carries a quaternionic structure. In particular, we are *almost* in position to apply Filip’s classification results to determine the group of matrices through which acts on .

Indeed, *if* we have that acts *irreducibly* on , then Filip’s list of possible groups says that acts through a (virtually Zariski dense) subgroup of (because preserves a quaternionic structure on ).

However, there is *no* reason for the action of the affine homeomorphisms on an isotypical component of the automorphism group to be irreducible in general (as far as I know). Nevertheless, the semisimplicity theorems of MÃ¶ller and Filip mentioned in the introduction tells us that can split into irreducible pieces in one of the following three ways:

- (a) is irreducible, i.e., it does not decompose further;
- (b) where and are irreducible pieces;
- (c) where , , are irreducible pieces isomorphic to .

By applying Filip’s classification to each of these items, we find that (up to compact and finite-index factors) there are just three cases:

- (a’) if is -irreducible, then acts through a Zariski-dense subgroup of ;
- (b’) if with and irreducible pieces, then acts through a subgroup of ;
- (c’) if with , , irreducible pieces isomorphic to , then acts through a subgroup of .

We claim that the situations (b’) and (c’) can’t occur, so that we are in situation (a’).

We start by ruling out the case (c’). In this situation, the nature of would force *all* Lyapunov exponents of on to vanish. On the other hand, the formulas of Eskin-Kontsevich-Zorich for the sum of non-negative Lyapunov exponents for the square-tiled surfaces , , , and (together with the facts that , ) allows to show that

where is the sum of non-negative Lyapunov exponents of on . This means that , and, thus, there must be some non-zero Lyapunov exponent of in . In particular, we can not be in situation (c’).

Remark 3At this point, we have that we are in situation (a’) or (b’). Hence, we already have at this stage that the Kontsevich-Zorich cocycle over has a irreducible piece where it acts through a Zariski dense subgroup of or . Of course, this suffices to deduce that we can realize a non-trivial case ( or ) of item (iv) in Filip’s list.

Let us now close this post by *sketching* the computation (done in Section 6 of our joint paper with Filip and Forni) permitting to rule out the situation in (b’).

The basic idea is very simple: if we had a decomposition with and , then the *sole* possibility for the subspace is to be the *central* subspace of *any* matrix (of the action on ) of with “simple spectrum” in the quaternionic sense (i.e., the matrix has an unstable [modulus ] eigenvalue, a central [modulus ] eigenvalue, and an stable [modulus ] eigenvalue, all of them with multiplicity four). Therefore, we can *contradict* the existence of once we exhibit two matrices of with “simple spectrum” whose central spaces are *distinct*.

Here, we do not have an abstract method to produce two matrices with the properties above, so that we are obliged to compute by hands some matrices of . As the reader can imagine, this calculation is straightforward but somewhat tedious, and, for this reason, we are not going to repeat them here: instead, we refer the curious reader to Section 6 of our joint paper with Filip and Forni for the details.

]]>

The first Bourbaki seminar of 2015 had the following four talks:

- David Harari discussed recent results of Harpaz and Wittenberg on the existence of rational points and zero-cycles on fibrations;
- Luigi Ambrosio discussed the works of Almgren, and DeLellis and Spadaro on the regularity of area-minimizing integral currents;
- Gilles Carron talked about two new applications of the doubling of variables method by Andrews and Clutterbuck, and Brendle;
- Phillipe Eyssidieux talked about the works of Chen, Donaldson and Sun, and Tian on the construction of KÃ¤hler-Einstein metrics on Fano manifolds.

Today, I would like to discuss David Harari’s talk entitled “*Zero cycles and rational points on fibrations in rationally connected varieties (after Harpaz and Wittenberg)*”. Here, I will try to follow the first 38m50s of the video of Harari’s talk (in French) and sometimes his lecture notes (also in French). Of course, this goes without saying that any errors/mistakes are my full responsibility.

**1. Introduction **

One of the basic old problems in Number Theory is to determine whether a system of polynomial equations

associated to homogeneous polynomials with coefficients in a number field has non-trivial solutions.

Equivalently, denoting by the algebraic variety defined by the system (1), we want to know whether the set of points of whose coordinates belong to is not empty. In the literature, is called the set of -rational points of .

It is not easy to answer this problem in general. Nevertheless, we have the following *necessary condition*: if , then for all completion of with respect to a place of (i.e., is an equivalence class of absolute values). In other words, we have that whenever there is a *local obstruction* in the sense that for some place of .

This necessary condition based on local obstructions *is* helpful because it is often *easy* to verify *algorithmically* that . For example, when , its completions are either (for the place of -adic absolute values, prime) or (for the “place at infinity”), and, in this situation, we can check that with the help of Hensel’s lemma (-adic analog of Newton’s method).

It is known that this necessary condition is *sufficient* in certain special cases. For instance, the classical Hasse-Minkowski theorem (from 1924) states that if and only if when is a quadric, i.e., is defined by just one polynomial equation of degree .

Partly motivated by this, we introduce the following definition:

Definition 1satisfies Hasse’s principle (also called local-global principle) whenever if and only if for all places of .

As it turns out, Hasse’s principle is *false* in general: Swinnerton-Dyer constructed in 1962 some counterexamples among cubic surfaces, and Iskovskih constructed in 1970 a counterexample among the surfaces fibered in conics (given by intersections of two projective quadrics).

Of course, given that it is not hard to determine algorithmically when (with the help of Hensel lemma and/or Newton’s method), it is somewhat sad that Hasse’s principle fails in general.

In view of this state of affairs, we can try to *generalize* the problem of determining whether by replacing “rational points” by slightly more general objects (which then would be easier to find). In this direction, we have the following notion.

Definition 2Azero-cycleis a formal linear combination where:

- vanishes for all but finitely many , and
- if , then is a closed point in the sense of Algebraic Geometry, i.e., is a point defined over (its coordinates belong to) a finite extension of .

Thedegreeof a zero-cycle is .

Note that, by definition, a rational point is a zero-cycle of degree . Thus, we can ask the following more general question:

*Does possess a zero-cycle of degree if has such cycles over all ?*

Remark 1It follows from BÃ©zout’s theorem that has a zero-cycle of degree if and only if has points defined over finite extensions of whose degrees are coprime.

Remark 2A little curiosity about BÃ©zout: as I discovered after moving from Paris to Avon, BÃ©zout spent the last years of his life in Avon and the city gave his name to a street (not far from my appartment) in his honor.

Once more, the answer to this question is *no*: for example, it is known that there are counterexamples among surfaces fibered in conics.

Given this scenario, our goal is to explain how to refine the local-global principle with additional *cohomological* conditions (related to the so-called Brauer groups) introduced by Manin ensuring the existence of zero-cycles and/or rational points in certain situations.

**2. Manin’s condition and Conjecture () **

From now on, let us assume for the sake of simplicity that the projective variety is:

*smooth*(or*non-singular*), i.e., the Jacobian matrix associated to the polynomials in (1) has maximal rank at all points of , and*geometrically integral*in the sense that if we pass from to an algebraic closure , then does not break into several irreducible components.

In 1970, Manin had the idea of introducing a *coupling*

between the space of local points over all places of and the Brauer group of . This (Brauer-Manin) coupling is defined as follows. We take a family of local points and an element , and we associated to them the following quantity:

Of course, we have to explain what this means.

The Brauer group is a (Ã©tale) cohomology group ( where is the multiplicative group) generalizing to the notion of Brauer group of a field (whose elements are equivalence classes of central simple algebras of finite rank over the given field).

In general, is a subgroup of the Brauer group of the function field of X. Moreover, is functorial: we can evaluate and, by the results from class field theory, we can embed in (and this embedding is actually an isomorphism when is a finite place).

Therefore, we can consider the sum of the elements . The fact that this sum has only finitely many non-trivial terms (i.e., for all but finitely many ‘s) is a consequence of the projectivity of .

At this point, it is natural to ask why Manin introduced this coupling and also what is its relevance for our purposes of studying rational points.

In order to explain this, let us setup some notations. The set of adelic points over is . The kernel to the left of the coupling (2) is a subset of denoted by . In other terms, whenever for all .

Using the global reciprocity law, one can show that (where the closure of is taken with respect to the product topology of the -adic topologies on .

This gives a new necessary condition called *Manin’s condition* for the existence of rational points in : if , then .

Of course, one of the main points of Manin’s condition is that, even though seems a complicated, it can be computed in practice for many examples.

The perspective provided by Manin’s condition led Colliot-ThÃ©lÃ¨ne to make the following conjecture (previously formulated by Sansuc in the setting of rational surfaces):

**Conjecture (Colliot-ThÃ©lÃ¨ne).** If is a rationally connected variety, then .

Remark 3It is known (since 2000) that, in general, doesnotimply that . Thus, it is necessary to impose some geometrical conditions on in the formulation of any conjecture in the spirit of Colliot-ThÃ©lÃ¨ne’s conjecture.

Two examples where this conjecture is known to be true are:

- intersections of two quadrics in the projective space if (by the results of Colliot-ThÃ©lÃ¨ne, Sansuc and Swinnerton-Dyer from 1987);
- smooth compactifications of homogenous spaces of algebraic linear groups with connected stabilizers (by the results of Borovoi).

Remark 4At this point, Serre asked Harari whether the particular choice of smooth compactification was important in the second item above. Harari replied that, even though this is not obvious, our whole discussion so far is birationally invariant, and this implies that it is not important what smooth compactification was taken in the statement of Borovoi’s theorem.

Logically, one can expect that this discussion of rational points has a counterpart for zero-cycles. In fact, the Brauer-Manin coupling (2) extends by linearity to Chow’s groups of zero-cycles modulo rational equivalence, so that we have a coupling

where . By analogy with Colliot-ThÃ©lÃ¨ne’s conjecture, this leads us to suspect that there might be a relation between the kernel (to the left) of this coupling and the zero-cycles of X over . In this direction, we have the following conjecture:

**Conjecture () (Colliot-ThÃ©lÃ¨ne, Kato, Saito).** If there exists a family of (local) zero-cycles orthogonal to (with respect to the coupling (3)) with , then has a (global) zero-cycle of degree over .

Remark 5Note that this time we made no geometric assumption on .

Remark 6There is a refined version of this conjecture (calledConjecture) where one describes the image of in under the natural application induced by the Brauer-Manin coupling (3).

Remark 7Concerning the nomenclature, Harari told that “Conjecture ()” probably means “ConjectureofExistence of zero-cycles of degreeone”, and, then, when this conjecture was refined, the subscript was removed leading to “Conjecture (E)”.

Again, let us give some examples where the conjecture () is known to be true:

- curves whose Tate-Shafarevich group of its Jacobian is finite (by the results of Saito in 1989);
- surfaces fibered in conics over (by the results of Salberger in 1988);
- smooth compactifications of homogenous spaces of algebraic linear groups with connected stabilizers (by the results of Liang in 2013).

These examples indicate that one can prove non-trivial results about the existence of rational points and/or zero-cycles when has an extra structure. As we are going to see now, Harpaz and Wittenberg obtained important results in this direction when is fibered over a curve.

**3. Statement of Harpaz-Wittenberg theorems **

Suppose that is fibered over a curve (say ) whose Tate-Shafarevich group is finite or simply take . Assume that the generic fiber , , is rationally connected.

Before stating the results of Harpaz-Wittenberg, we need the following notion introduced by Skorobogatov:

Definition 3A -variety issplitif it contains an irreducible component of multiplicity which is geometrically integral.

Remark 8For , is split: it decomposes as , and both and are defined over . On the other hand, is not split.

Coming back to as above, it is worth to mention that, from the point of view of the so-called *fibration method* of construction of rational points and zero-cycles, the non-split fibers of are the *bad* fibers. The reason for this fact is that if is split, then one can show (with the aid of the Lang-Weil estimate and Hensel’s lemma) that for almost all places .

In this setting, the question that we want to discuss is the following. Can we prove conjectures above for assuming its validity for the *fibers* of ?

The first result of Harpaz-Wittenberg provides an affirmative answer for this question in the context of the Conjecture ():

Theorem 4 (Harpaz-Wittenberg (2014))Let as above. Suppose that all smooth fibers satisfy Conjecture (). Then, Conjecture () holds for .

Remark 9One can replace “all smooth fibers” by “many smooth fibers” in the previous statement (where “most” means a non-empty Zariski open subset of fibers, for example).Also, we can replace “Conjecture ()” by “Conjecture (E)” in this statement.

The next two examples illustrate the range of applicability of this theorem:

- If has a generic fiber birational to a homogenous space of algebraic linear group with connected stabilizer, then Conjecture () holds for .
- Consider the equation , where is a finite extension of , is the corresponding norm, is a basis of , are the variables, and is a non-zero polynomial in . Let be a projective smooth model associated to this “normic equation” (such always exists by Hironaka’s theorem of resolution of singularities). Then, the generic fibers of are birational to a homogenous space of algebraic linear group with connected stabilizer, so that Conjecture () holds for .

Let us now compare this result with other theorems previously known in the literature.

Colliot-ThÃ©lÃ¨ne, Skorobogatov and Swinnerton-Dyer (in 1998) and Liang (in 2013) proved some cases of Harpaz-Wittenberg theorem under much more restrictive hypothesis on the Brauer group of the generic fiber and/or on the bad fibers of . Indeed, Colliot-ThÃ©lÃ¨ne-Skorobogatov-Swinnerton-Dyer imposed in their work that is trivial and the bad fibers are split over a finite extension of with Abelian Galois group, while Liang makes no assumption on but he imposes that there exists at most one bad fiber.

In particular, one of the great advantages of Harpaz-Wittenberg theorem is that there is no need to make any assumptions on or on the bad fibers, so that it can be applied to a whole new class of examples.

Concerning Colliot-ThÃ©lÃ¨ne’s conjecture on rational points, there are analogs of Harpaz-Wittenberg theorem (with “Conjecture ()” replaced by Colliot-ThÃ©lÃ¨ne’s conjecture) under certain restrictive hypothesis. For example, in the same work mentioned above, Colliot-ThÃ©lÃ¨ne-Skorobogatov-Swinnerton-Dyer proved such an analog assuming the validity of the so-called Schnizel’s hypothesis (a broad conjectural generalization of Dirichlet’s theorem on arithmetic progressions), while Harari (in 1997) proved such an analog under the assumption that there is at most one bad fiber.

Harpaz and Wittenberg have an analog of their theorem in the context of Colliot-ThÃ©lÃ¨ne’s conjecture, but, as it turns out, they were not able to completely removed the restrictive hypothesis mentioned in the previous paragraph. More precisely, they showed the following result.

Theorem 5 (Harpaz-Wittenberg)Suppose that (as above) has all of its bad (non-split) fibers over -points. Then, the validity of Colliot-ThÃ©lÃ¨ne’s conjecture for the fibers of implies that satisfies Colliot-ThÃ©lÃ¨ne’s conjecture.

Once more, a basic example illustrating Harpaz-Wittenberg comes from normic equations such that the polynomial splits over . In fact, the statement of Harpaz-Wittenberg theorem in the case of these examples was previously established by Browning and Matthiesen (in 2013) using methods from Analytic Number Theory (inspired from the works of Green-Tao-Ziegler).

At this point, Harari started the discussion of the elements of the proofs of the theorems of Harpaz-Wittenberg. However, I will not pursue this discussion here as I do not feel confident to comment on this part of Harari’s talk.

Instead, we will close this post here: the curious reader can consult the lecture notes of Harari and/or theÂ video of Harari’s talk (from 38h50s on) for more details.

]]>

*[Update (November 20, 2014): Some phrases near the statement of Theorem 3 below were edited to correct an inaccuracy pointed out to me by Giovanni.]*

Let be a polygon with sides and denote by its interior angles.

The billiard flow associated to is the following dynamical system. A point-particle in follows a linear trajectory with unit speed until it hits the boundary of . At such an instant, the point-particle is reflected by the boundary of (according to the usual laws of a specular reflection) and then it follows a new linear trajectory with unit speed. (Of course, this definition makes no sense at the corners of , and, for this reason, we leave the billiard flow undefined at any orbit going straight into a corner)

The phase space of the billiard flow is naturally identified with the three-dimensional manifold : indeed, we need an element of to describe the position of the particle and an element of the unit circle to describe the velocity vector of the particle.

Alternatively, the billiard flow associated to can be interpreted as the geodesic flow on a sphere with a flat metric and conical singularities (whose cone angles are ) with non-trivial holonomy (see Section 2 of Zorich’s survey): roughly speaking, one obtains this flat sphere with conical singularities by taking two copies of (one on the top of the other), gluing them along the boundaries, and by thinking of a billiard flow trajectory on as a straight line path going from one copy of to the other at each reflection.

This interpretation shows us that billiard flows on polygons are a particular case of geodesic flows on the unit tangent bundle of compact flat surfaces whose subsets of conical singularities were removed.

Remark 1In the case of arationalpolygon (i.e., are rational multiples of ), it is often a better idea (see this survey of Masur and Tabachnikov) to takeseveralcopies of obtained by applying thefinitegroup generated by the reflections through the sides of and then glue by translation the pairs of parallel sides of the resulting figure. In this way, one obtains that the billiard flow associated to is equivalent to translation (straightline) flow on a translation surface (an object that has trivial holonomy and, hence, is more well-behaved that a flat metric on with conical singularities) and this partly explains why the Ergodic Theory of billiards on rational polygons is well-developed.However, let us not insist on this point here because in what follows we will be mostly interested in billiard flows onirrationalpolygons.

A basic problem concerning the dynamics of billiards flows on polygons, or, more generally, geodesic flows on flat surfaces with conical singularities is to determine whether such a dynamical system is ergodic.

In view of Remark 1, we can safely skip the case of rational polygons: indeed, this setting one can use the relationship to translation surfaces to give a satisfactory answer to this problem (see the survey of Masur and Tabachnikov for more explanations). So, from now on, we will focus on billiard flows associated to non-rational polygons.

Kerckhoff, Masur and Smillie proved in 1986 that the billiard flow is ergodic for a -dense subset of polygons. Their idea is to consider the -dense subset of “Liouville polygons” admitting *fast* approximations by rational polygons (i.e., the subset of polygons whose interior angles admit fast approximations by rational multiples of ). Because the ergodicity of the billiard flow on rational polygons is well-understood, one can hope to “transfer” this information from rational polygons to any “Liouville polygon”.

Remark 2The -dense subset of polygons constructed by Kerckhoff, Masur and Smillie has zero measure: indeed, this happens because they require the angles to be “Liouville” (i.e., admit fast approximations by rational multiples of ), and, as it is well-known, the subset of Liouville numbers has zero Lebesgue measure.

A curious feature of the argument of Kerckhoff, Masur and Smillie is that it is hard to extract any sort of *quantitative* criterion. More precisely, it is difficult to quantify how fast the quantities must be approximated by rationals in order to ensure that the ergodicity of the billiard flow on the corresponding polygon. This happens because the genera of translation surfaces associated to the rational polygons approximating usually tend to infinity and it is a non-trivial problem to control the ergodic properties of translation flows on families of translation surfaces whose genera tend to infinity.

Nevertheless, Vorobets obtained in 1997 (by other methods) a quantitative version of Kerckhoff, Masur and Smillie by showing the ergodicity of the billiard flow on a polygon whose interior angles verify the following *fast approximation property*: there exist arbitrarily large natural numbers such that

for some rational numbers , , with denominators , .

In summary, the works of Kerckhoff-Masur-Smillie and Vorobets allows to solve the problem of ergodicity of the billiard flow on *Liouville* polygons.

Of course, this scenario motivates the question of ergodicity of billiard flows on *Diophantine* polygons (i.e., the “complement” of Liouville polygons consisting of those which are badly approximated by rational polygons).

In his talk, Giovanni announced a new criterion for the ergodicity of the billiard flow on polygons (and, more generally, the geodesic flow on a flat surface with conical singularities) with potential applications to a whole class (of full measure) of Diophantine polygons.

Before stating Giovanni’s results, let us introduce some notation. Consider a flat surface with a finite subset of conical singularities (e.g., obtained by reinterpretation of the billiard flow on a polygon). The infinitesimal structure of the unit tangent bundle is described by vector fields:

- is the generator of the geodesic flow;
- is the “perpendicular geodesic flow”;
- is the generator of the rotation on the circle fibers of .

These vector fields satisfy the following commutation relations:

- (because is a flat surface, and, hence, has zero curvature);
- ;
- .

Note that the knowledge of allows us to recover the natural Riemannian metric on induced by the flat structure on : indeed, is completely determined by the fact that is an orthonormal frame.

By analogy with the case of rational polygons (see this survey of Masur), we would like to apply *renormalization* methods to get an ergodicity criterion for the geodesic flow on based on the properties of the renormalization dynamics.

Logically, a naive implementation of this idea does not work: the TeichmÃ¼ller geodesic flow on the moduli space of flat surfaces with *arbitrary* conical singularities has *poor* dynamical behavior (in comparison with the case of rational polygons) because these moduli spaces are usually very big and, for example, this is a serious obstruction to any *recurrence* property of the corresponding TeichmÃ¼ller flow (which is a key ingredient in the so-called Masur’s ergodicity criterion).

Nevertheless, Giovanni noticed that one can still implement this renormalization method by introducing the following *deformations* of (playing the role of “fake TeichmÃ¼ller geodesic flow”):

- ;
- ;
- ,

for . By declaring that the vector fields form an orthonormal frame, we obtain a Riemannian metric on .

Remark 3Note that , and satisfy the following commutation relations:

Furthermore, the volume of is .In particular, as , we see that and , i.e., is very close to a Heisenberg group as (i.e., its geometry becomes nilpotent in the limit). In particular, we see that the deformations of donotexhibit any sort of recurrence property (in whatever moduli space they live).

Remark 4In the definition of , resp. , the scaling factors of , resp. , for , resp. are motivated by direct analogy with the TeichmÃ¼ller geodesic flow. On the other hand, the scaling factor for is more subtle to explain: Giovanni said that he found this scaling (which is convenient for his ergodicity criterion of billiards on polygons) from an analytical argument (see Remark 9 below).Also, Giovanni observed that, a posteriori, this scaling is “justified” from the dynamical point of view because the orbits of the geodesic flow of stay fairly close (i.e., they do not “diverge”) after applying the deformation , and, in particular, one has nice “rectangles” of heights and width (and, as it turns out, the presence of such nice rectangles is an important ingredient in Masur’s ergodicity criterion for rational polygons). However, he insisted that this “dynamical justification” was not the initial motivation to define (but rather the arguments from Analysis sketched below).

In this setting, Giovanni’s ergodicity criterion for geodesic flows on flat surfaces (such as billiard flows on polygons) is:

Theorem 1 (Forni)Let be a flat surface with a finite subset of conical singularities. Suppose that there exist a subset with positive lower density (i.e., ) and a real number such that for each and one can find a connected subset with the following properties:

- (i) for all , where denotes the Cheeger constant of with respect to (see below for the definitions);
- (ii) uniformly on .

Then, the geodesic flow on is ergodic.

Remark 5Recall that the Cheeger constant of a domain with respect to a Riemannian metric on is

where and are the connected components of .

Intuitively, Giovanni’s ergodicity criterion can be thought as saying that if we can find a suitable subset of *good* renormalization times in the sense that the complement of “adequate small neighborhoods” of the subset of conical singularities has *bounded geometry* (i.e., a controlled Cheeger constant, cf. the condition (i) above) and almost full volume (cf. the condition (ii) above), then we can exploit these renormalization times to conclude the ergodicity of the geodesic flow.

Remark 6For the sake of comparison with the case of rational polygons/translation surfaces, let us observe that for a translation surface (with flat metric ) one has

where is a constant depending only on the genus of and denotes the systole of (that is, the length of the shortest saddle connection).In particular, since the systole of a translation surface on a compact region of the moduli space admits an uniform lower bound, the analog of the condition (i) in Giovanni’s ergodicity criterion in the setting of translation surfaces is satisfied by most translation surfaces thanks to the recurrence properties of the TeichmÃ¼ller geodesic flow (that is, of the deformation , and ).

Remark 7Still for the sake of comparison, it is worth to observe that after more recent works of Cheung-Eskin and TreviÃ±o we know that the ergodicity criterion can be substantially improved in the context of translation surfaces: indeed, one can ensure the ergodicity (and even unique ergodicity) of the flow generated by whenever the systole of the flat metric associated to the TeichmÃ¼ller deformation , (and ) verifies the non-integrability condition

(Note that this non-integrability condition is automatic for recurrent TeichmÃ¼ller deformations as for such deformations the quantity admit uniform lower bounds on a countable family of disjoint subintervals of definite sizes)Evidently, these results of Cheung-Eskin and TreviÃ±o motivate the following question: is it possible to weaken the condition (i) in Theorem 1 in order to allow Cheeger constants that could approach slowly (maybe in a similar spirit of the non-integrability condition above)? In fact I asked this question to Giovanni after his talk and he pointed out that it is not very clear that this possible with his current argument because of the subtle nature of the proof of the estimate (1) appearing below (especially the estimate of the term ).

Before discussing some elements of the proof of Theorem 1, let us quickly comment on the potential applications of Giovanni’s ergodicity criterion. At first sight, it is not obvious at all how to decide whether a given polygon with interior angles (or, more generally, a flat surface with conical singularities with cone angles ) verify the requirements of Theorem 1 (especially the condition (i)).

In this direction, even though Giovanni said that he has not fully checked his arguments yet, Giovanni is *confident* that the following Diophantine conditions on are sufficient to apply his ergodicity criterion.

Theorem 2 (Forni (in progress))Denote by (see Remark 8 below for the reason why we exclude ). Suppose that satisfies the following Diophantine conditions:Let be a polygon with sides and interior angles .

- (1) there exists a constant such that for all
- (2) there exists a constant such that for all (non-trivial) integer vectors one has

Then, the conditions (i) and (ii) in Theorem 1 hold, and, a fortiori, the billiard flow on is ergodic.

Remark 8The sum of the interior angles of is a fixed rational multiple of . For this reason, it is natural to impose Diophantine conditions on rather than .

Even though we are not going to sketch the proof of Theorem 2 today, let us now make two comments on the Diophantine conditions (1) and (2).

First, these conditions do not seem totally independent (even though it is not easy to figure out their relationship): for example, for , the condition (2) becomes , that is, for all , and this latter condition resembles the condition (1).

Secondly, the condition (1) is a full Lebesgue measure condition on *only* for . In other terms, one can use Theorem 2 to deduce the ergodicity of the billiard flow on almost every polygon with sides, but the analogous statement for the case of *triangles* remains still open.

Closing this post, let us give a brief sketch of the proof of Giovanni’s ergodicity criterion (Theorem 1).

The argument starts in the same way as in Giovanni’s proof of the spectral gap property (“”) for the Lyapunov exponents of the Kontsevich-Zorich cocycle via variational formulas for the Hodge norm (in Section 2 of this paper here). More concretely, we consider the foliated Cauchy-Riemann operators

associated to the deformation . (We said “foliated” because the distribution is integrable in and are the usual and along the leaves of this foliation)

Next, given a -function , we consider its decomposition

in terms of the image and the kernel of the Cauchy-Riemann operators . (Here, there is a subtle point: contrary to the case of translation surfaces, it is *not* known that the image of is closed; in particular, one should replace and by adequate elements in the closure of the images of , but we will skip this technical detail by pretending that the decomposition above can always be made)

Recall that, under the assumptions of Theorem 1, our task is to show that the geodesic flow is ergodic, that is, we want to show that any real -function with (i.e., is invariant) is actually constant.

For this sake, by mimicking the proof of Lemma 2.1′ of his paper, Giovanni shows the following variational formula:

From this formula, we can deduce that as , (where is the subset of positive lower density of “good renormalization times”, cf. the statement of Theorem 1). Indeed, since is obtained by orthogonal projection of with respect to the (closure of the) image of , we have that is uniformly bounded for all . By plugging this information into the variational formula above, we obtain that

for all and the claim that as , follows.

In other terms, we have just shown that converges (in ) to as , .

Next, we observe that the functions are harmonic (since are meromorphic, resp. anti-meromorphic), and, thus, we can apply Cauchy’s estimate to obtain that

where is the gradient in the metric associated to the deformation , and is a -neighborhood of in (that is, is essentially equal to the subset by condition (ii) of Theorem 1).

Using the facts that has “bounded geometry” (by condition (i) of Theorem 1), and as , , we (~~get that is constant along the leaves of the foliation associated to )~~ see that one is getting closer to show that is constant.

Nevertheless, the information obtained in the previous paragraph is not quite sufficient to conclude that is constant because the leaves of the foliation associated to (sometimes called *Loch Ness monsters* in the flat surfaces literature, see, e.g, this paper here) might not have bounded geometry. For this reason, Giovanni needs also ~~At this point, it remains only~~ to control the behavior of in the -direction. Here, after replacing by an adequate truncation of its Fourier series in the -direction still called by a slight abuse of notation, Giovanni told us (without giving the proof because he ran out of time) that a computation based on arguments from Harmonic Analysis reveals that

Because , the bounded geometry condition (i) in Theorem 1 allows us to conclude that (~~ is also constant along the -direction. Therefore, we deduce that)~~ is constant on , and, hence the geodesic flow (generated by ) is ergodic (so that the sketch of proof of Theorem 1 is complete).

Remark 9As we mentioned in Remark 4 above, Giovanni’s choice of deformation in the -direction was purely guided by the arguments from Harmonic Analysis in the proof of Theorem 3 which “impose” the factor of in his control of the growth of .

]]>

In this blog post, I will transcript my notes for Giovanni’s talk. Of course, all mistakes in this post are my entire responsibility. Also, I apologize in advance for any wrong statements in what follows: indeed, I arrived at the seminar room about 10 minutes after Giovanni’s talk had started; furthermore, since the seminar room was crowded (about 30 to 40 mathematicians were attending the talk), I was forced to sit in the back of the room and consequently sometimes I could not properly hear Giovanni’s explanations.

The main actor in Giovanni’s talk was the *classical horocycle flow* . By definition, is the flow induced by the action of the -parameter subgroup on the unit cotangent bundle of a hyperbolic surface of finite area (i.e., is a lattice of ).

The optimal speed of ergodicity (rate of convergence of Birkhoff averages) for classical horocycle flows was the subject of several papers in the literature of Dynamical Systems: for example, after the works of Zagier, Sarnak, Burger, Ratner, Flaminio-Forni, StrÃ¶mbergsson, etc., we know that the rate of ergodicity is intimately related to the eigenvalues of the Laplacian (“size of the spectral gap”) of the corresponding hyperbolic surface (and, furthermore, this is related to the Riemann hypothesis in the case ).

The bulk of Giovanni’s talk was the discussion of the analog problem for horocycle *maps*, that is, the question of determining the optimal such that the iterates of the time map of the horocycle flow verify

The basic motivations behind this question are potential applications to “sparse equidistribution problems” (some of them coming from Number Theory) such as:

- The following
*particular case*of Sarnak’s conjecture on the randomness of MÃ¶bius function: for*all*and , one hasIn other words, the non-conventional ergodic averages of the horocycle flow along prime numbers at

*every*point converge to the spatial average. - N. Shah’s conjecture: for
*each*, one hasfor

*all*and whenever with cocompact (i.e., the hyperbolic surface is compact). In other terms, the non-conventional ergodic averages of the horocycle flow along a*polynomial sequence*of times of the form , , at*every*point converge to the spatial average.

Also, Giovanni expects that the tools developed to obtain an estimate of the form could help in deriving *quantitative versions* of Ratner’s equidistribution results in more general contexts than the classical horocycles flows.

Before stating some of the main results of Flaminio, Forni and Tanis, let us just mention that:

- Sarnak and Ubis gave in 2011 the following evidence towards the particular case of Sarnak’s conjecture stated above: every weak- limit of the sequence of probability measures
converges to an absolutely continuous measure (with respect to ) whose density is bounded by . (Here, is the usual Dirac mass at );

- Very roughly speaking, an evidence in favor of Shah’s conjecture for
*very small*is the fact that behaves like a linear function (with a mildly large factor ) for , so that Shah’s conjecture should not be very far from the corresponding statement of equidistribution for linear sequences of times . As it turns out, Flaminio, Forni and Tanis were able to convert this heuristic argument in a proof of Shah’s conjecture for very small: indeed, they are confident that Shah’s conjecture is settled for and they hope to push their methods to get the same results for . Here, a key ingredient is Theorem 1 below where Flaminio, Forni and Tanis establish a precise control on the quantity .

After this brief introduction to horocycle maps, we are ready to state the main result of this post:

Theorem 1 (Flaminio-Forni-Tanis)Let be a cocompact subgroup of and fix . For all and , one has

where is the smallest eigenvalue of the Laplacian on the hyperbolic surface , is the following quantity (related to the so-calledspectral gap):

is an adequate Sobolev norm (say, , i.e., depends on the first twelve derivatives of ), is an “universal” constant (depending on only) and is a constant depending on and .

The right-hand side of (1) says that the quantity is controlled by a “spectral term” and by an “uniform term” . In particular, this quantity is controlled exclusively by the uniform part when the spectral gap is sufficiently large (i.e., ).

Remark 1The proof of Theorem 1 shows that the “spectral term” can also be eliminated if is a coboundary. In other terms, if where is a bounded function and is the infinitesimal generator (vector field) of the horocycle flow , then one has:

The first step in the proof of Theorem 1 is to take Fourier transform in the time variable of the expression

By doing so, we are naturally lead to the study of the following *twisted* ergodic averages:

where (“”) and has zero average. Then, Flaminio, Forni and Tanis use this to show that Theorem 1 follows from the following result about twisted ergodic averages:

Theorem 2 (Flaminio-Forni-Tanis)In the setting of Theorem 1, one has

for any and .

Remark 2The proof of this theorem provides us with a constant as , and, in fact, in this regime. This isexpectedbecause it is known that the speed of mixing of the horocycle flow gets slower as the size of the spectral gap gets close to zero, and, thus, one can not hope for an “uniform control” without letting the constant explode as .

Remark 3The proof of this theorem also provides us with a constant as but Flaminio, Forni and Tanis have some hope to improve this (as there is no a priori reason to expect this kind of behavior in this regime).

Before giving a sketch of the proof of Theorem 2, let us recall that:

- Venkatesh obtained in 2010 the following bound:
where , and, more recently,

- Tanis and Visha announced (in 2014) that:

Observe that Venkatesh’s bound has the advantage that the implied constant is “universal” while this constant depends on in the case of Flaminio-Forni-Tanis and Tanis-Visha.

On the other hand, Flaminio-Forni-Tanis and Tanis-Visha obtain uniform exponents ( and resp.) on at the cost of sacrificing the uniformity on the constant (an expected fact, see Remark 2 above) contrary to Venkatesh’s bounds where the exponent on depends on the spectral gap.

Also, the control of Flaminio-Forni-Tanis of the constant on the regime (by , see Remark 2 above) is worse than Tanis-Visha’s control , but the exponent of obtained by Flaminio-Forni-Tanis (of ) is better than Tanes-Visha’s exponent (of ).

For the sake of comparison of the techniques employed by Venkatesh and Flaminio-Forni-Tanis, let us now quickly present a sketch of proof of Venkatesh’s bound. In a nutshell, Venkatesh’s method is based on the speed of ergodicity and mixing of the horocycle flow.

More concretely, let be a parameter to be chosen later and pose .

A direct computation reveals that there is no harm in replacing by in our way to estimate twisted ergodic averages because

where

Next, by Cauchy-Schwarz inequality, we have that . Moreover, the results of Burger and Flaminio-Forni on the speed of ergodicity of horocycle flows say that

with precise estimates in the error terms. In particular, our task becomes to understand how the quantity

approaches zero. Here, after “unfolding” this integral (using the definition of ), one can check that, for each , the resulting expression can be controlled in terms of Ratner’s results showing that the speed of mixing of horocycle flows is given by the size of the spectral gap. Finally, Venkatesh gets the bound described above by optimizing the choice of the parameter . See Section 3 of Venkatesh’s paper for more details.

From their side, Flaminio-Forni-Tanis use a different route to prove Theorem 2, namely, they employ *renormalization methods* to study twisted ergodic averages for horocycle flows.

This method is inspired from the renormalization method for *classical* ergodic averages of horocycle flows where one exploits the facts that the *geodesic flow* *dilates* the orbits of the horocycle flow in the sense that , and the geodesic flow is *exponentially mixing* (with precise estimates; see, e.g, the works of Dolgopyat, Liverani, Baladi-Liverani, etc.). Also, this method is similar in spirit to the techniques used by Forni to study of deviations of ergodic averages of interval exchange transformations and translation flows.

The basic idea to apply renormalization method sketched above in the context of twisted ergodic averages

is to reinterpret them as classical ergodic averages

of a *new* flow associated to the vector field on where and depends on .

Unfortunately, a straightforward application of the corresponding geodesic flow to renormalize does not seem to work well: the orbits of are *low* (1-)*dimensional objects* inside the unstable manifolds of the geodesic flow, and, thus, their equidistribution properties are harder to obtain (in comparison with the setting of classical ergodic averages of classical horocycles flows).

Nevertheless, Flaminio-Forni-Tanis noticed that this renormalization scheme works after replacing the geodesic flow by an adequate *scaling* (playing the role of a “fake geodesic flow”). More precisely, the idea of Flaminio, Forni and Tanis is to find a scaling that dilates the orbits of (in a similar way that the geodesic flow dilates the orbits of the horocycle flow) which is well-behaved enough to allow equidistribution estimates.

In this direction, Flaminio, Forni and Tanis start by showing that the coboundaries (i.e., the functions of the form ) are characterized by a countable family of invariant distributions in the sense that is a coboundary if and only if for all . (Compare with these works of Flaminio-Forni on horocycle flows, and Forni on translation flows).

After that, they use this family of invariant distributions to build up an adequate scaling: the main point is that the scaling must be so that the sizes of the distributions get smaller; in this way, we can control the ergodic average of an arbitrary function because after scaling it becomes closer to a coboundary and the ergodic averages of a coboundary is easy to control (for example, they stay bounded if with bounded). Here, they introduce the following scaling (on vector fields):

where , is the infinitesimal generator of the geodesic flow , is the infinitesimal generator of the stable horocycle subgroup of , and is the infinitesimal generator of the rotation group .

Denoting by the induced metric associated to (i.e., is the metric obtained by making these vector fields into an orthonormal frame), one has

so that the invariant distributions gets effectively small when .

Furthermore, the crucial point about this scaling — making it into a helpful tool in the proof of Theorem 2 — is that Flaminio-Forni-Tanis can show that the *geometry* of stays uniformly bounded as .

Of course, this is a very important point in this argument because the implied constants above (showing up in the estimates of ergodic averages) are related to the *best constants* in Sobolev inequalities (among of several other things) and, hence, they stay uniformly bounded whenever the geometry of is under control. (For the sake of comparison, let us mention that this “bounded geometry” property (after scalings) in the context of translation flows corresponds to the recurrence properties of the TeichmÃ¼ller geodesic flow on the moduli space of translation surfaces, see these papers of Forni here and here on this subject).

In summary, the key idea to obtain Theorem 2 is the introduction of a scaling of (“mimicking” the action of the geodesic flow on horocycle flow orbits or the TeichmÃ¼ller geodesic flow on translation surfaces) making all -invariant distributions small (i.e., making all functions into almost coboundaries) in such a way that the underlying geometry of stays bounded. This completes our sketch of proof of Theorem 2.

We conclude this post with the following remark.

Remark 4The factor in (2) “explains” the behavior in Remark 2 above.

]]>

The first talk of this seminar in this new format was given by Alba MÃ¡laga, and the next two talks (on next November 12th and December 10th) will be given by Giovanni Forni (on the ergodicity for billiards in irrational polygons) and James Tanis (on equidistribution for horocycle maps): the details can be found here.

In this blog post, we will discuss Alba’s talk about some of the results in her PhD thesis (under the supervision of J.-C. Yoccoz) concerning a family of maps preserving the measure of (as hinted by the title of this post). Of course, any mistakes/errors in what follows are my entire responsibility.

In her PhD thesis, Alba studies the following family of dynamical systems (“cylinder flows”).

The phase space is where is the unit circle. We call the circle of level in the phase space.

The parameter space is .

Given a parameter , we can define a transformation of the phase space by rotating the elements of the circle of level by , and then by putting them at the level (one level up) or (one level down) depending on whether they fall in the first or second half of the circle of level . In other terms,

where is the rotation by on the unit circle .

Note that we have left *undefined* at the points such that or . Of course, one can complete the definition of by sending each of the points in this countable family to a level up or down in an arbitrary way. However, we prefer *not* do so because this countable family of points will play no role in our discuss of typical orbits of . Instead, we will think of the set of points where is undefined as a (very mild) *singular set*.

Alba’s initial motivation for studying this family comes from billiards in irrational polygons. Indeed, our current knowledge of the dynamics of billiard maps on irrational polygons (i.e., polygons whose angles are not all rational multiples of ) is very poor, and, as Alba explained very well in her talk (with the aid of computer-made figures), she has a good *heuristic* argument suggesting that the billiard map on an irrational lozenge obtained by small perturbation of an unit square can be thought as a small perturbation of some members of the family . However, we will not pursue further this direction today and we will focus exclusively on the features of from now on.

It is an easy exercise to check that, for any parameter , the corresponding dynamical system preserves the *infinite* product measure , where is the counting measure on and is the Lebesgue measure on .

In this setting, Alba’s thesis is concerned with the dynamics of for a *typical* parameter (in both Baire-category and measure-theoretical senses).

Before stating some of Alba’s results, let us quickly discuss the dynamical behavior of for some *particular* choices of the parameter .

Example 1Consider the constant sequence . By definition, acts by a translation by on the -coordinate of all points of the phase space. In particular, the second iterate of any point has the form where . Furthermore, the function is not difficult to compute: since , we see that if , resp. , then resp. and, hence, . In other words, for all , and, thus, is aperiodictransformation (of period two).

Example 2Consider the constant sequence . Similarly to the previous example, acts periodically (with period ) on the -coordinate in the sense that where . Again, the function is not difficult to compute: by dividing the unit circle into the six intervals , , one can easily check that

In particular, we see that systematically moves the copy of an interval with even, resp. odd, at the circle of level to the corresponding copy , resp. , of the interval at level , resp. . In other terms, haswandering domains(i.e., domains which are disjoint from all its non-trivial iterates under the map) of positive -measure and, hence, is not conservative in the sense that it does not satisfy PoincarÃ©’s recurrence theorem with respect to the infinite invariant measure : for example, for each , sends the subset of -measure always “upstairs” to its copy at the -th level, so that the orbits of points in escape to (one of the “ends”) in the phase space .

Remark 1The reader can easily generalize the previous two examples to obtain that the transformation associated to the constant sequence with (a rational number written in lowest terms) is periodic or it has wandering domains of positive measure depending on whether the denominator is even or odd.

Example 3By a theorem of Conze and Keane, the transformation associated to a constant sequence with is ergodic (but not minimal).

Today, we will give sketches of the proofs of the following two results:

Theorem 1 (MÃ¡laga)For almost all parameter (with respect to the standard product Lebesgue measure), the transformation is conservative, i.e., has no wandering domains of positive -measure.

Theorem 2 (MÃ¡laga)For a Baire-generic parameter (with respect to the standard product topology), the transformation is conservative, ergodic, and minimal.

**1. Conservativity of for typical parameters **

Let . We say that the circle of level is *mirror* for if . This nomenclature is justified by the fact that any orbit of hitting a mirror level ends up by bouncing back:

- by definition, a point sent upstairs by to the point at the level satisfies ; thus, the fact that at a mirror level forces the next iterate to come back to the level , i.e.,
because if and only if ; in other terms, a point sent upstairs by to a mirror level is reflected downstairs to the initial level in the next iteration;

- similarly, a point sent downstairs by to a mirror level is reflected upstairs to the level in the next iteration.

Inspired by the notion of mirror levels, let us introduce the set of parameters:

The next proposition follows directly from the definitions of the standard product topology and Lebesgue measure on , and it is left as en exercise to the reader:

Here, we recall that a subset is a subset containing a countable intersection of open subsets, and, by definition, we say that a given property holds for a Baire-generic parameter whenever this property is verified for all parameters in a certain dense subset.

In particular, Proposition 3 implies that the first part of Theorem 2 and the entire statement of Theorem 1 are both immediate consequences of the following result:

Let us now complete our outline of the proof of Theorem 1 (and the first part of Theorem 2) by giving the basic idea behind of the proof of Proposition 4 (while hiding the details under the rug).

The main point is the following simple variant of the notion of mirror level. Suppose that is a level such that is very close to . Of course, the level is not a (perfect) mirror if , but a direct computation reveals that it is an *almost mirror*: the set of points passing through the level under iteration by (i.e., not bouncing back to the previous level) has measure because this set is the union of two intervals of sizes (containing in their boundaries) at the circles of levels and .

Using the notion of almost mirror levels, a rough outline of the proof of Proposition 4 goes as follows. By contradiction, assume that is not conservative, i.e., there exists a (wandering) set of positive -measure whose iterates under are mutually disjoint. By the Lebesgue density theorem, we can essentially think of as an interval of positive Lebesgue measure (around a density point of ) at a circle of fixed level (say ).

Note that the iterates of escape to infinite (by going to or on the phase space ): this happens because the iterates of are all mutually disjoint and their -measures are equal to (since preserves the measure ), and the -measure of any “box” consisting of the union of circles of level with is finite for all . So, the number of iterates of trapped inside a given box is finite since it verifies

On the other hand, by definition of , given , there are arbitrarily large numbers such that and , i.e., the levels and are -almost mirrors.

In particular, since can be chosen arbitrarily small, the set of points whose iterate is not reflected back by a -almost mirror has tiny -measure (equal to ) and preserves the measure , one can show that the -measure of would be equal arbitrarily small, i.e., : this occurs because the -iterates of the wandering domain go to either or on the phase space , and, in their way to infinite, they will pass through all -almost mirrors with arbitrarily small located at either the levels or the levels . Of course, this is a contradiction with our assumption that is a wandering domain of with , so that our brief sketch of proof of Proposition 4 is complete.

**2. Ergodicity of for (Baire) generic parameters **

Recall from the previous section that Theorem 1 and the first part of Theorem 2 were direct consequences of Propositions 3 and 4. Thus, it remains only to show that is ergodic and minimal for a Baire-generic parameter .

Since the argument to show the minimality of is very similar to the one proving the ergodicity of (both arguments are based on results of minimality and unique ergodicity for interval exchange transformations and translation flows on translation surfaces; cf. Remark 2), from now on we will focus exclusively on the ergodicity of for Baire-generic paramaters .

By Proposition 3, our task is reduced to show that

The fact that the condition “ is ergodic” leads to a subset is *almost* due to Oxtoby-Ulam: indeed, Oxtoby-Ulam observed that the ergodicity condition (written in terms of Birkhoff averages) usually leads to sets in the setting of *probability measure* preserving transformations; of course, preserves an *infinite* measure, but, as we shall see in a moment, Oxtoby-Ulam’s argument can be adapted this context.

Let , so that is conservative. In this case, we have a *well-defined* countable family of first-return maps of the orbits of to the circle of level . Note that each preserves the natural Lebesgue (probability) measure on .

We affirm that the ergodicity of is detected by the countable family of probability measure preserving transformations, i.e., is ergodic if and only if is ergodic for all . In fact, if is ergodic, then each must be ergodic (otherwise the -iterates of a non-trivial -invariant subset of would give a subset of contradicting the ergodicity of ), and, conversely, if is ergodic for all , then a -invariant with positive measure must be trivial (as it intersects a circle of some level in a set of positive measure, the ergodicity of implies that actually it intersects the circle of level in a subset of full measure; hence, by iterating under and using the ergodicity of for all , we conclude that it intersects all circles of all levels in subsets of full measures).

Now, we will combine this claim together with Oxtoby-Ulam’s argument to show that the set is a subset. For this sake, we select a dense subset of continuous functions on , and we observe that the claim above (and the definition of ergodicity in terms of Birkhoff averages) implies that if and only if where

(and, by abuse of notation, we think of the function as defined on when writing ). Since the parameter sets , and, a fortiori, , can be shown to be relatively open in for each fixed , , and (the details are left as an exercise to the reader), we deduce that is a -subset.

At this point, it remains only to show that is a *dense* subset of in order to complete the proof of Proposition 5. By Baire’s theorem, it suffices to prove that is a dense subset for each fixed .

Given and a neighborhood of , we want to find .

Because the basis of neighborhoods of in the standard product topology of is generated by open sets of the form

where and is a *finite* set, we can take large enough so that and contains an element with .

By definition, the circle at the levels and are mirrors for , so that has part of dynamics confined into the box . Furthermore, the reader can verify that the restriction of can be interpreted as an interval exchange transformation related to the vertical translation flow on a (compact) translation surface obtained by gluing the pieces of the boundaries of the cylinders accordingly to the formulas defining . Moreover, a “translation”

of the parameter by a small quantity leads to a transformation associated to the translation flow on is an almost vertical direction.

By a theorem of Kerckhoff, Masur and Smillie, the translation flow on the compact translation surface is uniquely ergodic in almost all directions. Using this result, we can choose so small that is a parameter such that is uniquely ergodic. From this fact, it follows that also belongs to (and, actually, ), as desired.

Remark 2The same argument above works by replacing “uniquely ergodic” by “minimal”.

Remark 3A “counter-intuitive” feature of the previous argument (which is somewhat common in Baire genericity type arguments) is that the denseness of (and its analog consisting of minimal dynamics parameters) actually usesnon-ergodic(andnon-minimal) transformations whose dynamics has a piece blocked inside the box between the mirror levels . Of course, the main point is that is “ergodic (and minimal) on a large portion” of the phase space, and this kind of “partial information” is usually sufficient to run a denseness proof based on Baire’s theorem.

]]>

However, before entering into the mathematical discussion strictly speaking, let me take the opportunity to dedicate this blog post to the memory of two Russian mathematicians who passed away earlier this month: Dmitri Anosov and Nikolai Chernov. Among their several well-known contributions in Dynamical Systems, we can quote:

- Anosov’s proof of the ergodicity of -volume preserving of a large class of hyperbolic systems (nowadays called Anosov diffeomorphisms);
- Chernov’s proof of subexponential mixing for a large class of Anosov flows;
- …

Of course, the list of contributions of Anosov and Chernov to Dynamical Systems is vast: each of them wrote more than 90 research articles and books about the features of systems with some hyperbolicity (such as geodesic flows on negatively curved manifolds and chaotic billiards) among other topics.

In particular, it is out of the scope of this post to provide detailed descriptions of the works of these two very influential dynamicists.

On the other hand, as a form of “small compensation”, let me say that the second section of this post (about rates of the WP flow on the modular surface) briefly discusses some of the ideas advanced by these two mathematicians.

Concerning the rates of mixing of the WP flow, let us recall that, by Burns-Masur-Wilkinson theorem (cf. Theorem 1 in the first post of this series), the WP flow on is *mixing* with respect to the Liouville measure whenever .

By definition of the mixing property, this means that the correlation function converges to as for any given -integrable observables and . (See, e.g., the section “ formulation” in this Wikipedia article about the mixing property.)

Given this scenario, it is natural to ask how *fast* the correlation function converges to zero. In general, the correlation function can decay to (as a function of ) in a very slow way depending on the choice of the observables (see, e.g., this blog post of Climenhaga for some concrete examples). Nevertheless, it is often the case (for mixing flows with some hyperbolicity) that the correlation function decays to with a *definite* (e.g., polynomial, exponential, etc.) speed when restricting the observables to appropriate spaces of “reasonably smooth” functions.

In other words, given a mixing flow (with some hyperbolicity), it is usually possible to choose appropriate functional (e.g., HÃ¶lder, , Sobolev, etc.) spaces and such that

- for some constants , and for all (
*polynomial decay*), - or for some constants , and for all (
*exponential decay*).

Evidently, the “precise” rate of mixing of the flow (i.e., the sharp values of the constants , and/or above) depend on the choice of the functional spaces and (as they might change if we replace observables by observables say). On the other hand, the *qualitative* speed of decay of , that is, the fact that decays polynomially or exponentially as whenever and are “reasonably smooth”, remains *unchanged* if we select and from a well-behaved scale of functional (like spaces, , or spaces, ). In particular, this partly explains why in the Dynamical Systems literature one simply says that a given mixing flow has “polynomial decay” or “exponential decay”: usually we are interested in the qualitative behavior of the correlation function for reasonably smooth observables, but the particular choice of functional spaces and is normally treated as a “technical detail”.

After this brief description of the notion of rate of mixing (speed of decay of correlation functions), we are ready to state the main result of this post.

Theorem 1 (Burns-Masur-M.-Wilkinson)The rate of mixing of the WP flow on is:

- at most polynomial when ;
- rapid (faster than any polynomial) when .

Remark 1This result was announced as Theorem 2 in the first post of this series and also in this preprint here. Since then, Burns, Masur, Wilkinson and myself found some evidence indicating that the Weil-Petersson geodesic flow on is actually exponentially mixing when . The details will hopefully appear in the forthcoming paper (currently still in preparation).

Remark 2An open problem left by Theorem 1 is to determine the rate of mixing of the WP flow on for . Indeed, while this theorem provides a polynomial upper bound for the rate of mixing in this setting, it does not rule out the possibility that the actual rate of mixing of the WP flow is sub-polynomial (even for reasonably smooth observables). Heuristically speaking, we believe that the sectional curvatures of the WP metric control the time spend by WP geodesics near the boundary of . In particular, it seems that the problem of determining the rate of mixing of the WP flow (when ) is somewhat related to the issue of finding suitable (polynomial?) bounds for how close to zero the sectional curvatures of the WP metric can be (in terms of the distance to the boundary of ). Unfortunately, the best available bounds for the sectional curvatures of the WP metric (due to Wolpert) do not rule out the possibility that some of these quantities get extremely close to zero (see Remark 4 of this post here).

The difference in the rates of mixing of the WP flow on when or in Theorem 1 reflects the following simple (yet important) feature of the WP metric near the boundary of the Deligne-Mumford compactification of .

In the case , e.g., , the moduli space equipped with the WP metric looks like the surface of revolution of the profile near the cusp at infinity (see Remark 6 of this post here). In particular, even though a -neighborhood of the cusp is “polynomially large” (with area ), the Gaussian curvature approaches only near the cusp and, as it turns out, this strong negative curvature near the cusp makes that all geodesic not pointing directly towards the cusp actually come back to the compact part in bounded (say ) time. In other words, the excursions of infinite WP geodesics on near the cusp are so quick that the WP flow on is “close” to a classical Anosov geodesic flow on negatively curved compact surface. In particular, it is not entirely surprising that the WP flow on is rapid.

On the other hand, in the case , the WP metric on has *some* sectional curvatures close to *zero* near the boundary of the Deligne-Mumford compactification of (see Theorem 3 and Remark 5 of this post here). By exploiting this feature of the WP metric on for (that has no counterpart for or ), we will build a *non-neglegible* set of WP geodesics spending a *long* time near the boundary of before eventually getting into the compact part. In this way, we will deduce that the WP flow on takes a fair (polynomial) amount of time to mix certain parts of the boundary of with fixed compact subsets of .

In the remainder of this post, we will give some details of the proof of Theorem 1. In the next section, we give a fairly complete proof (assuming the results in this previous post, of course) of the polynomial upper bound on the rate of mixing of the WP flow on when . After that, in the final section, we provide a *sketch* of the proof of the rapid mixing property of the WP flow on . In fact, we decided (for pedagogical reasons) to explain some key points of the rapid mixing property *only* in the *toy model* case of a negatively curved surface with one cusp corresponding *exactly* to a surface of revolution of a profile , . In this way, since the WP metric near the cusp of can be thought as a “perturbation” of the surface of revolution of (thanks to Wolpert’s asymptotic formulas), the reader hopefully will get a flavor of the main ideas behind the proof of rapid mixing of the WP flow on without getting into the (somewhat boring) technical details needed to check that the arguments used in the toy model case are “sufficiently robust” so that they can be “carried over” to the “perturbative setting” of the WP flow on .

**1. Rates of mixing of the WP flow on . I **

In this section, our notations are the same as in this previous post here.

Given , let us consider the portion of consisting of such that a non-separating (homotopically non-trivial, non-peripheral) simple closed curve has hyperbolic length . The following picture illustrates this portion of as a -neighborhood of the stratum of the boundary of the Deligne-Mumford compactification where gets pinched (i.e., becomes zero).

Note that the stratum is non-trivial (that is, not reduced to a single point) when . Indeed, by pinching as above and by disconnecting the resulting node, we obtain Riemann surfaces of genus with punctures whose moduli space is isomorphic to . It follows that is a complex orbifold of dimension , and, a fortiori, is not trivial. Evidently, this argument breaks down when : for example, by pinching a curve as above in a once-punctured torus and by removing the resulting node, we obtain thrice punctured spheres (whose moduli space is trivial). In particular, our Figure 1 concerns *exclusively* the case .

We want to locate certain regions near taking a long time to mix with the compact part of . For this sake, we will exploit the geometry of the WP metric near — e.g., the fact provided by Wolpert’s formulas (cf. Theorem 3 in this post) that some sectional curvatures of the WP metric approach zero — to build nice sets of unit vectors traveling in an “almost parallel” way to for a significant amount of time.

More precisely, we consider the vectors and (where is the complex structure). By definition, they span a complex line . Intuitively, the complex line points in the normal direction to a “copy” of inside a level set of the function as indicated in the following picture:

Using the complex line , we formalize the notion of “almost parallel” vector to . Indeed, given , let us denote by the quantity (where is the WP metric). By definition, measures the size of the projection of the unit vector in the complex line . In particular, we can think of as “almost parallel” to whenever the quantity is very close to zero.

In this setting, we will show that unit vectors almost parallel to whose footprints are close to always generate geodesics staying near for a long time. More concretely, given , let us define the set

where and is the footprint of the unit vector . Equivalently, is the disjoint union of the pieces of spheres attached to points with . The following figure summarizes the geometry of :

We would like to prove that a geodesic originating at any stays in a -neighborhood of for an interval of time of size of order , so that the WP geodesic flow does *not* mix with any fixed ball in the compact part of of Riemann surfaces with systole :

In this direction, we will need the following lemma from the third post of this series (cf. Lemma 13 in this post here).

From this lemma, it is not hard to estimate the amount of time spent by a geodesic near for an arbitrary :

Lemma 3There exists a constant (depending only on and ) such that

for all and .

*Proof:* By definition, implies that . Thus, it makes sense to consider the maximal interval of time such that for all .

By Lemma 2, we have that , i.e., for some constant depending only on and . In particular, for all . From this estimate, we deduce that

for all . Since the fact that implies that , the previous inequality tell us that

for all .

Next, we observe that, by definition, . Hence,

By putting together the previous two inequalities and the fact that (as ), we conclude that

Since was chosen so that is the maximal interval with for all , we have that . Therefore, the previous estimate can be rewritten as

Because , it follows from this inequality that where .

In other words, we showed that , and, *a fortiori*, for all . This completes the proof of the lemma.

Once we have Lemma 3 in our toolbox, it is not hard to infer some upper bounds on the rate of mixing of the WP flow on when .

Proposition 4Suppose that the WP flow on has a rate of mixing of the form

for some constants , , for all , and for all choices of -observables and .Then, , i.e., the rate of mixing of the WP flow is at most polynomial.

*Proof:* Let us fix once and for all an open ball (with respect to the WP metric) contained in the compact part of : this means that there exists such that the systoles of all Riemann surfaces in are .

Take a function supported on the set of unit vectors with footprints on with values such that and : such a function can be easily constructed by smoothing the characteristic function of with the aid of bump functions. Next, for each , take a function supported on the set with values such that and : such a function can also be constructed by smoothing the characteristic function of after taking into account the description of the WP metric near given by Theorems 2 and 3 in this post here and the definition of (in terms of the conditions and ). Furthermore, this description of the WP metric near combined with the asymptotic expansion where and is a twist parameter (see the proof of Lemma 4 of this post here) says that : indeed, the condition on footprints of unit tangent vectors in provides a set of volume (cf. the proof of Lemma 4 of the aforementioned post for details) and the condition on unit tangent vectors in with a fixed footprint provides a set of volume comparable to the Euclidean area of the Euclidean ball (cf. Theorem 2 in this post here), so that

In summary, for each , we have a function supported on with , and for some constant depending only on and .

Our plan is to use the observables and to give some upper bounds on the mixing rate of the WP flow . For this sake, suppose that there are constants and such that

for all and .

By Lemma 3, there exists a constant such that whenever . Indeed, since is a symmetric set (i.e., if and only if ), it follows from Lemma 3 that all Riemann surfaces in the footprints of have a systole . Because we took in such a way that all Riemann surfaces in have systole , we obtain , that is, , as it was claimed.

Now, let us observe that the function is supported on because is supported on and is supported on . By putting together this fact and the claim in the previous paragraph (that for ), we deduce that whenever . Thus,

By plugging this identity into the polynomial decay of correlations estimate , we get

whenever and .

We affirm that the previous estimate implies that . In fact, recall that our choices were made so that where is a fixed ball, , for some constant and . Hence, by combining these facts and the previous mixing rate estimate, we get that

that is, , for some constant and for all sufficiently small (so that and ). It follows that , as we claimed. This completes the proof of the proposition.

Remark 3In the statement of the previous proposition, the choice of -norms to measure the rate of mixing of the WP flow is not very important. Indeed, an inspection of the construction of the functions in the argument above reveals that for any , . In particular, the proof of the previous proposition is sufficiently robust to show also that a rate of mixing of the form

for some constants , , for all , and for all choices of -observables and holds only if .In other words, even if we replace -norms by (stronger, smoother) -norms in our measurements of rates of mixing of the WP flow (on for ), our discussions so far will always give polynomial upper bounds for the decay of correlations.

At this point, our discussion of the proof of the first item of Theorem 1 is complete (thanks to Proposition 4 and Remark 3). So, we will now move on to the next section we give some of the key ideas in the proof of the second item of Theorem 1.

**2. Rates of mixing of the WP flow on . II **

Let us consider the WP flow on when , that is, when or .

Actually, we will restrict our attention to the case because the remaining case is very similar to .

Indeed, the moduli space of four-times punctured spheres is a *finite* cover of the moduli space : this can be seen by sending each four-punctured sphere to the elliptic curve , so that becomes naturally isomorphic to where is a congruence subgroup of of level with index . Since all arguments towards rapid mixing of geodesic flows in this section still work after taking finite covers, it suffices to prove the second item of Theorem 1 to the WP flow on .

The rate of mixing of a geodesic flow on the unit tangent bundle of a negatively curved *compact* surface is known to be *fast*: indeed, Chernov used his technique of “Markov approximations” to show *stretched exponential* decay of correlations, and Dolgopyat added a new crucial ingredient (“Dolgopyat’s estimate”) to Chernov’s work to prove *exponential* decay of correlations.

Evidently, these works of Chernov and Dolgopyat can not be applied to the Wp flow on because of the non-compactness of due to the presence of a (single) cusp (at infinity). Nevertheless, this suggests that we should be able to determine the rate of mixing of the WP flow on provided we have enough control of the geometry of the WP metric near the cusp.

Fortunately, as we mentioned in Example 5 of this post here, Wolpert showed that the WP metric on has an *asymptotic* expansion at a point . Thus, the WP metric on neighborhoods (with ) of the cusp at infinity of becomes closer (as ) to the metric of surface of revolution of the profile on neighborhoods of the cusp at (as ).

Partly motivated by the scenario of the previous paragraph, from now on we will *pretend* that the WP metric on looks *exactly* like the metric at all points for some . In other words, instead of studying the WP flow on , we will focus on the rates of mixing of the following *toy model*: the geodesic flow on a negatively curved surface with a single cusp possessing a neighborhood where the metric is isometric to the surface of revolution of a profile for a fixed real number .

Remark 4The surface of revolution modeling the WP metric on is obtained by rotating the profile . In other words, we see that the study of rates of mixing of the surface of revolution approximating the WP metric on is a “borderline case” in our subsequent discussion.

Here, our main motivations to replace the WP flow on by the toy model described above are:

- all important ideas for the study of rates of mixing of are also present in the case of the toy model, and
- even though the WP metric on is a perturbation of a surface of revolution, the verification of the fact that the arguments used to estimate the decay of correlations of the geodesic flow on the toy model surfaces are robust enough so that they can be carried over the WP metric situation is somewhat boring: basically, besides performing a slight modification of the proofs to include the borderline case , one has to introduce “error terms” in the whole discussion below and, after that, one has to check that these errors terms do not change the qualitative nature of all estimates.

In summary, the remainder of this section will contain a proof of the following “toy model version” of the second item of Theorem 1.

Theorem 5Then, the geodesic flow (associated to ) on is rapid (faster than polynomial) mixing in the sense that, for all , one can choose an adequate Banach space of “reasonably smooth” observables and a constant so thatLet be a compact surface and fix . Suppose that is equipped with a negatively curved Riemannian metric such that the restriction of to a neighborhood of is isometric to a surface of revolution of a profile (for some choices of and ).

for all .

Remark 5The arguments below show that the statement above also holds when is equipped with a negatively curved metric that is isometric to a surface of revolution , , near for each .

Remark 6The Riemannian metric is incomplete because the surface of revolution of is incomplete when (as the reader can check via a simple calculation).

Recall that, in the setting of Theorem 5, we want to understand the dynamics of the excursions of the geodesic flow near the cusp (in order to get rapid mixing). For this sake, we describe these excursions by rewriting the geodesic flow (near ) as a *suspension flow*.

** 2.1. Excursions near the cusp and suspension flows **

Consider a small neighborhood in of where the metric is isometric to the surface of revolution of the profile , i.e.,

Next, take a small parameter and consider the parallel . We parametrize unit tangent vectors to the surface of revolution with footprints in as follows.

Given , we denote by the unique unit tangent vector pointing towards to the cusp at . Equivalently, is the unit vector tangent to the meridian at time , or, alternatively, where is the distance function from the cusp to a point . Also, we let be the unit vector obtained by rotating by in the counterclockwise sense (i.e., by applying the natural almost complex structure ).

In this setting, an unit vector pointing towards the cusp is completely determined by a real number such that and , i.e.,

The *qualitative* behavior of the excursion of a geodesic starting at can be easily determined in terms of the parameter thanks to the classical results in Differential Geometry about surfaces of revolutions. Indeed, it is well-known (see, e.g., Do Carmo’s book) that such a geodesic satisfies

and

for a certain constant , and, furthermore, these relations imply the famous Clairaut’s relation:

where is the parameter attached to (i.e., ). In particular, except for the geodesic going directly to the cusp (i.e., the geodesic starting at associated to ), all geodesics (starting at with ) behave qualitatively in a simple way. In the first part of its excursion towards the cusp, the angle increases (resp. decreases) from to (resp. from to ) while the value of diminishes in order to keep up with Clairaut’s relation. Then, the geodesic reaches its closest position to the cusp at time : here, (i.e., is tangent to the parallel containing ) and, hence,

Finally, in the second part , does the “opposite” from the first part: the angle goes from to and increases from back to . The following picture summarizes the discussion of this paragraph:

Remark 7Note that the time taken by the geodesic to go from the parallel to and then from back to isindependentof the basepoint . Indeed, this is a direct consequence of the rotational symmetry of our surface. Alternatively, this can be easily seen from the formula

deduced by integration of the ODE satisfied by . Observe that this formula also shows that is uniformly bounded, i.e., for all . Geometrically, this means that all geodesics starting at must return to in bounded time unless they go directly into the cusp.

This description of the excursions of geodesics near the cusp permits to build a *suspension-flow* model of the geodesic flow near . Indeed, let us consider the *cross-section* . As we saw above, an element of the surface is parametrized by two angular coordinates and : the value of determines a point and the value of determines an unit tangent vector making angle with . The subset of consisting of those elements with angular coordinate corresponds to the unit vectors with footprint in pointing towards the cusp at . The equation determines a circle inside corresponding to geodesics going straight into the cusp, and, furthermore, we have a natural “first-return map” defined by where is the geodesic starting at at time .

In this setting, the orbits , are modeled by the “suspension flow” if , over the *base map* with *roof* function , .

Remark 8Technically speaking, one needs to “complete” the definition of and by including the dynamics of the geodesic flow on the compact part of in order to properly write the geodesic flow on as a suspension flow. Nevertheless, since the major technical difficulty in the proof of Theorem 5 comes from the presence of the cusp, we will ignore the excursions of geodesics in the compact part and we will pretend that the (partially defined) flow is a “genuine” suspension flow model.

** 2.2. Rapid mixing of contact suspension flows **

One of the advantages about thinking of the geodesic flow on as a suspension flow comes from the fact that several authors have previously studied the interplay between the rates of mixing of this class of flows and the features of and : see, e.g., these papers of Avila-GouÃ«zel-Yoccoz and Melbourne for some results in this direction (and also for a precise definition of suspension flows).

For our current purposes, it is worth to recall that BÃ¡lint and Melbourne (cf. Theorem 2.1 [and Remarks 2.3 and 2.5] of this paper here) proved the rapid mixing property for *contact* suspension flows whose base map is modeled by a *Young tower* with *exponential tails* and whose roof function is bounded and *uniformly piecewise HÃ¶lder continuous* on each subset of the basis of the Young tower. In particular, the proof of Theorem 5 is complete once we prove that the base map is modeled by Young towers and the roof function is bounded and uniformly piecewise HÃ¶lder continuous on each element of the basis of the Young tower (whatever this means).

As it turns out, the theory of Young towers (introduced by Young in these papers here and here) is a *double-edged sword*: while it provides an adequate setup for the study of statistical properties of systems with some hyperbolicity *once* the so-called *Young towers* were built, it has the *drawback* that the construction of Young towers (satisfying all five natural but technical axioms in Young’s definition) is usually a delicate issue: indeed, one has to find a countable Markov partition of a positive measure subset (working as the basis of the Young tower) so that the return maps associated to this Markov partition verify several hyperbolicity and distortion controls, and it is not always clear where one could possibly find such a Markov partition for a given dynamical system.

Fortunately, Chernov and Zhang gave a list of *sufficient* geometric properties for a *two-dimensional* map like to be modeled by Young towers with exponential tails: in fact, Theorem 10 in Chernov-Zhang paper is a sort of “black-box” producing Young towers with exponential tails whenever seven *geometrical* conditions are fulfilled. For the sake of exposition, we will not attempt to check all seven conditions for : instead, we will focus on two main conditions called *distortion bounds* and *one-step growth condition*.

Before we discuss the distortion bounds and the one-step growth condition, we need to recall the concept of *homogeneity strips* (originally introduced by Bunimovich-Chernov-Sinai). In our setting, we take and (to be chosen later) and we make a partition of a neighborhood of the *singular set* (of geodesics going straight into the cusp) into countably many strips:

for all , . (Actually, has two connected components, but we will slightly abuse of notation by denoting these connected components by .)

Intuitively, the partition into polynomial scales in the parameter is useful in our context because the relevant quantities (such as Gaussian curvature, first and second derivatives, etc.) for the study of the geodesic flow of the surface of revolution blows up with a polynomial speed as the excursions of geodesics get closer the cusp (that is, as ). Thus, the important quantities for the analysis of the geodesic flow near the cusp become “almost constant” when restricted to one of the homogeneity strips .

Also, another advantage of the homogeneity strips is the fact that they give a rough control of the elements of the countable Markov partition at the basis of the Young tower produced by Chernov-Zhang: indeed, the arguments of Chernov-Zhang show that each element of the basis of their Young tower is completely contained in a homogeneity strip. In particular, the verification of the uniform piecewise HÃ¶lder continuity of the roof function follows once we prove that the restriction of the roof function to each homogeneity strip is uniformly HÃ¶lder continuous (in the sense that, for some , the HÃ¶lder norms are bounded by a constant *independent* of ).

Coming back to the one-step growth and distortion bounds, let us content ourselves to formulate simpler *versions* of them (while referring to Section 4 and 5 of Chernov-Zhang paper for precise definitions): indeed, the actual definitions of these notions involve the properties of the derivative along unstable manifolds, and, in our current setting, we have just a *partially defined* map , so that we can not talk about future iterates and unstable manifolds unless we “complete” the definition of .

Nevertheless, even if is only partially defined, we still can give crude analogs to unstable directions for by noticing that the vector field on (whose leaves are ) morally works like an unstable direction: in fact, this vector field is transverse to the singular set which is a sort of “stable set” because all trajectories of the geodesic flow starting at converge in the future to the same point, namely, the cusp at . In terms of the “unstable direction” , we define the *expansion factor* of at a point as , that is, the amount of expansion of the “unstable” vector field under . Note that, from the definitions, the expansion factor depends only on the -coordinate of . So, from now on, we will think of expansion factors as a function of .

In terms of expansion factors, the (variant of the) distortion bound condition is

where satisfies , and the (variant of the) one-step growth condition is

Remark 9The one-step growth condition above is very close to the original version in Chernov-Zhang paper (compare (3) with Equation (5.5) in Chernov-Zhang article). On the other hand, the distortion bound condition (2) differs slightly from its original version in Equation (4.1) in Chernov-Zhang paper. Nevertheless, they can be related as follows. The original distortion condition essentially amounts to give estimates (where is a smooth function such that as ) whenever and belong to the samehomogenous unstable manifold(i.e., a piece of unstable manifold such that never intersects the boundaries of the homogeneity strips for all and ; the existence of homogenous unstable manifolds through almost every point is guaranteed by a Borel-Cantelli type argument described in Appendix 2 of this paper of Bunimovich-Chernov-Sinai here). Here, one sees that

for some . Using the facts that decays exponentially fast (as and are in the same unstable manifold ) and is always contained in a homogeneity strip (as is a homogenous unstable manifold), one can check that the estimate in (2) implies the desired uniform bound on the previous expression in terms of a smooth function such that as . In other words, the estimate (2) can be shown to imply the original version of distortion bounds, so that we can safely concentrate on the proof of (2).

At this point, we can summarize the discussion so far as follows. By Melbourne’s criterion for rapid mixing for contact suspension flows and Chernov-Zhang criterion for the existence of Young towers with exponential tails for the map , we have “reduced” the proof of Theorem 5 to the following statements:

Proposition 6Given and , one has the following “uniform HÃ¶lder estimate”

whenever is sufficiently small (depending on , and ).

Proposition 7The expansion factor function satisfies:

- given , we can choose large (and sufficiently small) so that
where ;

- given , we can choose and such that and
for some (sufficiently large) constant and for all .

The proofs of these two propositions are given in the next two subsections and they are based on the study of perpendicular unstable Jacobi fields related to the variations of geodesics of the form , .

** 2.3. The derivative of the roof function **

From now on, we fix (e.g., ) and, for the sake of simplicity, we will denote a geodesic corresponding to an initial vector by . Of course, there is no loss of generality here because of the rotational symmetry of the surface . Also, we will suppose that as the case is symmetric.

Note that the roof function is defined by the condition , or, equivalently,

where denotes the distance from a point to the cusp at and is the distance from to . By taking the derivative with respect to at and by recalling that , we obtain that

where . Since where , we have , and, *a fortiori*,

Let us compute the two inner products above. By definition of the parameter and the symmetry of the revolution surface , we have . Also, if we denote by the perpendicular (“unstable”) Jacobi field along the geodesic associated to the variation of (cf. Section 2 of this previous post here) with initial conditions and , then

From the computation of the inner products above and the fact that they add up to zero, we deduce that , that is,

In other terms, the previous equation says that the derivative can be controlled via the quantity measuring the growth of the perpendincular Jacobi field at the return time . Here, it is worth to recall that Jacobi fields are driven by Jacobi’s equation:

where is the Gaussian curvature of the surface of revolution at the point . Also, it is useful to keep in mind that Jacobi’s equation implies that the quantity satisfies Riccati’s equation

where .

In the context of the surface of revolution , these equations are important tools because we have the following explicit formula for the Gaussian curvature at a point :

In particular, verifies .

Next, we take and we consider the following auxiliary function:

By definition, . Furthermore,

Since the equation (describing the motion of geodesic on ) implies that , we deduce from the previous inequality that

This estimate allows to control the solution of Riccati’s equation along the following lines. The initial data of the Jacobi field is and . Hence,

In particular, there exists a well-defined maximal interval where for all . By plugging this estimate into Jacobi’s equation, we get that

for each .

By integrating this inequality (and using the initial condition ), we obtain that

Therefore,

If , we deduce that (as ). Otherwise, and . Since satisfies Riccati’s equation, we deduce from (5) that

at each time where . It follows that for all . Hence,

and, *a fortiori*,

In other words, we proved that

Now, the quantity can be estimated as follows. By deriving Clairaut’s relation , we get

Since (as we are interested in small angles , large) and (thanks to the relation and the fact that and, thus, for small), we conclude that

for . Here, we used the fact that for . Therefore,

since . Also, the symmetry of the surface implies and, hence,

In summary, we have shown that , i.e.,

By putting together (4), (6) and (8), we conclude that

for some constant depending on and .

At this stage, we are ready to complete the proof of Proposition 6.

*Proof:* Let us estimate the HÃ¶lder constant . For this sake, we fix and we write

for some between and . Since and , it follows from (9) that

Because and are arbitrary points in , we have that

where is an appropriate constant.

Now, our assumption implies that we can choose sufficiently small so that . By doing so, we see from the previous estimate that

whenever , i.e., , is sufficently small. This proves Proposition 6.

** 2.4. Some estimates for the expansion factors **

Similarly to the previous subsection, the proof of Proposition 7 uses the properties of Jacobi’s and Riccati’s equation to study

where is the scalar function (with and ) measuring the size of the perpendicular “unstable” Jacobi field along .

We begin by giving a lower bound on . Given , let us choose small so that

Of course, this choice of is possible because . Next, we consider the auxiliary function:

By definition, . Furthermore,

In particular,

Since (cf. the paragraph before (5)), we deduce from the previous estimate that

This inequality implies that the solution of Riccati’s equation satisfies for all . Indeed, the initial condition , says that and the inequality above tells us that

at any time where .

By integrating the estimate over the interval , we obtain that

i.e.,

For sake of concreteness, let us set and let us restrict our attention to geodesics whose initial angle with the meridians of are sufficiently small so that . In this way, we have that (thanks to Jacobi’s equation and our initial conditions and ). In this way, the inequality above becomes

Next, we observe that can be bounded from below in a similar way to our derivation of a bound from above to in the previous subsection: in fact, by repeating the arguments appearing after (7) above, one can show that

and

where is an adequate (small) constant depending on , and .

By putting together the estimates above, we deduce that

where .

This inequality shows that

Thus, if , then we can choose small (with ) and large so that (our variant of) the one-step growth condition (3) holds. This proves the first part of Proposition 7.

Finally, we give an indication of the proof of the second part of Proposition 7 (i.e., the distortion bound (2)). We start by writing

and by noticing that

Next, we take the derivative with respect to of the previous expression. Here, we obtain several terms involving some quantities already estimated above via Jacobi’s and Riccati’s equation (such as , , etc.), but also a new quantity appears, namely, , i.e., the derivative with respect to of the family of solutions of Riccati’s equation along . Here, the “trick” to give bounds on is to derive Riccati’s equation

with respect to in order to get an ODE (in the time variable ) satisfied by . In this way, it is possible to see that one has reasonable bounds on as soon as the derivative of the square root of the absolute value of the Gaussian curvature. Here, can be bounded by recalling that we have an explicit formula

for the Gaussian curvature. By following these lines, one can prove that, for a given , the distortion bound

holds whenever is taken sufficiently small. In other words, by taking , we have .

Note that the estimate in the previous paragraph gives the desired distortion bounds (2) once we show that can be selected such that . In order to check this, it suffices to recall that can be taken arbitrarily small (cf. the proof of the first part of Proposition 7), i.e., . So,

and

Since for , it follows that for adequate choices of and . This completes our sketch of proof of the second part of Proposition 7.

]]>

Theorem 1 (Burns-Masur-Wilkinson)Suppose that:Let be the quotient of a contractible, negatively curved, possibly incomplete, Riemannian manifold by a subgroup of isometries of acting freely and properly discontinuously. Denote by the metric completion of and the boundary of .

- (I) the universal cover of is
geodesically convex, i.e., for every , there exists an unique geodesic segmentinconnecting and .- (II) the metric completion of is
compact.- (III) the boundary is
volumetrically cusplike, i.e., for some constants and , the volume of a -neighborhood of the boundary satisfiesfor every .

- (IV) has
polynomially controlled curvature, i.e., there are constants and such that the curvature tensor of and its first two derivatives satisfy the following polynomial boundfor every .

- (V) has
polynomially controlled injectivity radius, i.e., there are constants and such thatfor every (where denotes the injectivity radius at ).

- (VI) The
first derivative of the geodesic flowispolynomially controlled, i.e., there are constants and such that, for every infinite geodesic on and every :Then, the Liouville (volume) measure of is finite, the geodesic flow on the unit cotangent bundle of is defined at -almost every point for all time , and the geodesic flow is

non-uniformly hyperbolic(in the sense of Pesin’s theory) andergodic.

Actually, the geodesic flow is Bernoulli and, furthermore, its metric entropy is positive, finite and is given by Pesin’s entropy formula (i.e., it is equal to the sum of positive Lyapunov exponents of counted with multiplicities).

More precisely, we proved in the previous post of this series that a geodesic flow satisfying the assumptions (II), (III) and (VI) above is non-uniformly hyperbolic with respect to the volume probability measure, and, furthermore, we identified the Oseledets stable and unstable subspaces (cf. the last theorem of this post):

Theorem 2Under the assumptions (II), (III) and (VI) in Theorem 1 above, the geodesic flow isnon-uniformly hyperbolic: more concretely, there exists a subset of full -measure such that the -invariant splitting

into the flow direction and the spaces and ofstable and unstable Jacobi fieldsalong have the property that

for all and .

Today, we want to exploit the non-uniform hyperbolicity of (and the assumptions (I) to (VI) above) in order to deduce the ergodicity of via Hopf’s argument.

For this sake, we organize this post as follows. In the first section, we discuss the geometry of stable and unstable manifolds of : in particular, we will see that these invariant manifolds form *global* laminations with useful *absolute continuity* properties. After that, we describe Hopf’s argument in the second section: from the nice properties of the invariant laminations, we deduce that Birkhoff averages are constant almost everywhere, and, hence, is ergodic. Finally, we conclude this post with a remark (inspired by conversations with Y. CoudÃ¨ne and B. Hasselblatt last November 2013) about the deduction of the *mixing* property for from Hopf’s argument.

**1. Stable manifolds of certain geodesic flows **

** 1.1. Local (Pesin) stable manifolds for certain geodesic flows **

We begin by noticing that a geodesic flow satisfying the assumptions (I) to (VI) of Theorem 1 has “nice” local (Pesin) stable and unstable manifolds through almost every point.

The reader with some experience with non-uniformly hyperbolic systems might think that this is an immediate consequence of the so-called Pesin’s theory. However, this is *not* the case in our setting because the phase space of is *not* assumed to be compact. In other words, we are facing a dilemma: while the non-compactness of is an important point for the applications of Theorem 1 (to moduli spaces equipped with WP metrics), it forbids a naive utilization of Pesin’s theory because of the competition between the dynamical behaviors of in compact regions of and near “infinity” .

Fortunately, Katok and Strelcyn (with the aid of Ledrappier and Przytycki) developed a *generalization* of Pesin’s theory where any “well-behaved” dynamics on non-compact phase space is allowed. Furthermore, Katok-Strelcyn successfully applied their version of Pesin’s theory to the study of dynamical billiards.

Very roughly speaking, Katok-Strelcyn say that if the dynamics of the non-uniformly hyperbolic system “blows up at most polinomially” at infinity , then the hyperbolic (exponential) behavior of is strong enough so that Pesin’s theory can be applied (because is “essentially compact” for practical purposes).

Evidently, this is much easier said than done, and, unfortunately, the discussion of the details of Katok-Strelcyn’s generalization of Pesin’s theory is out of the scope of this post. In particular, we will content ourselves in just mentioning the conditions (I) to (VI) in Theorem 1 were set up by Burns-Masur-Wilkinson in such a way that a geodesic flow satisfying (I) to (VI) also verifies all the requirements to apply Katok-Strelcyn’s work. Here, even though this is philosophically natural, it is worth to point out that the deduction of the conditions to use Katok-Strelcyn’s technology from (I) to (VI) is *far from trivial*: indeed, Burns-Masur-Wilkinson do this after studying (in Appendices A and B of their paper) several properties of Sasaki metric and properties of .

In summary, Burns-Masur-Wilkinson use (I) to (VI) to ensure that Katok-Strelcyn’s generalization of Pesin’s theory applies in the setting of Theorem 1. As a by-product, they deduce the following statement about the existence and absolute continuity of local (Pesin) stable manifolds (cf. Proposition 3.10 of Burns-Masur-Wilkinson paper).

Theorem 3 (“Pesin stable manifold theorem”)Then, there exists a subset of full volume, a \textrm{measurable} function , and a measurable familyLet be the geodesic flow on the unit tangent bundle of a -dimensional Riemannian manifold satisfying the conditions (I) to (VI) of Theorem 1. Denote by the subset of full volume provided by Theorem 2 where is non-uniformly hyperbolic.

of smooth () embedded disks with the following properties. For all :

- , i.e., is tangent to ;
- for all , i.e., is topologically contracted in forward time by ;
- if and only if and , i.e., is local stable manifold (in the sense that it is dynamically characterized as the set of close to whose forward -orbit approaches the forward -orbit of ).

Moreover, the family is absolutely continuous in the sense that the following “Fubini-like statements” hold.

- given a subset of zero volume, one has that the set has zero measure in (with respect to the induced -dimensional Lebesgue measure on ) for almost every ;
- given a -embedded -dimensional open disk and a subset of zero measure (for the induced Lebesgue measure of ), the set
(obtained by saturating by the local stable manifolds passing through it) has zero volume in .

Finally, the analogous assertions about unstable manifolds are also true.

** 1.2. Global stable manifolds of certain geodesic flows **

The Pesin stable and unstable laminations provided by Theorem 3 are *not* sufficient to run Hopf’s argument: as it was explained in the first post of this series, the local stable manifolds could be a priori very *short* (because their radii vary only *measurably* with and so one does not expect for uniform lower bounds on ).

Hence, it is important (for our purposes of using Hopf’s argument) to compare Pesin’s local stable manifolds with *global* objects. Here, the key point is to observe that Theorem 2 says that the tangent space of at is exactly the vector space of *stable Jacobi fields* along the geodesic and, as we will recall in a moment, stable Jacobi fields are naturally related to global objects called *stable horospheres*.

**1.2.1. Stable Jacobi fields and stable horospheres**

Let be a Riemannian manifold. Given an unit tangent generating a geodesic ray such that the sectional curvatures of are negative along and , let us denote by the *stable Jacobi field* associated to : by definition, this is the Jacobi field

where is the Jacobi field satisfying and .

In terms of the description of Jacobi fields via variations of geodesics, the stable Jacobi fields along are obtained by varying through geodesics such that for all (that is, stays always close to in *forward time*). These geodesics are *orthogonal* to a family of immersed hypersurfaces of whose lifts to the universal cover of are the so-called *stable horospheres*.

The stable horospheres can be constructed “by hands” with the aid of Busemann functions as follows.

Let be the quotient of a contractible, negatively curved, Riemannian manifold by a subgroup of isometries of acting freely and properly discontinuously and suppose that the universal cover of is geodesically convex (i.e., satisfies item (I) of Theorem 1).

In this situation, it is possible to show (see, e.g., Proposition 3.5 in Burns-Masur-Wilkinson paper) that given an unit vector generating an infinite geodesic ray , the functions given by

converge (uniformly on compact sets) as to a convex function

called *stable Busemann function* such that and, for every , the unit vector defines an infinite geodesic ray with

for all . In particular, the geodesics give variations of leading to stable Jacobi fields.

For each , the level set is a connected, complete, codimension submanifold of called *stable horosphere of level *. By definition, the geodesics are orthogonal to the -parameter family of stable horospheres (because stable horospheres are leve sets of and the geodesics point in the direction of the gradient).

The submanifold

of consisting of unit vectors that are orthogonal to the stable horosphere of level is called the *(global) stable manifold* of . This nomenclature is justified by the following property (corresponding to Proposition 3.6 in Burns-Masur-Wilkinson paper). In the context of Theorem 1, suppose that the infinite geodesic ray projecting to a *forward recurrent* geodesic on (i.e., *after* projection to , the unit vector becomes an accumulation point of the set ). Then, for any , the unit vector is tangent to an infinite geodesic ray such that

Furthermore, as . In particular, (stable manifolds are -invariant) and for all (stable manifolds are dynamically characterized by future orbits getting close together).

Remark 1As usual, by reversing the time (via the symmetry ), one can define unstable Jacobi fields, unstable Busemann functions and unstable horospheres.

The following picture (that we already encountered in the last post while discussing Jacobi fields) illustrates the stable and unstable horospheres associated to the vertical geodesic in the hyperbolic plane passing through .

**1.2.2. Geometry of the stable and unstable horospheres**

In this subsection, we make a couple of comments on the geometry of stable and unstable horospheres. More precisely, besides explaining the computation of their second fundamental forms from matrix Riccati equations, we will see that the stable and unstable horospheres are mutually transverse in a quantitative way. Of course, this transversality property of horospheres is another important point in Hopf’s argument (as it allows to control the angle between stable and unstable manifolds).

Let be a geodesic ray such that the sectional curvatures of along are negative. For each , let us denote by the unstable Jacobi field along with (as usual).

Consider the -parameter family of matrices (linear operators) defined by the formula

As we mentioned in this post here, are symmetric, positive-definite operators satisfying the matrix Ricatti equation

(i.e., for all ).

It is possible to show (cf. Eberlein’s survey) that the operator is precisely the second fundamental form at of the unstable horosphere of level .

By reversing the time, we have an analogous operator related to stable horospheres.

Note that, by definition, the stable and unstable subspaces and at an unit vector defining an *infinite* geodesic ray are

In other terms, we have a -invariant splitting

over the set

(where ).

Let us now show that this splitting is locally uniform over .

Proposition 4There exists a continuous function such that the continuous family of conefields

and

meeting only at the origin have the property that

for all .

*Proof:* Our task consists in showing that the functions

of are locally uniformly bounded away from zero.

By symmetry, it suffices to prove that is locally uniformly bounded from below. For the sake of reaching a contradiction, suppose this is not the case. This means that there are sequences , with such that , and .

For each , let be the stable Jacobi fields along induced by , and denote by the (limit) Jacobi field along induced by .

On one hand, for each , the square of the norm of the stable Jacobi field is a decreasing function of . In particular, since , we deduce that is a non-increasing function of .

On the other hand, is a strictly convex function of (because is a perpendicular Jacobi field, cf. Eberlein’s survey).

By putting these two facts together, we see that the function has no critical points. However, . This contradiction proves the desired proposition.

**1.2.3. Absolute continuity of global stable manifolds**

Once we have related Pesin’s stable and unstable manifolds (local objects) to stable and unstable horospheres (global objects), it is not entirely surprising that the absolute continuity properties of Pesin stable manifolds (described in Theorem 3 above) can be “transferred” to horospherical laminations:

Proposition 5Then, there exists a subset of full volume such that the stable Busemann functions are for all . Moreover, the leaves of the stable lamination are -submanifolds of diffeomorphic to . Furthermore, the stable horospherical laminationLet be the geodesic flow on the unit tangent bundle of a -dimensional Riemannian manifold satisfying the conditions (I) to (VI) of Theorem 1. Denote by the subset of the unit tangent bundle of the universal cover of consisting of unit vectors projecting into a forward and backward recurrent geodesic in .

obtained by taking the family of manifolds through the vectors in the projection of to (via ) has the following absolute continuity properties:

- if has zero -volume, then for -almost every and any , the set has zero -dimensional volume in ;
- if is a smooth, embedded, -dimensional open disk and has zero -dimensional volume in , then for any one has where
is the set obtained by saturating with the leaves of the lamination .

Finally, a similar statement holds for the corresponding unstable lamination.

Logically, the statement of this proposition is very close to Theorem 3 about the absolute continuity of Pesin stable manifolds, but the crucial point is that we have now an absolutely continuous stable lamination whose leaves have radii essentially equal to . In other words, the leaves of the stable lamination have a size controlled by the injectivity radius of , a global smooth function, instead of the *a priori* merely measurable function giving the radii of leaves of Pesin’s stable lamination .

The proof of Proposition 5 is not very difficult: it uses the absolute continuity properties of Pesin’s lamination in Theorem 3 and the “contraction of stable horospheres” (i.e., the fact that the forward dynamics of eventually contracts inside ), and it occupies two pages in Burns-Masur-Wilkinson paper (cf. the proof of their Proposition 3.11). However, we will skip this point in favor of discussing Hopf’s argument right now.

**2. Proof of Theorem 1 via Hopf’s argument**

Let be a geodesic flow satisfying the assumptions (I) to (VI) of Theorem 1. We want to show that is ergodic with respect to the volume measure (with normalized total mass).

By Birkhoff’s ergodic theorem, given a continuous function with compact support, the Birkhoff ergodic averages

converge as to the same limit for -almost every .

By definition of ergodicity, our task consists in showing that the function is constant -almost everywhere.

For this sake, let us define the measurable functions

and

Note that, by Birkhoff’s ergodic theorem, there exists a subset of full -measure such that

Moreover, from their definitions, note that the functions , and are -invariant.

The initial observation in Hopf’s argument is the fact that the function , resp. , is *constant* along the stable manifolds , resp. unstable manifolds . In fact, this follows easily from the uniform continuity of the (compactly supported, continuous) function and the fact that as (resp. ) whenever (resp. ).

The basic strategy of Hopf’s argument can be summarized as follows. We want to combine this initial observation with the absolute continuity properties of the stable and unstable horospherical laminations to deduce that is “*locally ergodic*” in the sense that *every* possesses a neighborhood such that the restriction to is -almost everywhere constant.

Of course, since is connected, this local ergodicity property implies that the function is constant -almost everywhere, and, *a fortiori*, is ergodic with respect to . In other terms, our task is reduced to prove the local ergodicity property stated in the previous paragraph.

In this direction, we fix once and for all , we set

and we denote by the -neighborhood of .

Let be the full -volume subset constructed in Proposition 5. For each , we consider the stable leaf , we take its iterates under for , and we saturate the resulting subset with the leaves of the unstable horospherical lamination to obtain the subset

The construction of is illustrated in the figure below: the subset is marked in blue and some leaves of passing through points of are marked in red.

The local ergodicity property stated above is an immediate consequence of the following two claims:

- (a) the restriction of the function to is almost everywhere constant for almost every choice of ;
- (b) is essentially open for almost every near in the sense that there exists a neighborhood of such that has full volume in for almost every choice of .

We establish the first claim (a) by exploiting the initial observation that Birkhoff averages are constant along stable and unstable manifolds and the absolute continuity properties of the stable and unstable horospherical laminations.

More precisely, let us consider again the full volume subset of where (provided by Birkhoff’s ergodic theorem).

By absolute continuity property of (cf. the first item of conclusion of Proposition 5), for almost every , the intersection has full volume in . We affirm that is almost everywhere constant for any such .

In fact, takes a constant value on . Moreover, since on , we also have that takes the constant value on . By combining this fact with the -invariance of , we deduce that takes the constant value on . Furthermore, by putting together this fact with the initial observation that is constant along unstable manifolds , we obtain that takes the constant value on .

Note that, by assumption, is a full volume subset of . Since is a -flow, it follows that is a full volume subset of the -dimensional smooth submanifold . Therefore, from the absolute continuity property of (cf. the second item of conclusion of Proposition 5), we conclude that is a full volume subset of . In particular, we have that takes the constant value on the full volume subset of . Because on , we get that takes the constant value on the full volume subset of , i.e., is almost everywhere constant. This completes the proof of the claim (a).

Remark 2The reader is encouraged to interpret this argument in the light of Figure 2 in order to get a clear picture of the roles of the subsets , and .

We establish now the second claim (b) from the absolute continuity properties of the horospherical laminations and the local uniform transversality of the stable and unstable manifolds.

More concretely, from the absolute continuity property in the first item of the conclusion of Proposition 5, we have that the stable disk , resp. unstable disk , is almost everywhere tangent to the stable direction , resp. unstable direction , for almost every . Since the stable and unstable directions and are contained in the *continuous* families of cones and from Proposition 4, we have that , resp. , is *everywhere* tangent to , resp. for almost every .

In particular, from the -invariance of the stable lamination , we see that the -dimensional disk is everywhere tangent to for almost every . Since the continuous conefields and meet only at the origin (cf. Proposition 4), that is, they are locally uniformly transverse, we conclude that there exists a neighborhood of such that

for almost any . In other words, intersects in a full volume subset. This completes the proof of claim (b).

This concludes our discussion of Hopf’s argument (namely, the derivation of claims (a) and (b)) for the ergodicity of .

Closing this post, let us say a few words about the mixing and Bernoulli properties in the statement of Theorem 1. In Burns-Masur-Wilkinson paper, these properties are deduced from general results of Katok saying that if a *contact* flow is non-uniformly hyperbolic and ergodic, then it is Bernoulli (and, in particular, mixing).

Nevertheless, as it was brought to my attention by B. Hasselblatt and Y. CoudÃ¨ne, the Hopf argument above can be slightly adapted in certain contexts to give mixing and/or mixing of all orders. For example, concerning the mixing property, Y. CoudÃ¨ne, B. Hasselblatt and S. Troubetzkoy showed (in Theorem 3.3) in this recent preprint here that if any -function saturated by stable and unstable sets (in the sense that there is a full measure subset such that whenever and or ) is almost everywhere constant, then the dynamical system is mixing. Also, they have a similar criterion for multiple mixing, and, furthermore, they discuss a couple of non-trivial examples of applications of their criteria.

In the context of Theorem 1, we can deduce the mixing property for from the result of CoudÃ¨ne-Hasselblatt-Troubetzkoy. Indeed, the argument used in the proof of the claim (a) above (during the discussion of Hopf’s argument) also shows that any -function saturated by stable and unstable sets (such as ) is almost everywhere constant, so that CoudÃ¨ne-Hasselblatt-Troubetzkoy mixing criterion “Ã la Hopf” applies in this setting.

]]>

Of course, there are several ways to come around this little technical subtlety (from the dynamical point of view) in the definition of Kontsevich-Zorich cocycle and this is the main purpose of this post. Evidently, the content of this post is well-known (especially among experts), but I hope that this post will benefit the reader with some background in Dynamical Systems wishing to know the answer to the following question:

*Does the Kontsevich-Zorich cocycle (as it is classically defined) qualifies as a genuine example of linear cocycle in the usual sense in Dynamical Systems?*

**Disclaimer.** Even though this post benefited from my conversations with Jean-Christophe Yoccoz, all errors and mistakes below are my sole responsibility.

**1. The Kontsevich-Zorich cocycle **

The basic references for this section are G. Forni’s paper, J.-C. Yoccoz’s survey and/or this blog post here (where the reader can find some figures illustrating the notions discussed below).

Let be a fixed compact orientable topological surface of genus , let be a non-empty finite set of points of cardinality and let be a list of “ramification indices” such that .

Recall that a *translation surface structure* on is a maximal atlas of charts on such that all changes of coordinates are translation in and, for each , there are neighborhoods , and a ramified cover of degree such that every injective restriction of is a chart of the maximal atlas .

Remark 1Equivalently, we can think of translation structures as the data of a Riemann surface structure on together with an Abelian differential (holomorphic one-form) possessing zeroes of orders at for . However, for the purposes of this post, we will not need this alternative point of view.

Remark 2Since the usual Euclidean area form on is translation invariant, it makes sense to talk about the total area of a translation structure. From now on, we will always implicitly assume that our translation structures are normalized, i.e., they have unit total area. Here, it is worth to point out that this normalization is not important for the definition of TeichmÃ¼ller and moduli spaces, but it is important for the discussion of the dynamics of the TeichmÃ¼ller flow on moduli spaces.

We denote by the group of orientation-preserving homeomorphisms of fixing (pointwise), by the connected component in of the identity element, and by the *mapping class group* (sometimes also called modular group).

Note that the group acts (by *pre-composition*) on the set of translation surfaces: given and a translation surface structure on , we get a translation structure by defining .

In this setting, the *TeichmÃ¼ller space* is the quotient of the set of translation structures on by the action of and the *moduli space* is the quotient of the set of translation structures on by the action of . By definition, the moduli space is the quotient of TeichmÃ¼ller space by the action of the mapping class group .

Remark 3The TeichmÃ¼ller space is a manifold, but the moduli space is an orbifold (not a manifold) in general. We will come back to this point later in this post.

The group acts (by *post-composition*) on the TeichmÃ¼ller space : given and a translation structure , we define (note that this action is well-defined because the conjugation of a translation in by the linear action of the matrix is still a translation). Furthermore, since acts by pre-composition and acts by post-composition, these actions commute and, hence, the action of descends to the moduli space .

The actions of the diagonal subgroup of on TeichmÃ¼ller and moduli spaces are called *TeichmÃ¼ller flow*.

The dynamics of the TeichmÃ¼ller flow and/or -action on moduli spaces of (normalized) translation surfaces is a rich subject with interesting applications to the Ergodic Theory of some parabolic systems (such as interval exchange transformations and billiards in rational tables): see, for example, these posts here and here for more details.

A main character in the investigation of the -action on moduli spaces of (normalized) translation surfaces is the so-called *Kontsevich-Zorich cocycle*. Very roughly speaking, this cocycle was introduced by Kontsevich and Zorich as a practical way to extract the “interesting part” of the derivative cocycle of the TeichmÃ¼ller flow.

Formally, the Kontsevich-Zorich (KZ) cocycle is usually defined as follows (compare with Forni’s paper). Let be the *vector bundle* over TeichmÃ¼ller space whose fibers are the absolute homology group with real coefficients. One usually refers to as the *Hodge bundle* over .

Remark 4The reader with some background in Complex Geometry might have thought that this notion is very similar to the Hodge bundle over TeichmÃ¼ller and moduli spaces of algebraic curves (Riemann surfaces) obtained by attaching the space of holomorphic -forms to a Riemann surface .In fact, this is no coincidence and the nomenclature “Hodge bundle” for is a “popular” abuse of notation in the literature about the TeichmÃ¼ller flow. In fact, this abuse of notation goes beyond this: one could also construct (trivial) bundles over TeichmÃ¼ller spaces by taking the fibers to be the absolute homology group with complex coefficients or the absolute cohomology group or with real or complex coefficients. These variants are closely related to each other (because and the first absolute homology and cohomology groups of a surface are dual [by PoincarÃ© duality]) and they are also called Hodge bundle in the literature (depending on the author’s taste).

The vector bundle is *well-defined* and *trivial*, i.e., : in a nutshell, this is a consequence of the fact that a homeomorphism that is isotopic to the identity (such as the elements of ) act trivially on homology.

By taking the quotient of by the natural action of the mapping class group on *both factors*, we get the so-called *Hodge bundle*

over the moduli space .

In this context, the (trivial) cocycle

over the TeichmÃ¼ller flow on TeichmÃ¼ller space given by

for and descends to the so-called *Kontsevich-Zorich cocycle* on the Hodge bundle over moduli space (by taking the quotient by the action of ). Here, it is worth to observe that the Kontsevich-Zorich cocycle is well-defined (i.e., we can take this quotient) because of the fact that acts by pre-composition and acts by post-composition on TeichmÃ¼ller spaces (so that these actions commute).

Remark 5The Kontsevich-Zorich cocycle could also be defined more generally by taking the quotient of the trivial cocycle over the action of full group (and not only ) on TeichmÃ¼ller space.

**2. Is the KZ cocycle a linear cocycle? **

The reader with some familiarity with Dynamical Systems might have noticed some similarities between the notions of Kontsevich-Zorich cocycle and a linear cocycle over (discrete or continuous time) dynamical system.

In fact, let us recall that a *linear cocycle* , , over a flow , , is a flow on a vector bundle such that (i.e., projects onto ) and is a vector bundle automorphism, i.e., for all , the restriction of to is a linear map from the fiber on the fiber .

Example 1The trivial cocycle on the trivial bundle over a flow is . In particular, the cocycle on defined above is an example of trivial cocycle.

Example 2The derivative map of a smooth flow on a smooth manifold is an important class of examples of linear cocycles.

Given that the Kontsevich-Zorich cocycle on moduli spaces projects to the TeichmÃ¼ller flow on moduli spaces and it acts on the fibers of via the (symplectic) action on homology of the elements of the mapping class group , one might be tempted to qualify the Kontsevich-Zorich cocycle as a linear cocycle.

*However*, a closer inspection of the definitions reveals that:

The Kontsevich-Zorich cocycle is not *always* a linear cocycle!

Actually, the fact that KZ cocycle is not a linear cocycle in general is *not* its fault: in order to talk about *linear* cocycles one needs *vector bundles*, and, as it turns out, the fibers of the Hodge bundle over moduli space are *not* vector spaces over the orbifold points of moduli spaces.

More precisely, we see from the definition that the fiber of at a translation surface is the quotient where is the group of automorphisms of , that is, the group of homeomorphisms of fixing pointwise whose local expressions in the charts are translations of .

Note that is a finite group: for instance, any element of is holomorphic (with respect to the Riemann surface structure underlying ) and, hence, by Hurwitz’s automorphisms theorem, we have that has cardinality . (Actually, even though Hurwitz’s theorem is sharp, this estimate of is not optimal: see, e.g., this paper of Schlage-Puchta and Weitze-Schmithuesen)

Therefore, the fiber of at is not very far from a vector space: it differs from by the quotient by (the action on homology of) the finite group (of “symplectic matrices”).

Nevertheless, when the translation surface is an *orbifold* point of moduli space (i.e., when is non-trivial), the fiber is not necessarily a vector space. (A simple concrete example of such a situation is the cone obtained from the quotient of by the finite group generated by the rotation of angle )

In summary, KZ cocycle is not always a linear cocycle because the Hodge bundle over moduli space is not always a vector bundle.

In other terms, the moduli space is an orbifold (but not a manifold in general), the Hodge bundle is an *orbifold vector bundle* (in general) and, thus, KZ cocycle is an *orbifold linear cocycle* (in general).

Example 3A concrete description of the Eierlegende Wollmilchsau is the following. We consider the quaternion group , we take an unit square in for each , and we glue (by translation) the vertical rightmost side of to the vertical leftmost side of and we glue (by translation) the horizontal top side of to the horizontal bottom side of . In this way, one obtais a translation surface where has genus , consists of four points and .One of my favorite examples of translation surface with a non-trivial group of automorphisms is the so-calledEierlegende Wollmilchsau.

A simple argument (see, e.g., this paper here) shows that the group of automorphisms of the Eierlegende Wollmilchsau is isomorphic to the quaternion group .

Example 4Some moduli spaces are manifolds and the corresponding Hodge bundles are vector bundles.For instance, the so-called minimal stratum of translation surfaces on a genus surface with a single marked point is a manifold because it can be shown (see, e.g., Proposition 2.4 in this paper here) that the automorphism group of any translation surface is trivial.

**3. Dynamics of the KZ cocycle? **

From the point of view of Topology and Algebraic Geometry, the “orbifoldic” nature of KZ cocycle is not surprising. Indeed, this kind of object is very common when studying monodromy representations and, also, one can overcome the “orbifoldic” nature of KZ cocycle by taking *covers* of moduli spaces in order to “kill” orbifold points. In particular, an orbifold linear cocycle is as good as a linear cocycle for topological considerations.

On the other hand, for *dynamical* considerations, the classical definition of KZ cocycle as an orbifold linear cocycle deserves further discussion.

For example, given an ergodic TeichmÃ¼ller flow invariant probability measure on , it is desirable to apply Oseledets theorem to KZ cocycle in order to talk about its Lyapunov exponents and/or Oseledets subspaces. However, the Oseledets theorem deals only with linear cocycles and, thus, the fact that the KZ cocycle is merely an orbifold linear cocycle, or, more precisely, the fibers of Hodge bundle are not vector spaces, imposes some technical difficulties.

Remark 6For ergodic-theoretical purposes, the technical point pointed out above only shows up when -almost every translation surface is an orbifold point.In particular, the discussion in this section does not concern the so-calledMasur-Veech probability measureson moduli spaces (because its generic points have trivial group of automorphisms). This explains why the orbifoldic nature of KZ cocycle is never discussed in earlier papers in the literature (such as Forni’s paper): in those paper, the authors were concerned exclusively with the behavior of almost every trajectory with respect to Masur-Veech measures.

Remark 7The orbifoldic nature is discussed (in an implicit way at least) in the literature on Veech surfaces. Recall that a Veech surface is a translation surface whose -orbit in moduli space is closed. As it turns out, the stabilizer in of a Veech surface is a lattice, so that its -orbit is naturally isomorphic to the unit cotangent bundle of the finite area hyperbolic surface . If the Veech surface has a non-trivial group of automorphisms, the Hodge bundle over its -orbit (and hence the corresponding KZ cocycle) is orbifoldic, but one can get around this by studying the group of so-calledaffine diffeomorphisms, a sort of “finite cover” of in view of a natural exact sequence . See, e.g., this survey paper of P. Hubert and T. Schmidt (or our joint paper with J.-C.Yoccoz).

Fortunately, for the sake of the *definition* of Lyapunov exponents of KZ cocycle with respect to an ergodic TeichmÃ¼ller flow invariant probability measure , the possible ambiguity coming from the fact that the KZ cocycle is a “linear cocycle up to ” is *irrelevant* (cf. Section 4.3 of our paper with J.-C. Yoccoz and D. Zmiaikou). In fact, the Lyapunov exponents are defined by measuring the exponential rate of growth

of vectors (along typical trajectories), and the ambiguity caused by the fact that is a well-defined linear operator only up to the matrices in (action on homology of) does not change these rates because is a *finite* group and, hence, the possible values of (after composing with the elements of ) are uniformly related to each other by universal multiplicative constants (whose effects disappear when considering the expression ). In other terms, the Lyapunov exponents of orbifold linear cocycles *are* well-defined!

Unfortunately, there is no “cheap” solution (similar to the previous paragraph) for the definition of *Oseledets subspaces* of KZ cocycle: one needs linear structures on the fibers of the Hodge bundle to talk about them. Logically, this is an annoying situation because Oseledets subspaces are useful: for example, the analysis of these subspaces plays a major role in the recent breakthrough paper of A. Eskin and M. Mirzakhani about Ratner-like theorems for the -action on moduli spaces.

As it was already mentioned in the beginning of this section, the way out of this dilemma is to pick a cover of moduli space where all orbifold points disappear and to lift the Hodge bundle to this cover.

Of course, there are *several ways* of picking such a cover, but the whole point of this post is that certain covers are better than others depending on our purposes.

For example, the TeichmÃ¼ller space is a cover of having no orbifold points (because an automorphism of a translation surface of genus that is isotopic to the identity is the identity), and the Hodge bundle is a vector bundle. Nevertheless, the TeichmÃ¼ller space of (normalized) translation structures is not *dynamically* interesting: for example, there are no *finite* TeichmÃ¼ller invariant measure (and, thus, we can not use the standard tools from Ergodic Theory). This situation is very similar to the case of geodesic flows on the unit cotangent bundle of a finite-area hyperbolic surfaces (if we think of as moduli space, as TeichmÃ¼ller space and the geodesic flow as TeichmÃ¼ller flow): even though there are plenty of finite geodesic flow invariant measures on , there are no finite geodesic flow invariant measures on the cover (and, in fact, the dynamics of the geodesic flow on unit cotangent of the hyperbolic plane is rather boring). In summary, despite the fact that the Hodge bundle is an honest vector bundle over TeichmÃ¼ller space , we can not use it to define Oseledets subspaces (or, in general, do non-trivial dynamics) because the TeichmÃ¼ller flow on is not dynamically rich.

The “failure” (from the dynamical point of view) in the previous paragraph suggests that we should try picking *finite* covers of moduli spaces (having no orbifold points). Indeed, the lift of TeichmÃ¼ller flow invariant probability measures leads to a finite measures in that case and we are in good position to discuss Ergodic Theory.

In this direction, J.-C. Yoccoz, D. Zmiaikou and myself considered the cover of obtained by marking an horizontal separatrix issued of a conical singularity of the translation surface. This is a finite cover because there are exactly outgoing horizontal separatrices at a conical singularity with total angle . Furthermore, has no orbifold points: indeed, any automorphism of a translation surface that fixes an horizontal separatrix issued of a conical singularity is the identity. Moreover, the diagonal subgroup (TeichmÃ¼ller flow) still acts on because the matrices respect the horizontal direction (and, thus, horizontal separatrices).

In particular, we can talk about the Oseledets subspaces of the KZ cocycle over TeichmÃ¼ller flow at the level of the lift of the Hodge bundle to (because this “lifted KZ cocycle” over TeichmÃ¼ller flow is a genuine linear cocycle over a probability measure preserving flow and hence we can apply Oseledets theorem).

For the purposes of our joint paper with J.-C. Yoccoz and D. Zmiaikou, the lift of the KZ cocycle to the Hodge bundle over the finite cover of moduli space was adequate (and this is why we decided to stick to it in our paper).

However, we should confess that we were not completely happy with because does *not* act on (even though its diagonal subgroup do act!). Therefore, the more general version of the KZ cocycle over the -action on moduli spaces in Remark 5 above (also very important for the applications of the TeichmÃ¼ller flow) can *not* be lifted to the Hodge bundle over .

Actually, the reason why does not act on is very simple: the action of the rotation subgroup where is ill-defined. In order to see this, let us consider the following translation surface of genus with a marked (in blue) horizontal separatrix issued of its unique conical singularity :

Let us try to define the action of on by letting vary from to . Starting from , let us slowly apply the rotation matrices to the translation surface for small positive values of :

In this way, we get a new translation surface and the horizontal separatrix is sent into a *non-horizontal separatrix* . Thus, and, *a fortiori*, the natural “reflex” of posing fails.

Evidently, for any small positive angle , the non-horizontal separatrix is very close to the horizontal separatrix (obtained by “rotating” by an angle inside ) indicated below

If we do this, then, by letting vary from to , we would be force to impose where is the “previous” horizontal separatrix issued from the singularity in the “natural cyclic order” (obtained by rotating around the singularity) indicated (in red) belowÂ However, since the horizontal separatrices and are *distinct*, we have that and are *distinct* points of , a contradiction with the fact that the rotation matrix must act by the identity map on .

A simple inspection of the argument above shows that the *finite* cover of (obtained by “replacing” the rotation (circle) group by its -fold cover , but keeping the “same” diagonal subgroup ) does act on ! In other words, “almost acts” on . Nevertheless, since the (non-trivial) finite cover is *not* an algebraic group (unlike itself), the natural action on the Hodge bundle over is not so useful from the point of view of Dynamical Systems.

In summary, despite the usefulness of for the study of the KZ cocycle over the TeichmÃ¼ller flow on moduli spaces, it is desirable to find an alternative finite cover of moduli spaces having no orbifold points where still acts.

One solution to this problem is to take the quotient of TeichmÃ¼ller space by a *torsion-free* finite index subgroup of : indeed, is a finite cover of moduli space because has finite index, has no orbifoldic points because is torsion-free and acts on because acts by *post-composition* and acts by *pre-composition* on .

Here, a result of J.-P. Serre (see also the book of B. Farb and D. Margalit for a “modern” exposition of Serre’s result [in the context of moduli spaces of Riemann surfaces]) produces many of those subgroups with the desired properties: in fact, given any integer , Serre showed that the subgroup

consisting of elements of the mapping class group acting trivially on the homology of with coefficients in is torsion-free. (This finite cover of moduli space was already mentionned in this blog: see this post here)

In summary,

The lift of the KZ cocycle to the Hodge bundle over defines a linear cocycle over the -action on moduli spaces that we could (should?) call KZ cocycle in *dynamical* discussions (where certain specific notions such as Oseledets subspaces are needed).

Remark 8For the sake of comparison, the finite cover might be somewhat big relative to . In fact, is a cover of degree in general, while, from the fact that the action on homology of the mapping class group surjects into , we have that in general is a cover of degree (a quantity of the form for prime).

]]>

The plan for this post is the following. After quickly reviewing in Section 1 below some basic features of the geometry of tangent bundles of Riemannian manifolds, we will estimate the first derivative of geodesic flows on certain negatively curved manifolds in terms its sectional curvatures (as promised last time). Finally, we will complete today’s discussion by proving the first part of Burns-Masur-Wilkinson ergodicity criterion (i.e., we will show that any geodesic flow verifying the assumptions of Burns-Masur-Wilkinson is non-uniformly hyperbolic in the sense of Pesin’s theory), while leaving the second part of Burns-Masur-Wilkinson ergodicity criterion (i.e., the verification of ergodicity via Hopf’s argument) for the next post of this series.

**1. Geometry of tangent bundles **

** 1.1. Riemannian metrics, Levi-Civitta connections and Riemannian curvature tensors **

Let be a Riemmanian manifold and denote by its Riemannian metric of .

Let be the associated Levi-Civita connection, i.e., the unique connection (“notion of parallel transport”) that is symmetric and compatible with the Riemannian metric . Given a curve on , the covariant derivative along is

(and it should *not* be confused with ). Sometimes we will also denote the covariant derivative simply by when the curve is implicitly specified: for example, given a vector field along a curve (of footprints), we write where is an extension of to .

In this setting, recall that a curve is a geodesic if and only if for all .

Since the equation is a first order ODE (in the variables ), we have that geodesics are determined by the initial vector . Furthermore, any geodesic has constant speed, i.e., the quantity measuring the square of size (norm) of the tangent vector is constant along : in fact, using the compatibility between and , one gets

for all .

The lack of commutativity of the Levi-Civitta connection is measured by the Riemannian curvature tensor

In terms of the Riemannian curvature tensor , the sectional curvature of a -plane spanned by two vectors and is

** 1.2. The tangent bundle to a tangent bundle **

The tangent bundle of the tangent bundle of is a bundle over in three natural ways:

- (a) where is the natural projection;
- (b) where is the natural projection;
- (c) where is defined as follows: given tangent at to a curve , we set where is the curve of footprints of the vectors ;

In this context, the vertical, resp. horizontal, subbundle of is , resp. . The vertical, resp. horizontal, subbundle is naturally identified with via , resp. . The vertical subbundle is transverse to the horizontal subbundle and the fiber of at can be identified via the map .

Geometrically, the roles of the vertical and horizontal subbundles are easier to understand in the following way. Given an element of tangent to a curve with , let be the curve of footprints of in . In this setting, the identification of with a pair of vectors via the horizontal and vertical subbundles simply amounts to take

In other terms, the component of in the horizontal subbundle measures how fast is moving in while the component of in the vertical subbundle measure how fast is moving in the fibers of .

This way of thinking as a bundle over leads to the following natural Riemannian metric on : given , we define

This metric is called *Sasaki metric* and the geometry of with respect to this Riemannian metric will be useful in our study of geodesic flows.

Remark 1Sasaki metric is induced by the symplectic form

in the sense that

where . The symplectic form is the pullback of the canonical symplectic form on the cotangent bundle by the map associating to the linear functional .

For the reader’s convenience, let us mention the following three useful facts about Sasaki metric:

- Sasaki showed that the fibers of the tangent bundle are totally geodesic submanifolds of equipped with Sasaki metric;
- A parallel vector field on viewed as a curve on is a geodesic for Sasaki metric that is always orthogonal to the fibers of ;
- by Topogonov comparison theorem, for close to , one has
where is the vector obtained by parallel transporting along the geodesic connecting to and is the distance associated to Sasaki metric; here, how close must be from depends only on the sectional curvatures of Sasaki metric in a neighborhood of ;

**2. First derivative of geodesic flows and Jacobi fields **

** 2.1. Computation of the first derivative of geodesic flows **

Let be the geodesic flow associated to a Riemannian manifold . By definition, given a tangent vector , we define where is the unique geodesic of with . Here, it is worth to point out that the geodesic flow is always locally well-defined but it might be globally ill-defined. Moreover, the geodesic flow preserves the Liouville measure (i.e., the volume form on induced by Riemannian metric of ).

We want to describe and, from the definition of first derivative, this amounts to study (-parameter) *variations of geodesics*.

More precisely, let be a (smooth) map such that, for each , is a geodesic of . Intuitively, is a one-parameter variation of the geodesic .

Define the vector field along the geodesic . It is well-known that satisfies the Jacobi equation

where is the covariant derivative (along ) and is the Riemannian curvature tensor. In other terms, is a *Jacobi field*, i.e., a vector field satisfying Jacobi’s equation.

Observe that Jacobi’s equation is a second order linear ODE. In particular, a Jacobi field is determined by the initial data .

The pair corresponds to the tangent vector at to the curve in (under the identification described above [in terms vertical and horizontal subbundles]). Indeed, the curve of footprints of is , so that the tangent vector at of is represented by

Here, the symmetry of the Levi-Civitta connection was used.

Similarly, the pair represents the tangent vector at to the curve . Therefore, represents

In summary, Jacobi fields are intimately related to the first derivatives of geodesic flows:

Proposition 1The image of the tangent vector under the derivative of the geodesic flow is the tangent vector where is the (unique) Jacobi field with initial data along the (unique) geodesic with .

** 2.2. Perpendicular Jacobi fields and invariant subbundles **

A concrete example of Jacobi field along a geodesic is : indeed, in this context, and , so that Jacobi’s equation is trivially verified. Geometrically, this Jacobi field correspond to a trivial variation of the geodesic where the initial point moves along and/or the speed of the parametrization of changes, i.e., .

In general, a Jacobi field along a geodesic that is tangent to has the form for some : in fact, for with , one has , so that Jacobi’s equation reduces to , i.e., for all .

Hence, a Jacobi field along a geodesic is interesting only when it is not completely tangent to the geodesic, or, equivalently, when it has some non-trivial component in the perpendicular direction to the geodesic.

A Jacobi field along a geodesic has the following geometrical properties:

- the component of makes constant angle with , i.e., the quantity is constant;
- if both components of are orthogonal to at some point, then they stay orthogonal all along , i.e., if and for some , then for all ;

We say that a Jacobi field along a geodesic is a *perpendicular Jacobi field* whenever both components of are orthogonal to .

From the properties of Jacobi fields discussed above, we see that any Jacobi field along a geodesic has a decomposition

where is a perpendicular Jacobi field and is a Jacobi field tangent to .

After this little digression on Jacobi fields, let us use them to introduce relevant invariant subbundles under the first derivative of a geodesic flow .

We begin by recalling that the norm of a tangent vector stays constant along its -orbit, i.e., preserves the energy hypersurfaces (for each ). In particular, the first derivative of the geodesic flow preserves the tangent bundle (to the unit tangent bundle of M).

We affirm that, under the identification for , the fiber (of the subbundle of ) corresponds to the set of pairs with .

In fact, note that an element of is tangent at to a variation of geodesics parametrized by arc-length, i.e.,

for all , such that the geodesic satisfies and the Jacobi field corresponding to verifies .

The desired property now follows from the following calculation:

The invariant subbundle itself admits a decomposition into two invariant subbundles, namely,

where is the vector field generating the geodesic flow and is the orthogonal complement of . In fact, under the identification for , the vector is and the elements of have the form with , . In particular, the -invariance of follows from the fact (mentioned above) that a Jacobi field satisfying and for some is a perpendicular Jacobi field (i.e., and for all ).

In summary, the action of on has two complementary invariant subbundles, namely, the span of the vector field generating the geodesic flow and its orthogonal consisting of perpendicular Jacobi fields. Since acts isometrically in the direction of , our task is reduced to study the action of on perpendicular Jacobi fields.

** 2.3. Matrix Jacobi and Ricatti equations **

We want to describe the matrix of acting on the vector space of perpendicular Jacobi fields. For this sake, let be an orthonormal basis for the tangent space of , and denote by the parallel transport of this orthonormal basis along the geodesic .

Define the matrix whose entries are

where is the Riemannian curvature tensor.

Note that any Jacobi field along can be written as . In this setting, Jacobi equation becomes

and, as usual, a solution is determined by the values and .

We can write solutions of the Jacobi equation above in a practical way by considering a matrix solution of the matrix Jacobi equation:

If is non-singular, the matrix

satisfies the matrix Ricatti equation

Remark 2The matrix is symmetric if and only if one has

for any two columns and of . Here, is the standard symplectic form of .

** 2.4. An estimate for the first derivative of a geodesic flow **

After these preliminaries on the geometry of tangent bundles, geodesic flows and Jacobi fields, we are ready to prove the following result stated as Theorem 11 in our previous post (but whose proof was postponed for this post).

Theorem 2Let be a negatively curved manifold. Let and consider a geodesic. Suppose that is a Lipschitz function such that, for each , the sectional curvature of any plane containing is greater than or equal to , and denote by the solution of Ricatti’s equation

with initial data . Then, the first derivative of the geodesic flow at time satisfies the estimate

From our discussion so far, the task of estimating the norm is equivalent to provide bounds for in terms of where is a perpendicular Jacobi field along (cf. Proposition 1 and Subsection 2).

We begin by estimating these quantities for two special subclasses of perpendicular Jacobi fields defined as follows. Let and be the (fundamental) solutions of the matrix Jacobi equation

with initial data and . Note that, by definition, all Jacobi fields with , resp. all Jacobi fields with , have the form , resp. , i.e., they are obtained by applying the matrices , resp. , to a vector , resp. . In this setting, the “other” component , resp. (of the Jacobi field , resp. , viewed as a tangent vector to ) can be recovered by applying the matrix , resp. , to , resp. .

Remark 3Very roughly speaking, the idea behind the choice of the subclasses and is that are Jacobi fields belonging to a certain “stable cone” and are Jacobi fields belonging to a certain “unstable cone” (compare with the discussion in the next Section).

Our first lemma says that the tangent vectors associated to Jacobi fields as above do *not* growth in forward time.

Lemma 3Let be a perpendicular Jacobi field along such that . Then,

In particular,

*Proof:* One of the consequences of negative sectional curvatures along is the fact that the functions and are strictly convex for any perpendicular Jacobi field (see, e.g., Eberlein’s survey).

In our context, this implies that is a (strictly) convex function decreasing from to in the interval . Therefore,

Since and for close to (because ), we deduce that

This completes the proof of the lemma.

Our second lemma says that the the growth in forward time of tangent vectors associated to Jacobi fields as above is reasonably controlled in terms of the solution of Ricatti’s equation with (where is the Lipschitz function controlling some sectional curvatures of ).

Lemma 4Let be a perpendicular Jacobi field along with . Then,

*Proof:* By definition, . Thus, and, *a fortiori*,

On the other hand, since , we see that and, hence,

These inequalities show that the proof of the lemma is complete once we can prove that for all .

In this direction, let us observe that the matrix is symmetric because it verifies (in a trivial way) the condition of Remark 2. Therefore, the norm of is given by the expression

where ranges from all unit vectors. In particular, our task is reduced to show that

for all unit vectors , where .

From the matrix Ricatti equation, we see that