However, before entering into the mathematical discussion strictly speaking, let me take the opportunity to dedicate this blog post to the memory of two Russian mathematicians who passed away earlier this month: Dmitri Anosov and Nikolai Chernov. Among their several well-known contributions in Dynamical Systems, we can quote:

- Anosov’s proof of the ergodicity of -volume preserving of a large class of hyperbolic systems (nowadays called Anosov diffeomorphisms);
- Chernov’s proof of subexponential mixing for a large class of Anosov flows;
- …

Of course, the list of contributions of Anosov and Chernov to Dynamical Systems is vast: each of them wrote more than 90 research articles and books about the features of systems with some hyperbolicity (such as geodesic flows on negatively curved manifolds and chaotic billiards) among other topics.

In particular, it is out of the scope of this post to provide detailed descriptions of the works of these two very influential dynamicists.

On the other hand, as a form of “small compensation”, let me say that the second section of this post (about rates of the WP flow on the modular surface) briefly discusses some of the ideas advanced by these two mathematicians.

Concerning the rates of mixing of the WP flow, let us recall that, by Burns-Masur-Wilkinson theorem (cf. Theorem 1 in the first post of this series), the WP flow on is *mixing* with respect to the Liouville measure whenever .

By definition of the mixing property, this means that the correlation function converges to as for any given -integrable observables and . (See, e.g., the section “ formulation” in this Wikipedia article about the mixing property.)

Given this scenario, it is natural to ask how *fast* the correlation function converges to zero. In general, the correlation function can decay to (as a function of ) in a very slow way depending on the choice of the observables (see, e.g., this blog post of Climenhaga for some concrete examples). Nevertheless, it is often the case (for mixing flows with some hyperbolicity) that the correlation function decays to with a *definite* (e.g., polynomial, exponential, etc.) speed when restricting the observables to appropriate spaces of “reasonably smooth” functions.

In other words, given a mixing flow (with some hyperbolicity), it is usually possible to choose appropriate functional (e.g., Hölder, , Sobolev, etc.) spaces and such that

- for some constants , and for all (
*polynomial decay*), - or for some constants , and for all (
*exponential decay*).

Evidently, the “precise” rate of mixing of the flow (i.e., the sharp values of the constants , and/or above) depend on the choice of the functional spaces and (as they might change if we replace observables by observables say). On the other hand, the *qualitative* speed of decay of , that is, the fact that decays polynomially or exponentially as whenever and are “reasonably smooth”, remains *unchanged* if we select and from a well-behaved scale of functional (like spaces, , or spaces, ). In particular, this partly explains why in the Dynamical Systems literature one simply says that a given mixing flow has “polynomial decay” or “exponential decay”: usually we are interested in the qualitative behavior of the correlation function for reasonably smooth observables, but the particular choice of functional spaces and is normally treated as a “technical detail”.

After this brief description of the notion of rate of mixing (speed of decay of correlation functions), we are ready to state the main result of this post.

Theorem 1 (Burns-Masur-M.-Wilkinson)The rate of mixing of the WP flow on is:

- at most polynomial when ;
- rapid (faster than any polynomial) when .

Remark 1This result was announced as Theorem 2 in the first post of this series and also in this preprint here. Since then, Burns, Masur, Wilkinson and myself found some evidence indicating that the Weil-Petersson geodesic flow on is actually exponentially mixing when . The details will hopefully appear in the forthcoming paper (currently still in preparation).

Remark 2An open problem left by Theorem 1 is to determine the rate of mixing of the WP flow on for . Indeed, while this theorem provides a polynomial upper bound for the rate of mixing in this setting, it does not rule out the possibility that the actual rate of mixing of the WP flow is sub-polynomial (even for reasonably smooth observables). Heuristically speaking, we believe that the sectional curvatures of the WP metric control the time spend by WP geodesics near the boundary of . In particular, it seems that the problem of determining the rate of mixing of the WP flow (when ) is somewhat related to the issue of finding suitable (polynomial?) bounds for how close to zero the sectional curvatures of the WP metric can be (in terms of the distance to the boundary of ). Unfortunately, the best available bounds for the sectional curvatures of the WP metric (due to Wolpert) do not rule out the possibility that some of these quantities get extremely close to zero (see Remark 4 of this post here).

The difference in the rates of mixing of the WP flow on when or in Theorem 1 reflects the following simple (yet important) feature of the WP metric near the boundary of the Deligne-Mumford compactification of .

In the case , e.g., , the moduli space equipped with the WP metric looks like the surface of revolution of the profile near the cusp at infinity (see Remark 6 of this post here). In particular, even though a -neighborhood of the cusp is “polynomially large” (with area ), the Gaussian curvature approaches only near the cusp and, as it turns out, this strong negative curvature near the cusp makes that all geodesic not pointing directly towards the cusp actually come back to the compact part in bounded (say ) time. In other words, the excursions of infinite WP geodesics on near the cusp are so quick that the WP flow on is “close” to a classical Anosov geodesic flow on negatively curved compact surface. In particular, it is not entirely surprising that the WP flow on is rapid.

On the other hand, in the case , the WP metric on has *some* sectional curvatures close to *zero* near the boundary of the Deligne-Mumford compactification of (see Theorem 3 and Remark 5 of this post here). By exploiting this feature of the WP metric on for (that has no counterpart for or ), we will build a *non-neglegible* set of WP geodesics spending a *long* time near the boundary of before eventually getting into the compact part. In this way, we will deduce that the WP flow on takes a fair (polynomial) amount of time to mix certain parts of the boundary of with fixed compact subsets of .

In the remainder of this post, we will give some details of the proof of Theorem 1. In the next section, we give a fairly complete proof (assuming the results in this previous post, of course) of the polynomial upper bound on the rate of mixing of the WP flow on when . After that, in the final section, we provide a *sketch* of the proof of the rapid mixing property of the WP flow on . In fact, we decided (for pedagogical reasons) to explain some key points of the rapid mixing property *only* in the *toy model* case of a negatively curved surface with one cusp corresponding *exactly* to a surface of revolution of a profile , . In this way, since the WP metric near the cusp of can be thought as a “perturbation” of the surface of revolution of (thanks to Wolpert’s asymptotic formulas), the reader hopefully will get a flavor of the main ideas behind the proof of rapid mixing of the WP flow on without getting into the (somewhat boring) technical details needed to check that the arguments used in the toy model case are “sufficiently robust” so that they can be “carried over” to the “perturbative setting” of the WP flow on .

**1. Rates of mixing of the WP flow on . I **

In this section, our notations are the same as in this previous post here.

Given , let us consider the portion of consisting of such that a non-separating (homotopically non-trivial, non-peripheral) simple closed curve has hyperbolic length . The following picture illustrates this portion of as a -neighborhood of the stratum of the boundary of the Deligne-Mumford compactification where gets pinched (i.e., becomes zero).

Note that the stratum is non-trivial (that is, not reduced to a single point) when . Indeed, by pinching as above and by disconnecting the resulting node, we obtain Riemann surfaces of genus with punctures whose moduli space is isomorphic to . It follows that is a complex orbifold of dimension , and, a fortiori, is not trivial. Evidently, this argument breaks down when : for example, by pinching a curve as above in a once-punctured torus and by removing the resulting node, we obtain thrice punctured spheres (whose moduli space is trivial). In particular, our Figure 1 concerns *exclusively* the case .

We want to locate certain regions near taking a long time to mix with the compact part of . For this sake, we will exploit the geometry of the WP metric near — e.g., the fact provided by Wolpert’s formulas (cf. Theorem 3 in this post) that some sectional curvatures of the WP metric approach zero — to build nice sets of unit vectors traveling in an “almost parallel” way to for a significant amount of time.

More precisely, we consider the vectors and (where is the complex structure). By definition, they span a complex line . Intuitively, the complex line points in the normal direction to a “copy” of inside a level set of the function as indicated in the following picture:

Using the complex line , we formalize the notion of “almost parallel” vector to . Indeed, given , let us denote by the quantity (where is the WP metric). By definition, measures the size of the projection of the unit vector in the complex line . In particular, we can think of as “almost parallel” to whenever the quantity is very close to zero.

In this setting, we will show that unit vectors almost parallel to whose footprints are close to always generate geodesics staying near for a long time. More concretely, given , let us define the set

where and is the footprint of the unit vector . Equivalently, is the disjoint union of the pieces of spheres attached to points with . The following figure summarizes the geometry of :

We would like to prove that a geodesic originating at any stays in a -neighborhood of for an interval of time of size of order , so that the WP geodesic flow does *not* mix with any fixed ball in the compact part of of Riemann surfaces with systole :

In this direction, we will need the following lemma from the third post of this series (cf. Lemma 13 in this post here).

From this lemma, it is not hard to estimate the amount of time spent by a geodesic near for an arbitrary :

Lemma 3There exists a constant (depending only on and ) such that

for all and .

*Proof:* By definition, implies that . Thus, it makes sense to consider the maximal interval of time such that for all .

By Lemma 2, we have that , i.e., for some constant depending only on and . In particular, for all . From this estimate, we deduce that

for all . Since the fact that implies that , the previous inequality tell us that

for all .

Next, we observe that, by definition, . Hence,

By putting together the previous two inequalities and the fact that (as ), we conclude that

Since was chosen so that is the maximal interval with for all , we have that . Therefore, the previous estimate can be rewritten as

Because , it follows from this inequality that where .

In other words, we showed that , and, *a fortiori*, for all . This completes the proof of the lemma.

Once we have Lemma 3 in our toolbox, it is not hard to infer some upper bounds on the rate of mixing of the WP flow on when .

Proposition 4Suppose that the WP flow on has a rate of mixing of the form

for some constants , , for all , and for all choices of -observables and .Then, , i.e., the rate of mixing of the WP flow is at most polynomial.

*Proof:* Let us fix once and for all an open ball (with respect to the WP metric) contained in the compact part of : this means that there exists such that the systoles of all Riemann surfaces in are .

Take a function supported on the set of unit vectors with footprints on with values such that and : such a function can be easily constructed by smoothing the characteristic function of with the aid of bump functions. Next, for each , take a function supported on the set with values such that and : such a function can also be constructed by smoothing the characteristic function of after taking into account the description of the WP metric near given by Theorems 2 and 3 in this post here and the definition of (in terms of the conditions and ). Furthermore, this description of the WP metric near combined with the asymptotic expansion where and is a twist parameter (see the proof of Lemma 4 of this post here) says that : indeed, the condition on footprints of unit tangent vectors in provides a set of volume (cf. the proof of Lemma 4 of the aforementioned post for details) and the condition on unit tangent vectors in with a fixed footprint provides a set of volume comparable to the Euclidean area of the Euclidean ball (cf. Theorem 2 in this post here), so that

In summary, for each , we have a function supported on with , and for some constant depending only on and .

Our plan is to use the observables and to give some upper bounds on the mixing rate of the WP flow . For this sake, suppose that there are constants and such that

for all and .

By Lemma 3, there exists a constant such that whenever . Indeed, since is a symmetric set (i.e., if and only if ), it follows from Lemma 3 that all Riemann surfaces in the footprints of have a systole . Because we took in such a way that all Riemann surfaces in have systole , we obtain , that is, , as it was claimed.

Now, let us observe that the function is supported on because is supported on and is supported on . By putting together this fact and the claim in the previous paragraph (that for ), we deduce that whenever . Thus,

By plugging this identity into the polynomial decay of correlations estimate , we get

whenever and .

We affirm that the previous estimate implies that . In fact, recall that our choices were made so that where is a fixed ball, , for some constant and . Hence, by combining these facts and the previous mixing rate estimate, we get that

that is, , for some constant and for all sufficiently small (so that and ). It follows that , as we claimed. This completes the proof of the proposition.

Remark 3In the statement of the previous proposition, the choice of -norms to measure the rate of mixing of the WP flow is not very important. Indeed, an inspection of the construction of the functions in the argument above reveals that for any , . In particular, the proof of the previous proposition is sufficiently robust to show also that a rate of mixing of the form

for some constants , , for all , and for all choices of -observables and holds only if .In other words, even if we replace -norms by (stronger, smoother) -norms in our measurements of rates of mixing of the WP flow (on for ), our discussions so far will always give polynomial upper bounds for the decay of correlations.

At this point, our discussion of the proof of the first item of Theorem 1 is complete (thanks to Proposition 4 and Remark 3). So, we will now move on to the next section we give some of the key ideas in the proof of the second item of Theorem 1.

**2. Rates of mixing of the WP flow on . II **

Let us consider the WP flow on when , that is, when or .

Actually, we will restrict our attention to the case because the remaining case is very similar to .

Indeed, the moduli space of four-times punctured spheres is a *finite* cover of the moduli space : this can be seen by sending each four-punctured sphere to the elliptic curve , so that becomes naturally isomorphic to where is a congruence subgroup of of level with index . Since all arguments towards rapid mixing of geodesic flows in this section still work after taking finite covers, it suffices to prove the second item of Theorem 1 to the WP flow on .

The rate of mixing of a geodesic flow on the unit tangent bundle of a negatively curved *compact* surface is known to be *fast*: indeed, Chernov used his technique of “Markov approximations” to show *stretched exponential* decay of correlations, and Dolgopyat added a new crucial ingredient (“Dolgopyat’s estimate”) to Chernov’s work to prove *exponential* decay of correlations.

Evidently, these works of Chernov and Dolgopyat can not be applied to the Wp flow on because of the non-compactness of due to the presence of a (single) cusp (at infinity). Nevertheless, this suggests that we should be able to determine the rate of mixing of the WP flow on provided we have enough control of the geometry of the WP metric near the cusp.

Fortunately, as we mentioned in Example 5 of this post here, Wolpert showed that the WP metric on has an *asymptotic* expansion at a point . Thus, the WP metric on neighborhoods (with ) of the cusp at infinity of becomes closer (as ) to the metric of surface of revolution of the profile on neighborhoods of the cusp at (as ).

Partly motivated by the scenario of the previous paragraph, from now on we will *pretend* that the WP metric on looks *exactly* like the metric at all points for some . In other words, instead of studying the WP flow on , we will focus on the rates of mixing of the following *toy model*: the geodesic flow on a negatively curved surface with a single cusp possessing a neighborhood where the metric is isometric to the surface of revolution of a profile for a fixed real number .

Remark 4The surface of revolution modeling the WP metric on is obtained by rotating the profile . In other words, we see that the study of rates of mixing of the surface of revolution approximating the WP metric on is a “borderline case” in our subsequent discussion.

Here, our main motivations to replace the WP flow on by the toy model described above are:

- all important ideas for the study of rates of mixing of are also present in the case of the toy model, and
- even though the WP metric on is a perturbation of a surface of revolution, the verification of the fact that the arguments used to estimate the decay of correlations of the geodesic flow on the toy model surfaces are robust enough so that they can be carried over the WP metric situation is somewhat boring: basically, besides performing a slight modification of the proofs to include the borderline case , one has to introduce “error terms” in the whole discussion below and, after that, one has to check that these errors terms do not change the qualitative nature of all estimates.

In summary, the remainder of this section will contain a proof of the following “toy model version” of the second item of Theorem 1.

Theorem 5Then, the geodesic flow (associated to ) on is rapid (faster than polynomial) mixing in the sense that, for all , one can choose an adequate Banach space of “reasonably smooth” observables and a constant so thatLet be a compact surface and fix . Suppose that is equipped with a negatively curved Riemannian metric such that the restriction of to a neighborhood of is isometric to a surface of revolution of a profile (for some choices of and ).

for all .

Remark 5The arguments below show that the statement above also holds when is equipped with a negatively curved metric that is isometric to a surface of revolution , , near for each .

Remark 6The Riemannian metric is incomplete because the surface of revolution of is incomplete when (as the reader can check via a simple calculation).

Recall that, in the setting of Theorem 5, we want to understand the dynamics of the excursions of the geodesic flow near the cusp (in order to get rapid mixing). For this sake, we describe these excursions by rewriting the geodesic flow (near ) as a *suspension flow*.

** 2.1. Excursions near the cusp and suspension flows **

Consider a small neighborhood in of where the metric is isometric to the surface of revolution of the profile , i.e.,

Next, take a small parameter and consider the parallel . We parametrize unit tangent vectors to the surface of revolution with footprints in as follows.

Given , we denote by the unique unit tangent vector pointing towards to the cusp at . Equivalently, is the unit vector tangent to the meridian at time , or, alternatively, where is the distance function from the cusp to a point . Also, we let be the unit vector obtained by rotating by in the counterclockwise sense (i.e., by applying the natural almost complex structure ).

In this setting, an unit vector pointing towards the cusp is completely determined by a real number such that and , i.e.,

The *qualitative* behavior of the excursion of a geodesic starting at can be easily determined in terms of the parameter thanks to the classical results in Differential Geometry about surfaces of revolutions. Indeed, it is well-known (see, e.g., Do Carmo’s book) that such a geodesic satisfies

and

for a certain constant , and, furthermore, these relations imply the famous Clairaut’s relation:

where is the parameter attached to (i.e., ). In particular, except for the geodesic going directly to the cusp (i.e., the geodesic starting at associated to ), all geodesics (starting at with ) behave qualitatively in a simple way. In the first part of its excursion towards the cusp, the angle increases (resp. decreases) from to (resp. from to ) while the value of diminishes in order to keep up with Clairaut’s relation. Then, the geodesic reaches its closest position to the cusp at time : here, (i.e., is tangent to the parallel containing ) and, hence,

Finally, in the second part , does the “opposite” from the first part: the angle goes from to and increases from back to . The following picture summarizes the discussion of this paragraph:

Remark 7Note that the time taken by the geodesic to go from the parallel to and then from back to isindependentof the basepoint . Indeed, this is a direct consequence of the rotational symmetry of our surface. Alternatively, this can be easily seen from the formula

deduced by integration of the ODE satisfied by . Observe that this formula also shows that is uniformly bounded, i.e., for all . Geometrically, this means that all geodesics starting at must return to in bounded time unless they go directly into the cusp.

This description of the excursions of geodesics near the cusp permits to build a *suspension-flow* model of the geodesic flow near . Indeed, let us consider the *cross-section* . As we saw above, an element of the surface is parametrized by two angular coordinates and : the value of determines a point and the value of determines an unit tangent vector making angle with . The subset of consisting of those elements with angular coordinate corresponds to the unit vectors with footprint in pointing towards the cusp at . The equation determines a circle inside corresponding to geodesics going straight into the cusp, and, furthermore, we have a natural “first-return map” defined by where is the geodesic starting at at time .

In this setting, the orbits , are modeled by the “suspension flow” if , over the *base map* with *roof* function , .

Remark 8Technically speaking, one needs to “complete” the definition of and by including the dynamics of the geodesic flow on the compact part of in order to properly write the geodesic flow on as a suspension flow. Nevertheless, since the major technical difficulty in the proof of Theorem 5 comes from the presence of the cusp, we will ignore the excursions of geodesics in the compact part and we will pretend that the (partially defined) flow is a “genuine” suspension flow model.

** 2.2. Rapid mixing of contact suspension flows **

One of the advantages about thinking of the geodesic flow on as a suspension flow comes from the fact that several authors have previously studied the interplay between the rates of mixing of this class of flows and the features of and : see, e.g., these papers of Avila-Gouëzel-Yoccoz and Melbourne for some results in this direction (and also for a precise definition of suspension flows).

For our current purposes, it is worth to recall that Bálint and Melbourne (cf. Theorem 2.1 [and Remarks 2.3 and 2.5] of this paper here) proved the rapid mixing property for *contact* suspension flows whose base map is modeled by a *Young tower* with *exponential tails* and whose roof function is bounded and *uniformly piecewise Hölder continuous* on each subset of the basis of the Young tower. In particular, the proof of Theorem 5 is complete once we prove that the base map is modeled by Young towers and the roof function is bounded and uniformly piecewise Hölder continuous on each element of the basis of the Young tower (whatever this means).

As it turns out, the theory of Young towers (introduced by Young in these papers here and here) is a *double-edged sword*: while it provides an adequate setup for the study of statistical properties of systems with some hyperbolicity *once* the so-called *Young towers* were built, it has the *drawback* that the construction of Young towers (satisfying all five natural but technical axioms in Young’s definition) is usually a delicate issue: indeed, one has to find a countable Markov partition of a positive measure subset (working as the basis of the Young tower) so that the return maps associated to this Markov partition verify several hyperbolicity and distortion controls, and it is not always clear where one could possibly find such a Markov partition for a given dynamical system.

Fortunately, Chernov and Zhang gave a list of *sufficient* geometric properties for a *two-dimensional* map like to be modeled by Young towers with exponential tails: in fact, Theorem 10 in Chernov-Zhang paper is a sort of “black-box” producing Young towers with exponential tails whenever seven *geometrical* conditions are fulfilled. For the sake of exposition, we will not attempt to check all seven conditions for : instead, we will focus on two main conditions called *distortion bounds* and *one-step growth condition*.

Before we discuss the distortion bounds and the one-step growth condition, we need to recall the concept of *homogeneity strips* (originally introduced by Bunimovich-Chernov-Sinai). In our setting, we take and (to be chosen later) and we make a partition of a neighborhood of the *singular set* (of geodesics going straight into the cusp) into countably many strips:

for all , . (Actually, has two connected components, but we will slightly abuse of notation by denoting these connected components by .)

Intuitively, the partition into polynomial scales in the parameter is useful in our context because the relevant quantities (such as Gaussian curvature, first and second derivatives, etc.) for the study of the geodesic flow of the surface of revolution blows up with a polynomial speed as the excursions of geodesics get closer the cusp (that is, as ). Thus, the important quantities for the analysis of the geodesic flow near the cusp become “almost constant” when restricted to one of the homogeneity strips .

Also, another advantage of the homogeneity strips is the fact that they give a rough control of the elements of the countable Markov partition at the basis of the Young tower produced by Chernov-Zhang: indeed, the arguments of Chernov-Zhang show that each element of the basis of their Young tower is completely contained in a homogeneity strip. In particular, the verification of the uniform piecewise Hölder continuity of the roof function follows once we prove that the restriction of the roof function to each homogeneity strip is uniformly Hölder continuous (in the sense that, for some , the Hölder norms are bounded by a constant *independent* of ).

Coming back to the one-step growth and distortion bounds, let us content ourselves to formulate simpler *versions* of them (while referring to Section 4 and 5 of Chernov-Zhang paper for precise definitions): indeed, the actual definitions of these notions involve the properties of the derivative along unstable manifolds, and, in our current setting, we have just a *partially defined* map , so that we can not talk about future iterates and unstable manifolds unless we “complete” the definition of .

Nevertheless, even if is only partially defined, we still can give crude analogs to unstable directions for by noticing that the vector field on (whose leaves are ) morally works like an unstable direction: in fact, this vector field is transverse to the singular set which is a sort of “stable set” because all trajectories of the geodesic flow starting at converge in the future to the same point, namely, the cusp at . In terms of the “unstable direction” , we define the *expansion factor* of at a point as , that is, the amount of expansion of the “unstable” vector field under . Note that, from the definitions, the expansion factor depends only on the -coordinate of . So, from now on, we will think of expansion factors as a function of .

In terms of expansion factors, the (variant of the) distortion bound condition is

where satisfies , and the (variant of the) one-step growth condition is

Remark 9The one-step growth condition above is very close to the original version in Chernov-Zhang paper (compare (3) with Equation (5.5) in Chernov-Zhang article). On the other hand, the distortion bound condition (2) differs slightly from its original version in Equation (4.1) in Chernov-Zhang paper. Nevertheless, they can be related as follows. The original distortion condition essentially amounts to give estimates (where is a smooth function such that as ) whenever and belong to the samehomogenous unstable manifold(i.e., a piece of unstable manifold such that never intersects the boundaries of the homogeneity strips for all and ; the existence of homogenous unstable manifolds through almost every point is guaranteed by a Borel-Cantelli type argument described in Appendix 2 of this paper of Bunimovich-Chernov-Sinai here). Here, one sees that

for some . Using the facts that decays exponentially fast (as and are in the same unstable manifold ) and is always contained in a homogeneity strip (as is a homogenous unstable manifold), one can check that the estimate in (2) implies the desired uniform bound on the previous expression in terms of a smooth function such that as . In other words, the estimate (2) can be shown to imply the original version of distortion bounds, so that we can safely concentrate on the proof of (2).

At this point, we can summarize the discussion so far as follows. By Melbourne’s criterion for rapid mixing for contact suspension flows and Chernov-Zhang criterion for the existence of Young towers with exponential tails for the map , we have “reduced” the proof of Theorem 5 to the following statements:

Proposition 6Given and , one has the following “uniform Hölder estimate”

whenever is sufficiently small (depending on , and ).

Proposition 7The expansion factor function satisfies:

- given , we can choose large (and sufficiently small) so that
where ;

- given , we can choose and such that and
for some (sufficiently large) constant and for all .

The proofs of these two propositions are given in the next two subsections and they are based on the study of perpendicular unstable Jacobi fields related to the variations of geodesics of the form , .

** 2.3. The derivative of the roof function **

From now on, we fix (e.g., ) and, for the sake of simplicity, we will denote a geodesic corresponding to an initial vector by . Of course, there is no loss of generality here because of the rotational symmetry of the surface . Also, we will suppose that as the case is symmetric.

Note that the roof function is defined by the condition , or, equivalently,

where denotes the distance from a point to the cusp at and is the distance from to . By taking the derivative with respect to at and by recalling that , we obtain that

where . Since where , we have , and, *a fortiori*,

Let us compute the two inner products above. By definition of the parameter and the symmetry of the revolution surface , we have . Also, if we denote by the perpendicular (“unstable”) Jacobi field along the geodesic associated to the variation of (cf. Section 2 of this previous post here) with initial conditions and , then

From the computation of the inner products above and the fact that they add up to zero, we deduce that , that is,

In other terms, the previous equation says that the derivative can be controlled via the quantity measuring the growth of the perpendincular Jacobi field at the return time . Here, it is worth to recall that Jacobi fields are driven by Jacobi’s equation:

where is the Gaussian curvature of the surface of revolution at the point . Also, it is useful to keep in mind that Jacobi’s equation implies that the quantity satisfies Riccati’s equation

where .

In the context of the surface of revolution , these equations are important tools because we have the following explicit formula for the Gaussian curvature at a point :

In particular, verifies .

Next, we take and we consider the following auxiliary function:

By definition, . Furthermore,

Since the equation (describing the motion of geodesic on ) implies that , we deduce from the previous inequality that

This estimate allows to control the solution of Riccati’s equation along the following lines. The initial data of the Jacobi field is and . Hence,

In particular, there exists a well-defined maximal interval where for all . By plugging this estimate into Jacobi’s equation, we get that

for each .

By integrating this inequality (and using the initial condition ), we obtain that

Therefore,

If , we deduce that (as ). Otherwise, and . Since satisfies Riccati’s equation, we deduce from (5) that

at each time where . It follows that for all . Hence,

and, *a fortiori*,

In other words, we proved that

Now, the quantity can be estimated as follows. By deriving Clairaut’s relation , we get

Since (as we are interested in small angles , large) and (thanks to the relation and the fact that and, thus, for small), we conclude that

for . Here, we used the fact that for . Therefore,

since . Also, the symmetry of the surface implies and, hence,

In summary, we have shown that , i.e.,

By putting together (4), (6) and (8), we conclude that

for some constant depending on and .

At this stage, we are ready to complete the proof of Proposition 6.

*Proof:* Let us estimate the Hölder constant . For this sake, we fix and we write

for some between and . Since and , it follows from (9) that

Because and are arbitrary points in , we have that

where is an appropriate constant.

Now, our assumption implies that we can choose sufficiently small so that . By doing so, we see from the previous estimate that

whenever , i.e., , is sufficently small. This proves Proposition 6.

** 2.4. Some estimates for the expansion factors **

Similarly to the previous subsection, the proof of Proposition 7 uses the properties of Jacobi’s and Riccati’s equation to study

where is the scalar function (with and ) measuring the size of the perpendicular “unstable” Jacobi field along .

We begin by giving a lower bound on . Given , let us choose small so that

Of course, this choice of is possible because . Next, we consider the auxiliary function:

By definition, . Furthermore,

In particular,

Since (cf. the paragraph before (5)), we deduce from the previous estimate that

This inequality implies that the solution of Riccati’s equation satisfies for all . Indeed, the initial condition , says that and the inequality above tells us that

at any time where .

By integrating the estimate over the interval , we obtain that

i.e.,

For sake of concreteness, let us set and let us restrict our attention to geodesics whose initial angle with the meridians of are sufficiently small so that . In this way, we have that (thanks to Jacobi’s equation and our initial conditions and ). In this way, the inequality above becomes

Next, we observe that can be bounded from below in a similar way to our derivation of a bound from above to in the previous subsection: in fact, by repeating the arguments appearing after (7) above, one can show that

and

where is an adequate (small) constant depending on , and .

By putting together the estimates above, we deduce that

where .

This inequality shows that

Thus, if , then we can choose small (with ) and large so that (our variant of) the one-step growth condition (3) holds. This proves the first part of Proposition 7.

Finally, we give an indication of the proof of the second part of Proposition 7 (i.e., the distortion bound (2)). We start by writing

and by noticing that

Next, we take the derivative with respect to of the previous expression. Here, we obtain several terms involving some quantities already estimated above via Jacobi’s and Riccati’s equation (such as , , etc.), but also a new quantity appears, namely, , i.e., the derivative with respect to of the family of solutions of Riccati’s equation along . Here, the “trick” to give bounds on is to derive Riccati’s equation

with respect to in order to get an ODE (in the time variable ) satisfied by . In this way, it is possible to see that one has reasonable bounds on as soon as the derivative of the square root of the absolute value of the Gaussian curvature. Here, can be bounded by recalling that we have an explicit formula

for the Gaussian curvature. By following these lines, one can prove that, for a given , the distortion bound

holds whenever is taken sufficiently small. In other words, by taking , we have .

Note that the estimate in the previous paragraph gives the desired distortion bounds (2) once we show that can be selected such that . In order to check this, it suffices to recall that can be taken arbitrarily small (cf. the proof of the first part of Proposition 7), i.e., . So,

and

Since for , it follows that for adequate choices of and . This completes our sketch of proof of the second part of Proposition 7.

]]>

Theorem 1 (Burns-Masur-Wilkinson)Suppose that:Let be the quotient of a contractible, negatively curved, possibly incomplete, Riemannian manifold by a subgroup of isometries of acting freely and properly discontinuously. Denote by the metric completion of and the boundary of .

- (I) the universal cover of is
geodesically convex, i.e., for every , there exists an unique geodesic segmentinconnecting and .- (II) the metric completion of is
compact.- (III) the boundary is
volumetrically cusplike, i.e., for some constants and , the volume of a -neighborhood of the boundary satisfiesfor every .

- (IV) has
polynomially controlled curvature, i.e., there are constants and such that the curvature tensor of and its first two derivatives satisfy the following polynomial boundfor every .

- (V) has
polynomially controlled injectivity radius, i.e., there are constants and such thatfor every (where denotes the injectivity radius at ).

- (VI) The
first derivative of the geodesic flowispolynomially controlled, i.e., there are constants and such that, for every infinite geodesic on and every :Then, the Liouville (volume) measure of is finite, the geodesic flow on the unit cotangent bundle of is defined at -almost every point for all time , and the geodesic flow is

non-uniformly hyperbolic(in the sense of Pesin’s theory) andergodic.

Actually, the geodesic flow is Bernoulli and, furthermore, its metric entropy is positive, finite and is given by Pesin’s entropy formula (i.e., it is equal to the sum of positive Lyapunov exponents of counted with multiplicities).

More precisely, we proved in the previous post of this series that a geodesic flow satisfying the assumptions (II), (III) and (VI) above is non-uniformly hyperbolic with respect to the volume probability measure, and, furthermore, we identified the Oseledets stable and unstable subspaces (cf. the last theorem of this post):

Theorem 2Under the assumptions (II), (III) and (VI) in Theorem 1 above, the geodesic flow isnon-uniformly hyperbolic: more concretely, there exists a subset of full -measure such that the -invariant splitting

into the flow direction and the spaces and ofstable and unstable Jacobi fieldsalong have the property that

for all and .

Today, we want to exploit the non-uniform hyperbolicity of (and the assumptions (I) to (VI) above) in order to deduce the ergodicity of via Hopf’s argument.

For this sake, we organize this post as follows. In the first section, we discuss the geometry of stable and unstable manifolds of : in particular, we will see that these invariant manifolds form *global* laminations with useful *absolute continuity* properties. After that, we describe Hopf’s argument in the second section: from the nice properties of the invariant laminations, we deduce that Birkhoff averages are constant almost everywhere, and, hence, is ergodic. Finally, we conclude this post with a remark (inspired by conversations with Y. Coudène and B. Hasselblatt last November 2013) about the deduction of the *mixing* property for from Hopf’s argument.

**1. Stable manifolds of certain geodesic flows **

** 1.1. Local (Pesin) stable manifolds for certain geodesic flows **

We begin by noticing that a geodesic flow satisfying the assumptions (I) to (VI) of Theorem 1 has “nice” local (Pesin) stable and unstable manifolds through almost every point.

The reader with some experience with non-uniformly hyperbolic systems might think that this is an immediate consequence of the so-called Pesin’s theory. However, this is *not* the case in our setting because the phase space of is *not* assumed to be compact. In other words, we are facing a dilemma: while the non-compactness of is an important point for the applications of Theorem 1 (to moduli spaces equipped with WP metrics), it forbids a naive utilization of Pesin’s theory because of the competition between the dynamical behaviors of in compact regions of and near “infinity” .

Fortunately, Katok and Strelcyn (with the aid of Ledrappier and Przytycki) developed a *generalization* of Pesin’s theory where any “well-behaved” dynamics on non-compact phase space is allowed. Furthermore, Katok-Strelcyn successfully applied their version of Pesin’s theory to the study of dynamical billiards.

Very roughly speaking, Katok-Strelcyn say that if the dynamics of the non-uniformly hyperbolic system “blows up at most polinomially” at infinity , then the hyperbolic (exponential) behavior of is strong enough so that Pesin’s theory can be applied (because is “essentially compact” for practical purposes).

Evidently, this is much easier said than done, and, unfortunately, the discussion of the details of Katok-Strelcyn’s generalization of Pesin’s theory is out of the scope of this post. In particular, we will content ourselves in just mentioning the conditions (I) to (VI) in Theorem 1 were set up by Burns-Masur-Wilkinson in such a way that a geodesic flow satisfying (I) to (VI) also verifies all the requirements to apply Katok-Strelcyn’s work. Here, even though this is philosophically natural, it is worth to point out that the deduction of the conditions to use Katok-Strelcyn’s technology from (I) to (VI) is *far from trivial*: indeed, Burns-Masur-Wilkinson do this after studying (in Appendices A and B of their paper) several properties of Sasaki metric and properties of .

In summary, Burns-Masur-Wilkinson use (I) to (VI) to ensure that Katok-Strelcyn’s generalization of Pesin’s theory applies in the setting of Theorem 1. As a by-product, they deduce the following statement about the existence and absolute continuity of local (Pesin) stable manifolds (cf. Proposition 3.10 of Burns-Masur-Wilkinson paper).

Theorem 3 (“Pesin stable manifold theorem”)Then, there exists a subset of full volume, a \textrm{measurable} function , and a measurable familyLet be the geodesic flow on the unit tangent bundle of a -dimensional Riemannian manifold satisfying the conditions (I) to (VI) of Theorem 1. Denote by the subset of full volume provided by Theorem 2 where is non-uniformly hyperbolic.

of smooth () embedded disks with the following properties. For all :

- , i.e., is tangent to ;
- for all , i.e., is topologically contracted in forward time by ;
- if and only if and , i.e., is local stable manifold (in the sense that it is dynamically characterized as the set of close to whose forward -orbit approaches the forward -orbit of ).

Moreover, the family is absolutely continuous in the sense that the following “Fubini-like statements” hold.

- given a subset of zero volume, one has that the set has zero measure in (with respect to the induced -dimensional Lebesgue measure on ) for almost every ;
- given a -embedded -dimensional open disk and a subset of zero measure (for the induced Lebesgue measure of ), the set
(obtained by saturating by the local stable manifolds passing through it) has zero volume in .

Finally, the analogous assertions about unstable manifolds are also true.

** 1.2. Global stable manifolds of certain geodesic flows **

The Pesin stable and unstable laminations provided by Theorem 3 are *not* sufficient to run Hopf’s argument: as it was explained in the first post of this series, the local stable manifolds could be a priori very *short* (because their radii vary only *measurably* with and so one does not expect for uniform lower bounds on ).

Hence, it is important (for our purposes of using Hopf’s argument) to compare Pesin’s local stable manifolds with *global* objects. Here, the key point is to observe that Theorem 2 says that the tangent space of at is exactly the vector space of *stable Jacobi fields* along the geodesic and, as we will recall in a moment, stable Jacobi fields are naturally related to global objects called *stable horospheres*.

**1.2.1. Stable Jacobi fields and stable horospheres**

Let be a Riemannian manifold. Given an unit tangent generating a geodesic ray such that the sectional curvatures of are negative along and , let us denote by the *stable Jacobi field* associated to : by definition, this is the Jacobi field

where is the Jacobi field satisfying and .

In terms of the description of Jacobi fields via variations of geodesics, the stable Jacobi fields along are obtained by varying through geodesics such that for all (that is, stays always close to in *forward time*). These geodesics are *orthogonal* to a family of immersed hypersurfaces of whose lifts to the universal cover of are the so-called *stable horospheres*.

The stable horospheres can be constructed “by hands” with the aid of Busemann functions as follows.

Let be the quotient of a contractible, negatively curved, Riemannian manifold by a subgroup of isometries of acting freely and properly discontinuously and suppose that the universal cover of is geodesically convex (i.e., satisfies item (I) of Theorem 1).

In this situation, it is possible to show (see, e.g., Proposition 3.5 in Burns-Masur-Wilkinson paper) that given an unit vector generating an infinite geodesic ray , the functions given by

converge (uniformly on compact sets) as to a convex function

called *stable Busemann function* such that and, for every , the unit vector defines an infinite geodesic ray with

for all . In particular, the geodesics give variations of leading to stable Jacobi fields.

For each , the level set is a connected, complete, codimension submanifold of called *stable horosphere of level *. By definition, the geodesics are orthogonal to the -parameter family of stable horospheres (because stable horospheres are leve sets of and the geodesics point in the direction of the gradient).

The submanifold

of consisting of unit vectors that are orthogonal to the stable horosphere of level is called the *(global) stable manifold* of . This nomenclature is justified by the following property (corresponding to Proposition 3.6 in Burns-Masur-Wilkinson paper). In the context of Theorem 1, suppose that the infinite geodesic ray projecting to a *forward recurrent* geodesic on (i.e., *after* projection to , the unit vector becomes an accumulation point of the set ). Then, for any , the unit vector is tangent to an infinite geodesic ray such that

Furthermore, as . In particular, (stable manifolds are -invariant) and for all (stable manifolds are dynamically characterized by future orbits getting close together).

Remark 1As usual, by reversing the time (via the symmetry ), one can define unstable Jacobi fields, unstable Busemann functions and unstable horospheres.

The following picture (that we already encountered in the last post while discussing Jacobi fields) illustrates the stable and unstable horospheres associated to the vertical geodesic in the hyperbolic plane passing through .

**1.2.2. Geometry of the stable and unstable horospheres**

In this subsection, we make a couple of comments on the geometry of stable and unstable horospheres. More precisely, besides explaining the computation of their second fundamental forms from matrix Riccati equations, we will see that the stable and unstable horospheres are mutually transverse in a quantitative way. Of course, this transversality property of horospheres is another important point in Hopf’s argument (as it allows to control the angle between stable and unstable manifolds).

Let be a geodesic ray such that the sectional curvatures of along are negative. For each , let us denote by the unstable Jacobi field along with (as usual).

Consider the -parameter family of matrices (linear operators) defined by the formula

As we mentioned in this post here, are symmetric, positive-definite operators satisfying the matrix Ricatti equation

(i.e., for all ).

It is possible to show (cf. Eberlein’s survey) that the operator is precisely the second fundamental form at of the unstable horosphere of level .

By reversing the time, we have an analogous operator related to stable horospheres.

Note that, by definition, the stable and unstable subspaces and at an unit vector defining an *infinite* geodesic ray are

In other terms, we have a -invariant splitting

over the set

(where ).

Let us now show that this splitting is locally uniform over .

Proposition 4There exists a continuous function such that the continuous family of conefields

and

meeting only at the origin have the property that

for all .

*Proof:* Our task consists in showing that the functions

of are locally uniformly bounded away from zero.

By symmetry, it suffices to prove that is locally uniformly bounded from below. For the sake of reaching a contradiction, suppose this is not the case. This means that there are sequences , with such that , and .

For each , let be the stable Jacobi fields along induced by , and denote by the (limit) Jacobi field along induced by .

On one hand, for each , the square of the norm of the stable Jacobi field is a decreasing function of . In particular, since , we deduce that is a non-increasing function of .

On the other hand, is a strictly convex function of (because is a perpendicular Jacobi field, cf. Eberlein’s survey).

By putting these two facts together, we see that the function has no critical points. However, . This contradiction proves the desired proposition.

**1.2.3. Absolute continuity of global stable manifolds**

Once we have related Pesin’s stable and unstable manifolds (local objects) to stable and unstable horospheres (global objects), it is not entirely surprising that the absolute continuity properties of Pesin stable manifolds (described in Theorem 3 above) can be “transferred” to horospherical laminations:

Proposition 5Then, there exists a subset of full volume such that the stable Busemann functions are for all . Moreover, the leaves of the stable lamination are -submanifolds of diffeomorphic to . Furthermore, the stable horospherical laminationLet be the geodesic flow on the unit tangent bundle of a -dimensional Riemannian manifold satisfying the conditions (I) to (VI) of Theorem 1. Denote by the subset of the unit tangent bundle of the universal cover of consisting of unit vectors projecting into a forward and backward recurrent geodesic in .

obtained by taking the family of manifolds through the vectors in the projection of to (via ) has the following absolute continuity properties:

- if has zero -volume, then for -almost every and any , the set has zero -dimensional volume in ;
- if is a smooth, embedded, -dimensional open disk and has zero -dimensional volume in , then for any one has where
is the set obtained by saturating with the leaves of the lamination .

Finally, a similar statement holds for the corresponding unstable lamination.

Logically, the statement of this proposition is very close to Theorem 3 about the absolute continuity of Pesin stable manifolds, but the crucial point is that we have now an absolutely continuous stable lamination whose leaves have radii essentially equal to . In other words, the leaves of the stable lamination have a size controlled by the injectivity radius of , a global smooth function, instead of the *a priori* merely measurable function giving the radii of leaves of Pesin’s stable lamination .

The proof of Proposition 5 is not very difficult: it uses the absolute continuity properties of Pesin’s lamination in Theorem 3 and the “contraction of stable horospheres” (i.e., the fact that the forward dynamics of eventually contracts inside ), and it occupies two pages in Burns-Masur-Wilkinson paper (cf. the proof of their Proposition 3.11). However, we will skip this point in favor of discussing Hopf’s argument right now.

**2. Proof of Theorem 1 via Hopf’s argument**

Let be a geodesic flow satisfying the assumptions (I) to (VI) of Theorem 1. We want to show that is ergodic with respect to the volume measure (with normalized total mass).

By Birkhoff’s ergodic theorem, given a continuous function with compact support, the Birkhoff ergodic averages

converge as to the same limit for -almost every .

By definition of ergodicity, our task consists in showing that the function is constant -almost everywhere.

For this sake, let us define the measurable functions

and

Note that, by Birkhoff’s ergodic theorem, there exists a subset of full -measure such that

Moreover, from their definitions, note that the functions , and are -invariant.

The initial observation in Hopf’s argument is the fact that the function , resp. , is *constant* along the stable manifolds , resp. unstable manifolds . In fact, this follows easily from the uniform continuity of the (compactly supported, continuous) function and the fact that as (resp. ) whenever (resp. ).

The basic strategy of Hopf’s argument can be summarized as follows. We want to combine this initial observation with the absolute continuity properties of the stable and unstable horospherical laminations to deduce that is “*locally ergodic*” in the sense that *every* possesses a neighborhood such that the restriction to is -almost everywhere constant.

Of course, since is connected, this local ergodicity property implies that the function is constant -almost everywhere, and, *a fortiori*, is ergodic with respect to . In other terms, our task is reduced to prove the local ergodicity property stated in the previous paragraph.

In this direction, we fix once and for all , we set

and we denote by the -neighborhood of .

Let be the full -volume subset constructed in Proposition 5. For each , we consider the stable leaf , we take its iterates under for , and we saturate the resulting subset with the leaves of the unstable horospherical lamination to obtain the subset

The construction of is illustrated in the figure below: the subset is marked in blue and some leaves of passing through points of are marked in red.

The local ergodicity property stated above is an immediate consequence of the following two claims:

- (a) the restriction of the function to is almost everywhere constant for almost every choice of ;
- (b) is essentially open for almost every near in the sense that there exists a neighborhood of such that has full volume in for almost every choice of .

We establish the first claim (a) by exploiting the initial observation that Birkhoff averages are constant along stable and unstable manifolds and the absolute continuity properties of the stable and unstable horospherical laminations.

More precisely, let us consider again the full volume subset of where (provided by Birkhoff’s ergodic theorem).

By absolute continuity property of (cf. the first item of conclusion of Proposition 5), for almost every , the intersection has full volume in . We affirm that is almost everywhere constant for any such .

In fact, takes a constant value on . Moreover, since on , we also have that takes the constant value on . By combining this fact with the -invariance of , we deduce that takes the constant value on . Furthermore, by putting together this fact with the initial observation that is constant along unstable manifolds , we obtain that takes the constant value on .

Note that, by assumption, is a full volume subset of . Since is a -flow, it follows that is a full volume subset of the -dimensional smooth submanifold . Therefore, from the absolute continuity property of (cf. the second item of conclusion of Proposition 5), we conclude that is a full volume subset of . In particular, we have that takes the constant value on the full volume subset of . Because on , we get that takes the constant value on the full volume subset of , i.e., is almost everywhere constant. This completes the proof of the claim (a).

Remark 2The reader is encouraged to interpret this argument in the light of Figure 2 in order to get a clear picture of the roles of the subsets , and .

We establish now the second claim (b) from the absolute continuity properties of the horospherical laminations and the local uniform transversality of the stable and unstable manifolds.

More concretely, from the absolute continuity property in the first item of the conclusion of Proposition 5, we have that the stable disk , resp. unstable disk , is almost everywhere tangent to the stable direction , resp. unstable direction , for almost every . Since the stable and unstable directions and are contained in the *continuous* families of cones and from Proposition 4, we have that , resp. , is *everywhere* tangent to , resp. for almost every .

In particular, from the -invariance of the stable lamination , we see that the -dimensional disk is everywhere tangent to for almost every . Since the continuous conefields and meet only at the origin (cf. Proposition 4), that is, they are locally uniformly transverse, we conclude that there exists a neighborhood of such that

for almost any . In other words, intersects in a full volume subset. This completes the proof of claim (b).

This concludes our discussion of Hopf’s argument (namely, the derivation of claims (a) and (b)) for the ergodicity of .

Closing this post, let us say a few words about the mixing and Bernoulli properties in the statement of Theorem 1. In Burns-Masur-Wilkinson paper, these properties are deduced from general results of Katok saying that if a *contact* flow is non-uniformly hyperbolic and ergodic, then it is Bernoulli (and, in particular, mixing).

Nevertheless, as it was brought to my attention by B. Hasselblatt and Y. Coudène, the Hopf argument above can be slightly adapted in certain contexts to give mixing and/or mixing of all orders. For example, concerning the mixing property, Y. Coudène, B. Hasselblatt and S. Troubetzkoy showed (in Theorem 3.3) in this recent preprint here that if any -function saturated by stable and unstable sets (in the sense that there is a full measure subset such that whenever and or ) is almost everywhere constant, then the dynamical system is mixing. Also, they have a similar criterion for multiple mixing, and, furthermore, they discuss a couple of non-trivial examples of applications of their criteria.

In the context of Theorem 1, we can deduce the mixing property for from the result of Coudène-Hasselblatt-Troubetzkoy. Indeed, the argument used in the proof of the claim (a) above (during the discussion of Hopf’s argument) also shows that any -function saturated by stable and unstable sets (such as ) is almost everywhere constant, so that Coudène-Hasselblatt-Troubetzkoy mixing criterion “à la Hopf” applies in this setting.

]]>

Of course, there are several ways to come around this little technical subtlety (from the dynamical point of view) in the definition of Kontsevich-Zorich cocycle and this is the main purpose of this post. Evidently, the content of this post is well-known (especially among experts), but I hope that this post will benefit the reader with some background in Dynamical Systems wishing to know the answer to the following question:

*Does the Kontsevich-Zorich cocycle (as it is classically defined) qualifies as a genuine example of linear cocycle in the usual sense in Dynamical Systems?*

**Disclaimer.** Even though this post benefited from my conversations with Jean-Christophe Yoccoz, all errors and mistakes below are my sole responsibility.

**1. The Kontsevich-Zorich cocycle **

The basic references for this section are G. Forni’s paper, J.-C. Yoccoz’s survey and/or this blog post here (where the reader can find some figures illustrating the notions discussed below).

Let be a fixed compact orientable topological surface of genus , let be a non-empty finite set of points of cardinality and let be a list of “ramification indices” such that .

Recall that a *translation surface structure* on is a maximal atlas of charts on such that all changes of coordinates are translation in and, for each , there are neighborhoods , and a ramified cover of degree such that every injective restriction of is a chart of the maximal atlas .

Remark 1Equivalently, we can think of translation structures as the data of a Riemann surface structure on together with an Abelian differential (holomorphic one-form) possessing zeroes of orders at for . However, for the purposes of this post, we will not need this alternative point of view.

Remark 2Since the usual Euclidean area form on is translation invariant, it makes sense to talk about the total area of a translation structure. From now on, we will always implicitly assume that our translation structures are normalized, i.e., they have unit total area. Here, it is worth to point out that this normalization is not important for the definition of Teichmüller and moduli spaces, but it is important for the discussion of the dynamics of the Teichmüller flow on moduli spaces.

We denote by the group of orientation-preserving homeomorphisms of fixing (pointwise), by the connected component in of the identity element, and by the *mapping class group* (sometimes also called modular group).

Note that the group acts (by *pre-composition*) on the set of translation surfaces: given and a translation surface structure on , we get a translation structure by defining .

In this setting, the *Teichmüller space* is the quotient of the set of translation structures on by the action of and the *moduli space* is the quotient of the set of translation structures on by the action of . By definition, the moduli space is the quotient of Teichmüller space by the action of the mapping class group .

Remark 3The Teichmüller space is a manifold, but the moduli space is an orbifold (not a manifold) in general. We will come back to this point later in this post.

The group acts (by *post-composition*) on the Teichmüller space : given and a translation structure , we define (note that this action is well-defined because the conjugation of a translation in by the linear action of the matrix is still a translation). Furthermore, since acts by pre-composition and acts by post-composition, these actions commute and, hence, the action of descends to the moduli space .

The actions of the diagonal subgroup of on Teichmüller and moduli spaces are called *Teichmüller flow*.

The dynamics of the Teichmüller flow and/or -action on moduli spaces of (normalized) translation surfaces is a rich subject with interesting applications to the Ergodic Theory of some parabolic systems (such as interval exchange transformations and billiards in rational tables): see, for example, these posts here and here for more details.

A main character in the investigation of the -action on moduli spaces of (normalized) translation surfaces is the so-called *Kontsevich-Zorich cocycle*. Very roughly speaking, this cocycle was introduced by Kontsevich and Zorich as a practical way to extract the “interesting part” of the derivative cocycle of the Teichmüller flow.

Formally, the Kontsevich-Zorich (KZ) cocycle is usually defined as follows (compare with Forni’s paper). Let be the *vector bundle* over Teichmüller space whose fibers are the absolute homology group with real coefficients. One usually refers to as the *Hodge bundle* over .

Remark 4The reader with some background in Complex Geometry might have thought that this notion is very similar to the Hodge bundle over Teichmüller and moduli spaces of algebraic curves (Riemann surfaces) obtained by attaching the space of holomorphic -forms to a Riemann surface .In fact, this is no coincidence and the nomenclature “Hodge bundle” for is a “popular” abuse of notation in the literature about the Teichmüller flow. In fact, this abuse of notation goes beyond this: one could also construct (trivial) bundles over Teichmüller spaces by taking the fibers to be the absolute homology group with complex coefficients or the absolute cohomology group or with real or complex coefficients. These variants are closely related to each other (because and the first absolute homology and cohomology groups of a surface are dual [by Poincaré duality]) and they are also called Hodge bundle in the literature (depending on the author’s taste).

The vector bundle is *well-defined* and *trivial*, i.e., : in a nutshell, this is a consequence of the fact that a homeomorphism that is isotopic to the identity (such as the elements of ) act trivially on homology.

By taking the quotient of by the natural action of the mapping class group on *both factors*, we get the so-called *Hodge bundle*

over the moduli space .

In this context, the (trivial) cocycle

over the Teichmüller flow on Teichmüller space given by

for and descends to the so-called *Kontsevich-Zorich cocycle* on the Hodge bundle over moduli space (by taking the quotient by the action of ). Here, it is worth to observe that the Kontsevich-Zorich cocycle is well-defined (i.e., we can take this quotient) because of the fact that acts by pre-composition and acts by post-composition on Teichmüller spaces (so that these actions commute).

Remark 5The Kontsevich-Zorich cocycle could also be defined more generally by taking the quotient of the trivial cocycle over the action of full group (and not only ) on Teichmüller space.

**2. Is the KZ cocycle a linear cocycle? **

The reader with some familiarity with Dynamical Systems might have noticed some similarities between the notions of Kontsevich-Zorich cocycle and a linear cocycle over (discrete or continuous time) dynamical system.

In fact, let us recall that a *linear cocycle* , , over a flow , , is a flow on a vector bundle such that (i.e., projects onto ) and is a vector bundle automorphism, i.e., for all , the restriction of to is a linear map from the fiber on the fiber .

Example 1The trivial cocycle on the trivial bundle over a flow is . In particular, the cocycle on defined above is an example of trivial cocycle.

Example 2The derivative map of a smooth flow on a smooth manifold is an important class of examples of linear cocycles.

Given that the Kontsevich-Zorich cocycle on moduli spaces projects to the Teichmüller flow on moduli spaces and it acts on the fibers of via the (symplectic) action on homology of the elements of the mapping class group , one might be tempted to qualify the Kontsevich-Zorich cocycle as a linear cocycle.

*However*, a closer inspection of the definitions reveals that:

The Kontsevich-Zorich cocycle is not *always* a linear cocycle!

Actually, the fact that KZ cocycle is not a linear cocycle in general is *not* its fault: in order to talk about *linear* cocycles one needs *vector bundles*, and, as it turns out, the fibers of the Hodge bundle over moduli space are *not* vector spaces over the orbifold points of moduli spaces.

More precisely, we see from the definition that the fiber of at a translation surface is the quotient where is the group of automorphisms of , that is, the group of homeomorphisms of fixing pointwise whose local expressions in the charts are translations of .

Note that is a finite group: for instance, any element of is holomorphic (with respect to the Riemann surface structure underlying ) and, hence, by Hurwitz’s automorphisms theorem, we have that has cardinality . (Actually, even though Hurwitz’s theorem is sharp, this estimate of is not optimal: see, e.g., this paper of Schlage-Puchta and Weitze-Schmithuesen)

Therefore, the fiber of at is not very far from a vector space: it differs from by the quotient by (the action on homology of) the finite group (of “symplectic matrices”).

Nevertheless, when the translation surface is an *orbifold* point of moduli space (i.e., when is non-trivial), the fiber is not necessarily a vector space. (A simple concrete example of such a situation is the cone obtained from the quotient of by the finite group generated by the rotation of angle )

In summary, KZ cocycle is not always a linear cocycle because the Hodge bundle over moduli space is not always a vector bundle.

In other terms, the moduli space is an orbifold (but not a manifold in general), the Hodge bundle is an *orbifold vector bundle* (in general) and, thus, KZ cocycle is an *orbifold linear cocycle* (in general).

Example 3A concrete description of the Eierlegende Wollmilchsau is the following. We consider the quaternion group , we take an unit square in for each , and we glue (by translation) the vertical rightmost side of to the vertical leftmost side of and we glue (by translation) the horizontal top side of to the horizontal bottom side of . In this way, one obtais a translation surface where has genus , consists of four points and .One of my favorite examples of translation surface with a non-trivial group of automorphisms is the so-calledEierlegende Wollmilchsau.

A simple argument (see, e.g., this paper here) shows that the group of automorphisms of the Eierlegende Wollmilchsau is isomorphic to the quaternion group .

Example 4Some moduli spaces are manifolds and the corresponding Hodge bundles are vector bundles.For instance, the so-called minimal stratum of translation surfaces on a genus surface with a single marked point is a manifold because it can be shown (see, e.g., Proposition 2.4 in this paper here) that the automorphism group of any translation surface is trivial.

**3. Dynamics of the KZ cocycle? **

From the point of view of Topology and Algebraic Geometry, the “orbifoldic” nature of KZ cocycle is not surprising. Indeed, this kind of object is very common when studying monodromy representations and, also, one can overcome the “orbifoldic” nature of KZ cocycle by taking *covers* of moduli spaces in order to “kill” orbifold points. In particular, an orbifold linear cocycle is as good as a linear cocycle for topological considerations.

On the other hand, for *dynamical* considerations, the classical definition of KZ cocycle as an orbifold linear cocycle deserves further discussion.

For example, given an ergodic Teichmüller flow invariant probability measure on , it is desirable to apply Oseledets theorem to KZ cocycle in order to talk about its Lyapunov exponents and/or Oseledets subspaces. However, the Oseledets theorem deals only with linear cocycles and, thus, the fact that the KZ cocycle is merely an orbifold linear cocycle, or, more precisely, the fibers of Hodge bundle are not vector spaces, imposes some technical difficulties.

Remark 6For ergodic-theoretical purposes, the technical point pointed out above only shows up when -almost every translation surface is an orbifold point.In particular, the discussion in this section does not concern the so-calledMasur-Veech probability measureson moduli spaces (because its generic points have trivial group of automorphisms). This explains why the orbifoldic nature of KZ cocycle is never discussed in earlier papers in the literature (such as Forni’s paper): in those paper, the authors were concerned exclusively with the behavior of almost every trajectory with respect to Masur-Veech measures.

Remark 7The orbifoldic nature is discussed (in an implicit way at least) in the literature on Veech surfaces. Recall that a Veech surface is a translation surface whose -orbit in moduli space is closed. As it turns out, the stabilizer in of a Veech surface is a lattice, so that its -orbit is naturally isomorphic to the unit cotangent bundle of the finite area hyperbolic surface . If the Veech surface has a non-trivial group of automorphisms, the Hodge bundle over its -orbit (and hence the corresponding KZ cocycle) is orbifoldic, but one can get around this by studying the group of so-calledaffine diffeomorphisms, a sort of “finite cover” of in view of a natural exact sequence . See, e.g., this survey paper of P. Hubert and T. Schmidt (or our joint paper with J.-C.Yoccoz).

Fortunately, for the sake of the *definition* of Lyapunov exponents of KZ cocycle with respect to an ergodic Teichmüller flow invariant probability measure , the possible ambiguity coming from the fact that the KZ cocycle is a “linear cocycle up to ” is *irrelevant* (cf. Section 4.3 of our paper with J.-C. Yoccoz and D. Zmiaikou). In fact, the Lyapunov exponents are defined by measuring the exponential rate of growth

of vectors (along typical trajectories), and the ambiguity caused by the fact that is a well-defined linear operator only up to the matrices in (action on homology of) does not change these rates because is a *finite* group and, hence, the possible values of (after composing with the elements of ) are uniformly related to each other by universal multiplicative constants (whose effects disappear when considering the expression ). In other terms, the Lyapunov exponents of orbifold linear cocycles *are* well-defined!

Unfortunately, there is no “cheap” solution (similar to the previous paragraph) for the definition of *Oseledets subspaces* of KZ cocycle: one needs linear structures on the fibers of the Hodge bundle to talk about them. Logically, this is an annoying situation because Oseledets subspaces are useful: for example, the analysis of these subspaces plays a major role in the recent breakthrough paper of A. Eskin and M. Mirzakhani about Ratner-like theorems for the -action on moduli spaces.

As it was already mentioned in the beginning of this section, the way out of this dilemma is to pick a cover of moduli space where all orbifold points disappear and to lift the Hodge bundle to this cover.

Of course, there are *several ways* of picking such a cover, but the whole point of this post is that certain covers are better than others depending on our purposes.

For example, the Teichmüller space is a cover of having no orbifold points (because an automorphism of a translation surface of genus that is isotopic to the identity is the identity), and the Hodge bundle is a vector bundle. Nevertheless, the Teichmüller space of (normalized) translation structures is not *dynamically* interesting: for example, there are no *finite* Teichmüller invariant measure (and, thus, we can not use the standard tools from Ergodic Theory). This situation is very similar to the case of geodesic flows on the unit cotangent bundle of a finite-area hyperbolic surfaces (if we think of as moduli space, as Teichmüller space and the geodesic flow as Teichmüller flow): even though there are plenty of finite geodesic flow invariant measures on , there are no finite geodesic flow invariant measures on the cover (and, in fact, the dynamics of the geodesic flow on unit cotangent of the hyperbolic plane is rather boring). In summary, despite the fact that the Hodge bundle is an honest vector bundle over Teichmüller space , we can not use it to define Oseledets subspaces (or, in general, do non-trivial dynamics) because the Teichmüller flow on is not dynamically rich.

The “failure” (from the dynamical point of view) in the previous paragraph suggests that we should try picking *finite* covers of moduli spaces (having no orbifold points). Indeed, the lift of Teichmüller flow invariant probability measures leads to a finite measures in that case and we are in good position to discuss Ergodic Theory.

In this direction, J.-C. Yoccoz, D. Zmiaikou and myself considered the cover of obtained by marking an horizontal separatrix issued of a conical singularity of the translation surface. This is a finite cover because there are exactly outgoing horizontal separatrices at a conical singularity with total angle . Furthermore, has no orbifold points: indeed, any automorphism of a translation surface that fixes an horizontal separatrix issued of a conical singularity is the identity. Moreover, the diagonal subgroup (Teichmüller flow) still acts on because the matrices respect the horizontal direction (and, thus, horizontal separatrices).

In particular, we can talk about the Oseledets subspaces of the KZ cocycle over Teichmüller flow at the level of the lift of the Hodge bundle to (because this “lifted KZ cocycle” over Teichmüller flow is a genuine linear cocycle over a probability measure preserving flow and hence we can apply Oseledets theorem).

For the purposes of our joint paper with J.-C. Yoccoz and D. Zmiaikou, the lift of the KZ cocycle to the Hodge bundle over the finite cover of moduli space was adequate (and this is why we decided to stick to it in our paper).

However, we should confess that we were not completely happy with because does *not* act on (even though its diagonal subgroup do act!). Therefore, the more general version of the KZ cocycle over the -action on moduli spaces in Remark 5 above (also very important for the applications of the Teichmüller flow) can *not* be lifted to the Hodge bundle over .

Actually, the reason why does not act on is very simple: the action of the rotation subgroup where is ill-defined. In order to see this, let us consider the following translation surface of genus with a marked (in blue) horizontal separatrix issued of its unique conical singularity :

Let us try to define the action of on by letting vary from to . Starting from , let us slowly apply the rotation matrices to the translation surface for small positive values of :

In this way, we get a new translation surface and the horizontal separatrix is sent into a *non-horizontal separatrix* . Thus, and, *a fortiori*, the natural “reflex” of posing fails.

Evidently, for any small positive angle , the non-horizontal separatrix is very close to the horizontal separatrix (obtained by “rotating” by an angle inside ) indicated below

If we do this, then, by letting vary from to , we would be force to impose where is the “previous” horizontal separatrix issued from the singularity in the “natural cyclic order” (obtained by rotating around the singularity) indicated (in red) below However, since the horizontal separatrices and are *distinct*, we have that and are *distinct* points of , a contradiction with the fact that the rotation matrix must act by the identity map on .

A simple inspection of the argument above shows that the *finite* cover of (obtained by “replacing” the rotation (circle) group by its -fold cover , but keeping the “same” diagonal subgroup ) does act on ! In other words, “almost acts” on . Nevertheless, since the (non-trivial) finite cover is *not* an algebraic group (unlike itself), the natural action on the Hodge bundle over is not so useful from the point of view of Dynamical Systems.

In summary, despite the usefulness of for the study of the KZ cocycle over the Teichmüller flow on moduli spaces, it is desirable to find an alternative finite cover of moduli spaces having no orbifold points where still acts.

One solution to this problem is to take the quotient of Teichmüller space by a *torsion-free* finite index subgroup of : indeed, is a finite cover of moduli space because has finite index, has no orbifoldic points because is torsion-free and acts on because acts by *post-composition* and acts by *pre-composition* on .

Here, a result of J.-P. Serre (see also the book of B. Farb and D. Margalit for a “modern” exposition of Serre’s result [in the context of moduli spaces of Riemann surfaces]) produces many of those subgroups with the desired properties: in fact, given any integer , Serre showed that the subgroup

consisting of elements of the mapping class group acting trivially on the homology of with coefficients in is torsion-free. (This finite cover of moduli space was already mentionned in this blog: see this post here)

In summary,

The lift of the KZ cocycle to the Hodge bundle over defines a linear cocycle over the -action on moduli spaces that we could (should?) call KZ cocycle in *dynamical* discussions (where certain specific notions such as Oseledets subspaces are needed).

Remark 8For the sake of comparison, the finite cover might be somewhat big relative to . In fact, is a cover of degree in general, while, from the fact that the action on homology of the mapping class group surjects into , we have that in general is a cover of degree (a quantity of the form for prime).

]]>

The plan for this post is the following. After quickly reviewing in Section 1 below some basic features of the geometry of tangent bundles of Riemannian manifolds, we will estimate the first derivative of geodesic flows on certain negatively curved manifolds in terms its sectional curvatures (as promised last time). Finally, we will complete today’s discussion by proving the first part of Burns-Masur-Wilkinson ergodicity criterion (i.e., we will show that any geodesic flow verifying the assumptions of Burns-Masur-Wilkinson is non-uniformly hyperbolic in the sense of Pesin’s theory), while leaving the second part of Burns-Masur-Wilkinson ergodicity criterion (i.e., the verification of ergodicity via Hopf’s argument) for the next post of this series.

**1. Geometry of tangent bundles **

** 1.1. Riemannian metrics, Levi-Civitta connections and Riemannian curvature tensors **

Let be a Riemmanian manifold and denote by its Riemannian metric of .

Let be the associated Levi-Civita connection, i.e., the unique connection (“notion of parallel transport”) that is symmetric and compatible with the Riemannian metric . Given a curve on , the covariant derivative along is

(and it should *not* be confused with ). Sometimes we will also denote the covariant derivative simply by when the curve is implicitly specified: for example, given a vector field along a curve (of footprints), we write where is an extension of to .

In this setting, recall that a curve is a geodesic if and only if for all .

Since the equation is a first order ODE (in the variables ), we have that geodesics are determined by the initial vector . Furthermore, any geodesic has constant speed, i.e., the quantity measuring the square of size (norm) of the tangent vector is constant along : in fact, using the compatibility between and , one gets

for all .

The lack of commutativity of the Levi-Civitta connection is measured by the Riemannian curvature tensor

In terms of the Riemannian curvature tensor , the sectional curvature of a -plane spanned by two vectors and is

** 1.2. The tangent bundle to a tangent bundle **

The tangent bundle of the tangent bundle of is a bundle over in three natural ways:

- (a) where is the natural projection;
- (b) where is the natural projection;
- (c) where is defined as follows: given tangent at to a curve , we set where is the curve of footprints of the vectors ;

In this context, the vertical, resp. horizontal, subbundle of is , resp. . The vertical, resp. horizontal, subbundle is naturally identified with via , resp. . The vertical subbundle is transverse to the horizontal subbundle and the fiber of at can be identified via the map .

Geometrically, the roles of the vertical and horizontal subbundles are easier to understand in the following way. Given an element of tangent to a curve with , let be the curve of footprints of in . In this setting, the identification of with a pair of vectors via the horizontal and vertical subbundles simply amounts to take

In other terms, the component of in the horizontal subbundle measures how fast is moving in while the component of in the vertical subbundle measure how fast is moving in the fibers of .

This way of thinking as a bundle over leads to the following natural Riemannian metric on : given , we define

This metric is called *Sasaki metric* and the geometry of with respect to this Riemannian metric will be useful in our study of geodesic flows.

Remark 1Sasaki metric is induced by the symplectic form

in the sense that

where . The symplectic form is the pullback of the canonical symplectic form on the cotangent bundle by the map associating to the linear functional .

For the reader’s convenience, let us mention the following three useful facts about Sasaki metric:

- Sasaki showed that the fibers of the tangent bundle are totally geodesic submanifolds of equipped with Sasaki metric;
- A parallel vector field on viewed as a curve on is a geodesic for Sasaki metric that is always orthogonal to the fibers of ;
- by Topogonov comparison theorem, for close to , one has
where is the vector obtained by parallel transporting along the geodesic connecting to and is the distance associated to Sasaki metric; here, how close must be from depends only on the sectional curvatures of Sasaki metric in a neighborhood of ;

**2. First derivative of geodesic flows and Jacobi fields **

** 2.1. Computation of the first derivative of geodesic flows **

Let be the geodesic flow associated to a Riemannian manifold . By definition, given a tangent vector , we define where is the unique geodesic of with . Here, it is worth to point out that the geodesic flow is always locally well-defined but it might be globally ill-defined. Moreover, the geodesic flow preserves the Liouville measure (i.e., the volume form on induced by Riemannian metric of ).

We want to describe and, from the definition of first derivative, this amounts to study (-parameter) *variations of geodesics*.

More precisely, let be a (smooth) map such that, for each , is a geodesic of . Intuitively, is a one-parameter variation of the geodesic .

Define the vector field along the geodesic . It is well-known that satisfies the Jacobi equation

where is the covariant derivative (along ) and is the Riemannian curvature tensor. In other terms, is a *Jacobi field*, i.e., a vector field satisfying Jacobi’s equation.

Observe that Jacobi’s equation is a second order linear ODE. In particular, a Jacobi field is determined by the initial data .

The pair corresponds to the tangent vector at to the curve in (under the identification described above [in terms vertical and horizontal subbundles]). Indeed, the curve of footprints of is , so that the tangent vector at of is represented by

Here, the symmetry of the Levi-Civitta connection was used.

Similarly, the pair represents the tangent vector at to the curve . Therefore, represents

In summary, Jacobi fields are intimately related to the first derivatives of geodesic flows:

Proposition 1The image of the tangent vector under the derivative of the geodesic flow is the tangent vector where is the (unique) Jacobi field with initial data along the (unique) geodesic with .

** 2.2. Perpendicular Jacobi fields and invariant subbundles **

A concrete example of Jacobi field along a geodesic is : indeed, in this context, and , so that Jacobi’s equation is trivially verified. Geometrically, this Jacobi field correspond to a trivial variation of the geodesic where the initial point moves along and/or the speed of the parametrization of changes, i.e., .

In general, a Jacobi field along a geodesic that is tangent to has the form for some : in fact, for with , one has , so that Jacobi’s equation reduces to , i.e., for all .

Hence, a Jacobi field along a geodesic is interesting only when it is not completely tangent to the geodesic, or, equivalently, when it has some non-trivial component in the perpendicular direction to the geodesic.

A Jacobi field along a geodesic has the following geometrical properties:

- the component of makes constant angle with , i.e., the quantity is constant;
- if both components of are orthogonal to at some point, then they stay orthogonal all along , i.e., if and for some , then for all ;

We say that a Jacobi field along a geodesic is a *perpendicular Jacobi field* whenever both components of are orthogonal to .

From the properties of Jacobi fields discussed above, we see that any Jacobi field along a geodesic has a decomposition

where is a perpendicular Jacobi field and is a Jacobi field tangent to .

After this little digression on Jacobi fields, let us use them to introduce relevant invariant subbundles under the first derivative of a geodesic flow .

We begin by recalling that the norm of a tangent vector stays constant along its -orbit, i.e., preserves the energy hypersurfaces (for each ). In particular, the first derivative of the geodesic flow preserves the tangent bundle (to the unit tangent bundle of M).

We affirm that, under the identification for , the fiber (of the subbundle of ) corresponds to the set of pairs with .

In fact, note that an element of is tangent at to a variation of geodesics parametrized by arc-length, i.e.,

for all , such that the geodesic satisfies and the Jacobi field corresponding to verifies .

The desired property now follows from the following calculation:

The invariant subbundle itself admits a decomposition into two invariant subbundles, namely,

where is the vector field generating the geodesic flow and is the orthogonal complement of . In fact, under the identification for , the vector is and the elements of have the form with , . In particular, the -invariance of follows from the fact (mentioned above) that a Jacobi field satisfying and for some is a perpendicular Jacobi field (i.e., and for all ).

In summary, the action of on has two complementary invariant subbundles, namely, the span of the vector field generating the geodesic flow and its orthogonal consisting of perpendicular Jacobi fields. Since acts isometrically in the direction of , our task is reduced to study the action of on perpendicular Jacobi fields.

** 2.3. Matrix Jacobi and Ricatti equations **

We want to describe the matrix of acting on the vector space of perpendicular Jacobi fields. For this sake, let be an orthonormal basis for the tangent space of , and denote by the parallel transport of this orthonormal basis along the geodesic .

Define the matrix whose entries are

where is the Riemannian curvature tensor.

Note that any Jacobi field along can be written as . In this setting, Jacobi equation becomes

and, as usual, a solution is determined by the values and .

We can write solutions of the Jacobi equation above in a practical way by considering a matrix solution of the matrix Jacobi equation:

If is non-singular, the matrix

satisfies the matrix Ricatti equation

Remark 2The matrix is symmetric if and only if one has

for any two columns and of . Here, is the standard symplectic form of .

** 2.4. An estimate for the first derivative of a geodesic flow **

After these preliminaries on the geometry of tangent bundles, geodesic flows and Jacobi fields, we are ready to prove the following result stated as Theorem 11 in our previous post (but whose proof was postponed for this post).

Theorem 2Let be a negatively curved manifold. Let and consider a geodesic. Suppose that is a Lipschitz function such that, for each , the sectional curvature of any plane containing is greater than or equal to , and denote by the solution of Ricatti’s equation

with initial data . Then, the first derivative of the geodesic flow at time satisfies the estimate

From our discussion so far, the task of estimating the norm is equivalent to provide bounds for in terms of where is a perpendicular Jacobi field along (cf. Proposition 1 and Subsection 2).

We begin by estimating these quantities for two special subclasses of perpendicular Jacobi fields defined as follows. Let and be the (fundamental) solutions of the matrix Jacobi equation

with initial data and . Note that, by definition, all Jacobi fields with , resp. all Jacobi fields with , have the form , resp. , i.e., they are obtained by applying the matrices , resp. , to a vector , resp. . In this setting, the “other” component , resp. (of the Jacobi field , resp. , viewed as a tangent vector to ) can be recovered by applying the matrix , resp. , to , resp. .

Remark 3Very roughly speaking, the idea behind the choice of the subclasses and is that are Jacobi fields belonging to a certain “stable cone” and are Jacobi fields belonging to a certain “unstable cone” (compare with the discussion in the next Section).

Our first lemma says that the tangent vectors associated to Jacobi fields as above do *not* growth in forward time.

Lemma 3Let be a perpendicular Jacobi field along such that . Then,

In particular,

*Proof:* One of the consequences of negative sectional curvatures along is the fact that the functions and are strictly convex for any perpendicular Jacobi field (see, e.g., Eberlein’s survey).

In our context, this implies that is a (strictly) convex function decreasing from to in the interval . Therefore,

Since and for close to (because ), we deduce that

This completes the proof of the lemma.

Our second lemma says that the the growth in forward time of tangent vectors associated to Jacobi fields as above is reasonably controlled in terms of the solution of Ricatti’s equation with (where is the Lipschitz function controlling some sectional curvatures of ).

Lemma 4Let be a perpendicular Jacobi field along with . Then,

*Proof:* By definition, . Thus, and, *a fortiori*,

On the other hand, since , we see that and, hence,

These inequalities show that the proof of the lemma is complete once we can prove that for all .

In this direction, let us observe that the matrix is symmetric because it verifies (in a trivial way) the condition of Remark 2. Therefore, the norm of is given by the expression

where ranges from all unit vectors. In particular, our task is reduced to show that

for all unit vectors , where .

From the matrix Ricatti equation, we see that

Since the Lipschitz function controls the sectional curvatures (of planes containing ) along and the matrix is symmetric, we can estimate the right-hand side of the previous inequality as

On the other hand, since is a unit vector, the Cauchy-Schwarz inequality implies that . Therefore, the right-hand side of the previous inequality is bounded by

From this differential inequality and the facts that and , we can easily deduce that for all from the standard continuity argument.

Finally, we can complete the proof of the lemma by observing that the symmetric matrix is positive definite for : this follows from the facts that and satisfies matrix Ricatti equation associated to a negatively curved manifold (cf. Eberlein’s book). Therefore, and for all and all unit vector , so that

as desired.

Once we know how to control the growth of and for Jacobi fields and as above, the idea to estimate the growth of for an arbitrary perpendicular Jacobi field (thus completing the proof of Theorem 2) is to produce a decomposition of the form

where and the norms of and are controlled in terms of the norm of .

For this sake, define

and we set and .

First, note that the vector is well-defined, i.e., the matrix is invertible. Indeed, we already saw that the matrices and are symmetric (because they satisfy (in a trivial way) the condition of Remark 2) and that the matrix is positive definite (because satisfies matrix Ricatti equation, the manifold is negatively curved and imply that is positive-definite for (cf. Eberlein’s book). Furthermore, all eigenvalues of the matrix are : in fact, any eigenvalue of has the form for some unit vector , and because is a convex function (see, e.g., Eberlein’s survey) decreasing from to in the interval (with ). Therefore, the matrix is a symmetric matrix whose eigenvalues are and, hence, is an invertible matrix satisfying

Secondly, we claim that the Jacobi fields and give the desired decomposition. In fact, since , and are Jacobi fields, our claim follows from the facts that and .

Finally, let us estimate the (Sasaki) norms of and in terms of . We begin by observing that it suffices to estimate the Sasaki norm of because

On the other hand, the (Sasaki) norm of is not difficult to bound:

Since (cf. the proof of Lemma 4) and , we can estimate the right-hand side of the previous inequality by

By putting together these estimates of the Sasaki norms of and and Lemmas 3 and 4, we deduce that

This completes the proof of Theorem 2.

**3. Hyperbolicity of geodesic flows on certain negatively curved manifolds **

In this section, we will partly fulfill our promise in our previous post by giving the first steps towards the proof of Burns-Masur-Wilkinson ergodicity criterion:

Theorem 5 (Burns-Masur-Wilkinson)Suppose that:Let be the quotient of a contractible, negatively curved, possibly incomplete, Riemannian manifold by a subgroup of isometries of acting freely and properly discontinuously. Denote by the metric completion of and the boundary of .

- (I) the universal cover of is
geodesically convex, i.e., for every , there exists an unique geodesic segmentinconnecting and .- (II) the metric completion of is
compact.- (III) the boundary is
volumetrically cusplike, i.e., for some constants and , the volume of a -neighborhood of the boundary satisfiesfor every .

- (IV) has
polynomially controlled curvature, i.e., there are constants and such that the curvature tensor of and its first two derivatives satisfy the following polynomial boundfor every .

- (V) has
polynomially controlled injectivity radius, i.e., there are constants and such thatfor every (where denotes the injectivity radius at ).

- (VI) The
first derivative of the geodesic flowispolynomially controlled, i.e., there are constants and such that, for every infinite geodesic on and every :Then, the Liouville (volume) measure of is finite, the geodesic flow on the unit cotangent bundle of is defined at -almost every point for all time , and the geodesic flow is

non-uniformly hyperbolic(in the sense of Pesin’s theory) andergodic.

Actually, the geodesic flow is Bernoulli and, furthermore, its metric entropy is positive, finite and is given by Pesin’s entropy formula (i.e., it is equal to the sum of positive Lyapunov exponents of counted with multiplicities).

More precisely, our plan for the rest of this post is to show the non-uniform hyperbolicity of the geodesic flow described in the statement above. Then, we will leave the proof of the ergodicity of (via Hopf’s argument) for the next post of this series.

We start by noticing that has finite -volume: this is an easy consequence of the compactness of (assumption (II)) and the volumetrically cusp-like assumption (III) on .

Next, let us check that the geodesic flow in the statement of Burns-Masur-Wilkinson ergodicity criterion is defined for all time for almost every initial data . For this sake, denote by the natural projection and set

and

By definition,

and, *a fortiori*,

In particular, since the geodesic flow preserves the measure , our task of showing that is defined for all time for almost every initial data is reduced to prove that has zero -measure.

In order to compute the -measure of , let us estimate the -measure of for each along the following lines. Note that

where consists into the unit tangent vectors flowing into for some time between and . By definition, , so that

where . Here, we used the fact that is -invariant (for the first inequality) and the assumption (III) (for the second inequality). It follows that

for all . Hence, has zero -measure and is defined for all time for almost all initial data.

Remark 4The reader certainly noticed that we do not the full strength of assumption (III) to deduce the long-term existence of at almost every point: in fact, the weaker condition works equally well. Nevertheless, we will see below that the full strength of assumption (III) is helpful to ensure the existence of Lyapunov exponents for the geodesic flow .

Now, let us show that the geodesic flow is non-uniformly hyperbolic in the sense of Pesin theory, i.e., all (transverse) Lyapunov exponents are non-zero.

We start by verifying that the Lyapunov exponents of are well-defined (at almost every point): by Oseledets multiplicative ergodic theorem, it suffices to check the -integrability of the derivative cocycles and associated to the time- and time-maps and , that is,

By symmetry (or reversibility of the geodesic flow), we have to consider only the -integrability of . We estimate the integral above for by noticing that

Since is compact (by assumption (II)), we need to show only that the series above is convergent and this is not hard to see: on one hand, we already saw that for some (as a consequence of assumption (III), and, on the other hand, on by assumption (VI), so that

By Oseledets theorem, once we know the -integrability of the derivative cocycle, we have that, for almost every , there are real numbers

called Lyapunov exponents and a -invariant splitting

into Lyapunov subspaces such that, for every ,

In the context of a geodesic flow , recall that the derivative cocycle preserves the decomposition , and acts isometrically along and preserves . This implies that the Lyapunov exponent of along is zero and the derivative cocycle has Lyapunov exponents counted with multiplicity (i.e., we count -times the Lyapunov exponent ) along .

Remark 5In fact, the derivative cocycle preserves a natural symplectic form on . In particular, the Lyapunov exponents are organized in a symmetric way around the origin in the sense that is a Lyapunov exponent whenever is a Lyapunov exponent.

By definition, is called *non-uniformly hyperbolic* whenever all Lyapunov exponents along (sometimes called transverse Lyapunov exponents) are non-zero.

In our context (of the statement of Burns-Masur-Wilkinson ergodicity criterion), we will prove the non-uniform hyperbolicity of by exploiting the negative curvature of . More concretely, the negative curvature of implies that:

- for any non-trivial perpendicular Jacobi field , the functions and are strictly convex (thanks to Jacobi’s equation);
- for each geodesic ray and for each , there exists an unique perpendicular Jacobi field along with such that
for all .

See, e.g., Eberlein’s book for more explanations. In the literature, is called an *unstable* Jacobi field and it is usually constructed as the limit where is the Jacobi field with and . Similarly, we can define *stable* Jacobi fields along geodesic rays by reversing the time of the geodesic flow. The Figure 2 above illustrates stable (“blue”) and unstable (“red”) Jacobi fields along a vertical geodesic in the hyperbolic plane.

We will discuss stable and unstable Jacobi fields in more details in the next post of this series (because they describe the stable and unstable manifolds of and Hopf’s argument depend crucially on the features of stable and unstable manifolds). For now, we just need to know that, if is negatively curved and is defined for all time at , then

where ,

and

In other terms, where and are -dimensional subspaces related to stable and unstable Jacobi fields. See, e.g., Eberlein’s book for a proof of this fact.

In this setting, the non-uniform hyperbolicity of is a direct consequence of the following lemma relating stable and unstable Jacobi fields to Lyapunov subspaces:

Lemma 6There exists a -invariant subset of full -measure such that

*Proof:* Denote by the set of unit vectors such that:

- is defined for all time ;
- the Lyapunov exponents and Lyapunov subspaces are defined for ;
- is
*uniformly recurrent*under in the sense that, for any neighborhood of , there exists such that the sets have Lebesgue measure for all sufficiently large.

Note that is -invariant and it has full -measure: our previous discussion showed that the first two conditions hold for almost every and the third condition holds in a full measure subset thanks to Birkhoff’s ergodic theorem.

We affirm that satisfies the conclusions of the lemma. In fact, by the reversibility of the geodesic flow , it suffices to show that

for all .

For this sake, given , we fix a neighborhood of and a real number such that if is an unstable Jacobi field along a geodesic with , then

The choice of and is possible because is negatively curved and is an increasing strictly convex function whose second derivative is controlled by Jacobi’s equation.

Since is uniformly recurrent for , we have that

for all . Because , we know that has Lebesgue measure for some and for all sufficiently large. Therefore, for any unstable Jacobi field along , one has

for all sufficiently large. It follows from the definitions that

for any , and, hence,

Similarly, . Because , these inclusions must be equalities and the proof of the lemma is complete.

For later reference, we summarize the results of this section in the following statement:

Theorem 7Under the assumptions (I) to (VI) in Theorem 5 above, the geodesic flow is non-uniformly hyperbolic: more concretely, there exists a subset of full -measure such that the -invariant splitting

into the flow direction and the spaces and of stable and unstable Jacobi fields along have the property that

for all and .

]]>

- a) Vadim Kaloshin and Yuri Lima are organizing a “Summer School on Dynamical Systems” to be held at the University of Maryland from August 17th to 25th, 2014. There will be four minicourses (whose titles are available here) by Dmitry Dolgopyat, Giovanni Forni, Anton Gorodetski and Vadim Kaloshin.
- b) Pascal Hubert, Erwan Lanneau and Anton Zorich are organizing the workshop “Dynamics and Geometry in Teichmueller Spaces” to be held at CIRM, Luminy/Marseille, France, next year (from July 6th to 10th, 2015 to be precise).

The first event is aimed at graduate students interested in learning some recent topics in Dynamics and also undergraduate students with some background in Dynamical Systems wishing to pursue her/his studies in Dynamics. The details for this event are being uploaded at the summer school webpage and the organizers (Vadim Kaloshin and Yuri Lima) will be happy to provide extra information for all potential participants.

The second event is a research conference around Teichmueller and moduli spaces from both the geometrical and dynamical points of view. The details for this conference are still being defined (as far as I know) and one is encouraged to write to the organizers (Pascal Hubert, Erwan Lanneau and/or Anton Zorich) for more informations.

]]>

This was the second talk of a new “flat surfaces” seminar organised by himself, Anton Zorich and myself at Instut Henri Poincare (IHP) in Paris. The details about this seminar (such as current schedule, previous and next talks, abstracts, etc.) can be found at this website here.

For the time being, this seminar is an experiment in the sense that IHP allows us to use their rooms from March to June 2014. Of course, if the experiment is a success (i.e., if it manages to gather a non-trivial number of participants interested in flat surfaces and Teichmueller dynamics), then we plan to continue it.

Below the fold, I will reproduce my notes of Jean-Christophe’s talk about a new result together with Stefano Marmi on the cohomological equation for interval exchange transformations of restricted Roth type. Logically, it goes without saying that any errors/mistakes are my entire responsibility.

**1. Introduction **

A classical method to study the properties of (“quasi-periodic”) dynamical system consists into finding an adequate *linearization*, i.e., one seeks a (“smooth”) change of coordinates so that the new dynamical system is “linear”/“algebraic” in some sense (e.g., a rigid rotation on a circle, a translation on a torus, etc.).

Of course, given and a “good candidate” for a linear model of , the problem of finding is *non-linear* (because the *conjugation equation* is non-linear in ). For this reason, it is often the case that before attacking the conjugation equation one studies the following *linear* version

called *cohomological equation* for (where is given and we want to solve for ). In fact, the relationship between the cohomological equation and the conjugation equation was already discussed in this blog (see, e.g., this post), where we emphasized Herman’s Schwartzian derivative trick to convert solutions of the cohomological equation into solutions of the conjugation equation in the context of circle diffeomorphisms.

Today, we will discuss exclusively the existence and regularity of solutions of the cohomological equation for interval exchange transformations (but we will not study the conjugation equation).

In order to motivate the main results in this post, let us recall some of the known theorems about the existence and regularity of solutions of the cohomological equation for rotations on the circle of angle (or, equivalently, an interval exchange transformation of two intervals of lengths and ).

Definition 1We say that an irrational number is ofRoth typewhenever for all there exists such that

for all . Here, means the distance to the closest integer.

Remark 1The nomenclature “Roth type” is motivated by Roth’s theorem stating that any irrational algebraic integer is of Roth type.

Proposition 2 (Russmann, Herman, …)Let be of Roth type. Given and (i.e., is a function on with zero mean), there exists a solution of the cohomological equation

for the rotation of angle on with the property that for all .

In other terms, this result says that we can solve the cohomological equation for circle rotations of Roth type with a loss of ()-derivative for all .

Remark 2The analog of this result in the Sobolev scale (i.e., when and belong to standard Sobolev spaces ) follows from an elementary Fourier analysis (cf. this post). On the other hand, the statement above (in Hölder scale ) requires some extra work, but it is still within the framework of Harmonic Analysis in the sense that one uses Littlewood-Paley decomposition and interpolation inequalities (cf. Herman’s article for more details).

For the sake of comparison, let us give the following statement (where the boldface terms will introduced later):

Theorem 3 (Marmi-Yoccoz)Let be an interval exchange transformation ofrestricted Roth type. Given , there exists , a subspace of codimension and a bounded operator such that

for every .

Of course, since a rotation of the circle is an interval exchange transformation in two intervals, the theorem of Marmi-Yoccoz extends the previous proposition (of Russmann, Herman, …) to a larger important class of quasi-periodic dynamical systems.

Remark 3It is possible to check that a circle rotation of Roth type is an interval exchange transformation of two intervals of restricted Roth type. Furthermore, in this particular context one also can show that .

Remark 4The theorem of Marmi-Yoccoz applies to almost every interval exchange transformation: in fact, the restricted Roth type condition has full measure in the space of interval exchange transformations (with respect to the natural Lebesgue measure obtained by parametrizing interval exchange transformations with fixed combinatorics via the lengths of the permuted intervals).

On the other hand, the loss of derivative in Marmi-Yoccoz theorem is not very good compared to the previous proposition: in fact, the quantity depends on and the definition of restricted Roth type in a *highly non-trivial way*, so that is usually very small and, *a fortiori*, is not close (in general) to the optimal loss in the previous proposition.

Remark 5Here, Jean-Christophe said that it is likely that the definition of restriction Roth type must be changed if one desires an optimal loss of derivatives.

Before explaining the terms in boldface in Marmi-Yoccoz theorem, let us recall some previous related results on cohomological equations for interval exchange transformations and translation flows (the“continuous time analogs” of interval exchange transformations).

First, Forni considered in these two papers here (1997) and here (2007) the cohomological equation for translation flows on translation surfaces (i.e., Forni studied the “continuous time analog” of the cohomological equation for interval exchange transformations). Using several tools from Harmonic Analysis (including *weighted Sobolev spaces*) on *compact surfaces*, Forni managed to construct solutions of the cohomological equation for “almost all” (choice of direction for) translation flows in *any* given translation surface with an optimal loss of derivative (in a weighted Sobolev scale).

Secondly, Marmi, Moussa and Yoccoz considered in these two papers here (2005) and here (2012) showed the existence of *continuous* solutions of the cohomological equation for interval exchange transformations of restricted Roth type. In particular, the new result of Marmi-Yoccoz improves these previous results by asserting that the existence of Hölder continuous () solutions for the cohomological equations studied in their previous papers with Moussa.

As the reader can see, the results of Forni and Marmi-Moussa-Yoccoz have both strong and weak points. On one hand, Forni’s result gives solutions to the cohomological equation for “almost all” translation flows with optimal loss of derivative in Sobolev scale, but the “Diophantine condition” (i.e., the subset of full measure of translation flows) in his theorem is not explicit. On the other hand, the results of Marmi-Moussa-Yoccoz result and Marmi-Yoccoz give solutions to the cohomological equation for interval exchange transformations with a poor gain of regularity in Hölder scale, but their “Diophantine condition” (restricted Roth type) on the interval exchange transformation is “relatively explicit”. Also, it is not easy to compare “directly” these results: even though there is a “natural” notion of restricted Roth type translation flow (in the sense that the return map of the translation flow to an appropriate transverse section is an interval exchange transformation of restricted Roth type) in this paper of Marmi-Moussa-Yoccoz, it is not clear that a restricted Roth type translation flow fits the “Diophantine condition” of Forni.

Remark 6Very roughly speaking, one of the (several) difficulties in relating the Diophantine conditions of Forni and Marmi-Moussa-Yoccoz is related to the application of Oseledets theorem for the Kontsevich-Zorich cocycle: indeed, Oseldets theorem provides a non-explicit set of full measure of points such that the Kontsevich-Zorich cocycle along the Teichmüller flow orbit of these points have a particularly nice behavior (see, e.g., the introduction of this paper of Forni for more explanations). Nevertheless, it is worth to point out that the recent results of Chaika-Eskin give some hope towards relating the Diophantine conditions of Forni and Marmi-Moussa-Yoccoz.

After these comments on Theorem 3 (and some related results), it is time to define the objects involved in the statement of this theorem.

**2. Interval exchange transformations **

Recall that an interval exchange transformation is determined by the following data. Given a finite alphabet with letters, an interval and two partitions into subintervals with

for every , the interval exchange transformation is the piecewise translation sending to . Here, , resp. stands for *top*, resp. *bottom* subintervals, that is, the subintervals of the partition one sees before, resp. after applying . The figure below gives some examples of interval exchange transformations.

We denote by , resp. , the extremities of the subintervals , resp. (), so that and are the extremities of the interval . In particular, , resp. , are the discontinuities of , resp. .

Using these notations, we are ready to introduce the first term marked boldface in Theorem 3:

and is the subspace of zero mean functions in . In concrete terms, is the space of piecewise -functions on that are on the intervals admitting natural extensions to the intervals (but these extensions might disagree at the points ‘s).

Next, we introduce the constants and attached to an interval exchange transformation . An interval exchange transformation can be naturally seen (in many ways) as the first return map of a translation flow on a translation surface (by means of *Masur’s suspension construction* or *Veech’s zippered rectangles construction*): the reader can find more details in this post here (for instance). The translations surfaces obtained from in this way have a genus and a number of conical singularities depending only on . Alternatively, one can define and by combinatorial means (in terms of the cycles of the permutation on induced by the way permutes the subintervals and ).

At this point, the sole undefined term in boldface in the statement of Theorem 3 is “restricted Roth type”. In order to do so, we have to introduce the Rauzy-Veech algorithm and the (discrete version of the) Kontsevich-Zorich cocycle (using this survey of Jean-Christophe as a basic reference).

**3. Rauzy-Veech algorithm, Kontsevich-Zorich cocycle and restricted Roth type **

We say that an interval exchange transformation has a *connection* if there are , such that

Since , resp. , is a discontinuity of , resp. (so that the future orbit of , resp. past orbit of is ill-defined), we see that has a connection whenever it has an orbit that is “blocked” (can not be extended) in the future *and* in the past.

An interval exchange transformation *without* connections are very similar to irrational rotations of the circle: by a result of Keane, any without connections has a minimal dynamics.

Starting with without connections, we denote by the first return map of to the subinterval

It is not hard to check that is also an interval exchange transformation permuting a finite collection , of subintervals of naturally indexed by the alphabet . Furthermore, also has no connections. In the literature, the map is called an elementary step of the Rauzy-Veech algorithm.

Of course, the two facts described in the previous paragraph imply that we can *iterate* this procedure: starting with without connections, by successively applying the elementary steps of the Rauzy-Veech algorithm, one obtains a sequence of interval exchange transformations acting on a decreasing sequence of subintervals . Moreover, it is possible to show that the lengths of the intervals tend to zero as .

For later use, let us observe that, by definition, for any , is the first return map of to .

In terms of the Rauzy-Veech algorithm, the Kontsevich-Zorich cocycle can be described as follows. Given and , we consider the special Birkhoff sum

where is the first return time of (under iterates).

It is possible to check that . In particular, denoting by

we see that

is a linear operator inducing a matrix whose entries have the following dynamical interpretation: is the number of visits of to under -iterates before its return to . The matrices form a linear cocycle (i.e., ) called (discrete) Kontsevich-Zorich cocycle.

The restricted Roth type for an interval exchange transformation is defined in terms of the features of the Kontsevich-Zorich cocycle .

More precisely, we define inductively and is the smallest integer such that for all . It is possible to show that this definition leads to a sequence with as .

We say that has *restricted Roth type* whenever the following four conditions are fulfilled.

- (a)
*Roth type condition*: for each , one hasfor all .

Remark 7The fact that the Roth type condition is satisfied for almost all interval exchange transformations (i.e., for Lebesgue almost all choices of lengths of the intervals ) was checked in this paper of Marmi-Moussa-Yoccoz (see also the paper of Avila-Gouezel-Yoccoz).

Remark 8For sake of comparison, in the case of the rotation of angle on the circle (i.e., interval exchange transformation permuting two intervals of lengths and ), one can check that and where are the entries of the continued fraction expansion of and are the denominators of the best rational approximations of . In particular, the Roth type condition is equivalent to

for all , i.e., is of Roth type.

- (b)
*Spectral gap*: there exists such thatwhere is the subspace of functions with zero mean.

Remark 9The spectral gap property is also satisfied by almost all interval exchange transformations thanks to the work of Veech. In fact, this property is closely related to the non-uniform hyperbolicity of the Teichmueller flow (and the constant is the second Lyapunov exponent of the Kontsevich-Zorich cocycle over the Teichmueller flow).

- (d)
*Hyperbolicity*: the*stable space*of the Kontsevich-Zorich cocycle has dimension .

Remark 10The hyperbolicity property is verified for almost all interval exchange transformations thanks to the work of Forni.

- (c)
*Coherence property*: denoting by the restriction of the Kontsevich-Zorich cocycle to the stable space , and by the action of the Kontsevich-Zorich cocycle on the “center-unstable spaces” , then for each , one hasand

Remark 11The coherence property is also verified for almost all interval exchange transformations: indeed, this is a consequence of Oseledets theorem applied to the Kontsevich-Zorich cocycle (see, e.g., Marmi-Moussa-Yoccoz paper for more details).

Remark 12We called “item (d)” the hyperbolicity property and “item (c)” the coherence property just to keep the same notations of this paper of Marmi-Moussa-Yoccoz (and also because Jean-Christophe did the same during the talk :) ).

At this point, all boldfaced terms in Theorem 3 were defined and now it is time to discuss some points of the proof of this result.

**4. Some steps of the proof of Theorem 3 **

Recall that the quantity is the number of marked points of a translation surface obtained by suspension of the interval exchange transformation . Combinatorially, these marked points can be seen as cycles of a permutation on keeping track of the ‘s one sees when turning counterclockwise around the conical singularities in the translation surface . (See, e.g., this survey of Jean-Christophe for more details)

The permutation allows to define a *boundary operator* ,

where is the set of cycles of , and , resp. , means , resp. .

It is not hard to see that the boundary operator has the following properties:

- ;
- ;
- the restriction of to is the usual boundary operator in relative homology
after appropriate (natural) identifications and .

In this setting, Marmi-Yoccoz deduce Theorem 3 as an immediate consequence of the following more precise statement:

Theorem 4 (Marmi-Yoccoz)Denote by the kernel of the (restriction of the) boundary operator (to ) and consider an arbitrary supplement of in (i.e., ).Let be an interval exchange transformation of restricted Roth type.Then, there are two bounded operators and such that

where and .

Remark 13This theorem says that we can solve the cohomological equation whenever and (i.e., there is no obstruction coming from the boundary operator and the operator ). In the literature, these conditions (or “obstructions”) are called Forni’s distributions.

Closing today’s post, let us give some steps of the proof of Theorem 4.

The first three steps are the following:

- (1) there exists such that for all with (zero mean) one has
- (2) there exists and a bounded operator such that for all one has
where

- (3) by Gottschalk-Hedlund theorem applied to the homeomorphism of the compact space obtained after blowing up the orbits of the discontinuities of and (and by keeping track of what happens once one undo the blowup), one can use the previous steps to write for some
*continuous*function .

In fact, these three steps were performed in this paper of Marmi-Moussa-Yoccoz from 2012 (where they are explained in details).

Remark 14As it turns out, the “non-optimal loss of derivatives” in Marmi-Yoccoz is already present here: in fact, it seems that for an optimal loss of derivatives for the solutions of the cohomological equation in the setting of Marmi-Yoccoz one has to improve the information on the constant appearing in the first step above.

At this stage, it is clear that the proof of Theorem 4 is reduced to show that the function appearing in Step 3 above is Hölder continuous. Here, the main *novelty* introduced by Marmi-Yoccoz is the following fourth step:

- (4) Given an interval , denote by ,
for each , and

Then, there exists such that one has the following “almost recurrence relation”

for the sequence of vectors .

At this point, Jean-Christophe was running out of time so that he decided to skip the proof of this almost recurrence relation (based, of course, on the definition of restricted Roth type) in order to explain how this new step allows to conclude Theorem 4.

First, one uses the fact that is continuous to check that (this is not difficult since we are not asking for moduli of continuity/rate of convergence).

By combining this information with Step 4 above, they can show that there exists such that

From this inequality, it is not difficult to deduce a Hölder modulus of continuity for by “interpolation” of the information at the extremities of .

Finally, the proof of the estimate (1) itself consists into three steps.

One constructs first a vector where is the intersection of with the kernel of and is a natural *Kontsevich-Zorich cocycle invariant* symplectic form on (related to the intersection form on ).

After this, Marmi-Yoccoz introduce the vector .

Then, Marmi-Yoccoz use the coherence property (c) in the definition of restricted Roth type (among several other facts) to show the analog of (1) for . Moreover, they can check that this estimate on can be transferred to . Furthermore, by exploiting the symplecticity of the Kontsevich-Zorich cocycle (among several other facts), Marmi-Yoccoz show the analog of (1) for the vectors imply the estimate (1) for the vectors , so that the argument is complete.

]]>

Given the very interesting program of this conference, it was not surprising that Amphithéâtre Hermite (where the talks were delivered) was always full.

Today, we will discuss one of the talks of this conference, namely, the talk “On the continuity of Lyapunov spectrum for random products” of Alex Eskin about his joint work (in preparation) with Artur Avila and Marcelo Viana.

As usual, all mistakes/errors in this post are entirely my responsibility.

Remark 1A video of a talk of Artur Avila on the same subject can be found here.

Update [February 11, 2014]:Last Friday, I was lucky enough to get some extra explanationsconcerning “costs of couplings” directly from Alex. At the end of this post (see the “Epilogue”), I will try to briefly summarize what I could understand from this conversation.

**1. Introduction **

Let be a probability measure on , e.g., where is a (non-trivial) probability vector (i.e., and for all ) and are Dirac masses at .

Consider the random walk on induced by , i.e., let , , and, for each , , put

Remark 2Of course, the intuition here is that the samples , , are describing a random walk on whenever we perform a random choice of with respect to (or, equivalently, random choices of ‘s with probability distribution ).

In this context, the Oseledets multiplicative ergodic theorem says that:

Theorem 1 (Oseledets)For -almost every , one has

where is a symmetric matrix with eigenvalues . (Here, is the transpose matrix of , and denotes the non-negative symmetric matrix such that .)

The numbers are called Lyapunov exponents.

Geometrically, Oseledets theorem says that the random walk almost surely tracks a geodesic of speed of the symmetric space (where is a maximal compact subgroup of ).

Remark 3The top Lyapunov exponent can be recovered by the formula

for -a.e. , and the remaining Lyapunov exponents can be recovered by the following standard trick/observation: the top Lyapunov exponent of the action of on the -th exterior power is . For this reason, it is often (but not always!) the case that the results about the top Lyapunov exponent also provide information about all Lyapunov exponents.

Historically, the first results about the Lyapunov exponents of random products concerned their multiplicities for a *fixed* probability distribution . A prototypical theorem in this direction is the following result of Guivarch-Raugi and Goldsheid-Margulis providing sufficient conditions for the simplicity (multiplicity ) of Lyapunov exponents.

Definition 2We say that isnotstrongly irreducible whenever there exists a finite collection of subspaces of such that

for all .

Definition 3We say that is proximal if there exists such that has distinct eigenvalues. (Here, is the Zariski closure of the group generated by .)

Remark 4If is Zariski dense in , then is strongly irreducible and proximal.

Theorem 4 (Guivarch-Raugi, Goldstein-Margulis)

- 1) If is strongly irreducible and proximal, then (i.e., the top Lyapunov exponent is simple/has multiplicity );
- 2) If is Zariski dense in , then .

**2. Statement of the main result **

In their work, Avila, Eskin and Viana consider how the Lyapunov exponents change when the probability distribution *varies*. Among the results that they will prove in their forthcoming article is:

Theorem 5 (Avila-Eskin-Viana)Suppose is afixedprobability vector, and consider the probability measures

whose supportvaries. Then, for each , the Lyapunov exponent is a continuous function of .

Remark 5This statement looks innocent, but it is known that Lyapunov exponents do not vary continuously (only upper semi-continuously) “in general”. See, e.g., this article of Bochi (and the references therein) for more details.

**3. Previous works and related results **

The theorem of Avila-Eskin-Viana generalizes to any dimension the work of Bocker-Neto and Viana in dimension :

Theorem 6 (Bocker-Neto-Viana)For a fixed probability vector , the two Lyapunov exponents of

depend continuously on .

On the other hand, if one decides to fix the support and to vary the vector of probabilities, then Peres showed in 1991 that:

Theorem 7 (Peres)Let us fix the support . Then, the simple Lyapunov exponents of

are locally real-analytic function of . More precisely, given and a probability vector such that the th Lyapunov exponent is simple (i.e., multiplicity ), then the th Lyapunov exponent is a real-analytic function of near .

The formula for -a.e. for the top Lyapunov exponent is not very useful to study how Lyapunov exponents vary with because the notion of “-a.e. ” changes radically with .

A slightly more useful formula was found by Furstenberg:

where and is a -stationary measure (i.e., is invariant in average with respect to , that is, ) on the projective space of lines in .

Of course, the cocycle depends nicely on and , but the dependence on of the stationary measure in Furstenberg’s formula is not obvious to determine. In particular, one needs to “feed” Furstenberg’s formula with extra information in order to deduce continuity of the top Lyapunov exponent in a given setting.

For example, if one feeds the following remark

Remark 6If is strongly irreducible and proximal, then the stationary measure on is unique.

to Furstenberg’s formula, then one can deduce:

Proposition 8Suppose (in the weak-* topology), is proximal and strongly irreducible. Then, .

*Proof:* Denote by the sequence of stationary measures associated to in Furstenberg’s formula. It is not hard to check that any accumulation of the sequence is -stationary. By the previous remark, has an unique stationary measure on , so that any accumulation of coincides with the stationary measure in Furstenberg’s formula for . In other words, , and the desired proposition now follows immediately from Furstenberg’s formula.

Remark 7Le Page showed that the conclusion of the previous proposition can be improved from continuity to real-analyticity. However, in general (without strong irreducibility and proximality of ), one can not expect anything better than Hölder continuity.

**4. Some ideas of the proof of Avila-Eskin-Viana theorem **

Let us simplify the exposition by considering the following toy case: we are given two sequences of matrices

and

and we want to show that the top Lyapunov exponents of the probabilities

converge to

The projective actions of the matrices and on the projective circle are of “north pole–south pole type”: there are two fixed points and corresponding to the directions of the coordinate axes and of and the points of are either attracted or repelled towards and under the actions of and . In particular, one can infer from this that an arbitrary -stationary measure on has the form

with .

Therefore, if we denote by the -stationary measures coming from Furstenberg’s formula, then

and our goal is to show that . However, there is not so easy as it seems (in the sense that naive methods don’t work well) and one has to look for appropriate tools.

In this direction, the notion of *Margulis function* comes at hand. Given a probability measure on a group acting on a space , let

be the Markov operator associated to . We say that is a Margulis function if:

- 1)
- 2) on a “negligible set”
- 3) there are constants and such that , i.e., when is large at a point (a step of a -random walk approaches ), the value of at the -images of this point decrease in average (the next step of a -random walk tend to get far from ).

Coming back to toy case, it is possible to show that for the function given by

is a Margulis function for .

This type of information is useful to show simplicity of the Lyapunov exponents of , but it does not help us to show the continuity statement or . In fact, the difficulty comes from the fact that is not a Margulis function of because the south pole of is changing location (even though they are close to ), so that a *single* Margulis function is not capable of assigning the value to all of the south poles of without being trivial.

Here, one can try to overcome the technical obstacle of the moving south poles of by considering the diagonal action of on and by introducing the function

for and close to . As it turns out, this function is a good candidate of Margulis function for in the sense that the inequality in item 3) involving the Markov operator is satisfied *near* , and it seems that we are doing some progress.

Unfortunately, we made no progress at all with the idea in the previous paragraph: indeed, the technology of Margulis functions requires *globally* defined functions and so far we were able only to exhibit *locally* defined functions (in a neighborhood of ).

At this point, the basic idea of Avila-Eskin-Viana is the introduction of *measure-theoretical* analogs of Margulis functions. In other terms, they want to replace “functions” by “measures” to get objects that are slightly more flexible but still capable of doing the same job than Margulis functions.

The measure-theoretical analog of Margulis functions are called *couplings* with finite *costs*. Concretely, we say that a probability measure on is a coupling of to itself if the projection of to both factors is . Given a coupling of to itself, we define its cost as:

where is an adequate small neighborhood of .

In this setting, we can see that the task of showing is reduced to find a large constant and a sequence of couplings of to itself such that

for all . Indeed, this is so because the cost of coupling to itself is and thus has finite cost only when .

At this point, the time of Alex Eskin was essentially out and he concluded by saying that the main point is that finding couplings with finite costs is *easier* than building globally defined Margulis functions, and the desired couplings with uniformly bounded costs could be found by analyzing the analog of item 3) in the definition of Margulis functions for couplings of to itself with *optimal* (minimal possible) costs.

**5. Epilogue**

Let us try to give more explanations to the discussion in the previous 6 paragraphs above (following my conversation with Alex Eskin [or what I can remember of it...]).

We start by selecting the small neighborhood of so that the limit stationary measure gives mass

to .

Then, we restrict our measures to and we *change *the dynamics so that these restrictions are stationary: formally, we replace the Markov operator by an adequate “local transfer operator” such that is -stationary.

In these terms, the “local version”

of the “usual candidate to Margulis function” seems to be a Margulis function at first sight, but unfortunately it does not satisfies item 3). Indeed, the *pointwise* estimates of the form

with and do not hold *always* because there are *some* couples of points that are pushed *together* towards despite the fact that the *probability* of this event is *small*.

For this reason, Avila-Eskin-Viana replace “functions” by “measures” with the idea that this probabilistic tendency felt by most couples of getting away from is better expressed as estimates for measures than pointwise estimates for functions.

More concretely, by selecting an appropriate subinterval , one can see that the -measure of the set

of elements of pushing a point towards is . From this information, it is not difficult to construct some measures on such that projects to on both factors and . From the measures , one obtains some couplings with finite costs.

However, this is not quite the end of the history: we need couplings whose costs are *uniformly bounded* for all . Here, the trick is to study couplings with *optimal* costs (i.e. with smallest possible costs). In fact, by applying the “dynamics” to , one has the following analogue of item 3) in the definition of Margulis functions:

for some universal constants and (thanks to the probabilistic tendency of most couples of points to get pushed away from ). On the other hand, since has optimal (smallest) cost, we conclude that

that is,

In other terms, the analog for measures of item 3) in the definition of Margulis functions allows to check that the costs of the sequence optimal cost couplings are uniformly bounded by , as desired.

]]>

Theorem 1 (Burns-Masur-Wilkinson)Suppose that:Let be the quotient of a contractible, negatively curved, possibly incomplete, Riemannian manifold by a subgroup of isometries of acting freely and properly discontinuously. Denote by the metric completion of and the boundary of .

- (I) the universal cover of is
geodesically convex, i.e., for every , there exists an unique geodesic segmentinconnecting and .- (II) the metric completion of is
compact.- (III) the boundary is
volumetrically cusplike, i.e., for some constants and , the volume of a -neighborhood of the boundary satisfiesfor every .

- (IV) has
polynomially controlled curvature, i.e., there are constants and such that the curvature tensor of and its first two derivatives satisfy the following polynomial boundfor every .

- (V) has
polynomially controlled injectivity radius, i.e., there are constants and such thatfor every (where denotes the injectivity radius at ).

- (VI) The
first derivative of the geodesic flowispolynomially controlled, i.e., there are constants and such that, for every infinite geodesic on and every :Then, the Liouville (volume) measure of is finite, the geodesic flow on the unit cotangent bundle of is defined at -almost every point for all time , and the geodesic flow is

non-uniformly hyperbolic(in the sense of Pesin’s theory) andergodic.

Actually, the geodesic flow is Bernoulli and, furthermore, its metric entropy is positive, finite and is given by Pesin’s entropy formula (i.e., it is equal to the sum of positive Lyapunov exponents of counted with multiplicities).

However, since the second post of this series was dedicated to the discussion of items (I), (II) and (III) above for the Weil-Petersson (WP) metric, we think it is natural that this third post provides a discussion of items (IV), (V) and (VI) for the Weil-Petersson metric (thus completing the proof of Burns-Masur-Wilkinson theorem of ergodicity of the Weil-Petersson geodesic flow modulo the proof of their ergodicity criterion).

For this reason, we will continue the discussion of the geometry of the Weil-Petersson metric in this post while leaving the proof of Burns-Masur-Wilkinson ergodicity criterion for the next two posts of this series.

The organization of today’s post is very simple: it is divided in three sections where the items (IV), (V) and (VI) for the Weil-Petersson metric are discussed.

**1. The curvatures of the Weil-Petersson metric **

The item (IV) of Burns-Masur-Wilkinson ergodicity criterion (Theorem 1) asks for polynomial bounds in the sectional curvatures and their first two derivatives.

In the context of the Weil-Petersson (WP) metric, the desired polynomial bounds on the sectional curvatures follow from the work of Wolpert.

** 1.1. Wolpert’s formulas for the curvatures of the WP metric **

This subsection gives a *compte rendu* of some estimates of Wolpert for the behavior of the WP metric near the boundary of the Teichmüller space .

Before stating Wolpert’s formulas, we need an *adapted* system of coordinates (called *combined length basis* in the literature) near the strata , , of , where is the curve complex of (introduced in the previous post).

Denote by the set of pairs (“basis”) where is a simplex of the curve complex and is a collection of simple closed curves such that each is disjoint from all . Here, we allow that two curves *intersect* (i.e., one might have ) and also the case is *not* excluded.

Following the nomenclature introduced by Wolpert, we say that is a *combined length basis* at a point whenever the set of tangent vectors

is a basis of , where is the length parameter in the Fenchel-Nielsen coordinates and .

Remark 1The length parameters and their square-roots are natural for the study of the WP metric: for instance, Wolpert showed that these functions are convex along WP geodesics (see, e.g., these papers of Wolpert and this paper of Wolf).

The name *combined length basis* comes from the fact that we think of as a combination of a collection of *short* curves (indicating the boundary stratum that one is close to), and a collection of *relative* curves to allowing to complete the set into a basis of the tangent space to in which one can write nice formulas for the WP metric.

This notion can be “extended” to a stratum of as follows. We say is a *relative basis* at a point whenever and the length parameters is a *local* system of coordinates for near .

Remark 2The stratum is (isomorphic to) a product of the Teichmüller spaces of the pieces of . In particular, carries a “WP metric”, namely, the product of the WP metrics on the Teichmüller spaces of the pieces of . In this setting, is a relative basis at if and only if is a basis of .

Remark 3Contrary to the Fenchel-Nielsen coordinates, the length parameters associated to a relative basis might not be aglobalsystem of coordinates for . Indeed, this is so because we allow the curves in to intersect non-trivially: geometrically, this means that there are points in where the geodesic representatives of such curves meet orthogonally, and, at such points , the system of coordinates induced by meet a singularity.

The relevance of the concept of combined length basis to the study of the WP metric is explained by the following theorem of Wolpert:

Theorem 2 (Wolpert)For any point , , there exists a relative length basis . Furthermore, the WP metric can be written as

where the implied comparison constant is uniform in a neighborhood of .In particular, there exists a neighborhood of such that is a combined length basis at any .

The statement above is just the beginning of a series of formulas of Wolpert for the WP metric and its sectional curvatures written in terms of the local system of coordinates induced by a combined length basis .

In order to write down the next list of formulas of Wolpert, we need the following notations. Given an arbitrary collection of simple closed curves on , we define

where . Also, given a constant and a basis , we will consider the following (Bers) region of Teichmüller space:

Wolpert provides several estimates for the WP metric and its sectional curvatures in terms of the basis , and , , which are uniform on the regions .

Theorem 3 (Wolpert)Fix . Then, for any , and any and , the following estimates hold uniformly on

- where is the Kronecker delta.
- and, furthermore, extends continuosly to the boundary stratum .
- the distance from to the boundary stratum is
- for any vector ,
- and
- extends continuously to the boundary stratum
- the sectional curvature of the complex line (real two-plane) is
- for any quadruple , distinct from a curvature-preserving permutation of , one has
and, moreover, each of the form or introduces a multiplicative factor in the estimate above.

These estimates of Wolpert gives a very good understanding of the geometry of the WP metric in terms of combined length basis. For instance, one infers from the last two items above that the sectional curvatures of the WP metric along the complex lines converge to with speed as one approaches the boundary stratum , while the sectional curvatures of the WP metric associated to quadruples of the form with and converge to with speed *at least*.

In particular, these formulas of Wolpert allow to show “1/3 of item (IV)” for the WP metric, that is,

for all .

Remark 4Observe that the formulas of Wolpert provideasymmetricinformation on the sectional curvatures of the WP metric: indeed, while we have precise estimates on how these sectional curvarutures can approach , the same is not true for the sectional curvatures approaching zero (where one disposes of lower bounds but no upper bounds for the speed of convergence).

Remark 5From the discussion above, we see that there are sectional curvatures of the WP metric on approaching zero whenever contains two distinct curves. In other words, the WP metric has sectional curvatures approaching zero whenever the genus and the number of punctures of satisfy , i.e., except in the cases of once-punctured torii and four-times puncture spheres . This qualitative difference on the geometry of the WP metric on in the cases and (i.e., or ) will be important in the last post of this series when we will discuss the rates of mixing of the WP geodesic flow.

Remark 6As Wolpert points out in this paper here, these estimates permit to think of the WP metric on the moduli space in a -neighborhood of the cusp at infinity as a -pertubation of the metric of the surface of revolution of the profile modulo multiplicative factors of the form .

Now, we will investigate the remaining “2/3 of item (IV)” for the WP metric, i.e., polynomial bounds for the first two derivatives and of the curvature operator of the WP metric.

** 1.2. Bounds for the first two derivatives of WP metric **

As it was *recently* pointed out to us by Wolpert (in a private communication), it is possible to deduce very good bounds for the derivatives of the WP metric (and its curvature tensor) by refining the formulas for the WP metric in some of his works.

Nevertheless, by the time the article of Burns, Masur and Wilkinson was written, it was not clear at all that the delicate calculations of Wolpert for the WP metric could be extended to provide useful information about the derivatives of this metric.

For this reason, Burns, Masur and Wilkinson decided to implement the following alternative strategy.

At first sight, our task reminds the setting of Cauchy’s inequality in Complex Analysis where one estimates the derivatives of a holomorphic function in terms of given bounds for the -norm of this function via the Cauchy integral formula. In fact, our current goal is to estimate the first two derivatives of a “function” (actually, the curvature tensor of the WP metric) defined on the complex-analytic manifold knowing that this “function” already has nice bounds (cf. the previous subsection).

However, one can *not* apply the argument described in the previous paragraph *directly* to the curvature tensor of the WP metric because this metric is *only* a real-analytic (but *not* a complex-analytic/holomorphic) object on the complex-analytic manifold .

Fortunately, Burns, Masur and Wilkinson observed that this idea of using the Cauchy inequalities could still work *after* one adds some results of McMullen into the picture. In a nutshell, McMullen showed that the WP metric is closely related to a *holomorphic* object: very roughly speaking, using the so-called Bers simultaneous uniformization theorem, one can think of the Teichmüller space as a *totally real* submanifold of the so-called quasi-Fuchsian locus , and, in this setting, the Weil-Petersson symplectic -form is the restriction to of the differential of a *holomorphic* -form globally defined on the quasi-Fuchsian locus . In particular, it is possible to use Cauchy’s inequalities to the holomorphic object to get some estimates for the first two derivatives of the WP metric.

Remark 7Acaricatureof the previous paragraph is the following. We want to estimate the first two derivatives of a real-analytic function (“WP metric”) knowing some bounds for the values of . In principle, we can not do this by simply applying Cauchy’s estimates to , but in our context we know (“by the results of McMullen”) that the natural embedding of as a totally real submanifold of allows to think of as the restriction of a holomorphic function and, thus, we can apply Cauchy inequalities to to get some estimates for .

In what follows, we will explain the “Cauchy inequality” idea of Burns, Masur and Wilkinson in two steps. Firstly, we will describe the embedding of into the quasi-Fuchsian locus and the holomorphic -form of McMullen whose differential restricts to the WP symplectic -form on . After that, we will show how the Cauchy inequalities can be used to give the remaining “2/3 of item (IV)” for the WP metric.

**1.2.1. Quasi-Fuchsian locus and McMullen’s -forms **

Given a hyperbolic Riemann surface , , the *quasi-Fuchsian locus* is defined as

where is the *conjugate* Riemann surface of , i.e., is the quotient of the *lower-half plane* by . The *Fuchsian locus* is the image of under the *anti-diagonal* embedding

Geometrically, we can think of elements as follows. Recall that and are related to and via (extremal) quasiconformal mappings determined by the solutions of Beltrami equations associated to -invariant Beltrami differentials (coefficients) and on and . Now, we observe that and live naturally on the Riemann sphere . Since the real axis/circle at infinity/equator has zero Lebesgue measure, we see that and induce a Beltrami differential on . By solving the corresponding Beltrami equation, we obtain a quasiconformal map on and, by conjugating, we obtain a quasi-Fuchsian subgroup

i.e., a Kleinian subgroup whose domain of discontinuity consists of two connected components and such that and .

The following picture summarizes the discussion of the previous paragraph:

Remark 8The Jordan curve given by the image of the equator under the quasiconformal map is “wild” in general, e.g., it has Hausdorff dimension (as the picture above tries to represent). In fact, this happens because a typical quasiconformal map is merely a Hölder continuous, and, hence, it might send “nice” curves (such as the equator) into curves with “intricate geometries” (see, e.g., the three external links of the Wikipedia article on quasi-Fuchsian groups).

The data of the quasi-Fuchsian subgroup attached to permits to assign (marked) *projective structures* to and . More precisely, by writing and with and , we are equipping and with projective structures, that is, atlases of charts to whose changes of coordinates are Möebius transformations (i.e., elements of ). Furthermore, by recalling that and come with markings and (because they are points in Teichmüller spaces), we see that the projective structures above are marked.

In summary, we have a natural *quasi-Fuchsian uniformization* map

assigning to the marked projective structures

Here, is the “Teichmüller space of projective structures” on , i.e., the space of “Teichmüller” equivalence classes of marked projective structures where two marked projective structures and are “Teichmüller” equivalent whenever there is a projective isomorphism homotopic to .

Remark 9The procedure (due to Bers) of attaching a quasi-Fuchsian subgroup to a pair of hyperbolic surfaces and is called Bers simultaneous uniformization because the knowledge of allows to equipat the same timeand with natural projective structures.

Note that is a section of the natural projection

obtained by sending each pair of (marked) projective structures , , , to the unique pair of (marked) compatible conformal structures , , .

We will now describe how the (affine) structure of the fibers of the projection and the section can be used to construct McMullen’s primitives/potentials of the Weil-Petersson symplectic form .

Given two projective structures in the same of the projection , one can measure how far apart from each other are and using the so-called Schwarzian derivative.

More precisely, the fact that and induce the same conformal structure means that the charts of atlases associated to them can be thought as some families of maps and from (small) open subsets to the Riemann sphere , and we can measure the “difference” by computing how “far” from a Möebius transformation (element of ) is .

Here, given a point , one observes that there exists an *unique* Möebius transformation such that and *coincide* at up to *second order* (i.e., and have the same value and the same first and second derivatives at ). Hence, it is natural to measure how far from a Möebius transformation is by understanding the difference between the *third derivatives* of and at , i.e., .

Actually, this is *almost* the definition of the *Schwarzian derivative*: since the derivatives of and map to , in order to recover an object from to *itself*, it is a better idea to “correct” with , i.e., we define the Schwarzian derivative of and at as

Here, the factor shows up for historical reasons (that is, this factor makes coincide with the classical definition of Schwarzian derivative in the literature).

By definition, the Schwarzian derivative is a field of quadratic forms on (since its definition involves taking third order derivatives). In other terms, is a *quadratic differential* on , that is, the “difference” between two projective structures in the same fiber of the projection is given by a quadratic differential . In particular, the fibers are affine spaces modeled by the space of quadratic differentials on .

Remark 10The reader will find more explanations about the Schwarzian derivative in Section 6.3 of Hubbard’s book.

Remark 11The idea of “measuring” the distance between projective structures (inducing the same conformal structure) by computing how far they are from Möebius transformations via the Schwarzian derivative is close in some sense to the idea of measuring the distance between two points in Teichmüller space by computing the eccentricities of quasiconformal maps between these points.

Using this affine structure on and the fact that is the cotangent space of at , we see that, for each , the map

defines a (holomorphic) -form on . Note that, by letting vary and by fixing , we have a map given by

Since (so that ) and , we can think of as a (holomorphic) -form on .

For later use, let us notice that the -form is *bounded* with respect to the Teichmüller metric on . Indeed, this is a consequence of *Nehari’s bound* stating that if is a round disc (i.e., the image of the unit disc under a Möebius transformation) equipped with its hyperbolic metric and is an injective complex-analytic map, then

In this setting, McMullen constructed primitives/potentials for the WP symplectic form as follows. The Teichmüller space sits in the quasi-Fuchsian locus as the *Fuchsian locus* where is the *anti-diagonal* embedding

By pulling back the -form under , we obtain a bounded -form

Remark 12This form is closely related to a classical object in Teichmüller theory calledBers embedding: in our notation, the Bers embedding is

McMullen showed that the bounded -forms are primitives/potentials of the WP symplectic -form , i.e.,

See also Section 7.7 of Hubbard’s book for a nice exposition of this theorem of McMullen. Equivalently, the restriction of the holomorphic -form to the Fuchsian locus (a *totally real* sublocus of ) permits to construct (Teichmüller bounded) primitives for the WP symplectic form on .

At this point, we are ready to implement the “Cauchy estimate” idea of Burns-Masur-Wilkinson to deduce bounds for the first two derivatives of the curvature operator of the WP metric.

**1.2.2. “Cauchy estimate” of after Burns-Masur-Wilkinson**

Following Burns-Masur-Wilkinson, we will need the following local coordinates in :

Proposition 4There exists an universal constant such that, for any , one has a holomorphic embedding

of the Euclidean unit polydisc (where ) sending to and satisfying

where is the Teichmüller norm and is the Euclidean norm on .

This result is proven in this paper of McMullen here.

Also, since the statement of Proposition 4 involves the Teichmüller norm and we are interested in the Weil-Petersson norm , the following comparison (from Lemma 5.4 of Burns-Masur-Wilkinson paper) between and will be helpful:

Lemma 5There exists an universal constant such that, for any and any cotangent vector , one has

where is the systole of (i.e., the length of the shortest closed simple hyperbolic geodesics on ). In particular, for any and any tangent vector , one has

*Proof:* Given , let us write with is “normalized” to contain the element where .

Fix a Dirichlet fundamental domain of the action of centered at the point .

By the *collaring theorem* stating that a closed simple hyperbolic geodesic of length has a collar [tubular neighborhood] of radius isometrically embedded in and two of these collars and are disjoint whenever and are disjoint (see, e.g., Theorem 3.8.3 in Hubbard’s book), we have that the union of isometric copies of contains a ball of fixed (universal) radius around any point .

By combining the Cauchy integral formula with the fact stated in the previous paragraph, we see that

Since the hyperbolic metric is bounded away from on , we can use the -norm estimate on above to deduce that

for some constant . This completes the proof of the lemma.

Remark 13The factor in the previous lemma can be replaced by via a refinement of the argument above. However, we will not prove this here because this refined estimate is not needed for the proof of the main results of Burns-Masur-Wilkinson.

Using the local coordinates from Proposition 4 (and the comparison between Teichmüller and Weil-Petersson norms in the previous lemma), we are ready to use Cauchy’s inequalities to estimate “‘s” of the WP metric. More concretely, denoting by “centered at some ” in Proposition 4, let , and consider the vector fields

on . In setting, we denote by the “‘s” of the WP metric in the local coordinate and by the inverse of the matrix .

Proposition 6There exists an universal constant such that, for any , the pullback of the WP metric local coordinate “centered at ” in Proposition 4 verifies the following estimates:

and

for all , and , .

*Proof:* The first inequality

follows from Proposition 4 and Lemma 5. Indeed, by letting , we see from Proposition 4 and Lemma 5 that

Since

we deduce that

i.e., .

For the proof of second inequality (estimates of the -derivatives of ‘s), we begin by “rephrasing” the construction of McMullen’s -form in terms of the local coordinate introduced in Proposition 4.

The composition of the local coordinate with the anti-diagonal embedding of the Teichmüller space in the quasi-Fuchsian locus can be rewritten as

where is the anti-diagonal embedding

and the local coordinate given by

In this setting, the pullback by of the holomorphic -form gives a holomorphic -form on . Moreover, since the Euclidean metric on is comparable to the pullback by of the Teichmüller metric (cf. Proposition 4), is bounded in Teichmüller metric and where , we see that

where and is a holomorphic bounded (in the Euclidean norm) -form on .

Let us write in complex coordinates , where are bounded holomorphic functions. Hence,

and, a fortiori,

Since is the Kähler form of the metric , we see that the coefficients of are linear combinations of the -pullbacks of and . Because are (universally) bounded holomorphic functions, we can use Cauchy’s inequalities to see that the derivatives of are (universally) bounded at any with . It follows from the boundedness of the (*non-holomorphic*) anti-diagonal embedding that the -derivatives of ‘s satisfy the desired bound.

The estimates in Proposition 6 (controlling the WP metric in the local coordinates constructed in Proposition 4) permit to deduce the remaining “2/3 of item (IV)” for the WP metric:

Theorem 7 (Burns-Masur-Wilkinson)There are constants and such that, for any , the curvature tensor of the WP metric satisfies

*Proof:* Fix and consider the local coordinate provided by Proposition 4. Since and are uniformly bounded, our task is reduced to estimate the first two derivatives of the curvature tensor of the metric at the origin .

Recall that the Christoffel symbols of are

or

in Einstein summation convention, and, in terms of the Christoffel symbols, the coefficients of the curvature tensor are

Therefore, we see that the coefficients of the -derivative is a polynomial function of and the first partial derivatives whose “degree” in the “variables” is (because of the formula ).

By Proposition 6, each has order and the first partial derivatives of at are bounded by a constant depending only on . It follows that

and, consequently,

This completes the proof.

At this point, we have that Theorems 3 and 7 imply the validity of item (IV) of Burns-Masur-Wilkinson ergodicity criterion (Theorem 1) for the WP metric.

Remark 14In a very recent private communication, Wolpert indicated that it is possible to derive theThe estimates for the derivatives of the curvature tensor appearing in the proof of Theorem 7 arenotsharp with respect to the exponent . For instance, the WP metric on the moduli space of once-punctured torii has curvature where is the WP distance between and the boundary , so that one expects tha the -derivatives of the curvature behave like (i.e., the exponent above should be ).sharpestimates of the form

for the derivatives of the curvature tensor of the WP metric from his works.

**2. Injectivity radius of the Weil-Petersson metric **

In this short section, we will verify item (V) of Burns-Masur-Wilkinson ergodicity criterion (Theorem 1) for the WP metric, i.e.,

Theorem 8There exists a constant such that for all , , one has the following polynomial lower bound on the injectivity radius of the WP metric at :

The proof of this result also relies on the work of Wolpert. More precisely, Wolpert showed in this paper here that there exists a constant such that, for any and with ,

where is the Abelian subgroup of the “level ” mapping class group generated by the Dehn twists about the curves .

This reduces the proof of Theorem 8 to the following lemma:

Lemma 9There exists an universal constant with the following property. For each , there exists such that, for any with

for some non-trivial , one can find so that and for some .

*Proof:* We begin the proof of the lemma by recalling that the mapping class group acts on in a properly discontinuous way with no fixed points. Therefore, for each , there exists such that if for some non-trivial (i.e., some non-trivial element of the mapping class group has an “almost fixed point”), then (i.e., the “almost fixed point” is close to the boundary of ).

Let us show now that in the setting of the previous paragraph, for some .

In this direction, let be the product of and the maximal orders of all finite order elements of the mapping class groups of “lower complexity” surfaces. By contradiction, let us assume that there exist infinite sequences , , , such that for some and

but for all ,