Posted by: matheuscmss | August 31, 2014

## Dynamics of the Weil-Petersson flow: rates of mixing

For the last installment of this series, our goal is to discuss the rates of mixing of the Weil-Petersson (WP) geodesic flow on the unit tangent bundle ${T^1\mathcal{M}_{g,n}}$ of the moduli space ${\mathcal{M}_{g,n}}$ of Riemann surfaces of genus ${g\geq 0}$ with ${n\geq 0}$ punctures for ${3g-3+n\geq 1}$.

However, before entering into the mathematical discussion strictly speaking, let me take the opportunity to dedicate this blog post to the memory of two Russian mathematicians who passed away earlier this month: Dmitri Anosov and Nikolai Chernov. Among their several well-known contributions in Dynamical Systems, we can quote:

Of course, the list of contributions of Anosov and Chernov to Dynamical Systems is vast: each of them wrote more than 90 research articles and books about the features of systems with some hyperbolicity (such as geodesic flows on negatively curved manifolds and chaotic billiards) among other topics.

In particular, it is out of the scope of this post to provide detailed descriptions of the works of these two very influential dynamicists.

On the other hand, as a form of “small compensation”, let me say that the second section of this post (about rates of the WP flow on the modular surface) briefly discusses some of the ideas advanced by these two mathematicians.

Concerning the rates of mixing of the WP flow, let us recall that, by Burns-Masur-Wilkinson theorem (cf. Theorem 1 in the first post of this series), the WP flow ${\varphi_t}$ on ${T^1\mathcal{M}_{g,n}}$ is mixing with respect to the Liouville measure ${\mu}$ whenever ${3g-3+n\geq 1}$.

By definition of the mixing property, this means that the correlation function ${C_t(f,g):=\int f \cdot g\circ\varphi_t d\mu - \left(\int f d\mu\right)\left(\int g d\mu\right)}$ converges to ${0}$ as ${t\rightarrow\infty}$ for any given ${L^2}$-integrable observables ${f}$ and ${g}$. (See, e.g., the section “${L^2}$ formulation” in this Wikipedia article about the mixing property.)

Given this scenario, it is natural to ask how fast the correlation function ${C_t(f,g)}$ converges to zero. In general, the correlation function ${C_t(f,g)}$ can decay to ${0}$ (as a function of ${t\rightarrow\infty}$) in a very slow way depending on the choice of the observables (see, e.g., this blog post of Climenhaga for some concrete examples). Nevertheless, it is often the case (for mixing flows with some hyperbolicity) that the correlation function ${C_t(f,g)}$ decays to ${0}$ with a definite (e.g., polynomial, exponential, etc.) speed when restricting the observables to appropriate spaces of “reasonably smooth” functions.

In other words, given a mixing flow (with some hyperbolicity), it is usually possible to choose appropriate functional (e.g., Hölder, ${C^r}$, Sobolev, etc.) spaces ${X}$ and ${Y}$ such that

• ${|C_t(f,g)|\leq C\|f\|_{X} \|g\|_{Y} t^{-n}}$ for some constants ${C>0}$, ${n\in\mathbb{N}}$ and for all ${t\geq 1}$ (polynomial decay),
• or ${|C_t(f,g)|\leq C\|f\|_{X} \|g\|_{Y} e^{-ct}}$ for some constants ${C>0}$, ${c>0}$ and for all ${t\geq 1}$ (exponential decay).

Evidently, the “precise” rate of mixing of the flow (i.e., the sharp values of the constants ${C>0}$, ${n\in\mathbb{N}}$ and/or ${c>0}$ above) depend on the choice of the functional spaces ${X}$ and ${Y}$ (as they might change if we replace ${C^1}$ observables by ${C^2}$ observables say). On the other hand, the qualitative speed of decay of ${C_t(f,g)}$, that is, the fact that ${C_t(f,g)}$ decays polynomially or exponentially as ${t\rightarrow\infty}$ whenever ${f}$ and ${g}$ are “reasonably smooth”, remains unchanged if we select ${X}$ and ${Y}$ from a well-behaved scale of functional (like ${C^r}$ spaces, ${r\in\mathbb{N}}$, or ${H^s}$ spaces, ${s>0}$). In particular, this partly explains why in the Dynamical Systems literature one simply says that a given mixing flow ${\varphi_t}$ has “polynomial decay” or “exponential decay”: usually we are interested in the qualitative behavior of the correlation function for reasonably smooth observables, but the particular choice of functional spaces ${X}$ and ${Y}$ is normally treated as a “technical detail”.

After this brief description of the notion of rate of mixing (speed of decay of correlation functions), we are ready to state the main result of this post.

Theorem 1 (Burns-Masur-M.-Wilkinson) The rate of mixing of the WP flow ${\varphi_t}$ on ${T^1\mathcal{M}_{g,n}}$ is:

• at most polynomial when ${3g-3+n>1}$;
• rapid (faster than any polynomial) when ${3g-3+n=1}$.

Remark 1 This result was announced as Theorem 2 in the first post of this series and also in this preprint here. Since then, Burns, Masur, Wilkinson and myself found some evidence indicating that the Weil-Petersson geodesic flow on ${T^1\mathcal{M}_{g,n}}$ is actually exponentially mixing when ${3g-3+n=1}$. The details will hopefully appear in the forthcoming paper (currently still in preparation).

Remark 2 An open problem left by Theorem 1 is to determine the rate of mixing of the WP flow on ${T^1\mathcal{M}_{g,n}}$ for ${3g-3+n>1}$. Indeed, while this theorem provides a polynomial upper bound for the rate of mixing in this setting, it does not rule out the possibility that the actual rate of mixing of the WP flow is sub-polynomial (even for reasonably smooth observables). Heuristically speaking, we believe that the sectional curvatures of the WP metric control the time spend by WP geodesics near the boundary of ${\overline{\mathcal{M}}_{g,n}}$. In particular, it seems that the problem of determining the rate of mixing of the WP flow (when ${3g-3+n>1}$) is somewhat related to the issue of finding suitable (polynomial?) bounds for how close to zero the sectional curvatures of the WP metric can be (in terms of the distance to the boundary of ${\overline{\mathcal{M}}_{g,n}}$). Unfortunately, the best available bounds for the sectional curvatures of the WP metric (due to Wolpert) do not rule out the possibility that some of these quantities get extremely close to zero (see Remark 4 of this post here).

The difference in the rates of mixing of the WP flow on ${T^1\mathcal{M}_{g,n}}$ when ${3g-3+n>1}$ or ${3g-3+n=1}$ in Theorem 1 reflects the following simple (yet important) feature of the WP metric near the boundary of the Deligne-Mumford compactification of ${\mathcal{M}_{g,n}}$.

In the case ${3g-3+n=1}$, e.g., ${g=1=n}$, the moduli space ${\mathcal{M}_{1,1}\simeq\mathbb{H}/PSL(2,\mathbb{Z})}$ equipped with the WP metric looks like the surface of revolution of the profile ${\{v=u^3: 0 < u \leq 1\}}$ near the cusp at infinity (see Remark 6 of this post here). In particular, even though a ${\varepsilon}$-neighborhood of the cusp is “polynomially large” (with area ${\sim \varepsilon^4}$), the Gaussian curvature approaches only ${-\infty}$ near the cusp and, as it turns out, this strong negative curvature near the cusp makes that all geodesic not pointing directly towards the cusp actually come back to the compact part in bounded (say ${\leq 1}$) time. In other words, the excursions of infinite WP geodesics on ${\mathcal{M}_{1,1}}$ near the cusp are so quick that the WP flow on ${T^1\mathcal{M}_{1,1}}$ is “close” to a classical Anosov geodesic flow on negatively curved compact surface. In particular, it is not entirely surprising that the WP flow on ${T^1\mathcal{M}_{1,1}}$ is rapid.

On the other hand, in the case ${3g-3+n>1}$, the WP metric on ${\mathcal{M}_{g,n}}$ has some sectional curvatures close to zero near the boundary of the Deligne-Mumford compactification ${\overline{\mathcal{M}}_{g,n}}$ of ${\mathcal{M}_{g,n}}$ (see Theorem 3 and Remark 5 of this post here). By exploiting this feature of the WP metric on ${\mathcal{M}_{g,n}}$ for ${3g-3+n>1}$ (that has no counterpart for ${\mathcal{M}_{1,1}}$ or ${\mathcal{M}_{0,4}}$), we will build a non-neglegible set of WP geodesics spending a long time near the boundary of ${\overline{\mathcal{M}}_{g,n}}$ before eventually getting into the compact part. In this way, we will deduce that the WP flow on ${\mathcal{M}_{g,n}}$ takes a fair (polynomial) amount of time to mix certain parts of the boundary of ${\overline{\mathcal{M}}_{g,n}}$ with fixed compact subsets of ${\mathcal{M}_{g,n}}$.

In the remainder of this post, we will give some details of the proof of Theorem 1. In the next section, we give a fairly complete proof (assuming the results in this previous post, of course) of the polynomial upper bound on the rate of mixing of the WP flow on ${T^1\mathcal{M}_{g,n}}$ when ${3g-3+n>1}$. After that, in the final section, we provide a sketch of the proof of the rapid mixing property of the WP flow on ${T^1\mathcal{M}_{1,1}}$. In fact, we decided (for pedagogical reasons) to explain some key points of the rapid mixing property only in the toy model case of a negatively curved surface with one cusp corresponding exactly to a surface of revolution of a profile ${\{v=u^r\}}$, ${r\geq 2}$. In this way, since the WP metric near the cusp of ${\mathcal{M}_{1,1}\simeq \mathbb{H}/PSL(2,\mathbb{Z})}$ can be thought as a “perturbation” of the surface of revolution of ${\{v=u^3\}}$ (thanks to Wolpert’s asymptotic formulas), the reader hopefully will get a flavor of the main ideas behind the proof of rapid mixing of the WP flow on ${\mathcal{M}_{1,1}}$ without getting into the (somewhat boring) technical details needed to check that the arguments used in the toy model case are “sufficiently robust” so that they can be “carried over” to the “perturbative setting” of the WP flow on ${T^1\mathcal{M}_{1,1}}$.

1. Rates of mixing of the WP flow on ${T^1\mathcal{M}_{g,n}}$. I

In this section, our notations are the same as in this previous post here.

Given ${\varepsilon>0}$, let us consider the portion of ${\mathcal{M}_{g,n}}$ consisting of ${X\in\mathcal{M}_{g,n}}$ such that a non-separating (homotopically non-trivial, non-peripheral) simple closed curve ${\alpha}$ has hyperbolic length ${\ell_{\alpha}(X)\leq (2\varepsilon)^2}$. The following picture illustrates this portion of ${\mathcal{M}_{g,n}}$ as a ${(2\varepsilon)^2}$-neighborhood of the stratum ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ of the boundary of the Deligne-Mumford compactification ${\overline{\mathcal{M}}_{g,n}}$ where ${\alpha}$ gets pinched (i.e., ${\ell_{\alpha}}$ becomes zero).

Note that the stratum ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ is non-trivial (that is, not reduced to a single point) when ${3g-3+n>1}$. Indeed, by pinching ${\alpha}$ as above and by disconnecting the resulting node, we obtain Riemann surfaces of genus ${g-1}$ with ${n+2}$ punctures whose moduli space is isomorphic to ${\mathcal{T}_{\alpha}/MCG_{g,n}}$. It follows that ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ is a complex orbifold of dimension ${3(g-1)+(n+2)=3g-3+n-1>0}$, and, a fortiori, ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ is not trivial. Evidently, this argument breaks down when ${3g-3+n=1}$: for example, by pinching a curve ${\alpha}$ as above in a once-punctured torus and by removing the resulting node, we obtain thrice punctured spheres (whose moduli space ${\mathcal{M}_{0,3}=\{\overline{\mathbb{C}}-\{0,1,\infty\}\}}$ is trivial). In particular, our Figure 1 concerns exclusively the case ${3g-3+n>1}$.

We want to locate certain regions near ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ taking a long time to mix with the compact part of ${\mathcal{M}_{g,n}}$. For this sake, we will exploit the geometry of the WP metric near ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ — e.g., the fact provided by Wolpert’s formulas (cf. Theorem 3 in this post) that some sectional curvatures of the WP metric approach zero — to build nice sets of unit vectors traveling in an “almost parallel” way to ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ for a significant amount of time.

More precisely, we consider the vectors ${\lambda_{\alpha} := \textrm{grad}(\ell_{\alpha}^{1/2})}$ and ${J\lambda_{\alpha}}$ (where ${J}$ is the complex structure). By definition, they span a complex line ${L=\textrm{span}\{\lambda_{\alpha}, J\lambda_{\alpha}\}}$. Intuitively, the complex line ${L}$ points in the normal direction to a “copy” of ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ inside a level set of the function ${\ell_{\alpha}^{1/2}}$ as indicated in the following picture:

Using the complex line ${L}$, we formalize the notion of “almost parallel” vector to ${\mathcal{T}_{\alpha}/MCG_{g,n}}$. Indeed, given ${v\in T^1\mathcal{M}_{g,n}}$, let us denote by ${r_{\alpha}(v)}$ the quantity ${r_{\alpha}(v):=\sqrt{\langle v, \lambda_{\alpha} \rangle^2 + \langle v, J\lambda_{\alpha}\rangle^2}}$ (where ${\langle.,.\rangle}$ is the WP metric). By definition, ${r_{\alpha}(v)}$ measures the size of the projection of the unit vector ${v}$ in the complex line ${L}$. In particular, we can think of ${v}$ as “almost parallel” to ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ whenever the quantity ${r_{\alpha}(v)}$ is very close to zero.

In this setting, we will show that unit vectors almost parallel to ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ whose footprints are close to ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ always generate geodesics staying near ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ for a long time. More concretely, given ${\varepsilon>0}$, let us define the set

$\displaystyle V_{\varepsilon} := \{v\in T^1\mathcal{M}_{g,n}: f_{\alpha}(v)\leq \varepsilon, \, r_{\alpha}(v)\leq \varepsilon^2\}$

where ${f_{\alpha}(v) := \ell_{\alpha}^{1/2}(p)}$ and ${p\in\mathcal{M}_{g,n}}$ is the footprint of the unit vector ${v\in T^1\mathcal{M}_{g,n}}$. Equivalently, ${V_{\varepsilon}}$ is the disjoint union of the pieces of spheres ${S_{\varepsilon}(p):=\{v\in T^1_p\mathcal{M}_{g,n}: r_{\alpha}(v)\leq \varepsilon^2\}}$ attached to points ${p\in\mathcal{M}_{g,n}}$ with ${\ell_{\alpha}(p)\leq \varepsilon^2}$. The following figure summarizes the geometry of ${S_{\varepsilon}(p)}$:

We would like to prove that a geodesic ${\gamma_v(t)}$ originating at any ${v\in V_{\varepsilon}}$ stays in a ${(2\varepsilon)^2}$-neighborhood of ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ for an interval of time ${[0, T]}$ of size of order ${1/\varepsilon}$, so that the WP geodesic flow does not mix ${V_{\varepsilon}}$ with any fixed ball ${U}$ in the compact part of ${\mathcal{M}_{g,n}}$ of Riemann surfaces with systole ${> 2\varepsilon}$:

In this direction, we will need the following lemma from the third post of this series (cf. Lemma 13 in this post here).

Lemma 2 Let ${\gamma(t)}$ be a WP geodesic and denote by ${r_{\alpha}(t)=r_{\alpha}(\dot{\gamma}(t))}$ and ${f_{\alpha}(t)=\ell_{\alpha}^{1/2}(\gamma(t))}$. Then,

$\displaystyle r_{\alpha}'(t) = O(f_{\alpha}(t)^3)$

From this lemma, it is not hard to estimate the amount of time spent by a geodesic ${\gamma_v(t)}$ near ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ for an arbitrary ${v\in V_{\varepsilon}}$:

Lemma 3 There exists a constant ${C_0>0}$ (depending only on ${g}$ and ${n}$) such that

$\displaystyle \ell_{\alpha}^{1/2}(\gamma_v(t))=f_{\alpha}(t)\leq 2\varepsilon$

for all ${v\in V_{\varepsilon}}$ and ${0\leq t\leq 1/C_0\varepsilon}$.

Proof: By definition, ${v\in V_{\varepsilon}}$ implies that ${f_{\alpha}(0)\leq \varepsilon}$. Thus, it makes sense to consider the maximal interval ${[0,T]}$ of time such that ${f_{\alpha}(t)\leq 2\varepsilon}$ for all ${0\leq t\leq T}$.

By Lemma 2, we have that ${r_{\alpha}'(s)=O(f_{\alpha}(s)^3)}$, i.e., ${|r_{\alpha}'(s)|\leq B f_{\alpha}(s)^3}$ for some constant ${B>1/4}$ depending only on ${g}$ and ${n}$. In particular, ${|r_{\alpha}'(s)|\leq B f_{\alpha}(s)^3\leq B (2\varepsilon)^3}$ for all ${0\leq s\leq T}$. From this estimate, we deduce that

$\displaystyle r_{\alpha}(t) = r_{\alpha}(0)+\int_0^t r_{\alpha}'(s)\,ds\leq r_{\alpha}(0)+B(2\varepsilon)^3 t = r_{\alpha}(0)+8B\varepsilon^3t$

for all ${0\leq t\leq T}$. Since the fact that ${v\in V_{\varepsilon}}$ implies that ${r_{\alpha}(0)\leq\varepsilon^2}$, the previous inequality tell us that

$\displaystyle r_{\alpha}(t)\leq \varepsilon^2+8B\varepsilon^3t$

for all ${0\leq t\leq T}$.

Next, we observe that, by definition, ${f_{\alpha}'(t)=\langle\dot{\gamma}(t),\textrm{grad}\ell_{\alpha}^{1/2}\rangle = \langle\dot{\gamma}(t),\lambda_{\alpha}\rangle}$. Hence,

$\displaystyle |f_{\alpha}'(t)|=|\langle\dot{\gamma}(t),\lambda_{\alpha}\rangle|\leq \sqrt{\langle\dot{\gamma}(t),\lambda_{\alpha}\rangle^2 + \langle\dot{\gamma}(t),J\lambda_{\alpha}\rangle^2} = r_{\alpha}(t)$

By putting together the previous two inequalities and the fact that ${f_{\alpha}(0)\leq\varepsilon}$ (as ${v\in V_{\varepsilon}}$), we conclude that

$\displaystyle f_{\alpha}(T) = f_{\alpha}(0)+\int_0^T f_{\alpha}'(t) \, dt\leq \varepsilon + \varepsilon^2 T + 4B\varepsilon^3T^2$

Since ${T>0}$ was chosen so that ${[0,T]}$ is the maximal interval with ${f_{\alpha}(t)\leq 2\varepsilon}$ for all ${0\leq t\leq T}$, we have that ${f_{\alpha}(T)=2\varepsilon}$. Therefore, the previous estimate can be rewritten as

$\displaystyle 2\varepsilon\leq \varepsilon + \varepsilon^2T + 4B\varepsilon^3T^2$

Because ${B>1/4}$, it follows from this inequality that ${T\geq 1/C_0\varepsilon}$ where ${C_0:=8B}$.

In other words, we showed that ${[0,1/C_0\varepsilon]\subset [0,T]}$, and, a fortiori, ${f_{\alpha}(t)\leq 2\varepsilon}$ for all ${0\leq t\leq 1/C_0\varepsilon}$. This completes the proof of the lemma. $\Box$

Once we have Lemma 3 in our toolbox, it is not hard to infer some upper bounds on the rate of mixing of the WP flow on ${T^1\mathcal{M}_{g,n}}$ when ${3g-3+n>1}$.

Proposition 4 Suppose that the WP flow ${\varphi_t}$ on ${T^1\mathcal{M}_{g,n}}$ has a rate of mixing of the form

$\displaystyle C_t(a,b) = \left|\int a\cdot b\circ\varphi_t - \left(\int a\right)\left(\int b\right)\right|\leq C t^{-\gamma}\|a\|_{C^1}\|b\|_{C^1}$

for some constants ${C>0}$, ${\gamma>0}$, for all ${t\geq 1}$, and for all choices of ${C^1}$-observables ${a}$ and ${b}$. Then, ${\gamma\leq 10}$, i.e., the rate of mixing of the WP flow is at most polynomial.

Proof: Let us fix once and for all an open ball ${U}$ (with respect to the WP metric) contained in the compact part of ${\mathcal{M}_{g,n}}$: this means that there exists ${\varepsilon_0>0}$ such that the systoles of all Riemann surfaces in ${U}$ are ${\geq\varepsilon_0^2}$.

Take a ${C^1}$ function ${a}$ supported on the set ${T^1U}$ of unit vectors with footprints on ${U}$ with values ${0\leq a\leq 1}$ such that ${\int a\geq \textrm{vol}(U)/2}$ and ${\|a\|_{C^1}=O(1)}$: such a function ${a}$ can be easily constructed by smoothing the characteristic function of ${U}$ with the aid of bump functions. Next, for each ${\varepsilon>0}$, take a ${C^1}$ function ${b_{\varepsilon}}$ supported on the set ${V_{\varepsilon}}$ with values ${0\leq b_{\varepsilon}\leq 1}$ such that ${\int b_{\varepsilon}\geq \textrm{vol}(V_{\varepsilon})/2}$ and ${\|b_{\varepsilon}\|_{C^1}=O(1/\varepsilon^2)}$: such a function ${b_{\varepsilon}}$ can also be constructed by smoothing the characteristic function of ${V_{\varepsilon}}$ after taking into account the description of the WP metric near ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ given by Theorems 2 and 3 in this post here and the definition of ${V_{\varepsilon}}$ (in terms of the conditions ${\ell_{\alpha}^{1/2}\leq \varepsilon}$ and ${r_{\alpha}\leq\varepsilon^2}$). Furthermore, this description of the WP metric ${g_{WP}}$ near ${\mathcal{T}_{\alpha}/MCG_{g,n}}$ combined with the asymptotic expansion ${g_{WP}\sim 4dx_{\alpha}^2+x_{\alpha}^6d\tau_{\alpha}}$ where ${x_{\alpha}:=\ell_{\alpha}^{1/2}/\sqrt{2\pi^2}}$ and ${\tau_{\alpha}}$ is a twist parameter (see the proof of Lemma 4 of this post here) says that ${\textrm{vol}(V_{\varepsilon})\sim \varepsilon^8}$: indeed, the condition ${f_{\alpha}=\ell_{\alpha}^{1/2}\leq\varepsilon}$ on footprints of unit tangent vectors in ${V_{\varepsilon}}$ provides a set of volume ${\sim \varepsilon^4}$ (cf. the proof of Lemma 4 of the aforementioned post for details) and the condition ${r_{\alpha}\leq\varepsilon^2}$ on unit tangent vectors in ${V_{\varepsilon}}$ with a fixed footprint provides a set of volume comparable to the Euclidean area ${\pi\varepsilon^4}$ of the Euclidean ball ${\{\vec{v}\in\mathbb{R}^2: |v|\leq\varepsilon^2\}}$ (cf. Theorem 2 in this post here), so that

$\displaystyle \textrm{vol}(V_{\varepsilon}) = \int_{\{\ell_{\alpha}^{1/2}(p)\leq\varepsilon\}} \textrm{vol}(\{v\in T^1_p\mathcal{M}_{g,n}: r_{\alpha}(v)\leq\varepsilon^2\})\sim (\pi\varepsilon^4)\cdot \varepsilon^4\sim \varepsilon^8$

In summary, for each ${\varepsilon>0}$, we have a ${C^1}$ function ${b_{\varepsilon}}$ supported on ${V_{\varepsilon}}$ with ${0\leq b\leq 1}$, ${\|b_{\varepsilon}\|_{C^1} = O(1/\varepsilon^2)}$ and ${\int b_{\varepsilon}\geq c_0\varepsilon^8}$ for some constant ${c_0>0}$ depending only on ${g}$ and ${n}$.

Our plan is to use the observables ${a}$ and ${b_{\varepsilon}}$ to give some upper bounds on the mixing rate of the WP flow ${\varphi_t}$. For this sake, suppose that there are constants ${C>0}$ and ${\gamma>0}$ such that

$\displaystyle C_t(a,b_{\varepsilon})=\left|\int a\cdot b_{\varepsilon}\circ\varphi_t - \left(\int a\right)\left(\int b_{\varepsilon}\right)\right|\leq C t^{-\gamma}\|a\|_{C^1}\|b_{\varepsilon}\|_{C^1}$

for all ${t\geq 1}$ and ${\varepsilon>0}$.

By Lemma 3, there exists a constant ${C_0>0}$ such that ${V_{\varepsilon}\cap \varphi_{\frac{1}{C_0\varepsilon}}(T^1U)=\emptyset}$ whenever ${2\varepsilon<\varepsilon_0}$. Indeed, since ${V_{\varepsilon}}$ is a symmetric set (i.e., ${v\in V_{\varepsilon}}$ if and only if ${-v\in V_{\varepsilon}}$), it follows from Lemma 3 that all Riemann surfaces in the footprints of ${\varphi_{-\frac{1}{C_0\varepsilon}}(V_{\varepsilon})}$ have a systole ${\leq (2\varepsilon)^2<\varepsilon_0^2}$. Because we took ${U}$ in such a way that all Riemann surfaces in ${U}$ have systole ${\geq \varepsilon_0^2}$, we obtain ${\varphi_{-\frac{1}{C_0\varepsilon}}(V_{\varepsilon})\cap T^1U=\emptyset}$, that is, ${V_{\varepsilon}\cap \varphi_{\frac{1}{C_0\varepsilon}}(T^1U)=\emptyset}$, as it was claimed.

Now, let us observe that the function ${a\cdot b_{\varepsilon}\circ \varphi_t}$ is supported on ${V_{\varepsilon}\cap \varphi_t(T^1U)}$ because ${a}$ is supported on ${T^1U}$ and ${b_{\varepsilon}}$ is supported on ${V_{\varepsilon}}$. By putting together this fact and the claim in the previous paragraph (that ${V_{\varepsilon}\cap \varphi_{\frac{1}{C_0\varepsilon}}(T^1U)=\emptyset}$ for ${2\varepsilon<\varepsilon_0}$), we deduce that ${a\cdot b_{\varepsilon}\circ \varphi_{\frac{1}{C_0\varepsilon}}\equiv 0}$ whenever ${2\varepsilon<\varepsilon_0}$. Thus,

$\displaystyle C_{\frac{1}{C_0\varepsilon}}(a,b_{\varepsilon}) := \left|\int a\cdot b_{\varepsilon}\circ\varphi_{\frac{1}{C_0\varepsilon}} - \left(\int a\right)\left(\int b_{\varepsilon}\right)\right| = \left(\int a\right)\left(\int b_{\varepsilon}\right)$

By plugging this identity into the polynomial decay of correlations estimate ${C_t(a,b_{\varepsilon})\leq Ct^{-\gamma}\|a\|_{C^1}\|b_{\varepsilon}\|_{C^1}}$, we get

$\displaystyle \left(\int a\right)\left(\int b_{\varepsilon}\right) = C_{\frac{1}{C_0\varepsilon}}(a,b_{\varepsilon})\leq CC_0^{\gamma}\varepsilon^{\gamma}\|a\|_{C^1}\|b_{\varepsilon}\|_{C^1}$

whenever ${2\varepsilon<\varepsilon_0}$ and ${1/C_0\varepsilon\geq 1}$.

We affirm that the previous estimate implies that ${\gamma\leq 10}$. In fact, recall that our choices were made so that ${\int a\geq \textrm{vol}(U)/2}$ where ${U}$ is a fixed ball, ${\|a\|_{C^1}=O(1)}$, ${\int b_{\varepsilon} \geq c_0\varepsilon^8}$ for some constant ${c_0>0}$ and ${\|b_{\varepsilon}\|_{C^1}=O(1/\varepsilon^2)}$. Hence, by combining these facts and the previous mixing rate estimate, we get that

$\displaystyle \left(\frac{\textrm{vol}(U)}{2}\right) c_0\varepsilon^8\leq \left(\int a\right)\left(\int b_{\varepsilon}\right)\leq CC_0^{\gamma}\varepsilon^{\gamma}\|a\|_{C^1}\|b_{\varepsilon}\|_{C^1} = O(\varepsilon^{\gamma}\frac{1}{\varepsilon^2}),$

that is, ${\varepsilon^{10}\leq D \varepsilon^{\gamma}}$, for some constant ${D>0}$ and for all ${\varepsilon>0}$ sufficiently small (so that ${2\varepsilon<\varepsilon_0}$ and ${1/C_0\varepsilon\geq 1}$). It follows that ${\gamma\leq 10}$, as we claimed. This completes the proof of the proposition. $\Box$

Remark 3 In the statement of the previous proposition, the choice of ${C^1}$-norms to measure the rate of mixing of the WP flow is not very important. Indeed, an inspection of the construction of the functions ${b_{\varepsilon}}$ in the argument above reveals that ${\|b_{\varepsilon}\|_{C^{k+\alpha}} = O(1/\varepsilon^{k+\alpha})}$ for any ${k\in\mathbb{N}}$, ${0\leq\alpha<1}$. In particular, the proof of the previous proposition is sufficiently robust to show also that a rate of mixing of the form

$\displaystyle C_t(a,b) = \left|\int a\cdot b\circ\varphi_t - \left(\int a\right)\left(\int b\right)\right|\leq C t^{-\gamma}\|a\|_{C^{k+\alpha}}\|b\|_{C^{k+\alpha}}$

for some constants ${C>0}$, ${\gamma>0}$, for all ${t\geq 1}$, and for all choices of ${C^1}$-observables ${a}$ and ${b}$ holds only if ${\gamma\leq 8+2(k+\alpha)}$. In other words, even if we replace ${C^1}$-norms by (stronger, smoother) ${C^{k+\alpha}}$-norms in our measurements of rates of mixing of the WP flow (on ${T^1\mathcal{M}_{g,n}}$ for ${3g-3+n>1}$), our discussions so far will always give polynomial upper bounds for the decay of correlations.

At this point, our discussion of the proof of the first item of Theorem 1 is complete (thanks to Proposition 4 and Remark 3). So, we will now move on to the next section we give some of the key ideas in the proof of the second item of Theorem 1.

2. Rates of mixing of the WP flow on ${T^1\mathcal{M}_{g,n}}$. II

Let us consider the WP flow on ${T^1\mathcal{M}_{g,n}}$ when ${3g-3+n=1}$, that is, when ${(g,n)=(0,4)}$ or ${(1,1)}$.

Actually, we will restrict our attention to the case ${(g,n)=(1,1)}$ because the remaining case ${(g,n)=(0,4)}$ is very similar to ${(g,n)=(1,1)}$.

Indeed, the moduli space ${\mathcal{M}_{0,4}}$ of four-times punctured spheres is a finite cover of the moduli space ${\mathcal{M}_{1,1}\simeq \mathbb{H}/SL(2,\mathbb{Z})}$: this can be seen by sending each four-punctured sphere ${\overline{\mathbb{C}}-\{x_1,\dots, x_4\}}$ to the elliptic curve ${y^2=(x-x_1)\dots(x-x_4)}$, so that ${\mathcal{M}_{0,4}}$ becomes naturally isomorphic to ${\mathbb{H}/\Gamma_0(2)}$ where ${\Gamma_0(2)}$ is a congruence subgroup of ${SL(2,\mathbb{Z})}$ of level ${2}$ with index ${3}$. Since all arguments towards rapid mixing of geodesic flows in this section still work after taking finite covers, it suffices to prove the second item of Theorem 1 to the WP flow on ${T^1\mathcal{M}_{1,1}}$.

The rate of mixing of a geodesic flow on the unit tangent bundle of a negatively curved compact surface is known to be fast: indeed, Chernov used his technique of “Markov approximations” to show stretched exponential decay of correlations, and Dolgopyat added a new crucial ingredient (“Dolgopyat’s estimate”) to Chernov’s work to prove exponential decay of correlations.

Evidently, these works of Chernov and Dolgopyat can not be applied to the Wp flow on ${T^1\mathcal{M}_{1,1}}$ because of the non-compactness of ${\mathcal{M}_{1,1}\sim\mathbb{H}/SL(2,\mathbb{Z})}$ due to the presence of a (single) cusp (at infinity). Nevertheless, this suggests that we should be able to determine the rate of mixing of the WP flow on ${T^1\mathcal{M}_{1,1}}$ provided we have enough control of the geometry of the WP metric near the cusp.

Fortunately, as we mentioned in Example 5 of this post here, Wolpert showed that the WP metric ${g_{WP}}$ on ${\mathcal{M}_{1,1}\simeq \mathbb{H}/SL(2,\mathbb{Z})}$ has an asymptotic expansion ${g_{WP}^2\sim \frac{|dz|^2}{\textrm{Im}(z)}}$ at a point ${z\in\mathbb{H}}$. Thus, the WP metric on neighborhoods ${\{z=x+iy\in\mathbb{H}: |x|\leq 1/2, y>y_0\}/SL(2,\mathbb{Z})}$ (with ${y_0>1}$) of the cusp at infinity of ${\mathcal{M}_{1,1}}$ becomes closer (as ${y_0\rightarrow\infty}$) to the metric of surface of revolution of the profile ${v=u^3}$ on neighborhoods ${\{v=u^3: 0\leq u of the cusp at ${0}$ (as ${u_0\rightarrow 0}$).

Partly motivated by the scenario of the previous paragraph, from now on we will pretend that the WP metric on ${\mathbb{H}/PSL(2,\mathbb{Z})}$ looks exactly like the metric ${\frac{|dz|^2}{\textrm{Im}(z)}}$ at all points ${\{z\in\mathbb{H}: \textrm{Im}(z)>y_0\}}$ for some ${y_0\gg 1}$. In other words, instead of studying the WP flow on ${T^1\mathcal{M}_{1,1}}$, we will focus on the rates of mixing of the following toy model: the geodesic flow on a negatively curved surface ${S}$ with a single cusp possessing a neighborhood where the metric is isometric to the surface of revolution of a profile ${\{v=u^r\}}$ for a fixed real number ${r>3}$.

Remark 4 The surface of revolution modeling the WP metric on ${T^1\mathcal{M}_{1,1}}$ is obtained by rotating the profile ${\{v=u^3\}}$. In other words, we see that the study of rates of mixing of the surface of revolution approximating the WP metric on ${T^1\mathcal{M}_{1,1}}$ is a “borderline case” in our subsequent discussion.

Here, our main motivations to replace the WP flow ${\varphi_t}$ on ${T^1\mathcal{M}_{1,1}}$ by the toy model described above are:

• all important ideas for the study of rates of mixing of ${\varphi_t}$ are also present in the case of the toy model, and
• even though the WP metric on ${\mathcal{M}_{1,1}}$ is a perturbation of a surface of revolution, the verification of the fact that the arguments used to estimate the decay of correlations of the geodesic flow on the toy model surfaces are robust enough so that they can be carried over the WP metric situation is somewhat boring: basically, besides performing a slight modification of the proofs to include the borderline case ${r=3}$, one has to introduce “error terms” in the whole discussion below and, after that, one has to check that these errors terms do not change the qualitative nature of all estimates.

In summary, the remainder of this section will contain a proof of the following “toy model version” of the second item of Theorem 1.

Theorem 5 Let ${\overline{S}}$ be a compact surface and fix ${0\in\overline{S}}$. Suppose that ${S=\overline{S}-\{0\}}$ is equipped with a negatively curved Riemannian metric ${g}$ such that the restriction of ${g}$ to a neighborhood of ${\{p\in S: d(p,0) < \rho_0\}}$ is isometric to a surface of revolution of a profile ${\{v=u^r:0 (for some choices of ${\rho_0>0}$ and ${u_0>0}$).Then, the geodesic flow (associated to ${g}$) on ${T^1S}$ is rapid (faster than polynomial) mixing in the sense that, for all ${n\in\mathbb{N}}$, one can choose an adequate Banach space ${X_n}$ of “reasonably smooth” observables and a constant ${C_n>0}$ so that

$\displaystyle C_t(a,b)=\left|\int a\cdot b\circ\varphi_t - \left(\int a\right)\left(\int b\right)\right|\leq C_n t^{-n}\|a\|_{X_n}\|b\|_{X_n}$

for all ${t\geq 1}$.

Remark 5 The arguments below show that the statement above also holds when ${S=\overline{S}-\{0_1,\dots, 0_k\}}$ is equipped with a negatively curved metric that is isometric to a surface of revolution ${\{v=u^{r_i}\}}$, ${r_i>1}$, near ${0_i}$ for each ${i=1,\dots, k}$.

Remark 6 The Riemannian metric ${g}$ is incomplete because the surface of revolution of ${\{v=u^r\}}$ is incomplete when ${r>1}$ (as the reader can check via a simple calculation).

Recall that, in the setting of Theorem 5, we want to understand the dynamics of the excursions of the geodesic flow near the cusp ${0}$ (in order to get rapid mixing). For this sake, we describe these excursions by rewriting the geodesic flow (near ${0}$) as a suspension flow.

2.1. Excursions near the cusp and suspension flows

Consider a small neighborhood in ${S}$ of ${0}$ where the metric is isometric to the surface of revolution of the profile ${\{v=u^r: 0, i.e.,

$\displaystyle \{(x, x^r\cos y, x^r\sin y)\in\mathbb{R}^3: 0

Next, take ${0 a small parameter and consider the parallel ${C=C(d_0) = \{(d_0, d_0^r\cos y, d_0^r\sin y)\in \mathbb{R}^3: 0\leq y\leq 2\pi\}}$. We parametrize unit tangent vectors to the surface of revolution with footprints in ${C}$ as follows.

Given ${q=(d_0,d_0^r\cos y_0, d_0^r\sin y_0)\in C}$, we denote by ${V=V(q)\in T_q^1S}$ the unique unit tangent vector pointing towards to the cusp ${O}$ at ${x=0}$. Equivalently, ${V}$ is the unit vector tangent to the meridian ${\{(d_0-t, (d_0-t)^r\cos y_0, (d_0-t)^r\sin y_0)\in\mathbb{R}^3: 0\leq t< d_0\}}$ at time ${t=0}$, or, alternatively, ${V(q)=-\nabla d(q)}$ where ${d(p)=\textrm{dist}(O,p)}$ is the distance function from the cusp ${O}$ to a point ${p}$. Also, we let ${JV=JV(q)}$ be the unit vector obtained by rotating ${V}$ by ${\pi/2}$ in the counterclockwise sense (i.e., by applying the natural almost complex structure ${J}$).

In this setting, an unit vector ${v\in T_q^1S}$ pointing towards the cusp ${O}$ is completely determined by a real number ${\beta\in (-\pi/2, \pi/2)}$ such that ${\langle v, V\rangle = \cos\beta}$ and ${\langle v, JV \rangle = \sin\beta}$, i.e.,

$\displaystyle v = \cos\beta\cdot V + \sin\beta\cdot JV:=v(\beta)$

The qualitative behavior of the excursion of a geodesic ${\gamma(t)=(x(t),x(t)^r\cos y(t),x(t)^r\sin y(t))}$ starting at ${\dot{\gamma}(0)=v(\beta)\in T^1_qS}$ can be easily determined in terms of the parameter ${\beta}$ thanks to the classical results in Differential Geometry about surfaces of revolutions. Indeed, it is well-known (see, e.g., Do Carmo’s book) that such a geodesic ${\gamma(t)}$ satisfies

$\displaystyle x(t)^{2r}y'(t) = c$

and

$\displaystyle (1+r^2 x(t)^{2(r-1)})x'(t)^2 + \frac{c^2}{x(t)^{2r}}=1$

for a certain constant ${c}$, and, furthermore, these relations imply the famous Clairaut’s relation:

$\displaystyle x(t)^r \cos|\frac{\pi}{2}-|\beta(t)|| = c = constant \ \ \ \ \ (1)$

where ${\beta(t)}$ is the parameter attached to ${\dot{\gamma}(t)}$ (i.e., ${\dot{\gamma}(t)=v(\beta(t))\in T^1_{\gamma(t)}C(x(t))}$). In particular, except for the geodesic going directly to the cusp (i.e., the geodesic starting at ${V(q)}$ associated to ${\beta=0}$), all geodesics ${\gamma(t)}$ (starting at ${v(\beta)}$ with ${\beta\neq0}$) behave qualitatively in a simple way. In the first part ${t\in [0,T(\beta)/2]}$ of its excursion towards the cusp, the angle ${\beta(t)}$ increases (resp. decreases) from ${\beta>0}$ to ${\pi/2}$ (resp. from ${\beta<0}$ to ${-\pi/2}$) while the value of ${x(t)}$ diminishes in order to keep up with Clairaut’s relation. Then, the geodesic ${\gamma(t)}$ reaches its closest position to the cusp at time ${t=T(\beta)/2}$: here, ${\beta(t)=\pm\pi/2}$ (i.e., ${\dot{\gamma}(T(\beta)/2)}$ is tangent to the parallel ${C(x(T(\beta)/2))}$ containing ${\gamma(T(\beta)/2)}$) and, hence,

$\displaystyle x(T(\beta)/2)^r = x(0)^r\sin\beta = d_0^r\sin\beta:=x_{\min}(\beta)^r$

Finally, in the second part ${t\in [T(\beta)/2, T(\beta)]}$, ${\gamma(t)}$ does the “opposite” from the first part: the angle ${\beta(t)}$ goes from ${\pm\pi/2}$ to ${\pm\pi/2-\beta}$ and ${x(t)}$ increases from ${x_{\min}(\beta)}$ back to ${x(0)=d_0}$. The following picture summarizes the discussion of this paragraph:

Remark 7 Note that the time ${T(\beta)}$ taken by the geodesic ${\gamma(t)}$ to go from the parallel ${C=C(d_0)}$ to ${C(x_{\min}(\beta))}$ and then from ${C(x_{\min}(\beta))}$ back to ${C}$ is independent of the basepoint ${q=\gamma(0)\in C}$. Indeed, this is a direct consequence of the rotational symmetry of our surface. Alternatively, this can be easily seen from the formula

$\displaystyle \frac{T(\beta)}{2} = \int_{x_{\min}(\beta)}^{x(0)}x^{r}\sqrt{\frac{1+(rx^{r-1})^2}{x^{2r}-c^2}} \, dx = \int_{d_0(\sin\beta)^{1/r}}^{d_0}x^{r}\sqrt{\frac{1+(rx^{r-1})^2}{x^{2r}-(d_0^r\sin\beta)^2}} \, dx$

deduced by integration of the ODE satisfied by ${x(t)}$. Observe that this formula also shows that ${T(\beta)}$ is uniformly bounded, i.e., ${T(\beta)=O_{d_0, r}(1)}$ for all ${\beta\neq 0}$. Geometrically, this means that all geodesics ${\gamma(t)}$ starting at ${C}$ must return to ${C}$ in bounded time unless they go directly into the cusp.

This description of the excursions of geodesics near the cusp permits to build a suspension-flow model of the geodesic flow near ${O}$. Indeed, let us consider the cross-section ${N=T^1_CS = T^1_{C(d_0)}S}$. As we saw above, an element of the surface ${N}$ is parametrized by two angular coordinates ${y}$ and ${\beta}$: the value of ${y}$ determines a point ${q=(d_0, d_0^r\cos y, d_0^r\sin y)\in C}$ and the value of ${\beta}$ determines an unit tangent vector ${v(\beta)\in T^1_qS}$ making angle ${\beta}$ with ${V(q)}$. The subset ${M}$ of ${N}$ consisting of those elements ${v(\beta)}$ with angular coordinate ${-\pi/2<\beta<\pi/2}$ corresponds to the unit vectors with footprint in ${C}$ pointing towards the cusp at ${O}$. The equation ${\beta=0}$ determines a circle ${\Sigma}$ inside ${M}$ corresponding to geodesics going straight into the cusp, and, furthermore, we have a natural “first-return map” ${F:M-\Sigma\rightarrow N}$ defined by ${F(v(\beta)) = \dot{\gamma}_{v(\beta)}(T(\beta))}$ where ${\gamma_{v(\beta)}}$ is the geodesic starting at ${v(\beta)}$ at time ${t=0}$.

In this setting, the orbits ${\gamma_{v(\beta)}(t)}$, ${t\in [0,T(\beta)]}$ are modeled by the “suspension flow” ${\varphi_t(v(\beta),s)=(v(\beta), s+t)}$ if ${0\leq s+t, ${\varphi_{T(\beta)}(v(\beta),0) = (F(v(\beta)),0)}$ over the base map ${F}$ with roof function ${T:M-\Sigma\rightarrow\mathbb{R}}$, ${T(v(\beta))=T(\beta)}$.

Remark 8 Technically speaking, one needs to “complete” the definition of ${F}$ and ${r}$ by including the dynamics of the geodesic flow on the compact part of ${S}$ in order to properly write the geodesic flow on ${S}$ as a suspension flow. Nevertheless, since the major technical difficulty in the proof of Theorem 5 comes from the presence of the cusp, we will ignore the excursions of geodesics in the compact part ${S}$ and we will pretend that the (partially defined) flow ${\varphi_t}$ is a “genuine” suspension flow model.

2.2. Rapid mixing of contact suspension flows

One of the advantages about thinking of the geodesic flow on ${S}$ as a suspension flow comes from the fact that several authors have previously studied the interplay between the rates of mixing of this class of flows and the features of ${F}$ and ${r}$: see, e.g., these papers of Avila-Gouëzel-Yoccoz and Melbourne for some results in this direction (and also for a precise definition of suspension flows).

For our current purposes, it is worth to recall that Bálint and Melbourne (cf. Theorem 2.1 [and Remarks 2.3 and 2.5] of this paper here) proved the rapid mixing property for contact suspension flows whose base map is modeled by a Young tower with exponential tails and whose roof function is bounded and uniformly piecewise Hölder continuous on each subset of the basis of the Young tower. In particular, the proof of Theorem 5 is complete once we prove that the base map ${F:M-\Sigma\rightarrow N}$ is modeled by Young towers and the roof function ${T:M-\Sigma\rightarrow \mathbb{R}}$ is bounded and uniformly piecewise Hölder continuous on each element of the basis of the Young tower (whatever this means).

As it turns out, the theory of Young towers (introduced by Young in these papers here and here) is a double-edged sword: while it provides an adequate setup for the study of statistical properties of systems with some hyperbolicity once the so-called Young towers were built, it has the drawback that the construction of Young towers (satisfying all five natural but technical axioms in Young’s definition) is usually a delicate issue: indeed, one has to find a countable Markov partition of a positive measure subset (working as the basis of the Young tower) so that the return maps associated to this Markov partition verify several hyperbolicity and distortion controls, and it is not always clear where one could possibly find such a Markov partition for a given dynamical system.

Fortunately, Chernov and Zhang gave a list of sufficient geometric properties for a two-dimensional map like ${F:M-\Sigma\rightarrow N}$ to be modeled by Young towers with exponential tails: in fact, Theorem 10 in Chernov-Zhang paper is a sort of “black-box” producing Young towers with exponential tails whenever seven geometrical conditions are fulfilled. For the sake of exposition, we will not attempt to check all seven conditions for ${F:M-\Sigma\rightarrow N}$: instead, we will focus on two main conditions called distortion bounds and one-step growth condition.

Before we discuss the distortion bounds and the one-step growth condition, we need to recall the concept of homogeneity strips (originally introduced by Bunimovich-Chernov-Sinai). In our setting, we take ${k_0\in\mathbb{N}}$ and ${\nu=\nu(r)\in\mathbb{N}}$ (to be chosen later) and we make a partition of a neighborhood of the singular set ${\Sigma}$ (of geodesics going straight into the cusp) into countably many strips:

$\displaystyle H_k:=\left\{(y,\beta)\in M: \frac{1}{(k+1)^{\nu}}<|\beta|<\frac{1}{k^{\nu}}\right\}$

for all ${k\in\mathbb{N}}$, ${k\geq k_0}$. (Actually, ${H_k}$ has two connected components, but we will slightly abuse of notation by denoting these connected components by ${H_k}$.)

Intuitively, the partition ${H_k}$ into polynomial scales ${1/k^\nu}$ in the parameter ${\beta}$ is useful in our context because the relevant quantities (such as Gaussian curvature, first and second derivatives, etc.) for the study of the geodesic flow of the surface of revolution blows up with a polynomial speed as the excursions of geodesics get closer the cusp (that is, as ${\beta\rightarrow 0}$). Thus, the important quantities for the analysis of the geodesic flow near the cusp become “almost constant” when restricted to one of the homogeneity strips ${H_k}$.

Also, another advantage of the homogeneity strips is the fact that they give a rough control of the elements of the countable Markov partition at the basis of the Young tower produced by Chernov-Zhang: indeed, the arguments of Chernov-Zhang show that each element of the basis of their Young tower is completely contained in a homogeneity strip. In particular, the verification of the uniform piecewise Hölder continuity of the roof function ${T:M-\Sigma\rightarrow N}$ follows once we prove that the restriction ${T|_{H_k}}$of the roof function to each homogeneity strip ${H_k}$ is uniformly Hölder continuous (in the sense that, for some ${0<\alpha=\alpha(r)\leq 1}$, the Hölder norms ${\|T|_{H_k}\|_{C^\alpha}}$ are bounded by a constant independent of ${k}$).

Coming back to the one-step growth and distortion bounds, let us content ourselves to formulate simpler versions of them (while referring to Section 4 and 5 of Chernov-Zhang paper for precise definitions): indeed, the actual definitions of these notions involve the properties of the derivative along unstable manifolds, and, in our current setting, we have just a partially defined map ${F:M-\Sigma\rightarrow N}$, so that we can not talk about future iterates and unstable manifolds unless we “complete” the definition of ${F}$.

Nevertheless, even if ${F}$ is only partially defined, we still can give crude analogs to unstable directions for ${F}$ by noticing that the vector field ${w^u:=\partial/\partial\beta}$ on ${M-\Sigma}$ (whose leaves are ${\{y=constant\}}$) morally works like an unstable direction: in fact, this vector field is transverse to the singular set ${\Sigma=\{\beta=0\}}$ which is a sort of “stable set” because all trajectories of the geodesic flow starting at ${\Sigma}$ converge in the future to the same point, namely, the cusp at ${O}$. In terms of the “unstable direction” ${w^u=\partial/\partial\beta}$, we define the expansion factor ${\Lambda(v)}$ of ${F}$ at a point ${v=(y,\beta)\in M-\Sigma}$ as ${\Lambda(v):=\|DF(v)w^u\|/\|w^u\|}$, that is, the amount of expansion of the “unstable” vector field ${w^u}$ under ${DF(v)}$. Note that, from the definitions, the expansion factor ${\Lambda(v)}$ depends only on the ${\beta}$-coordinate of ${v=(y,\beta)}$. So, from now on, we will think of expansion factors as a function ${\Lambda(\beta)}$ of ${\beta}$.

In terms of expansion factors, the (variant of the) distortion bound condition is

$\displaystyle \frac{d\log\Lambda}{d\beta}(\beta_0)=\frac{\Lambda'(\beta_0)}{\Lambda(\beta_0)}\leq C\frac{1}{\beta_0^{\theta}} \ \ \ \ \ (2)$

where ${\theta=\theta(r)>0}$ satisfies ${\nu\theta < \nu+1}$, and the (variant of the) one-step growth condition is

$\displaystyle \sum\limits_{k=k_0}^{\infty}\Lambda_k^{-1}<1 \ \ \ \ \ (3)$

where ${\Lambda_k:=\min\limits_{v\in H_k}\Lambda(v) = \min\limits_{\frac{1}{(k+1)^{\nu}}\leq |\beta|\leq\frac{1}{k^{\nu}}} \Lambda(\beta)}$.

Remark 9 The one-step growth condition above is very close to the original version in Chernov-Zhang paper (compare (3) with Equation (5.5) in Chernov-Zhang article). On the other hand, the distortion bound condition (2) differs slightly from its original version in Equation (4.1) in Chernov-Zhang paper. Nevertheless, they can be related as follows. The original distortion condition essentially amounts to give estimates ${\log\prod\limits_{i=0}^n\frac{\Lambda(F^{-i}(v_1))}{\Lambda(F^{-i}(v_2))}\leq \psi(dist(v_1,v_2))}$ (where ${\psi}$ is a smooth function such that ${\psi(s)\rightarrow 0}$ as ${s\rightarrow 0}$) whenever ${x}$ and ${y}$ belong to the same homogenous unstable manifold ${W}$ (i.e., a piece ${W}$ of unstable manifold such that ${F^{-j}(W)}$ never intersects the boundaries of the homogeneity strips ${H_k}$ for all ${j\geq 0}$ and ${k\geq k_0}$; the existence of homogenous unstable manifolds through almost every point is guaranteed by a Borel-Cantelli type argument described in Appendix 2 of this paper of Bunimovich-Chernov-Sinai here). Here, one sees that

$\displaystyle \log\prod\limits_{i=0}^n\frac{\Lambda(F^{-i}(v_1))}{\Lambda(F^{-i}(v_2))} = \sum\limits_{i=0}^n \frac{\Lambda'(z_i)}{\Lambda(z_i)} dist(F^{-i}(x), F^{-i(y)})$

for some ${z_i\in F^{-i}(W)}$. Using the facts that ${dist(F^{-i}(x), F^{-i}(y))}$ decays exponentially fast (as ${x}$ and ${y}$ are in the same unstable manifold ${W}$) and ${F^{-i}(W)}$ is always contained in a homogeneity strip ${H_{k_i}}$ (as ${W}$ is a homogenous unstable manifold), one can check that the estimate in (2) implies the desired uniform bound on the previous expression in terms of a smooth function ${\psi(s)}$ such that ${\psi(s)\rightarrow 0}$ as ${s\rightarrow 0}$. In other words, the estimate (2) can be shown to imply the original version of distortion bounds, so that we can safely concentrate on the proof of (2).

At this point, we can summarize the discussion so far as follows. By Melbourne’s criterion for rapid mixing for contact suspension flows and Chernov-Zhang criterion for the existence of Young towers with exponential tails for the map ${F:M-\Sigma\rightarrow N}$, we have “reduced” the proof of Theorem 5 to the following statements:

Proposition 6 Given ${\nu>0}$ and ${0<\alpha<1/(\nu+1)}$, one has the following “uniform Hölder estimate”

$\displaystyle \sup\limits_{k\in\mathbb{N}} \|T|_{H_k}\|_{C^\alpha}<\infty$

whenever ${d_0}$ is sufficiently small (depending on ${r}$, ${\nu}$ and ${\alpha}$).

Proposition 7 The expansion factor function ${\Lambda(\beta)}$ satisfies:

• given ${\nu>r/(r-1)}$, we can choose ${k_0\in\mathbb{N}}$ large (and ${d_0}$ sufficiently small) so that

$\displaystyle \sum\limits_{k=k_0}^{\infty}\Lambda_k^{-1}<1$

where ${\Lambda_k = \min\limits_{\frac{1}{(k+1)^{\nu}}\leq |\beta|\leq\frac{1}{k^{\nu}}} \Lambda(\beta)}$;

• given ${r>3}$, we can choose ${\nu>r/(r-1)}$ and ${\theta>1+2/r}$ such that ${\nu\theta<\nu+1}$ and

$\displaystyle \frac{\Lambda'(\beta)}{\Lambda(\beta)}\leq C\frac{1}{\beta^{\theta}}$

for some (sufficiently large) constant ${C>0}$ and for all ${\beta}$.

The proofs of these two propositions are given in the next two subsections and they are based on the study of perpendicular unstable Jacobi fields related to the variations of geodesics of the form ${\gamma_{v(\beta)}(t)}$, ${0<\beta<\pi/2}$.

2.3. The derivative of the roof function

From now on, we fix ${q\in C=C(d_0)}$ (e.g., ${q=(d_0, d_0^r, 0)}$) and, for the sake of simplicity, we will denote a geodesic ${\gamma_{v(\beta)}(t)}$ corresponding to an initial vector ${v(\beta)\in T^1_qS}$ by ${\gamma_{\beta}(t)}$. Of course, there is no loss of generality here because of the rotational symmetry of the surface ${S}$. Also, we will suppose that ${\beta>0}$ as the case ${\beta<0}$ is symmetric.

Note that the roof function ${T(\beta)}$ is defined by the condition ${\gamma_{\beta}(T(\beta))\in C=C(d_0)}$, or, equivalently,

$\displaystyle d(\gamma_{\beta}(T(\beta))) = I(d_0) := \int_0^{d_0}\sqrt{1+(rx^{r-1})^2}dx$

where ${d(.)}$ denotes the distance from a point to the cusp at ${O}$ and ${I(d_0)}$ is the distance from ${C(d_0)}$ to ${O}$. By taking the derivative with respect to ${\beta}$ at ${\beta=\beta_0}$ and by recalling that ${-\nabla d = V}$, we obtain that

$\displaystyle 0=\langle\nabla d(c(\beta_0)), \dot{c}(\beta_0)\rangle = -\langle V(c(\beta_0)), \dot{c}(\beta_0)\rangle$

where ${c(\beta):=\gamma_{\beta}(T(\beta))}$. Since ${c(\beta) = C(\beta,T(\beta))}$ where ${C(\beta, t):=\gamma_{\beta}(t)}$, we have ${\dot{c}(\beta) = \frac{D \gamma_{\beta}}{\partial\beta} (T(\beta)) + \dot{\gamma}_{\beta}(T(\beta)) T'(\beta)}$, and, a fortiori,

$\displaystyle 0=\langle V(\gamma_{\beta_0}(T(\beta_0))), \frac {D \gamma_{\beta}}{\partial\beta}|_{\beta=\beta_0}(T(\beta_0))\rangle + \langle V(\gamma_{\beta_0}(T(\beta_0))), \dot{\gamma}_{\beta_0}(T(\beta_0))\rangle T'(\beta_0)$

Let us compute the two inner products above. By definition of the parameter ${\beta}$ and the symmetry of the revolution surface ${S}$, we have ${\langle V(\gamma_{\beta}(T(\beta))), \dot{\gamma}_{\beta}(T(\beta))\rangle = - \cos\beta = -\langle V(\gamma_{\beta}(0)), \dot{\gamma}_{\beta}(0)\rangle}$. Also, if we denote by ${J(t)=\frac {D \gamma_{\beta}}{\partial\beta}(t):= j(t)\cdot J\dot{\gamma}_{\beta}(t)}$ the perpendicular (“unstable”) Jacobi field along the geodesic ${\gamma_{\beta_0}(t)}$ associated to the variation of ${C(\beta,t)=\gamma_{\beta}(t)}$ (cf. Section 2 of this previous post here) with initial conditions ${j(0)=0}$ and ${j'(0)=1}$, then

$\displaystyle \begin{array}{rcl} \langle V(\gamma_{\beta_0}(T(\beta_0))), \frac {D \gamma_{\beta}}{\partial\beta}|_{\beta=\beta_0}(T(\beta_0))\rangle &=& j(T(\beta_0)) \langle V(\gamma_{\beta_0}(T(\beta_0))), J\dot{\gamma}_{\beta_0}(T(\beta_0))\rangle \\ &=& - j(T(\beta_0)) \langle V(\gamma_{\beta_0}(0)), J\dot{\gamma}_{\beta_0}(0)\rangle \\ & = & -j(T(\beta_0))\langle JV(\gamma_{\beta_0}(0)), \dot{\gamma}_{\beta_0}(0)\rangle \\ &=& - j(T(\beta_0))\sin\beta_0 \end{array}$

From the computation of the inner products above and the fact that they add up to zero, we deduce that ${0 = - j(T(\beta_0))\sin\beta_0 - (\cos\beta_0) T'(\beta_0)}$, that is,

$\displaystyle T'(\beta_0)=-(\tan\beta_0) j(T(\beta_0)) \ \ \ \ \ (4)$

In other terms, the previous equation says that the derivative ${T'(\beta_0)}$ can be controlled via the quantity ${j(T(\beta_0))}$ measuring the growth of the perpendincular Jacobi field ${J(t)}$ at the return time ${T(\beta_0)}$. Here, it is worth to recall that Jacobi fields are driven by Jacobi’s equation:

$\displaystyle j''(t)+K(t) j(t)=0$

where ${K(t)<0}$ is the Gaussian curvature of the surface of revolution ${S}$ at the point ${\gamma_{\beta_0}(t)}$. Also, it is useful to keep in mind that Jacobi’s equation implies that the quantity ${u=j'/j}$ satisfies Riccati’s equation

$\displaystyle u'(t)+u(t)^2 = k(t)^2$

where ${-k(t)^2:=K(t)}$.

In the context of the surface of revolution ${S}$, these equations are important tools because we have the following explicit formula for the Gaussian curvature ${K(q)}$ at a point ${q=(x,x^r\cos y, x^r\sin y)\in S}$:

$\displaystyle K(q) = \frac{-r(r-1)}{x^2(1+(rx^{r-1})^2)^2}$

In particular, ${k(q):=\sqrt{r(r-1)}/x(1+(rx^{r-1})^2)}$ verifies ${-k(q)^2=K(q)}$.

Next, we take ${\varepsilon>0}$ and we consider the following auxiliary function:

$\displaystyle g(q):=\frac{r(1+\varepsilon)}{x}$

By definition, ${k(q). Furthermore,

$\displaystyle k(t)^2-g(t)^2-g'(t)\leq \frac{r(r-1)}{x(t)^2} - \frac{r^2(1+\varepsilon)^2}{x(t)^2} - \frac{r(1+\varepsilon)x'(t)}{x(t)^2}$

Since the equation ${(1+rx(t)^{r-1})^2 x'(t)^2 = 1-c^2/x(t)^{2r}=\cos\beta(t)^2}$ (describing the motion of geodesic on ${S}$) implies that ${|x'(t)|\leq 1}$, we deduce from the previous inequality that

$\displaystyle k(t)^2 - g(t)^2 - g'(t)\leq \frac{1}{x(t)^2}(r(r-1)-(r(1+\varepsilon))^2+r(1+\varepsilon))<0 \ \ \ \ \ (5)$

for all times ${t\in[0,T(\beta)]}$.

This estimate allows to control the solution ${u=j'/j}$ of Riccati’s equation along the following lines. The initial data of the Jacobi field ${J(t)}$ is ${j(0)=0}$ and ${j'(0)}$. Hence,

$\displaystyle \frac{j'(0)}{j(0)}=\infty>g(0)=\frac{r(1+\varepsilon)}{x(0)} = \frac{r(1+\varepsilon)}{d_0}$

In particular, there exists a well-defined maximal interval ${[0,t_0]\subset [0,T(\beta)]}$ where ${j'(t)/j(t)\geq g(t)}$ for all ${t\in[0, t_0]}$. By plugging this estimate into Jacobi’s equation, we get that

$\displaystyle \frac{j''(t)}{j'(t)}=\frac{k(t)^2j(t)}{j'(t)}\leq \frac{k(t)^2}{g(t)}\leq g(t)$

for each ${t\in [0, t_0]}$.

By integrating this inequality (and using the initial condition ${j'(0)=1}$), we obtain that

$\displaystyle \log j'(t_0)=\log \frac{j'(t_0)}{j'(0)} =\int_0^{t_0} \frac{j''(t)}{j'(t)} ds\leq \int_0^{t_0} g(t) dt.$

Therefore,

$\displaystyle j(t_0)\leq \frac{j'(t_0)}{g(t_0)}\leq \frac{1}{g(t_0)} \exp\left(\int_0^{t_0} g(t) dt\right)$

If ${t_0=T(\beta)}$, we deduce that ${j(T(\beta))\leq \frac{1}{g(T(\beta))}\exp\left(\int_0^{T(\beta)} g(t) dt\right)\leq \frac{1}{k(0)}\exp\left(\int_0^{T(\beta)} g(t) dt\right)}$ (as ${k(0)=k(T(\beta))). Otherwise, ${0 and ${u(t_0)=j'(t_0)/j(t_0) = g(t_0)}$. Since ${u=j'/j}$ satisfies Riccati’s equation, we deduce from (5) that

$\displaystyle u'(t_1) - g'(t_1) = k(t_1)^2 - u(t_1)^2 - g'(t_1) = k(t_1)^2 - g(t_1)^2 - g'(t_1) < 0$

at each time ${t_1}$ where ${u(t_1)=g(t_1)}$. It follows that ${j'(t)/j(t):=u(t)\leq g(t)}$ for all ${t\in [t_0, T(\beta)]}$. Hence,

$\displaystyle \log\frac{j(T(\beta))}{j(t_0)} = \int_{t_0}^{T(\beta)} \frac{j'(t)}{j(t)} dt\leq \int_{t_0}^{T(\beta)} g(t) dt,$

and, a fortiori,

$\displaystyle \begin{array}{rcl} j(T(\beta))&\leq& j(t_0)\exp\left(\int_{t_0}^{T(\beta)} g(t) dt\right) \\ &\leq& \frac{1}{g(t_0)} \exp\left(\int_{0}^{t_0} g(t) dt\right) \exp\left(\int_{t_0}^{T(\beta)} g(t) dt\right) \\ &\leq & \frac{1}{k(0)} \exp\left(\int_0^{T(\beta)} g(t) dt\right). \end{array}$

In other words, we proved that

$\displaystyle j(T(\beta))\leq \frac{1}{k(0)} \exp\left(\int_0^{T(\beta)} g(t) dt\right) \ \ \ \ \ (6)$

independently whether ${t_0=T(\beta)}$ or ${0.

Now, the quantity ${\exp\left(\int_0^{T(\beta)} g(t) dt\right)}$ can be estimated as follows. By deriving Clairaut’s relation ${x(t)^r\sin\beta(t)=c}$, we get

$\displaystyle rx(t)^{r-1}x'(t)\sin\beta(t) + x(t)^r(\cos\beta(t))\beta'(t)=0,$

that is,

$\displaystyle \frac{1}{x(t)} = -\frac{1}{r}\frac{\cos\beta(t)}{x'(t)}\frac{\beta'(t)}{\sin\beta(t)} \ \ \ \ \ (7)$

Since ${\sin\beta(t)\sim\beta(t)}$ (as we are interested in small angles ${|\beta|, ${k_0}$ large) and ${\cos\beta(t)\sim x'(t)}$ (thanks to the relation ${(1+rx(t)^{r-1})^2 x'(t)^2 =1 - c^2/x(t)^{2r} = (\cos\beta(t))^2}$ and the fact that ${r>1}$ and, thus, ${1\leq 1+(rx(t)^{r-1})^2\leq 1+(rd_0^{r-1})^2\sim 1}$ for ${d_0}$ small), we conclude that

$\displaystyle g(t)=\frac{r(1+\varepsilon)}{x(t)}\leq (1+2\varepsilon)\frac{\beta'(t)}{\beta(t)}$

for ${t\in [0, T(\beta)/2]}$. Here, we used the fact that ${x'(t)<0}$ for ${t\in[0, T(\beta)/2]}$. Therefore,

$\displaystyle \int_0^{T(\beta)/2} g(t) dt \leq (1+2\varepsilon) \log\frac{\pi/2}{\beta(0)}$

since ${\beta(T(\beta)/2)=\pi/2}$. Also, the symmetry of the surface ${S}$ implies ${x(t)=x(T(\beta)-t)}$ and, hence,

$\displaystyle \int_0^{T(\beta)/2}g(t) dt = \int_{T(\beta)/2}^{T(\beta)} g(t) dt$

In summary, we have shown that ${\int_0^{T(\beta)}g(t) dt\leq 2(1+2\varepsilon)\log(\pi/2\beta(0))}$, i.e.,

$\displaystyle \exp\left(\int_0^{T(\beta)} g(t) dt\right)\leq (\pi/2)^{2(1+2\varepsilon)}\frac{1}{\beta(0)^{2(1+2\varepsilon)}} \ \ \ \ \ (8)$

By putting together (4), (6) and (8), we conclude that

$\displaystyle |T'(\beta_0)|\leq\frac{\tan\beta_0}{k(0)}\exp\left(\int_0^{T(\beta_0)} g(t) dt\right)\leq C\frac{\beta_0}{\beta_0^{2(1+2\varepsilon)}} = \frac{C}{\beta_0^{1+4\varepsilon}} \ \ \ \ \ (9)$

for some constant ${C>0}$ depending on ${r>1}$ and ${\varepsilon>0}$.

At this stage, we are ready to complete the proof of Proposition 6.

Proof: Let us estimate the Hölder constant ${\|T|_{H_k}\|_{C^{\alpha}}}$. For this sake, we fix ${\beta_1, \beta_2\in H_k}$ and we write

$\displaystyle \frac{|T(\beta_1)-T(\beta_2)|}{|\beta_1-\beta_2|^{\alpha}} = |T'(\beta_3)|\cdot |\beta_1-\beta_2|^{1-\alpha}$

for some ${\beta_3\in H_k}$ between ${\beta_1}$ and ${\beta_2}$. Since ${|\beta_1-\beta_2|\leq k^{-\nu}-(k+1)^{-\nu}\leq \nu/k^{\nu+1}}$ and ${|\beta_3|\geq (k+1)^{-\nu}}$, it follows from (9) that

$\displaystyle \frac{|T(\beta_1)-T(\beta_2)|}{|\beta_1-\beta_2|^{\alpha}} \leq C\nu^{1-\alpha}\frac{(k+1)^{\nu(1+4\varepsilon)}}{k^{(\nu+1)(1-\alpha)}}$

Because ${\beta_1}$ and ${\beta_2}$ are arbitrary points in ${H_k}$, we have that

$\displaystyle \|T|_{H_k}\|_{C^{\alpha}}\leq C\frac{(k+1)^{\nu(1+4\varepsilon)}}{k^{(\nu+1)(1-\alpha)}}$

where ${C>0}$ is an appropriate constant.

Now, our assumption ${0<\alpha<1/\nu+1}$ implies that we can choose ${\varepsilon>0}$ sufficiently small so that ${\nu(1+4\varepsilon)\leq \nu(1-\alpha)}$. By doing so, we see from the previous estimate that

$\displaystyle \sup\limits_{k\in\mathbb{N}}\|T|_{H_k}\|_{C^{\alpha}}<\infty$

whenever ${\varepsilon>0}$, i.e., ${d_0>0}$, is sufficently small. This proves Proposition 6. $\Box$

2.4. Some estimates for the expansion factors ${\Lambda(\beta)}$

Similarly to the previous subsection, the proof of Proposition 7 uses the properties of Jacobi’s and Riccati’s equation to study

$\displaystyle \Lambda(\beta):=j(T(\beta)) + j'(T(\beta)) \ \ \ \ \ (10)$

where ${j(t)=j_{\beta}(t)}$ is the scalar function (with ${j(0)=0}$ and ${j'(0)=1}$) measuring the size of the perpendicular “unstable” Jacobi field along ${\gamma_{\beta}(t)}$.

We begin by giving a lower bound on ${\Lambda(\beta)}$. Given ${\varepsilon>0}$, let us choose ${d_0=d_0(\varepsilon, r)>0}$ small so that

$\displaystyle \sqrt{1-\varepsilon}<\frac{1}{1+(rd_0^{r-1})^2}(\leq 1)$

Of course, this choice of ${d_0}$ is possible because ${r>1}$. Next, we consider the auxiliary function:

$\displaystyle h(q):=\frac{(r-1)(1-2\varepsilon)}{x}.$

By definition, ${h(q)<\sqrt{r(r-1)}/x(1+(rd_0^{r-1})^2)\leq k(q)}$. Furthermore,

$\displaystyle h'(t) = -\frac{(r-1)(1-\varepsilon)}{x(t)^2}x'(t)$

In particular,

$\displaystyle k(t)^2-h(t)^2-h'(t)>\frac{r(r-1)(1-\varepsilon)}{x(t)^2}-\frac{(r-1)^2(1-2\varepsilon)^2}{x(t)^2} - \frac{(r-1)(1-2\varepsilon)x'(t)}{x(t)^2}$

Since ${|x'(t)|\leq 1}$ (cf. the paragraph before (5)), we deduce from the previous estimate that

$\displaystyle k(t)^2-h(t)^2-h'(t)>0$

This inequality implies that the solution ${u(t)=j'(t)/j(t)}$ of Riccati’s equation satisfies ${u(t)\geq h(t)}$ for all ${t\in [0, T(\beta)]}$. Indeed, the initial condition ${j'(0)=1}$, ${j(0)=0}$ says that ${u(0)=\infty>h(0)}$ and the inequality above tells us that

$\displaystyle u'(t_1)-h'(t_1) = k(t_1)^2 - u(t_1)^2 - h'(t_1) = k(t_1)^2 - h(t_1)^2 - h'(t_1) >0$

at any time ${t_1}$ where ${u(t_1)=h(t_1)}$.

By integrating the estimate ${u(t)=j'(t)/j(t)\geq h(t)}$ over the interval ${[t_0, T(\beta)]}$, we obtain that

$\displaystyle \log \frac{j(T(\beta))}{j(t_0)} = \int_{t_0}^{T(\beta)} \frac{j'(t)}{j(t)} dt \geq \int_{t_0}^{T(\beta)} h(t) dt,$

i.e.,

$\displaystyle j(T(\beta))\geq j(t_0)\exp\left(\int_{t_0}^{T(\beta)} h(t) dt\right)$

For sake of concreteness, let us set ${t_0:=d_0/10}$ and let us restrict our attention to geodesics whose initial angle ${\beta=\beta(0)}$ with the meridians of ${S}$ are sufficiently small so that ${T(\beta)\geq d_0/2}$. In this way, we have that ${j(t_0)\geq t_0=d_0/10}$ (thanks to Jacobi’s equation ${j''=k^2j}$ and our initial conditions ${j(0)=0}$ and ${j'(0)=1}$). In this way, the inequality above becomes

$\displaystyle j(T(\beta))\geq \frac{d_0}{10}\exp\left(\int_{t_0}^{T(\beta)} h(t) dt\right)$

Next, we observe that ${\exp\left(\int_{t_0}^{T(\beta)} h(t) dt\right)}$ can be bounded from below in a similar way to our derivation of a bound from above to ${\exp\left(\int_{0}^{T(\beta)} g(t) dt\right)}$ in the previous subsection: in fact, by repeating the arguments appearing after (7) above, one can show that

$\displaystyle h(t)\geq \frac{(r-1)(1-3\varepsilon)}{r}\frac{\beta'(t)}{\beta(t)}$

and

$\displaystyle \exp\left(\int_{t_0}^{T(\beta)} h(t) dt \right)\geq \overline{c} \frac{1}{\beta(0)^{(r-1)(1-3\varepsilon)/r}}$

where ${\overline{c}>0}$ is an adequate (small) constant depending on ${r}$, ${d_0}$ and ${\varepsilon}$.

By putting together the estimates above, we deduce that

$\displaystyle \Lambda(T(\beta))\geq j(T(\beta))\geq c\frac{1}{\beta(0)^{(r-1)(1-3\varepsilon)/r}}$

where ${c=d_0\overline{c}/10}$.

This inequality shows that

$\displaystyle \sum\limits_{k=k_0}^{\infty} \Lambda_k^{-1}\leq \frac{1}{c} \sum\limits_{k=k_0}^{\infty}\frac{1}{(k+1)^{(r-1)\nu(1-3\varepsilon)/r}}$

Thus, if ${\nu>r/(r-1)}$, then we can choose ${\varepsilon>0}$ small (with ${(r-1)(1-3\varepsilon)\nu/r>1}$) and ${k_0\in\mathbb{N}}$ large so that (our variant of) the one-step growth condition (3) holds. This proves the first part of Proposition 7.

Finally, we give an indication of the proof of the second part of Proposition 7 (i.e., the distortion bound (2)). We start by writing

$\displaystyle \frac{\Lambda'(\beta)}{\Lambda(\beta)} = \frac{d}{d\beta}\log\Lambda(\beta)$

and by noticing that

$\displaystyle \log\Lambda(\beta) = \log(j(T(\beta))+j'(T(\beta))) = \log j(T(\beta)) + \log(1+u(T(\beta)))$

Next, we take the derivative with respect to ${\beta}$ of the previous expression. Here, we obtain several terms involving some quantities already estimated above via Jacobi’s and Riccati’s equation (such as ${j(T(\beta))}$, ${T'(\beta)}$, etc.), but also a new quantity appears, namely, ${u_{\beta}(t)}$, i.e., the derivative with respect to ${\beta}$ of the family of solutions ${u(t)=u(t,\beta)}$ of Riccati’s equation along ${\gamma_{\beta}(t)}$. Here, the “trick” to give bounds on ${u_{\beta}(t)}$ is to derive Riccati’s equation

$\displaystyle u'(t)+u(t)^2 = k(t)^2$

with respect to ${\beta}$ in order to get an ODE (in the time variable ${t}$) satisfied by ${u_{\beta}(t)}$. In this way, it is possible to see that one has reasonable bounds on ${u_{\beta}(t)}$ as soon as the derivative ${k_{\beta}}$ of the square root of the absolute value ${-K}$ of the Gaussian curvature. Here, ${k_{\beta}}$ can be bounded by recalling that we have an explicit formula

$\displaystyle K=-r(r-1)/x^2(1+(rx^{r-1})^2)^2$

for the Gaussian curvature. By following these lines, one can prove that, for a given ${\varepsilon>0}$, the distortion bound

$\displaystyle \frac{\Lambda'(\beta)}{\Lambda(\beta)}\leq C\frac{1}{\beta(0)^{(1+2/r)(1+\varepsilon)}} = \frac{C}{\beta(0)^{\theta}}$

holds whenever ${d_0>0}$ is taken sufficiently small. In other words, by taking ${\theta = \theta(r) = (r+2)(1+\varepsilon)/r}$, we have ${\Lambda'(\beta)/\Lambda(\beta) \leq C\beta(0)^{-\theta}}$.

Note that the estimate in the previous paragraph gives the desired distortion bounds (2) once we show that ${\theta=\theta(r)=\frac{(r+2)}{r}+}$ can be selected such that ${\nu\theta<\nu+1}$. In order to check this, it suffices to recall that ${\nu-r/(r-1)>0}$ can be taken arbitrarily small (cf. the proof of the first part of Proposition 7), i.e., ${\nu=\frac{r}{r-1}+}$. So,

$\displaystyle \nu\theta = \left(\frac{r}{r-1}+\right)\left(\frac{r+2}{r}+\right) = \frac{r+2}{r-1}+$

and

$\displaystyle \nu+1 = \frac{r}{r-1}+1+ = \frac{2r-1}{r-1}+$

Since ${r+2<2r-1}$ for ${r>3}$, it follows that ${\nu\theta<\nu+1}$ for adequate choices of ${\theta}$ and ${\nu}$. This completes our sketch of proof of the second part of Proposition 7.