Posted by: matheuscmss | November 11, 2014

## On the speed of ergodicity of horocycle maps

Last Thursday (November 6, 2014), Giovanni Forni gave a 1-hour talk at Orsay about his joint work with Livio Flaminio and James Tanis on the speed of ergodicity of horocycle maps.

In this blog post, I will transcript my notes for Giovanni’s talk. Of course, all mistakes in this post are my entire responsibility. Also, I apologize in advance for any wrong statements in what follows: indeed, I arrived at the seminar room about 10 minutes after Giovanni’s talk had started; furthermore, since the seminar room was crowded (about 30 to 40 mathematicians were attending the talk), I was forced to sit in the back of the room and consequently sometimes I could not properly hear Giovanni’s explanations.

The main actor in Giovanni’s talk was the classical horocycle flow ${h_t}$. By definition, ${h_t}$ is the flow induced by the action of the ${1}$-parameter subgroup ${\left( \begin{array}{cc} 1 & t \\ 0 & 1 \end{array}\right)}$ on the unit cotangent bundle ${SL(2,\mathbb{R})/\Gamma=:M}$ of a hyperbolic surface ${\mathbb{H}/\Gamma}$ of finite area (i.e., ${\Gamma}$ is a lattice of ${SL(2,\mathbb{R})}$).

The optimal speed of ergodicity (rate of convergence of Birkhoff averages) for classical horocycle flows was the subject of several papers in the literature of Dynamical Systems: for example, after the works of Zagier, Sarnak, Burger, Ratner, Flaminio-Forni, Strömbergsson, etc., we know that the rate of ergodicity is intimately related to the eigenvalues of the Laplacian (“size of the spectral gap”) of the corresponding hyperbolic surface (and, furthermore, this is related to the Riemann hypothesis in the case ${\Gamma= SL(2,\mathbb{Z})}$).

The bulk of Giovanni’s talk was the discussion of the analog problem for horocycle maps, that is, the question of determining the optimal ${\alpha>0}$ such that the iterates of the time ${T}$ map of the horocycle flow ${h_t}$ verify

$\displaystyle \left|\sum\limits_{n=0}^{N-1} f(h_{nT}(x)) - N \int f d \textrm{vol}_M\right|\leq C_T N^{1-\alpha}$

The basic motivations behind this question are potential applications to “sparse equidistribution problems” (some of them coming from Number Theory) such as:

• The following particular case of Sarnak’s conjecture on the randomness of Möbius function: for all non-periodic ${x\in M=SL(2,\mathbb{R})/SL(2,\mathbb{Z})}$ and ${f\in C^0(M)}$, one has

$\displaystyle \frac{1}{\#\{p\leq P: p \textrm{ is prime}\}}\sum\limits_{\stackrel{p\leq P,}{p \textrm{ prime}}} f(h_p(x)) \stackrel{P\rightarrow\infty}{\longrightarrow} \int f d\textrm{vol}_M.$

In other words, the non-conventional ergodic averages of the horocycle flow along prime numbers at every point converge to the spatial average.

• N. Shah’s conjecture: for each ${\gamma>0}$, one has

$\displaystyle \frac{1}{N}\sum\limits_{n=0}^{N-1} f(h_{n^{1+\gamma}}(x)) \rightarrow \int f d\textrm{vol}_M$

for all ${x\in M}$ and ${f\in C^0(M)}$ whenever ${M=SL(2,\mathbb{R})/\Gamma}$ with ${\Gamma}$ cocompact (i.e., the hyperbolic surface ${\mathbb{H}/\Gamma}$ is compact). In other terms, the non-conventional ergodic averages of the horocycle flow along a polynomial sequence of times of the form ${n^{1+\gamma}}$, ${\gamma>0}$, at every point converge to the spatial average.

Also, Giovanni expects that the tools developed to obtain an estimate of the form ${\left|\sum\limits_{n=0}^{N-1} f(h_{nT}(x)) - N \int f d \textrm{vol}_M\right|\leq C_T N^{1-\alpha}}$ could help in deriving quantitative versions of Ratner’s equidistribution results in more general contexts than the classical horocycles flows.

Before stating some of the main results of Flaminio, Forni and Tanis, let us just mention that:

• Sarnak and Ubis gave in 2011 the following evidence towards the particular case of Sarnak’s conjecture stated above: every weak-${\ast}$ limit of the sequence of probability measures

$\displaystyle \frac{1}{\#\{p\leq P: p \textrm{ is prime}\}}\sum\limits_{\stackrel{p\leq P,}{p \textrm{ prime}}} \delta_{h_p(x)}$

converges to an absolutely continuous measure (with respect to ${\textrm{vol}_M}$) whose density is bounded by ${10}$. (Here, ${\delta_z}$ is the usual Dirac mass at ${z}$);

• Very roughly speaking, an evidence in favor of Shah’s conjecture for ${\gamma>0}$ very small is the fact that ${n^{1+\gamma}}$ behaves like a linear function ${n^{1+\gamma}\sim N^{\gamma}n}$ (with a mildly large factor ${N^{\gamma}}$) for ${n\sim N}$, so that Shah’s conjecture should not be very far from the corresponding statement of equidistribution for linear sequences of times ${nT}$. As it turns out, Flaminio, Forni and Tanis were able to convert this heuristic argument in a proof of Shah’s conjecture for ${\gamma>0}$ very small: indeed, they are confident that Shah’s conjecture is settled for ${0<\gamma<1/20}$ and they hope to push their methods to get the same results for ${0<\gamma<1/11}$. Here, a key ingredient is Theorem 1 below where Flaminio, Forni and Tanis establish a precise control on the quantity ${\left|\sum\limits_{n=0}^{N-1} f(h_{nT}(x)) - N \int f d \textrm{vol}_M\right|}$.

After this brief introduction to horocycle maps, we are ready to state the main result of this post:

Theorem 1 (Flaminio-Forni-Tanis) Let ${\Gamma}$ be a cocompact subgroup of ${SL(2,\mathbb{R})}$ and fix ${T>0}$. For all ${x\in M = SL(2,\mathbb{R})/\Gamma}$ and ${f\in C^{\infty}(M)}$, one has

$\displaystyle \left|\sum\limits_{n=0}^{N-1} f(h_{nT}(x)) - N\int f d\textrm{vol}_M\right|\leq C N^{(1+\nu_0)/2} \|f\| + C_T N^{5/6} (\log N)^{1/2} \|f\| \ \ \ \ \ (1)$

where ${\mu_0>0}$ is the smallest eigenvalue of the Laplacian ${\Delta}$ on the hyperbolic surface ${\mathbb{H}/\Gamma}$, ${\nu_0}$ is the following quantity (related to the so-called spectral gap):

$\displaystyle \nu_0=\left\{\begin{array}{cl}\sqrt{1-4\mu_0} & \textrm{if } \mu_0\leq 1/4 \\ 0 & \textrm{otherwise}\end{array}\right.,$

${\|f\|}$ is an adequate Sobolev norm ${H^s}$ (say, ${s=12}$, i.e., ${\|f\|}$ depends on the first twelve derivatives of ${f}$), ${C}$ is an “universal” constant (depending on ${\Gamma}$ only) and ${C_T}$ is a constant depending on ${T}$ and ${\Gamma}$.

The right-hand side of (1) says that the quantity ${\left|\sum\limits_{n=0}^{N-1} f(h_{nT}(x)) - N\int f d\textrm{vol}_M\right|}$ is controlled by a “spectral term” ${CN^{\frac{1+\nu_0}{2}}\|f\|}$ and by an “uniform term” ${C_T N^{5/6} (\log N)^{1/2} \|f\|}$. In particular, this quantity is controlled exclusively by the uniform part when the spectral gap is sufficiently large (i.e., ${1-\nu_0>1/3}$).

Remark 1 The proof of Theorem 1 shows that the “spectral term” can also be eliminated if ${f}$ is a coboundary. In other terms, if ${f=Ug}$ where ${g}$ is a bounded function and ${U=\left(\begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array}\right)}$ is the infinitesimal generator (vector field) of the horocycle flow ${\{h_t\}_{t\in\mathbb{R}}}$, then one has:

$\displaystyle \left|\sum\limits_{n=0}^{N-1} f(h_{nT}) - N\int f d\textrm{vol}_M\right|\leq C_T N^{5/6} (\log N)^{1/2} \|f\|$

The first step in the proof of Theorem 1 is to take Fourier transform in the time variable of the expression

$\displaystyle \sum\limits_{n=0}^{N-1} f(h_{nT}(x)) - N\int f d\textrm{vol}_M$

By doing so, we are naturally lead to the study of the following twisted ergodic averages:

$\displaystyle \left| \int_0^N e^{i\lambda t} f(h_t(x)) dt\right|$

where ${\lambda\neq 0}$ (“${\lambda=n/T}$”) and ${f\in C^{\infty}(M)}$ has zero average. Then, Flaminio, Forni and Tanis use this to show that Theorem 1 follows from the following result about twisted ergodic averages:

Theorem 2 (Flaminio-Forni-Tanis) In the setting of Theorem 1, one has

$\displaystyle \left|\int_0^N e^{i\lambda t} f(h_t(x)) dt\right|\leq C(\lambda) N^{5/6}(\log N)^{1/2} \|f\|$

for any ${\lambda\neq 0}$ and ${f\in C^{\infty}(M)}$.

Remark 2 The proof of this theorem provides us with a constant ${C(\lambda)\rightarrow\infty}$ as ${\lambda\rightarrow 0}$, and, in fact, ${C(\lambda)\simeq \lambda^{-1/6}}$ in this regime. This is expected because it is known that the speed of mixing of the horocycle flow gets slower as the size of the spectral gap gets close to zero, and, thus, one can not hope for an “uniform control” ${N^{5/6}(\log N)^{1/2}}$ without letting the constant ${C(\lambda)}$ explode as ${\lambda\rightarrow 0}$.

Remark 3 The proof of this theorem also provides us with a constant ${C(\lambda)\rightarrow\infty}$ as ${\lambda\rightarrow \infty}$ but Flaminio, Forni and Tanis have some hope to improve this (as there is no a priori reason to expect this kind of behavior in this regime).

Before giving a sketch of the proof of Theorem 2, let us recall that:

• Venkatesh obtained in 2010 the following bound:

$\displaystyle \left|\int_0^N e^{i\lambda t} f(h_t(x)) dt\right|\leq C N^{1-b}\|f\|$

where ${0, and, more recently,

• Tanis and Visha announced (in 2014) that:

$\displaystyle \left|\int_0^N e^{i\lambda t} f(h_t(x)) dt\right|\leq C (1+|\lambda|^{-2})N^{19/20}\|f\|$

Observe that Venkatesh’s bound has the advantage that the implied constant ${C}$ is “universal” while this constant ${C(\lambda)}$ depends on ${\lambda}$ in the case of Flaminio-Forni-Tanis and Tanis-Visha.

On the other hand, Flaminio-Forni-Tanis and Tanis-Visha obtain uniform exponents (${N^{5/6}(\log N)^{1/2}}$ and ${N^{19/20}}$ resp.) on ${N}$ at the cost of sacrificing the uniformity on the constant ${C(\lambda)}$ (an expected fact, see Remark 2 above) contrary to Venkatesh’s bounds where the exponent on ${N}$ depends on the spectral gap.

Also, the control of Flaminio-Forni-Tanis of the constant ${C(\lambda)}$ on the regime ${\lambda\simeq 0}$ (by ${C(\lambda)\simeq\lambda^{-1/6}}$, see Remark 2 above) is worse than Tanis-Visha’s control ${C(\lambda)=C(1+\lambda^{-2})}$, but the exponent of ${N}$ obtained by Flaminio-Forni-Tanis (of ${5/^6+}$) is better than Tanes-Visha’s exponent (of ${19/20}$).

For the sake of comparison of the techniques employed by Venkatesh and Flaminio-Forni-Tanis, let us now quickly present a sketch of proof of Venkatesh’s bound. In a nutshell, Venkatesh’s method is based on the speed of ergodicity and mixing of the horocycle flow.

More concretely, let ${S\ll N}$ be a parameter to be chosen later and pose ${f_S(x) := \frac{1}{S}\int_0^S e^{i\lambda s} f(h_s(x)) ds}$.

A direct computation reveals that there is no harm in replacing ${f}$ by ${f_S}$ in our way to estimate twisted ergodic averages because

$\displaystyle \left| \frac{1}{N}\int_0^N e^{i\lambda t} f(h_t(x)) dt - I_S(N) \right|\leq \frac{S}{N}\|f\|_{L^{\infty}}$

where ${I_S(N):=\frac{1}{N}\int_0^N e^{i\lambda t} f_S(h_t(x)) dt}$

Next, by Cauchy-Schwarz inequality, we have that ${|I_S(N)|\leq \frac{1}{N}\int_0^N |f_S(h_t(x))|^2 dt}$. Moreover, the results of Burger and Flaminio-Forni on the speed of ergodicity of horocycle flows say that

$\displaystyle \left| \frac{1}{N}\int_0^N |f_S(h_t(x))|^2 dt - \int |f_S|^2 d\textrm{vol}_M \right| \stackrel{N\rightarrow\infty}{\longrightarrow} 0$

with precise estimates in the error terms. In particular, our task becomes to understand how the quantity

$\displaystyle \int |f_S|^2 d\textrm{vol}_M$

approaches zero. Here, after “unfolding” this integral (using the definition of ${f_S}$), one can check that, for each ${S\ll N}$, the resulting expression can be controlled in terms of Ratner’s results showing that the speed of mixing of horocycle flows is given by the size of the spectral gap. Finally, Venkatesh gets the bound described above by optimizing the choice of the parameter ${S\ll N}$. See Section 3 of Venkatesh’s paper for more details.

From their side, Flaminio-Forni-Tanis use a different route to prove Theorem 2, namely, they employ renormalization methods to study twisted ergodic averages for horocycle flows.

This method is inspired from the renormalization method for classical ergodic averages of horocycle flows where one exploits the facts that the geodesic flow ${g_t = \left( \begin{array}{cc} e^{t/2} & 0 \\ 0 & e^{-t/2}\end{array}\right)}$ dilates the orbits of the horocycle flow in the sense that ${g_t h_s = h_{se^t} g_t}$, and the geodesic flow is exponentially mixing (with precise estimates; see, e.g, the works of Dolgopyat, Liverani, Baladi-Liverani, etc.). Also, this method is similar in spirit to the techniques used by Forni to study of deviations of ergodic averages of interval exchange transformations and translation flows.

The basic idea to apply renormalization method sketched above in the context of twisted ergodic averages

$\displaystyle \int_0^N e^{i\lambda t} f(h_t(x)) dt$

is to reinterpret them as classical ergodic averages

$\displaystyle \int_0^N f(\phi_t^W(x,\theta)) dt - \int f d\textrm{vol}_{M\times S^1}$

of a new flow ${\phi_t^W}$ associated to the vector field ${W=U+c\frac{\partial}{\partial\theta}}$ on ${M\times S^1}$ where ${\theta\in S^1}$ and ${c}$ depends on ${\lambda}$.

Unfortunately, a straightforward application of the corresponding geodesic flow to renormalize ${\phi_t^W}$ does not seem to work well: the orbits of ${\phi_t^W}$ are low (1-)dimensional objects inside the unstable manifolds of the geodesic flow, and, thus, their equidistribution properties are harder to obtain (in comparison with the setting of classical ergodic averages of classical horocycles flows).

Nevertheless, Flaminio-Forni-Tanis noticed that this renormalization scheme works after replacing the geodesic flow by an adequate scaling (playing the role of a “fake geodesic flow”). More precisely, the idea of Flaminio, Forni and Tanis is to find a scaling that dilates the orbits of ${\phi_t^W}$ (in a similar way that the geodesic flow dilates the orbits of the horocycle flow) which is well-behaved enough to allow equidistribution estimates.

In this direction, Flaminio, Forni and Tanis start by showing that the coboundaries ${f}$ (i.e., the functions ${f}$ of the form ${f=Wg}$) are characterized by a countable family of invariant distributions ${\{D_{\lambda}^{(n)}\}_{n\in\mathbb{N}}}$ in the sense that ${f}$ is a coboundary if and only if ${D_{\lambda}^{(n)}f=0}$ for all ${n\in\mathbb{N}}$. (Compare with these works of Flaminio-Forni on horocycle flows, and Forni on translation flows).

After that, they use this family of invariant distributions to build up an adequate scaling: the main point is that the scaling must be so that the sizes of the distributions get smaller; in this way, we can control the ergodic average of an arbitrary function because after scaling it becomes closer to a coboundary and the ergodic averages of a coboundary ${f}$ is easy to control (for example, they stay bounded if ${f=Wg}$ with ${g}$ bounded). Here, they introduce the following scaling (on vector fields):

$\displaystyle X\mapsto X_k = k^{-2/3}X, \quad V\mapsto V_k=k^{-1/3} V, \quad \Theta\mapsto\Theta, \quad W\mapsto W_k=kW$

where ${k\geq 1}$, ${X:=\left(\begin{array}{cc} 1/2 & 0 \\ 0 & -1/2\end{array}\right)}$ is the infinitesimal generator of the geodesic flow ${g_t}$, ${V:=\left(\begin{array}{cc} 0 & 0 \\ 1 & 0\end{array}\right)}$ is the infinitesimal generator of the stable horocycle subgroup of ${SL(2,\mathbb{R})}$, and ${\Theta := \left(\begin{array}{cc} 0 & 1/2 \\ -1/2 & 0\end{array}\right)}$ is the infinitesimal generator of the rotation group ${SO(2,\mathbb{R})}$.

Denoting by ${\|.\|_k}$ the induced metric associated to ${X_k, V_k, \Theta, W_k}$ (i.e., ${\|.\|_k}$ is the metric obtained by making these vector fields into an orthonormal frame), one has

$\displaystyle \|D_{\lambda}^{(n)}\|_k\simeq k^{-1/6} \|D_{\lambda}^{(n)}\|_1 \ \ \ \ \ (2)$

so that the invariant distributions gets effectively small when ${k\rightarrow\infty}$.

Furthermore, the crucial point about this scaling — making it into a helpful tool in the proof of Theorem 2 — is that Flaminio-Forni-Tanis can show that the geometry of ${\|.\|_k}$ stays uniformly bounded as ${k\rightarrow\infty}$.

Of course, this is a very important point in this argument because the implied constants above (showing up in the estimates of ergodic averages) are related to the best constants in Sobolev inequalities (among of several other things) and, hence, they stay uniformly bounded whenever the geometry of ${\|.\|_k}$ is under control. (For the sake of comparison, let us mention that this “bounded geometry” property (after scalings) in the context of translation flows corresponds to the recurrence properties of the Teichmüller geodesic flow on the moduli space of translation surfaces, see these papers of Forni here and here on this subject).

In summary, the key idea to obtain Theorem 2 is the introduction of a scaling ${\|.\|_k}$ of ${M\times S^1}$ (“mimicking” the action of the geodesic flow on horocycle flow orbits or the Teichmüller geodesic flow on translation surfaces) making all ${\phi_t^W}$-invariant distributions small (i.e., making all functions into almost coboundaries) in such a way that the underlying geometry of ${(M\times S^1, \|.\|_k)}$ stays bounded. This completes our sketch of proof of Theorem 2.

We conclude this post with the following remark.

Remark 4 The factor ${k^{-1/6}}$ in (2) “explains” the behavior ${C(\lambda)\simeq \lambda^{-1/6}}$ in Remark 2 above.

## Responses

1. Very nice post! Some questions: In your second display, how does it relate to Sarnak’s disjointness conjecture of Mobius? Also, does the conjectural (second) display hold for any initial point x \in M? What about for periodic point?

• A quick (but vague) answer to your first question is: Sarnak’s disjointness conjecture concerns the Moebius function which is intimately related to prime numbers. So, it is not surprising that the study of equidistribution at prime times of orbits of zero topological entropy systems (such as the horocycle flow) has something to do with Sarnak’s conjecture (for a slightly more precise comment, see, e.g., page 16 of this lecture note http://publications.ias.edu/sites/default/files/Mobius%20lectures%20Summer%202010.pdf by Sarnak)

Of course, this is not a satisfactory answer (even if it serves as a sort of “motivation” for studying ergodic averages along special sequences of times), but I’m afraid that I can’t provide more detailed explanations in such a short comment.

Also, you’re absolutely right that I forgot to add the word “non-periodic” in the item preceding the statement of Shah’s conjecture: I’ll fix this typo right now.

On the other hand, there is no need to add the non-periodicity in the statement of Shah’s conjecture itself: indeed, there are *no* periodic orbits for the horocycle in the context of Shah’s conjecture because we are dealing with *compact* quotients of SL(2,R) by assumption.

2. Thanks for the explanation. I see the point now. So the summation in the second display is roughly the Moebius against f. So that the second display gives a more precise form of Sarnak’s conjecture in the horocycle flow context. While for the vague version they have been able to prove it in this case (B-S-Z), since they are able to treat the bilinear sum (which should be the main difficulty).