In 1966, M. Kac wrote a famous article asking whether Can one hear the shape of drum?: mathematically speaking, one wants to reconstruct (up to isometries) a domain from the knowledge of the spectrum of its Laplacian.

In his article, M. Kac showed that one can hear the shape of a disk {\mathbb{D}(0,R)=\{z\in\mathbb{R}^2:|z|\leq R\}} because of the following two facts:

  • the area {A_{\Omega}} and the perimeter {L_{\partial\Omega}} of a smooth domain {\Omega\subset\mathbb{R}^2} are determined by the eigenvalues {0<\lambda_1(\Omega)<\lambda_2(\Omega)\leq \dots} of its Laplacian {\Delta_{\Omega}} via the asymptotics of the trace of the heat operator:\displaystyle \textrm{tr}(e^{-t\Delta_{\Omega}}) = \sum\limits_{n=1}^{\infty} e^{-t\lambda_n(\Omega)} \sim \frac{1}{t}\left(\frac{A_{\Omega}}{2\pi} - \frac{L_{\partial\Omega}}{4\sqrt{2\pi}}\sqrt{t}\right) \quad \textrm{ as } t\rightarrow 0^+;
  • the isoperimetric inequality says that we can recognise the disk from its perimeter and area: indeed, any smooth domain {\Omega} satisfies {A_{\Omega}\leq \frac{1}{4\pi}L_{\partial\Omega}^2}, and the equality holds if and only if {\Omega} is isometric to a disk of radius {L_{\partial\Omega}/2\pi}.

(In particular, a smooth domain {\Omega\subset \mathbb{R}^2} with the same Laplace eigenvalues of {\mathbb{D}(0,R)} has area {\pi R^2} and perimeter {2\pi R}, so that the isoperimetric inequality ensures that {\Omega} and {\mathbb{D}(0,R)} are isometric.)

In a preprint from 2019, Hezari and Zelditch showed that one can also hear the shape of an ellipse {E_{\varepsilon}=\{(x,y):x^2+\frac{y^2}{1-\varepsilon^2}=1\}} of small eccentricity {0\leq \varepsilon\leq\varepsilon_0 < 1}. As it is explained by Zelditch in this video here, a first-order approximation to their basic strategy to hear ellipses of small eccentricities is to replace “trace of the heat operator” and “isoperimetric inequality” in Kac’s argument by “trace of the wave operator” and “dynamical rigidity of ellipses”.

More concretely, Hezari and Zelditch considered a smooth domain {\Omega} which is isospectral to an ellipse {E_{\varepsilon}} (i.e., {\lambda_n(\Omega) = \lambda_n(E_{\varepsilon})} for all {n\in\mathbb{N}}) of small eccentricity {\varepsilon}, and they took the following steps:

  • in Section 2 of their paper, it is shown that {\Omega} is necessarily close to {E_{\varepsilon}}; in fact, after parametrising {\partial\Omega} with an arc-length parameter {s}, we can control the {L^2}-norms of the derivatives of the curvature {\kappa} of {\partial\Omega} from the asymptotics of the trace of the heat operator: indeed,\displaystyle \textrm{tr}(e^{-t\Delta_{\Omega}}) = \sum\limits_{n=1}^{\infty} e^{-t\lambda_n(\Omega)} \sim (4\pi t)^{-1}\sum_{k\geq 0}b_k(\Omega) t^{k/2} \quad \textrm{ as } t\rightarrow0^+,where\displaystyle b_3(\Omega) = \frac{\sqrt{\pi}}{64}\int_{\partial\Omega}\kappa^2\,ds, \quad b_5(\Omega) = \frac{37\sqrt{\pi}}{2^{13}}\int_{\partial\Omega}\kappa^4\,ds - \frac{\sqrt{\pi}}{2^{10}}\int_{\partial\Omega}(\kappa')^2\,ds,and, in general, {b_{2n+3}(\Omega) = c_{2n+3}\int_{\partial\Omega}\kappa_n^2+ \int_{\partial\Omega} Q_n(\kappa,\dots, \kappa_{n-1})\,ds} for a certain constant {c_{2n+3}\neq0} and an adequate “universal” polynomial {Q_n} (here, {\kappa_j} is the {j}-th derivative of {\kappa} with respect to {s}); in particular, since {\Omega} and {E_{\varepsilon}} are isospectral, {b_k(\Omega) = b_k(E_{\varepsilon})} for all {k\in\mathbb{N}}, and Melrose explored this fact to get a pre-compactness bound {\|\kappa_n(\Omega)\|_{L^2} = O_n(1)} for all {n\in\mathbb{N}} (via a bootstrap argument where the Poincaré inequality and the Sobolev embedding theorem are employed to convert {L^2} bounds on {\kappa}, {\dots}, {\kappa_{j+1}} into {C^0} bounds on {\kappa}, {\dots}, {\kappa_j}); in other terms, after Melrose, the shape of any {\Omega} isospectral to {E_{\varepsilon}} is bounded; by reworking Melrose’s argument, Hezari and Zelditch actually show that {\Omega} is almost circular in the sense that {\|\kappa_n\|_{C^0}=O_n(\sqrt{\varepsilon})} for all {n\geq 1};
  • in view of a theorem of Avila, de Simoi and Kaloshin, an almost circular domain {\Omega} is isometric to an ellipse provided {\Omega} is rationally integrable, i.e., for each {q>2}, the periodic trajectories with rotation number {1/q} in the billiard table determined by {\Omega} are tangent to a smooth convex curve (usually called caustic);
  • in Sections 3 to 6 of their paper, it is shown that the portion between {5} and {L_{\partial\Omega}} of the singular support of the trace of the wave operator {\textrm{tr}(\cos (t\sqrt{\Delta_{\Omega}}))} coincides with the set {\mathcal{L}} of lengths of periodic trajectories with rotation number {1/q}, {q\geq 3}, in an almost circular billiard table {\Omega}; in particular, if {\Omega} is isospectral to {E_{\varepsilon}}, one concludes (in Section 7 of their paper) that, for each {q\geq 3}, all periodic trajectories in {\Omega} with rotation number {1/q} have the same length {t_q(\varepsilon)} and they form a caustic, so that {\Omega} is rationally integrable (and, a fortiori, isometric to {E_{\varepsilon}} after Avila–de Simoi–Kaloshin).

A natural question raised by Hezari–Zelditch work is to determine the magnitude of the upper bound {\varepsilon_0} on the eccentricities of the ellipses which can be heard from their methods.

In this direction, I would like to conclude this short post by noticing that I asked a group of 6 undergraduate students (in their 2nd year) at \’Ecole Polytechnique to follow closely the articles by Avila–de Simoi–Kaloshin and Hezari–Zelditch (while trying to explicitly compute as many implied constants as possible), and, after 6 months of work, they produced this report here (in French) concluding that {\varepsilon_0} is not bigger than {10^{-112}}. (Of course, there is plenty of room for tiny improvements here, but one will probably need some new ideas before reaching a “normal size” {\varepsilon_0} [e.g., {\varepsilon_0\sim 10^{-10}}].)

Posted by: matheuscmss | April 22, 2020

BiSTRO seminar

Simion Filip, Curtis McMullen, Martin Moeller and I are co-organizing an online seminar called Billiards, Surfaces à la Teichmueller and Riemann, Online (BiSTRO).

Similarly to several other current online seminars, the idea of BiSTRO arose after several conferences, meetings, etc. in Teichmueller dynamics were cancelled due to the covid-19 crisis.  

In any case, the first talk of BiSTRO seminar will be delivered tomorrow by Kasra Rafi (at 18h CEST): in a nutshell, he will speak extend the scope of a result of Furstenberg on stationary measures (previously discussed in this blog here and here) to the context of mapping class groups.

Closing this short post, let me point out that the reader wishing to attend BiSTRO seminar can find the relevant informations at the bottom of the seminar’s official webpage

Sinai billiards are a fascinating class of dynamical systems: in fact, despite their simple definition in terms of a dynamical billiard on a table given by a square or a two torus with a certain number of dispersing obstacles, they present some “nasty” features (related to the existence of “grazing” collisions) placing them slightly beyond the standard theory of smooth uniformly hyperbolic systems.

The seminal works by several authors (including Sinai, Bunimovich, Chernov, …) paved the way to establish many properties of Sinai billiards including the so-called fast decay of correlations for the Liouville measure {\mu_{\text{SRB}}} (a feature which is illustrated in this numerical simulation [due to Dyatlov] here).

Nevertheless, some ergodic-theoretical aspects of (certain) Sinai billiards were elucidated only very recently: for instance, the existence of an unique probability measure of maximal entropy {\mu_{\text{max}}} was proved by Baladi–Demers in this article here.

An interesting remark made by Baladi–Demers is the fact that the maximal entropy measure {\mu_{\text{max}}} should be typically different from the Liouville measure {\mu_{\text{SRB}}}: in fact, {\mu_{\text{max}}=\mu_{\text{SRB}}} forces a rigidity property for periodic orbits, namely, the Lyapunov exponents of all periodic orbits coincide (note that this is a huge number of conditions because the number of periodic orbits of period {\leq n} grows exponentially with {n}) and, in particular, there are no known examples of Sinai billiards where {\mu_{\text{max}}=\mu_{\text{SRB}}}.

In a recent paper, De Simoi–Leguil–Vinhage–Yang showed that {\mu_{\text{max}}=\mu_{\text{SRB}}} forces another rigidity property for periodic orbits, namely, the Birkhoff normal forms (see also Section 6 of Chapter 6 of Hasselblatt–Katok book and Appendix A of Moreira–Yoccoz article) at periodic orbits whose invariant manifolds produce homoclinic orbits are all linear. In particular, they proposed in Remark 5.6 of their preprint to prove that {\mu_{\text{max}}\neq\mu_{\text{SRB}}} for certain Sinai billiards by computing the Taylor expansion of the derivative of the billiard map at a periodic orbit of period two and checking that its quadratic part is non-degenerate.

In this short post, we explain that the strategy from the previous paragraph can be implemented to verify the non-linearity of the Birkhoff normal form at a periodic orbit of period two of the Sinai billiards in triangular lattices considered by Baladi–Demers.

Remark 1 In our subsequent discussion, we shall assume some familiarity with the basic aspects of billiards maps described in the classical book of Chernov–Markarian.


1. Sinai billiards in triangular lattices

We consider the Sinai billiard table on the hexagonal torus obtained from the following picture (extracted from Baladi–Demers article):

Here, the obstacles are disks of radii {1} centered at the points of a triangular lattice and the distance between adjacent scatterers is {d}.

As it is discussed in Baladi–Demers paper, the billiard map {F} has a probability measure of maximal entropy whenever {2<d<4/\sqrt{3}} (where the limit case {d=2} corresponds to touching obstacles and the limit case {d=4/\sqrt{3}} produces “infinite horizon”).

2. Billiard map near a periodic orbit of period two

We want to study the billiard map {F} near the periodic orbit of period two given by the horizontal segment between {p_0=(1,0)\in\mathbb{R}^2} and {p_1=(d-1,0)\in\mathbb{R}^2}.

A trajectory leaving the point {e^{i\theta}} in the direction {e^{i\psi}} travels along the line {e^{i\theta}+se^{i\psi}} until it hits the boundary of the disk of radius {1} and center {(d-1,0)} for the first time {t}. By definition, the time {t} is the smallest solution of {|e^{i\theta}+se^{i\psi}-d|=1}, i.e.,

\displaystyle s^2+2(\cos(\psi-\theta)-d\cos\psi)s+(d^2-2d\cos\theta)=0.

Hence, {t=B(\theta,\psi)-\sqrt{B(\theta,\psi)^2-A(\theta,\psi)}} where {B(\theta,\psi):=d\cos\psi-\cos(\psi-\theta)} and {A(\theta,\psi):=d^2-2d\cos\theta}, the billiard trajectory starting at {e^{i\theta}} in the direction {e^{i\psi}} hits the obstacle at

\displaystyle e^{i\theta}+(B(\theta,\psi)-\sqrt{B(\theta,\psi)^2-A(\theta,\psi)})e^{i\psi}=:d+e^{i\Theta}

where {\Theta=\Theta(\theta,\psi)\approx\pi} and, after a specular reflection, it takes the direction {e^{i(2\Theta-\psi-\pi)}:= e^{i\Psi}}.

As it is explained in page 35 of Chernov–Markarian’s book, this geometric description of the billiard map near the periodic orbit of period two allows to compute {DF^2} along the following lines. If we adopt the convention in Chernov–Markarian’s book of using arc-length parametrization {x} on the obstacle and describing the angle {y\in[-\pi/2,\pi/2]} with the normal direction pointing towards the interior of the table using a sign determined by the orientation of the obstacles so that this normal vector stays at the left of the tangent vectors to the obstacles, then the billiard map becomes

\displaystyle F(x,y):=(X(x,y),Y(x,y))=(\pi-\arcsin\left(\sin(x) + t(x, y) \sin(x-y)\right), -X(x,y)+x-y+\pi)


\displaystyle t(x,y)=B(x,y)-\sqrt{B(x,y)^2-A(x,y)},

\displaystyle B(x,y)=d\cos(x-y)-\cos(-y), \quad A(x,y)=d^2-2d\cos(x),

(and {\arcsin(z)\in[-\frac{\pi}{2},\frac{\pi}{2}]} for {z\in[-1,1]}) and, moreover,

\displaystyle D_{(x,y)}F=\left(\begin{array}{cc} M_{11}(x,y) & M_{12}(x,y) \\ M_{21}(x,y) & M_{22}(x,y)\end{array}\right)


\displaystyle D_{(X,Y)}F = \left(\begin{array}{cc} M_{11}(\pi-X(x,y),Y(x,y)) & M_{12}(\pi-X(x,y),Y(x,y)) \\ M_{21}(\pi-X(x,y),Y(x,y)) & M_{22}(\pi-X(x,y),Y(x,y))\end{array}\right),


\displaystyle \begin{array}{rcl} M_{11}(x,y)&=&-\frac{1}{\cos(Y(x,y))}(t(x,y)+\cos(y)) \\ &=& (1 - d) + \frac{1}{2} (2 d^2 - d^3) y^2+ (-d - d^2 + d^3) xy + \frac{1}{2} (d - d^3) x^2 + O(x^3+x^2y^2+y^3), \end{array}

\displaystyle \begin{array}{rcl} M_{12}(x,y)&=&-\frac{1}{\cos(Y(x,y))}t(x,y) \\ &=& (2 - d) + \frac{1}{2} (-2 d + 3 d^2 - d^3) y^2 + (-2 d^2 + d^3) xy + \frac{1}{2} (d + d^2 - d^3) x^2 + O(x^3+x^2y^2+y^3), \end{array}

\displaystyle \begin{array}{rcl} M_{21}(x,y)&=&-\frac{1}{\cos(Y(x,y))}(t(x, y) + \cos(Y(x, y)) + \cos(y)) \\ &=& -d + \frac{1}{2} (2 d^2 - d^3) y^2+ (-d - d^2+d^3) xy+\frac{1}{2} (d - d^3)x^2 + O(x^3+x^2y^2+y^3), \end{array}

\displaystyle \begin{array}{rcl} M_{22}(x,y)&=&-\frac{1}{\cos(Y(x,y))} (t(x, y) + \cos(Y(x, y))) \\ &=& (1 - d) + \frac{1}{2} (-2 d + 3 d^2 - d^3) y^2 + (-2 d^2 + d^3) xy + \frac{1}{2} (d + d^2 - d^3) x^2 + O(x^3+x^2y^2+y^3). \end{array}

Since we also have

\displaystyle \pi-X(x,y)=(2 - d) y+(-1 + d)x+O((y^2+x^2)(x+y))


\displaystyle -Y(x,y)=(-1 + d) y-dx+O((y^2+x^2)(x+y)),

it follows that

\displaystyle D_{(x,y)}F^2 = \left(\begin{array}{cc}2d^2-4d+1 & 2d^2-6d+4 \\ 2d^2-2d & 2d^2-4d+1\end{array}\right)+ Q(x,y)+ O(x^4+x^3y+y^4+x^2y^2+xy^3),


\displaystyle Q(x,y) = \left(\begin{array}{cc}Q_{11}(x,y) & Q_{12}(x,y) \\ Q_{21}(x,y) & Q_{22}(x,y)\end{array}\right)


\displaystyle \begin{array}{rcl} Q_{11}(x,y)&=&d (4 - 4 d - 17 d^2 + 33 d^3 - 20 d^4 + 4 d^5) y^2 + d (2 - 4 d + d^2 + 9 d^3 - 12 d^4 + 4 d^5) x^2 \\ &+& 2 d (-3 + 5 d + 4 d^2 - 19 d^3 + 16 d^4 - 4 d^5) x y, \end{array}

\displaystyle \begin{array}{rcl} Q_{12}(x,y)&=&d (6 + 5 d - 42 d^2 + 51 d^3 - 24 d^4 + 4 d^5) y^2 + d (3 - 5 d - 4 d^2 + 19 d^3 - 16 d^4 + 4 d^5) x^2 \\ &+& 2 d (-4 + 4 d + 17 d^2 - 33 d^3 + 20 d^4 - 4 d^5) x y, \end{array}

\displaystyle \begin{array}{rcl} Q_{21}(x,y)&=&d (4 - 6 d - 16 d^2 + 33 d^3 - 20 d^4 + 4 d^5) y^2 + d(1 - 2 d)^2 (1 - 2 d^2 + d^3) x^2 \\ &+& 2 d (-2 + 6 d + 3 d^2 - 19 d^3 + 16 d^4 - 4 d^5) x y , \end{array}

\displaystyle \begin{array}{rcl} Q_{22}(x,y)&=&d (8 + 2 d - 41 d^2 + 51 d^3 - 24 d^4 + 4 d^5) y^2 + d (2 - 6 d - 3 d^2 + 19 d^3 - 16 d^4 + 4 d^5) x^2 \\ &+& 2 d (-4 + 6 d + 16 d^2 - 33 d^3 + 20 d^4 - 4 d^5) x y. \end{array}

In particular, the determinant of the quadratic part of the Taylor expansion of {DF^2} near the periodic orbit of period two {\{p_0, p_1\}} is

\displaystyle \begin{array}{rcl} \frac{1}{d^2}\det Q(x,y) &=& (1 - 3 d - 2 d^2 + 10 d^3 - 8 d^4 + 2 d^5) x^4 - 2 (-2 + d)^2 (1 - d - 4 d^2 + 4 d^3) x^3 y \\ &+& (22 - 27 d - 68 d^2 + 126 d^3 - 68 d^4 + 12 d^5) x^2 y^2 - 2 (-2 + d)^2 (3 + d - 8 d^2 + 4 d^3) x y^3 \\ &+& 2 (-2 + d)^2 (1 - 2 d^2 + d^3) y^4 \end{array}

Thus, for {y=-x}, one has

\displaystyle \frac{1}{d^2}\det Q(x,-x) = x^4 (63 - 70 d - 172 d^2 + 320 d^3 - 176 d^4 + 32 d^5),

so that the quadratic part {Q} of the Taylor expansion of {DF^2} is non-degenerate because {63 - 70 d - 172 d^2 + 320 d^3 - 176 d^4 + 32 d^5\geq 3} for {2\leq d\leq 4/\sqrt{3}}.

Remark 2 {\det Q(x,y)= 12 x^4} in the limit case {d=2}.

Last January 8, 2020, Jialun Li gave the talk “Decrease of Fourier coefficients of stationary measures on the circle” in the “flat seminar” that I co-organize with Anton Zorich once per month.

In this post, I’ll transcript my notes of this nice talk (while taking full responsibility for any errors/mistakes in what follows).

1. Introduction

1.1. Stationary measures

Consider the linear action of {G=SL_2(\mathbb{R})} on {\mathbb{R}^2} induces an action on the projective space {X=\mathbb{P}(\mathbb{R}^2)}. For later use, recall that {X\simeq\mathbb{T}^1=\mathbb{R}/\pi\mathbb{Z}} via

\displaystyle \mathbb{T}^1\ni\theta\mapsto [\cos\theta:\sin\theta]=\mathbb{R}\cdot(\cos\theta,\sin\theta)\in X.

Given a probability measure {\mu} on {G}, we can build a Markov chain / random walk whose steps consist into taking points {y\in X} into {gy} where {g\in G} is chosen accordingly with the law of {\mu}.

The absence of hypothesis on {\mu} might lead to uninteresting random walks: in fact, if a point {x\in X} is stabilized by two elements {g,h\in Stab_x(G)}, then the random walk starting at {x} associated to {\mu=\frac{1}{2}(\delta_g+\delta_h)} is not very interesting.

For this reason, we shall assume that

Hypothesis (i): the support of {\mu} generates a Zariski-dense semigroup {\langle \textrm{supp}(\mu)\rangle}.

Remark 1 By Tits alternative, in our current setting of {G=SL_2(\mathbb{R})}, the hypothesis (i) can be reformulated by replacing “Zariski-dense” with “not solvable”.

As it was famously established by Furstenberg, the random walks associated to {\mu} have a well-defined asymptotic behaviour whenever (i) is fulfilled:

Theorem 1 (Furstenberg) Under (i), there exists (an unique) probability measure {\nu} on {X} such that, for all {x\in X},

\displaystyle \mu^n\ast\delta_x\rightharpoonup\nu

as {n\rightarrow\infty}. Here, the convolution of {\mu} with a probability measure {\eta} on {X} is a probability measure {\mu\ast\eta} on {X} defined as

\displaystyle \mu\ast\eta = \int_{g\in G} g_*\eta \, \, d\mu(g),

so that {\mu^n\ast\delta_x=\underbrace{\mu\ast\dots\ast\mu}_{n \textrm{ times}}\ast\nu} is the distribution of points obtained from {x} after {n} steps of the Markov chain associated to {\mu}.

In the literature, {\nu} is called Furstenberg measure, and it is an important example of {\mu}stationary measure, i.e., a probability measure on {X} which is “invariant on average”:

\displaystyle \nu=\int_{g\in G}g_*\nu \, \, d\mu(g) := \mu\ast\nu.

1.2. Lyapunov exponents

The stationary measure {\nu} can be used to describe the growth of the norms of random products {g_n\dots g_1} associated to {\mu^{\otimes\mathbb{N}}}-almost every {(g_1,\dots, g_n,\dots)\in G^{\mathbb{N}}} whenever {\mu} satisfies (i) and its first moment is finite:

Theorem 2 (Furstenberg, Guivarch–Raugi) If {\mu} has finite first moment, i.e.,

\displaystyle \int_G\log\|g\|\ d\mu(g)<\infty

and {\mu} satisfies (i), then

\displaystyle \lim\limits_{n\rightarrow\infty}\frac{1}{n}\log\|g_n\dots g_1\|= \sigma_{\mu}:=\int_{x=\mathbb{R}v\in X}\int_{g\in G}\log\frac{\|gv\|}{\|v\|} d\mu(g)\,d\nu(x)>0

for {\mu^{\otimes\mathbb{N}}}-almost every {(g_1,\dots, g_n, \dots)\in G^{\mathbb{N}}}.

The quantity {\sigma_{\mu}} is called Lyapunov exponent.

1.3. Regularity of stationary measures

The Furstenberg measure dictates the distribution of the Markov chains associated to {\mu} and, for this reason, it is natural to inquiry about the regularity properties of stationary measures.

In this direction, Guivarch showed that the Furstenberg measures have a certain regularity when {\mu} satisfies (i) and its exponential moment is finite:

Hypothesis (ii): there exists {\theta>0} with {\int_{g\in G} \|g\|^{\theta}\,\,d\mu(g)<\infty}.

Theorem 3 (Guivarch) Under (i) and (ii), there are {\alpha>0} and {C>0} such that

\displaystyle \nu(B(x,r))\leq C r^{\alpha}

for all {x\in X} and {r>0} (where {B(x,r)} is the interval of radius {r} centered at {x\in X\simeq\mathbb{T}^1}). In particular, {\nu} has no atoms.

More recently, Jialun Li established in this article here another regularity result by showing the decay of the Fourier coefficients

\displaystyle \widehat{\nu}(k):=\int_X e^{2ikx}d\nu(x), \quad k\in\mathbb{Z},

(where {X=\mathbb{P}(\mathbb{R}^2)\simeq\mathbb{T}^1=\mathbb{R}/\pi\mathbb{Z}}). More concretely, he proved that:

Theorem 4 (Li) Under (i) and (ii), we have {\lim\limits_{|k|\rightarrow\infty}\hat{\nu}(k)=0}. In other words, {\nu} is a Rajchman measure.

In a certain sense, the role of assumption (i) in the previous theorem is to avoid the following kind of example:

Example 1 Let {\mu=\frac{1}{2}(\delta_g+\delta_h)} with

\displaystyle g=\left(\begin{array}{cc} \frac{1}{\sqrt{3}} & 0 \\ 0 & \sqrt{3}\end{array}\right) \quad \textrm{and} \quad h=\left(\begin{array}{cc} \frac{1}{\sqrt{3}} & \frac{2}{\sqrt{3}} \\ 0 & \sqrt{3}\end{array}\right).

Note that the semigroup generated by {\textrm{supp}(\mu)=\{a,b\}} is not Zariski dense in {G} (as {a} and {b} are upper-triangular).We affirm that there is no decay of Fourier coefficients in this situation. Indeed, recall that if we identify {X} with {\mathbb{R}\cup\{\infty\}} via {X\ni x=\mathbb{R}\cdot (v_1,v_2)\mapsto v_1/v_2\in\mathbb{R}\cup\{\infty\}}, then {G} acts on {\mathbb{R}\cup\{\infty\}} via Möbius transformations, i.e., an element {\left(\begin{array}{cc} a & b \\ c & d\end{array}\right)\in G} acts on {x\in\mathbb{R}\cup\{\infty\}} as

\displaystyle \left(\begin{array}{cc} a & b \\ c & d\end{array}\right)x=\frac{ax+b}{cx+d}.

In particular, {gx=x/3}, {hx=(x+2)/3}, and the Fourier coefficients of the stationary measure given by the standard Hausdorff measure on middle-third Cantor set do not decay to zero.In a similar vein, if {0<\lambda<1} is a real number such that {1/\lambda} is a Pisot number, then {\mu=\frac{1}{2}(\delta_{g_{\lambda}}+\delta_{h_{\lambda}})} with

\displaystyle g_{\lambda}=\left(\begin{array}{cc} \sqrt{\lambda} & -\frac{1}{\sqrt{\lambda}} \\ 0 & \frac{1}{\sqrt{\lambda}}\end{array}\right) \quad \textrm{and} \quad h=\left(\begin{array}{cc} \sqrt{\lambda} & \frac{1}{\sqrt{\lambda}} \\ 0 & \frac{1}{\sqrt{\lambda}}\end{array}\right)

admits a stationary measure {\nu_{\lambda}} (called the Bernoulli convolution of parameter {\lambda} describing the distribution of the points {\sum\limits_{j\geq0}\varepsilon_j \lambda^j} where {\varepsilon_j=\pm1} with probability {1/2}) whose Fourier coefficients do not decay.

The proof of Theorem 4 is based on a renewal theorem. More concretely, given a function {f\in C^{\infty}_c(\mathbb{R})}, let

\displaystyle Rf(t):=\sum\limits_{n=1}^{\infty} \int f(\log\|g\|-t) d\mu^n(g).

By thinking of {f} as a smooth version of the characteristic function of an interval {I\subset \mathbb{R}}, we see that {Rf(t)} is “counting random products {g_n\dots g_1} with norm in the interval {\exp(I+t)}”. In this context, Guivarch and Le Page established the following renewal theorem:

Theorem 5 (Guivarch–Le Page) Under (i) and (ii), one has

\displaystyle Rf(t)\rightarrow \frac{1}{\sigma_{\mu}}\int_{\mathbb{R}} f(u) \, du

as {t\rightarrow\infty}.

Remark 2 Another important fact in the proof of Theorem 4 is the non-arithmeticity of the Jordan projections {\lambda(g)=\log (\textrm{top eigenvalue of }g)} of the elements of {\langle \textrm{supp}(\mu)\rangle}, i.e., the fact that these Jordan projections generate a dense subgroup of {\mathbb{R}} (whenever (i) is satisfied).

Since we will come back later to the discussion of deriving the decay of Fourier coefficients (e.g., Theorem 4) from a renewal theorem, let us now move forward in order to introduce the main result of this post, namely, a quantitative version of Theorem 4.

2. Quantitative decay of Fourier coefficients

The central result of this post is inspired by the following theorem of Bourgain and Dyatlov.

Theorem 6 (Bourgain–Dyatlov) If {\nu_{PS}} is the Patterson–Sullivan measure associated to a Schottky subgroup of {SL_2(\mathbb{R})}, then there exists {\varepsilon>0} (depending only on the dimension of {\nu_{PS}}, i.e., the Hausdorff dimension of the limit set of the Schottky subgroup) such that

\displaystyle |\widehat{\nu_{PS}}(k)|=O(|k|^{-\varepsilon})

for all {k\in\mathbb{Z}}.

The method of proof of this result is based on the so-called discretized sum-product estimates from additive combinatorics.

Interestingly enough, this result can be interpreted as a decay of Fourier coefficients of certain stationary measures thanks to the following theorem:

Theorem 7 (Furstenberg, Sullivan, …) The Patterson–Sullivan measure {\nu_{PS}} of a Schottky subgroup coincides with the stationary measure {\nu} of some probability measure {\mu} on {G} satisfying (i) and (ii).

Remark 3 We saw the proof of a version of this result for cocompact lattices of {SL_2(\mathbb{R})} in Proposition 14 of this blog post here.

The previous theorems suggest that a decay of Fourier coefficients of the Furstenberg measure associated to a probability measure {\mu} on {G} satisfying (i) and (ii). This statement was recently proved by Jialun Li in this article here.

Theorem 8 (Li) If {\mu} is a probability measure on {G=SL_2(\mathbb{R})} satisfying (i) and (ii), then there exists {\varepsilon>0} such that the Furstenberg measure {\nu} associated to {\mu} verifies

\displaystyle |\widehat{\nu}(k)|=O(|k|^{-\varepsilon})

for all {k\in\mathbb{Z}}.

Remark 4 Actually, Li’s theorem is stated in his article for any real split semisimple Lie group {G}.

The proof of this result is also based on a discretized sum-product estimate. Moreover, this statement is closely related to spectral gap of transfer operators and a renewal theorem:

Theorem 9 (Li) Let {\mu} be a probability measure on {G} verifying (i) and (ii). Given {b\in \mathbb{R}}, consider the transfer operator

\displaystyle P_{ib} f(x) := \int_G \exp(i b \log\frac{\|gv\|}{\|v\|}) f(gx) \, d\mu(g), \quad x=\mathbb{R}\cdot v\in X,

acting on {f\in C^{\gamma}(X)} (with {\gamma>0} small enough). Then,we have the following spectral gap property: there exists {\rho<1} such that the spectral radius of {P_{ib}} satisfies

\displaystyle \rho(P_{ib})<\rho

for all {|b|>1}.

Theorem 10 (Li) Under (i) and (ii), there exists {\varepsilon>0} such that the renewal operator {Rf(t)=\sum\limits_{n=1}^{\infty} \int f(\log\|g\|-t) d\mu^n(g)} satisfies

\displaystyle Rf(t) = \frac{1}{\sigma_{\mu}}\int_{\mathbb{R}} f(u) \, du + O(e^{-\varepsilon t} |f|_{C^2})

for all {f\in C^{\infty}_c(\mathbb{R})}.

In his article, Li establishes first Theorem 8 from a discretized sum-product estimate, and subsequently Theorems 9 and 10 are deduced from Theorem 8.

Nevertheless, Li pointed out in his talk that Theorems 89 and 10 are “morally equivalent” to each other. In fact,

  • Theorem 8 {\implies} Theorem 9: the Fourier decay can be used to prove spectral gap for transfer operators via the so-called Dolgopyat method (which was discussed in this blog post here);
  • Theorem 9 {\implies} Theorem 10: the spectral gap for transfer operators allows to deduce the renewal theorem because some elementary calculations reveal that {Rf(t)} is related to {(Id-P_{ib})^{-1}};
  • Theorem 10 {\implies} Theorem 8: let us finally fulfil our promise made in the end of the previous section by briefly explaining the idea of the derivation of the Fourier decay in Theorem 8 from the renewal theorem in Theorem 10; since {\mu^n\ast\nu=\nu}, the {k}th Fourier coefficient of the Furstenberg measure is

    \displaystyle \widehat{\nu}(k) = \int_X e^{2ikx}\,d\nu(x) = \int_G \int_X e^{2ikx}\,d\mu^n(g)\,d\nu(x);

    by Cauchy–Schwarz inequality, the control of {|\widehat{\nu}(k)|} is reduced to the study of

    \displaystyle \int_G e^{2ik(gx-gy)} d\mu^n(g);

    since {gx-gy\sim \|g\|^{-1} d(x,y)}, we see that the size of the integral above depends on the “number of random products {g=g_n\dots g_1} with norm {\|g\|} in a given interval”, and the answer to this kind of “counting problem” is encoded in the asymptotic property of the renewal operator {Rf(t)} provided by Theorem 10.

Remark 5 The analog of Theorem 10 in Abelian settings is false: the random walks driven by a finitely supported law {\lambda} on {\mathbb{R}} which is not arithmetic (i.e., its support generates a dense subgroup) verify a renewal theorem

\displaystyle Rf(t)=\sum\limits_{n=1}^{\infty}\int f(x-t) d\lambda^n(x)\rightarrow \frac{1}{\mathbb{E}(\lambda)}\int f(u) \, du \quad \textrm{as } t\rightarrow\infty

for {f\in C^{\infty}_c(\mathbb{R})}, but the error term is never exponential because {\#\textrm{supp}(\lambda^n)} grows polynomially with {n}. (Of course, this phenomenon is avoided in the context of {SL_2(\mathbb{R})} thanks to the fact that the Zariski-density assumption (i) on {\mu} ensures an exponential growth of {\#\textrm{supp}(\mu^n)} with {n}.)

Posted by: matheuscmss | January 2, 2020

Breuillard–Sert’s joint spectrum (III)

Last time, we saw that if {S\subset G} is a compact subset of reductive, real linear algebraic group {G} such that the monoid {\langle S\rangle} generated by {S} is Zariski dense in {G}, then the Cartan projections {\frac{1}{n}\kappa(S^n)} and the Jordan projections {\frac{1}{n}\lambda(S^n)} associated to {S} converge in the Hausdorff topology to the same limit {J(S)}, an object baptised “joint spectrum of {S}” by Breuillard and Sert.

Today, I’ll transcript below my notes of a talk by Romain Dujardin explaining to the participants of our groupe de travail some basic convexity and continuity properties of the joint spectrum. After that, we close the post with a brief discussion of the question of prescribing the joint spectrum.

As usual, all mistakes in what follows are my sole responsibility.

1. Preliminaries

Let us warm up by reviewing the setting of the previous posts of this series.

Let {G} be a reductive real linear algebraic group and denote its rank by {d}. By definition, a maximal torus {A\subset G} is isomorphic to {(\mathbb{R}^*_+)^d}.

The Cartan decomposition {G=KAK} (with {K} a maximal compact subgroup of {G}) allows to write any {g\in G} as {g\in K\exp(\kappa(g))K} for an unique {\kappa(g)\in\mathfrak{a}^+} where {\mathfrak{a}^+} is a choice of Weyl chamber in the Lie algebra {\mathfrak{a}} of {A}. The interior of the Weyl chamber {\mathfrak{a}^+} is denoted by {\mathfrak{a}^{++}}.

Example 1 For {G=GL_n}, we can take {\mathfrak{a}^+\simeq \{(x_1,\dots,x_n)\in\mathbb{R}^n: x_1\geq\dots\geq x_n\}} in {\mathfrak{a}\simeq \mathbb{R}^n}, so that {\mathfrak{a}^{++}\simeq\{(x_1,\dots,x_n)\in\mathbb{R}^n: x_1>\dots>x_n\}}.

The element {\kappa(g)\in\mathfrak{a}^+} is called the Cartan projection of {g}.

Example 2 For {G=GL_n}, {\kappa(g)=(\log a_1(g),\dots, \log a_n(g))\in\mathfrak{a}^+}, where {a_1(g)\geq \dots\geq a_n(g)>0} are the singular values of {g}.

Similarly, the Jordan projection {\lambda(g)} is defined in terms of the Jordan-Chevalley decomposition. For {G=GL_n}, this amounts to write the Jordan normal form {g=d+n = d(1+d^{-1}n)} with {d} diagonalisable and {n} nilpotent, so that {g=d(1+d^{-1}n) = \widetilde{g}_s g_u = g_e g_s g_u} with {g_u=1+d^{-1}n} unipotent, {g_e\in O(n)}, and {g_s=\exp(\lambda(g))} has eigenvalues {|\lambda_1(g)|\geq\dots\geq|\lambda_n(g)|} where {\lambda_i(g)} are the eigenvalues of {g} (ordered by decreasing sizes of their moduli).

The group {G} has a family {\rho_1,\dots, \rho_d} of distinguished representations such that the components of the vectors {\kappa(g)}, resp. {\lambda(g)}, are linear combinations of {\log\|\rho_i(g)\|}, resp. {\log|\lambda_1(\rho_i(g))|}. In particular, the usual formula for the spectral radius implies that {\frac{1}{n}\kappa(g^n)\rightarrow\lambda(g)} as {n\rightarrow\infty} (and, as it turns out, this fact is important in establishing the coincidence of the limits of the sequences {\frac{1}{n}\kappa(S^n)} and {\frac{1}{n}\lambda(S^n)}).

Example 3 For {G=GL_n}, the representations {\rho_i} of {G} on {\wedge^i \mathbb{R}^n}, {1\leq i \leq n}, have the property that the eigenvalue of {\rho_i(g)} with the largest modulus is {\lambda_1(g)\dots\lambda_i(g)}.

The rank {d} of {G} can be written as {d=d_s+(d-d_s)} where {d-d_s} is the dimension of the center {Z(G)} of {G}. In the literature, {d_s} is called the semi-simple rank of {G}. In general, we have {d_s} “truly” distinguished representations which are completed by a choice of {d-d_s} characters of {G/[G,G]}.

Example 4 For {G=GL_n}, {n=(n-1)+1}, and the representations {\rho_i} from the previous example have the property that {\rho_i} with {1\leq i<n} is “truly” distinguished and the determinant representation {\rho_n} comes from the center.

Remark 1 Recall that a weight {\chi=\exp\circ\overline{\chi}\circ\log} of a representation {(V,\rho)} of {G} is a generalized eigenvalue associated to a non-trivial {A}-invariant subspace, i.e., {\chi} is a weight whenever

\displaystyle \{0\}\neq V_{\chi}=\{v\in V: \rho(a)v=\chi(a)v:=\exp(\overline{\chi}(\log a))v \, \, \,\, \forall\, \, a\in A\}.

The weights are partially ordered via {\overline{\chi}_1\leq\overline{\chi}_2} if and only if {\overline{\chi}_1(\log a)\leq \overline{\chi}_2(\log a)} for all {a\in A}, and any irreducible representation {(V,\rho)} possesses an unique maximal weight {\overline{\chi}_{\rho}} (and, as it turns out, {V_{\chi_{\rho}}} is one-dimensional).In this context, the distinguished representations {\rho_1,\dots,\rho_{d_s}} form a family of representations whose maximal weights {\overline{\chi}_{\rho_i}} provide a basis of {Hom(\mathfrak{a},\mathbb{R})}.

A matrix {T\in GL_n} is proximal when its projective action on {\mathbb{P}(\mathbb{R}^n)} possesses an attracting fixed point {v_T^+} and a repulsive hyperplane {H_T^<}. Also, an element {g\in G} is called {G}proximal if and only if the matrices {\rho_i(g)} are proximal for all {1\leq i\leq d_s} (or, equivalently, {\lambda(g)\in \mathfrak{a}^{++}}).

A matrix {T\in GL_n} is {(r,\varepsilon)}-proximal whenever {T} is proximal, {d(v_T^+, H_T^<)\geq 2r}, and {d(Tx, Ty)\leq \varepsilon d(x,y)} for all {d(x,H_T^<)\geq\varepsilon}, {d(y,H_T^<)\geq\varepsilon} (where {d} is the Fubini-Study on the projective space {\mathbb{P}(\mathbb{R}^n)}). Moreover, {g\in G} is {(G,r,\varepsilon)}proximal if and only if the matrices {\rho_i(g)} are {(r,\varepsilon)}-proximal for all {1\leq i\leq d_s}.

A beautiful theorem of Abels–Margulis–Soifer asserts that {(G,r,\varepsilon)}-proximal elements are really abundant: given a Zariski-dense monoid {\Gamma} of {G}, there exists {r=r(\Gamma)>0} such that for all {0<\varepsilon<r}, one can find a finite subset {F=F(\Gamma, r, \varepsilon)\subset\Gamma} with the property that for any {g\in G}, one can find {f\in F} with {gf} {(G,r,\varepsilon)}-proximal.

In the previous post of this series, we saw that Abels–Margulis–Soifer was at the heart of Breuillard–Sert proof of the following result:

Theorem 1 If {S\subset G} is compact and the monoid {\langle S\rangle} generated by {S} is Zariski-dense in {G}, then the sequences {\frac{1}{n}\kappa(S^n)} and {\frac{1}{n}\lambda(S^n)} converge in Hausdorff topology to a compact subset {J(S)\subset \mathfrak{a}^+} called the joint spectrum of {S}.

After this brief review of the definition of the joint spectrum, let us now study some of its basic properties.

2. Convexity of the joint spectrum

Theorem 2 {J(S)} is a convex subset of {\mathfrak{a}^+}.

Remark 2 Later, we will see some sufficient conditions to get {\textrm{int}(J(S))\neq\emptyset}.

Similarly to the proof of Theorem 1, some important ideas behind the proof of Theorem 2 are:

  • the Jordan projection {\lambda} behaves well under powers: {\lambda(g^k)=k\lambda(g)};
  • the Cartan projection {\kappa} is subadditive: {\|\kappa(gh)-\kappa(g)\|=O_G(\|h\|)};
  • the Cartan and Jordan projections of proximal elements are comparable: there is a constant {C_r>0} such that {|\lambda(g)-\kappa(g)|\leq C_r} for all {g\in G} {(G,r,\varepsilon)}-proximal;
  • Abels–Margulis–Soifer provides a huge supply of proximal elements.

We start to formalize these ideas with the following lemma:

Lemma 3 If {g} and {h} are {(G,r,\varepsilon)}-proximal elements, then there are {u\in\langle S\rangle} and {M>0} such that {\|\lambda(g^k u h^k u)-k\lambda(g)-k\lambda(h)\|\leq M} for all {k\in\mathbb{N}}.

Proof: After replacing {g} by the matrix {\rho_i(g)}, our task is reduced to study the behaviours of the eigenvalues {\lambda_1(g)} of largest moduli of proximal matrices {g}.

By definition of proximality, the matrices {\frac{1}{\lambda_1(g^k)}g^k} converge to a projection {\pi_g} on {\mathbb{R}v_g^+} parallel to {H_g^<} as {k\rightarrow\infty}. Also, an analogous statement is valid for {h}. In particular, for any {u}, one has

\displaystyle \frac{g^k u h^k u}{\lambda_1(g^k)\lambda_1(h^k)}\rightarrow \pi_g u \pi_h u

as {k\rightarrow\infty}.

It is not hard to show that there exists {u\in \langle S\rangle} such that {\pi_g u \pi_h u} is not nilpotent: in fact, this happens because {\langle S\rangle} is Zariski-dense and the nilpotency condition can be describe in polynomial terms. In particular, {|\lambda_1(\pi_g u \pi_h u)|>0} and, by continuity, there exists {M>1} with

\displaystyle \left|\log\frac{|\lambda_1(g^k u h^k u)|}{|\lambda_1(g^k)|\cdot|\lambda(h^k)|}\right|\leq \log M

for all {k\in\mathbb{N}}. This ends the proof. \Box

At this point, we are ready to prove Theorem 2. Since {J(S)} is a compact subset of {\mathfrak{a}^+}, the proof of its convexity is reduced to show that {\frac{x+y}{2}\in J(S)} for all {x, y\in J(S)}.

For this sake, we begin by applying Abels–Margulis–Soifer theorem in order to fix {r>0} and a finite subset {F\subset\langle S\rangle} so that for any {w\in G} we can find {f\in F} with {wf} {(G,r,\varepsilon)}-proximal. By definition, there exists {m_0\in\mathbb{N}} such that any {f\in F} satisfies {f\in S^{n_f}} for some {n_f\leq m_0}.

Next, we consider {x, y\in J(S)} and we recall that {J(S)=\lim\frac{1}{n}\lambda(S^n)=\lim\frac{1}{n}\kappa(S^n)}. Hence, given {\delta>0}, we have that for all {n\in \mathbb{N}} sufficiently large, there are {g, h\in S^n} with

\displaystyle |\frac{1}{n}\kappa(g)-x|<\delta \quad \textrm{and} \quad |\frac{1}{n}\kappa(h)-y|<\delta.

Now, we select {f_g, f_h\in F} with {gf_g} and {h f_h} {(G,r,\varepsilon)}-proximal. Recall that, by proximality, there exists a constant {C_r>0} with

\displaystyle \|\lambda(gf_g)-\kappa(gf_g)\|\leq C_r \quad \textrm{and} \quad \|\kappa(gf_g)-\kappa(g)\|\leq C_r

(and an analogous statement is also true for {hf_h}). Furthermore, by Lemma 3, there are {u\in\langle S\rangle}, say {u\in S^{p(n)}}, and {M>0} with

\displaystyle \|\lambda((g f_g)^k u (h f_h)^k u) - k\lambda(g f_g) - k\lambda(h f_h)\|\leq M

for all {k\in\mathbb{N}}. Observe that {(g f_g)^k u (h f_h)^k u\in S^{2kn+2k(n_f+n_g)+2p(n)}}.

By dividing by {2kn+2k(n_f+n_g)+2p(n)}, by taking {n} large (so that {n\gg m_0\geq n_f, n_g}) and by letting {k\rightarrow \infty} (so that {k\gg p(n)}), we see that

\displaystyle \|\frac{1}{2kn+2k(n_f+n_g)+2p(n)}\lambda((g f_g)^k u (h f_h)^k u)-\frac{x+y}{2}\|\leq 2\delta

for {n} and {k} sufficiently large.

Since {\delta>0} is arbitrary and {J(S)} is closed, this proves that {(x+y)/2\in J(S)}. This completes the proof of Theorem 2.

3. Continuity properties of the joint spectrum

3.1. Domination and continuity

Definition 4 We say that {S\subset GL_d(\mathbb{R})} is {1}-dominated if there exists {\delta>0} such that

\displaystyle \frac{a_2(g)}{a_1(g)}\leq (1-\delta)^n

for all {n} sufficiently large and {g\in S^n}. (Recall that {a_1(g)\geq a_2(g)\geq\dots\geq a_d(g)} are the singular values of {g}.)

Definition 5 We say that {S\subset G} is {G}-dominated if {\rho_i(S)} is {1}-dominated for all {1\leq i\leq d_s}.

Remark 3 If {S} is {G}-dominated, then it is possible to show that the joint spectrum {J(S)} is well-defined even when {S} is not Zariski dense in {G}.

The next proposition asserts that the notion of {G}-domination generalizes the concept of matrices with simple spectrum (i.e., all of its eigenvalues have distinct moduli and multiplicity one).

Proposition 6 {S} is {G}-dominated if and only if {J(S)\subset\mathfrak{a}^{++}}.

On the other hand, the notion of {1}-domination is related to Schottky families.

Definition 7 We say that {E\subset GL_d(\mathbb{R})} is a {(r,\varepsilon)}-Schottky family if

  • (a) any {\gamma\in E} is {(r,\varepsilon)}-proximal;
  • (b) {d(v_{\gamma}^+, H_{\gamma'}^<)\geq 6\varepsilon} for all {\gamma, \gamma'\in E}.

Proposition 8 {S\subset GL_d(\mathbb{R})} is {1}-dominated {\iff} there are {n\in\mathbb{N}} and {0<\varepsilon<r} so that {S^n} is a {(r,\varepsilon)}-Schottky family.

Proof: Let us first establish the implication {\Longleftarrow}. It is not hard to see that if {S^n} is {1}-dominated, then {S} is {1}-dominated. Therefore, we can assume that {S} is a {(r,\varepsilon)}-Schottky family. At this point, we invoke the following lemma due to Breuillard–Gelander:

Lemma 9 (Breuillard–Gelander) If {g\in GL_d(\mathbb{R})} is {\varepsilon}-Lipschitz on an non-empty open subset {\Omega} of {\mathbb{P}(\mathbb{R}^d)}, then {a_2(g)/a_1(g)\leq \varepsilon/\sqrt{1-\varepsilon^2}}.

Proof: Thanks to the {KAK} decomposition, we can assume that {g=\textrm{diag}(a_1,\dots, a_d)}. Given {[v]\in\Omega} and {\delta>0} sufficiently small, our assumption on {g} implies that {d([gv], [gv+\delta ge_1])<\varepsilon d([v],[v+\delta e_1])} and {d([gv], [gv+\delta ge_2])<\varepsilon d([v],[v+\delta e_2])}. These inequalities imply the desired fact that {a_2(g)/a_1(g)\leq \varepsilon/\sqrt{1-\varepsilon^2}} after some computations with the Fubini-Study metric {d}. \Box

If {S} is a {(r,\varepsilon)}-Schottky family, then all elements of {S^n} are {\varepsilon^n}-Lipschitz on a neighborhood of any fixed {v_s^+}, {s\in S} for all {n\in\mathbb{N}}. By the previous lemma, we conclude that {a_2(g)/a_1(g)\leq 2\varepsilon^n} for all {n} sufficiently large and {g\in S^n}. Thus, {S} is {1}-dominated.

Let us now prove the implication {\Longrightarrow}. For this sake, we use a result of Bochi–Gourmelon (justifying the nomenclature “domination”): {S} is {1}-dominated if and only if there is a dominated splitting for a natural linear cocycle over the full shift dynamics on {S^{\mathbb{Z}}}, i.e.,

  • Splitting condition: there are continuous maps {E^u:S^{\mathbb{Z}}\rightarrow\mathbb{P}(\mathbb{R}^d)} and {E^s:S^{\mathbb{Z}}\rightarrow\textrm{Gr}(d-1,\mathbb{R})} such that {\mathbb{R}^d=E^u(x)\oplus E^s(x)} for all {x\in S^{\mathbb{Z}}} (here, {\textrm{Gr}(d-1,\mathbb{R})} is the Grassmannian of hyperplanes of {\mathbb{R}^d});
  • Invariance condition: {E^u(\sigma x) = x_0 (E^u(x))} and {E^s(\sigma x)=x_0(E^s(x))} for all {x=(\dots, x_{-1},x_0,x_1,\dots)\in S^{\mathbb{Z}}} (here, {\sigma} denotes the left shift dynamics {\sigma((x_i)_{i\in\mathbb{Z}}) = (x_{i+1})_{i\in\mathbb{Z}}});
  • Domination condition: the weakest contraction along {E^u} dominates the strongest expansion along {E^s}, that is, there are {C>0} and {0<\tau<1} such that {\|x_{n-1}\dots x_0|_{E^s(x)}\| \leq C\tau^n \|x_{n-1}\dots x_0|_{E^u(x)}\|} {\forall} {x=(\dots, x_{-1},x_0,x_1,\dots)\in S^{\mathbb{Z}}}.

Remark 4 For {S\subset SL(2,\mathbb{R})}, the equivalence between {1}-domination and the presence of dominated splittings was established by Yoccoz.

An important metaprinciple in Dynamics (going back to the classical proofs of the stable manifold theorem) asserts that “stable spaces depend only on the future orbit”. In our present context, this is reflected by the fact that one can show that {E^s(x)} depends only on {x_0, x_1,\dots} and {E^u(x)} depends only on {x_{-1}, x_{-2}, \dots} for all {x=(\dots, x_{-1},x_0,x_1,\dots)\in S^{\mathbb{Z}}}.

An interesting consequence of this fact is the following statement about the “non-existence of tangencies between {E^u} and {E^s}”: if {S} is {1}-dominated, then {E^u(x)\notin E^s(y)} for all {x, y\in S^{\mathbb{Z}}}. Indeed, this statement can be easily obtained by contradiction: if {E^u(x)\in E^s(y)} for some {x=(\dots, x_{-1}, x_0, x_1,\dots)} and {y=(\dots, y_{-1}, y_0, y_1,\dots)}, then {z:=(\dots, x_{-2}, x_{-1}, y_0, y_1,\dots)} has the property that {E^u(z)=E^u(x)} and {E^s(z)=E^s(y)}. Hence, {E^u(z) + E^s(z)=E^s(z)\neq \mathbb{R}^d}, a contradiction with the splitting condition above.

At this stage, we are ready to show that if {S} is {1}-dominated, then {S^n} is a {(r,\varepsilon)}-Schottky family for some {n\in\mathbb{N}} and {0<\varepsilon\leq r}. In fact, given {g\in S^n}, let {x(g)\in S^{\mathbb{Z}}} be the periodic sequence obtained by infinite concatenation of the word {g}. We affirm that, for {n} sufficiently large, {g} is proximal with {v_g^+= E^u(x(g))} and {H_g^<=E^s(x(g))}, and {\varepsilon}-Lipschitz outside the {\varepsilon}-neighborhood of {H_g^<}. This happens because the compactness of {S^{\mathbb{Z}}} and the non-existence of tangencies between {E^u} and {E^s} provide an uniform transversality between {E^u} and {E^s}. By combining this information with the domination condition above (and the fact that {C\tau^n\ll 1} for {n} sufficiently large), a small linear-algebraic computation reveals that any {g\in S^n} is proximal and {\varepsilon}-Lipschitz outside the {\varepsilon}-neighborhood of {H_g^<=E^s(x(g))} for adequate choices of {\varepsilon>0} and {n\in\mathbb{N}}. \Box

The proof of the previous proposition gave a clear link between {1}-domination and the notion of dominated splittings. Since a dominated splitting is robust under small perturbations (because they are detected by variants of the so-called cone field criterion), a direct consequence of the proof of the proposition above is:

Corollary 10 The {G}-domination property is open: if {S} is {G}-dominated, then any {S'} included in a sufficiently small neighborhood of {S} is also {G}-dominated.

The previous proposition also links {1}-domination to Schottky families and, as it turns out, this is a key ingredient to obtain the continuity of the joint spectrum in the presence of domination.

Theorem 11 If {S_0} is {G}-dominated, then the map {S\mapsto J(S)} is continuous at {S_0}.

Very roughly speaking, the proof of this result relies on the fact that if a matrix is “very Schottky” (like a huge power of a proximal matrix), then this matrix is quite close to a rank 1 operator and, in this regime, the Jordan projection {\lambda} behaves in an “almost additive” way.

3.2. Examples of discontinuity

3.2.1. Calculation of a joint spectrum in {SL_2(\mathbb{R})}

Recall that {SL_2(\mathbb{R})} acts on Poincaré disk {\mathbb{D}} by isometries of the hyperbolic metric. Consider {S=\{a,b\}}, where {a} and {b} are loxodromic elements of {SL_2(\mathbb{R})} acting by translations along disjoint oriented geodesic axis {\rho_a} and {\rho_b} on {\mathbb{D}} from {x_a^-\in\partial\mathbb{D}} to {x_a^+\in\partial\mathbb{D}} and from {x_b^-\in\partial\mathbb{D}} to {x_b^+\in\partial\mathbb{D}}. We assume that the endpoints of the axes {\rho_a} and {\rho_b} are cyclically order on {\partial\mathbb{D}} as {x_a^-, x_b^-, x_b^+, x_a^+}, and we denote by {\tau_a=2\log\lambda_1(a)} and {\tau_b=2\log\lambda_1(b)} the translation lengths of {a} and {b} along {\rho_a} and {\rho_b}.

In the sequel, we want to compute {J(S)\subset\mathbb{R}} and, for this sake, we need to understand {\frac{1}{n}\log\lambda_1(w(a,b))} where {w(a,b)} is a word of length {n} on {a} and {b}.

Proposition 12 If {a} and {b} are elements of {SL_2(\mathbb{R})} as above, {d>0} denotes the distance between the axes of {a} and {b}, and {\tau_b=\tau_a+2d+1}, then {J(S)} is the interval

\displaystyle J(S)=[\tau_a/2, \tau_b/2].

Proof: One can show (using hyperbolic geometry) that {ab} is a loxodromic element whose axis stays between the axes of {a} and {b} while going from a point in {[x_a^-, x_b^-]} to a point in {[x_b^+, x_a^+]}, and the translation length of {ab} satisfies

\displaystyle \cosh(\tau_{ab}/2) = \cosh(d)\sinh(\tau_a/2)\sinh(\tau_b/2)+\cosh(\tau_a/2)\cosh(\tau_b/2).

In particular, {\tau_a+\tau_b\leq \tau_{ab}\leq \tau_a+\tau_b+2d}.

We affirm that if {w(a,b)} is a word on {a} and {b} and {\widetilde{w}(a,b)} is a word obtained from {w(a,b)} by replacing some letter {a} by {b}, then {\tau_{\widetilde{w}}\geq \tau_w+1}. In fact, by performing a conjugation if necessary, we can assume that {w=w'a} and {\widetilde{w}=w'b}, so that {\tau_{\widetilde{w}}\geq\tau_{w'}+\tau_b=\tau_{w'}+\tau_a+2d+1} and {\tau_w\leq\tau_{w'}+\tau_a+2d}.

Therefore, if we start in {S^n} with {a^n} and we successively replace {a} by {b} until we reach {b^n}, then we see from the claim in the previous paragraph that {\frac{1}{n}\lambda(S^n)} becomes denser in {[\tau_a/2, \tau_b/2]} as {n\rightarrow\infty}. This proves that {J(S)=[\tau_a/2,\tau_b/2]}. \Box

3.2.2. Some joint spectra in {GL_2(\mathbb{R})}

Let {a, b\in SL_2(\mathbb{R})} as above and fix {\alpha>0}. We assume that there exists {k\in\mathbb{N}} such that {b^{-1}=Ra^kR} where {R} is the rotation by {\pi/2}.

The joint spectrum {J(S_{\infty})} of {S_{\infty}=\{\alpha\cdot\textrm{Id}, a, b\}} in the plane with axis {\log\lambda_1} and {\log\lambda_2} is a triangle with vertex at {(\log\alpha, \log\alpha)}, intersecting the {\log\lambda_1}-axis on the interval {[\log\lambda_1(a), \log\lambda_1(b)]}, and the side opposite to the vertex {(\log\alpha, \log\alpha)} contained in the line {\log\lambda_2=-\log\lambda_1}. Indeed, one eventually get this description of {J(S_{\infty})} because {S_{\infty}^n}, {n\in\mathbb{N}}, can be computed explicitly in terms of the joint spectrum of {\{a,b\}} thanks to the fact that {\alpha\cdot\textrm{Id}} commutes with {a} and {b}. Note that {(0,0)\notin J(S_{\infty})}.

Let us now consider {S_m=\{\alpha\cdot R_m, a, b\}}, where {R_m} denotes the rotation by {\pi/2m}. We affirm that {(0,0)\in J(S_m)} for all {m\in\mathbb{N}} and, a fortiori, {J(.)} is discontinuous at {S_{\infty}} (because {S_m\rightarrow S_{\infty}} as {m\rightarrow\infty}). In fact, given {m\in\mathbb{N}}, since {b^{-1}=Ra^kR}, the word

\displaystyle w_n = b^n(\alpha\cdot R_m)^m a^{kn} (\alpha\cdot R_m)^m\in S_m^{(k+1)n+2m}

equals to {\alpha^{2m}\cdot \textrm{Id}}. Therefore,

\displaystyle \frac{2m(\log\alpha,\log\alpha)}{(k+1)n+2m}=\frac{1}{(k+1)n+2m}\lambda(w_n)\in\frac{1}{(k+1)n+2m}\lambda(S_m^{(k+1)n+2m})

and, by letting {n\rightarrow\infty}, we conclude that {(0,0)\in J(S_m)}, as desired.

4. Prescribing the joint spectrum

We close this post with a brief sketch of the following result:

Theorem 13

  • (1) If {\mathcal{C}} is a convex body dans {\mathfrak{a}^+}, there exists a compact subset {S} of {G} generating a Zariski-dense monoid such that {J(S)=\mathcal{C}}.
  • (2) Moreover, if {\mathcal{C}} is a convex polyhedron with a finite number of vertices, then there exists a finite subset {S\subset G} generating a Zariski-dense monoid such that {J(S)=\mathcal{C}}.

Proof: (1) If we forget about the Zariski-denseness condition, then we could take simply {S=\exp(\mathcal{C})}. In order to respect the Zariski-density constraint, we fix {a_0\in\textrm{int}(\mathcal{C})} and we set {S=\exp(\mathcal{C})\cup \exp(a_0)V} where {V} is a small neighborhood of the identity. In this way, the monoid generated by {S} is Zariski-dense and it is possible to check that {J(S)=\mathcal{C}} whenever {V} is sufficiently small.

(2) Given a finite set {\mathcal{C}_0} whose convex hull is {\mathcal{C}}, we can take {S=\exp(\mathcal{C}_0)\cup \exp(a_0) F} where {F\subset V} is a finite set with sufficiently many points so that the monoid generated by {S} is Zariski-dense. \Box

Posted by: matheuscmss | December 22, 2019

Dartyge’s talk on ellipsephic integers

Last November, I attended the beautiful conference Prime Numbers, Determinism and Pseudorandomness at CIRM. This conference was originally prepared to celebrate the 60th birthday of Christian Mauduit, but unfortunately a tragic event during the summer of 2019 made that this conference ended up becoming a celebration of the memory of Christian.

The links to the titles, abstracts, slides and videos for the talks of this excellent meeting can be found here.

In this blog post, I would like to transcript my notes for the amazing survey talk “On ellipsephic integers” by Cécile Dartyge on one of Christian’s favorite topics in Analytic Number Theory, namely, the statistics of integers missing some digits.

Of course, all mistakes in the sequel are my sole responsibility.

1. Introduction

Ellipsephic integers refers to a collection of integers with missing digits in a certain basis (e.g., all integers whose representation in basis 10 doesn’t contain the digit 9). Christian Mauduit proposed this nomenclature partly because ellipsis = missing and psiphic = digit in Greek.

Formally, we consider a basis {r\in\mathbb{N}}, {r\geq 3}, and a subset {\mathcal{D}} of {\{0, 1, \dots, r-1\}} of cardinality {2\leq\#\mathcal{D}\leq r-1}. The corresponding set of ellipsephic integers {W_{\mathcal{D}}} is

\displaystyle W_{\mathcal{D}}:=\left\{n=\sum\limits_{j=0}^k a_j r^j: a_j\in\mathcal{D} \, \, \,\forall\,1\leq j\leq k\right\}.

The subset of ellipsephic integers below a certain threshold {x} is denoted by

\displaystyle W_{\mathcal{D}}(x):=\{n\in W_{\mathcal{D}}: n< x\}.

For the sake of exposition, we shall assume from now on that {0\in\mathcal{D}} and

\displaystyle gcd(d\in\mathcal{D})=1

unless it is explicitly said otherwise.

2. Ellipsephic integers on arithmetic progressions

Let {W_{\mathcal{D}}(x,a,q) := \{n\in W_{\mathcal{D}}(x): n\equiv a \, (\textrm{mod } q)\}}. Despite their sparseness, it was proved by Erdös, Mauduit and Sárközy that ellipsephic integers behave well (i.e., “à la Siegel-Walfisz”) along arithmetic progressions:

Theorem 1 (Erdös–Mauduit–Sárközy) There are two constants {c_1=c_1(r,\mathcal{D})} and {c_2=c_2(r,\mathcal{D})} such that

\displaystyle \left|\#W_{\mathcal{D}}(x,a,q)-\frac{\#W_{\mathcal{D}}(x)}{q}\right|\ll_{r,\mathcal{D}}\frac{\#W_{\mathcal{D}}(x)}{q}\exp\left(-c_2\frac{\log x}{\log q}\right)

for all {a\in\mathbb{N}}, {gcd(q,r(r-1))=1}, {q\leq \exp(c_1\sqrt{\log x})} and {x} sufficiently large.

Proof: As it is usual in this kind of counting problem, one relies on exponential sums. More precisely, note that

\displaystyle \#W_{\mathcal{D}}(x,a,q) = \frac{1}{q}\sum\limits_{h=1}^q\sum\limits_{n\in W_{\mathcal{D}}(x)} e\left(\frac{h(n-a)}{q}\right)

where {e(t):=\exp(2\pi i t)}. The “main term” {\#W_{\mathcal{D}}(x)/q} comes from the case {h=q}, so that our task consists into estimating the “error term”. For this sake, one has essentially to study

\displaystyle F_{N,\mathcal{D}}(h/q):=\sum\limits_{n\in W_{\mathcal{D}}(x)} e\left(\frac{hn}{q}\right)

where {x:=r^N}. Observe that

\displaystyle F_{N,\mathcal{D}}(h/q)=\prod\limits_{j=1}^{N-1}\left(\sum\limits_{d\in\mathcal{D}}e\left(\frac{hdr^j}{q}\right)\right):=\prod\limits_{j=1}^{N-1} u_{\mathcal{D}}(hr^j/q).

The terms {u_{\mathcal{D}}(hr^j/q)} are controlled thanks to the following lemma (giving some saving over the trivial bound {|u_{\mathcal{D}}(\alpha)|\leq \#\mathcal{D}} for all {\alpha\in\mathbb{R}}):

Lemma 2 (Erdös–Mauduit–Sárközy) Let {t:=\#\mathcal{D}}. For any {\alpha\in\mathbb{R}}, one has

\displaystyle |u_{\mathcal{D}}(\alpha)|\leq t\left(1-\frac{\|\alpha\|^2}{(r-1)^5}\right)

where {\|\alpha\|:=\min\limits_{n\in\mathbb{Z}}|\alpha-n|}.

In order to take full advantage of the saving on the right-hand side of the inequality, one needs the following lemma:

Lemma 3 (Mauduit–Sárközy) For any {\beta\in\mathbb{R}} and {q\leq r^{(N-8)/2}}, one has

\displaystyle \sum\limits_{0\leq k<N}\left\|\beta+\frac{hr^k}{q}\right\|\geq \frac{(r-1)^2}{128}\frac{N}{\log q}

The details of the derivation of the desired theorem from the two lemmas above is explained in Section 4 of Erdös–Mauduit–Sárközy paper. \Box

The methods of Erdös–Mauduit–Sárközy above paved the way to further results about ellipsephic integers. For instance, similarly to Bombieri–Vinogradov theorem, it is natural to expect that the distribution result of Erdös–Mauduit–Sárközy gets better on average: as it turns out, this was done independently by C. Dartyge and C. Mauduit, and S. Konyagin (circa 2000):

Theorem 4 (Dartyge–Mauduit, Konyagin) There exists {\alpha=\alpha(r,t)>0} such that for all {B>0} there exists {A>0} with the property that

\displaystyle \sum\limits_{\substack{q\leq x^{\alpha}/(\log x)^A,\\ \textrm{gcd}(q,r(r-1))=1}}\left|\#W_{\mathcal{D}}(x,a,q)-\frac{\#W_{\mathcal{D}}(x)}{q}\right|\ll \frac{\#W_{\mathcal{D}}(x)}{(\log x)^B}.

Proof: One uses Lemmas 2 and 3 above, a large sieve method, and some bounds on the moments

\displaystyle \int_0^1 |F_{N,\mathcal{D}}(s)|^m \, ds

of the function {F_{N,\mathcal{D}}}. \Box

Remark 1 More recently, K. Aloui, C. Mauduit and M. Mkaouar improved (in 2017) some of the results of Erdös–Mauduit–Sárközy to obtain some distribution results for ellipsephic and palindromic integers.

3. Ellipsephic primes and almost primes

By pursuing sieve methods, Dartyge and Mauduit obtained in 2001 the following result about ellipsephic almost primes:

Theorem 5 (Dartyge–Mauduit) There exists {k=k(r, t)\in\mathbb{N}} such that

\displaystyle \#\{n\in W_{\mathcal{D}}(x): \Omega(n)\leq k\}\gg\frac{\#W_{\mathcal{D}}(x)}{\log x},

where {\Omega(n)} stands for the number of prime factors of {n} (counted with multiplicity).

A natural question motivated by this theorem concerns the determination of explicit values of {k} in the previous statement. The answer to this question is somewhat related to the value of {\alpha} in the last theorem of the previous section and, in this direction, it is possible to show that

  • if {\mathcal{D}=\{0,1\}}, then one can take
    • {k=3} (and {\alpha\sim1/3}) for {r=3},
    • {k=5} (and {\alpha\sim 1/4}) for {r=4}, …,
    • {k=23} for {r=10}, and
    • {k=\frac{8}{\pi}(1+o(1))r} as {r\rightarrow\infty}
  • if {\mathcal{D}=\{0,\dots,r-2\}}, {r\geq 5}, then one can take {k=2}.

In 2009 and 2010, C. Mauduit and J. Rivat proved two conjectures of Gelfond on sums of digits of primes and squares. The methods in these articles gave hope to reach the case {k=1} (of ellipsephic primes) in Dartyge–Mauduit theorem above. This was recently accomplished by J. Maynard in 2016: if {\mathcal{D}=\{0,\dots,r-1\}\setminus\{a_0\}} and {r\geq 10}, then

\displaystyle \#\{p\in W_{\mathcal{D}}(x): p \textrm{ prime}\}\gg \frac{\#W_{\mathcal{D}}(x)}{\log x}

Remark 2 In his thesis, A. Irving got analogous results for palindromic ellipsephic integers with {3} digits in basis {r\geq 4} with two prime factors.

After this brief discussion of ellipsephic almost primes, let us now talk about ellipsephic integers possessing only small prime factors.

4. Friable ellipsephic integers

Recall that a friable integer is an integer without large prime factors. For later reference, we denote the largest prime factor of {n} by

\displaystyle p^+(n):=\max\limits_{\substack{p|n\\p\textrm{ prime}}} p.

It was shown by Erdös–Mauduit–Sárközy that, for any fixed {a\in\mathcal{D}\setminus\{0\}} and for all {\varepsilon>0}, there are infinitely many ellipsephic integers {n\in W_{\mathcal{D}}} of the form {n=\sum\limits_{i=0}^{k-1}ar^i} whose largest prime factor is {p^+(n)\leq n^{\varepsilon}}.

Logically, this results motivates the question to establish the existence of a positive proportion of friable ellipsephic integers. This seems a hard task for arbitrary {\varepsilon>0}, but this problem becomes more tractable for small values of {\varepsilon} when the basis {r} is large enough.

In fact, S. Col showed that there exists {0<\alpha=\alpha(r,\mathcal{D})<1} such that

\displaystyle \#\{n\in W_{\mathcal{D}}(x): p^+(n)<n^{\alpha}\}\gg\#W_{\mathcal{D}}(x).

Moreover, if {\mathcal{D}=\{0,1\}}, then it is possible to take {\alpha=1-\frac{\pi}{4r}\left(1-\frac{3\pi}{4r}\right)} (which is close to one for {r} large). On the other hand, {\alpha} can be taken very small when {\mathcal{D}=\{0,\dots,r-1\}\setminus\{a_0\}} and {r} sufficiently large.

5. Ellipsephic solutions to Vinogradov systems

Vinogradov system is a system of equations on the variables {x_1,\dots, x_s, y_1, \dots, y_s} of the form:

\displaystyle x_1^j+\dots+x_s^j = y_1^j+\dots+y_s^j, \quad 1\leq j\leq k.

A major breakthrough on counting solutions to Vinogradov systems was famously obtained by J. Bourgain, C. Demeter and L. Guth (see also the text and the video of L. Pierce’s Bourbaki seminar talk on this subject).

Concerning ellipsephic solutions to Vinogradov systems (i.e., solutions with {x_i, y_i\in W_{\mathcal{D}}} for all {1\leq i\leq s}), Kirsty Briggs showed that for {k=2}, {r=p} prime and {\mathcal{D}=\{d^2: d<\sqrt{p}\}}, the trivial bound {\leq \#W_{\mathcal{D}}(x)^{2s}} on the number of solutions of the Vinogradov system

\displaystyle x_1+\dots+ x_s=y_1+\dots+y_s,

\displaystyle x_1^2+\dots+x_s^2=y_1^2+\dots+y_s^2

with {x_i, y_i\in W_{\mathcal{D}}(x)} can be improved into {\ll_{\varepsilon} \#W_{\mathcal{D}}(x)^{2s-6+\varepsilon}} {\forall \,\varepsilon>0} whenever {s\geq 6}. (In particular, this result is saying that in the case {s=6}, the main contribution to the number of ellipsephic solutions of the corresponding Vinogradov system comes from the trivial solutions {x_i=y_i\in W_{\mathcal{D}}(x)}.)

6. Ellipsephic numbers in finite fields

The notion of finite-field analogs of ellipsephic numbers was studied by several authors including Dartyge, Mauduit and Sárközy.

In order to explain some results in this direction, let us setup some notations. Let {q=p^r} be the power of a prime number {p} and denote by {z\in \mathbb{F}_q}primitive element generating a basis {\mathcal{B}=\{1,z,\dots, z^{r-1}\}} of {\mathbb{F}_q} over {\mathbb{F}_p}. In this way, we can represent a number {x\in\mathbb{F}_q} as

\displaystyle x=\sum\limits_{i=0}^{r-1} c_i z^i \quad \textrm{ with }0\leq c_i<p.

Given a set of digits {\mathcal{D}\subset\{0,\dots, p-1\}} with {2\leq \#\mathcal{D}\leq p-1}, the associated subset of ellipsephic numbers is

\displaystyle W_{\mathcal{D}}:=\left\{\sum\limits_{i=0}^{r-1} c_iz^i: c_i\in\mathcal{D} \,\,\,\forall\, 0\leq i\leq r-1\right\}

Given a polynomial {f(x)\in\mathbb{F}_q[x]}, we can study the set of its ellipsephic values via the set

\displaystyle W_{\mathcal{D}}(f):=\{a\in\mathbb{F}_q: f(a)\in W_{\mathcal{D}}\}.

The size of {W_{\mathcal{D}}(f)} is described by the following theorem:

Theorem 6 (Dartyge–Mauduit–Sárközy) If {\textrm{deg}(f)=n}, then

\displaystyle \left|\#W_{\mathcal{D}}(f)-\#W_{\mathcal{D}}\right|\leq \frac{n-1}{\sqrt{q}}\left(\#\mathcal{D}+p\sqrt{p-\#\mathcal{D}}\right)

This result is specially interesting when {\mathcal{D}} contains a positive proportion of {\mathbb{F}_p}. Moreover, it can be improved when {\mathcal{D}} contains consecutive digits.

More recently, a better result was obtained by R. Dietmann, C. Elsholtz and I. Shparlinski for the case {f(x)=x^2}. Finally, the reader can consult the work of C. Swaenepoel for further results.

Posted by: matheuscmss | December 21, 2019

Breuillard–Sert’s joint spectrum (II)

In the previous post of this series, we gave the statements of some of the results of Breuillard and Sert on the definition and basic properties of the joint spectrum, and we promised to discuss the proofs in subsequent posts.

Today, after a long hiatus, I’ll try to accomplish part of this promise. More precise, I’ll transcript below my notes for the two talks (by Rodolfo Gutiérrez-Romo and myself) aiming to explain to the participants of our groupe de travail the proof of the first portion of Theorem 5 in the previous post, i.e., the convergence of {\frac{1}{n}\kappa(S^n)} (cf. Theorem 5 below), the convergence of {\frac{1}{n}\lambda(S^n)} (cf. Theorem 7 below), and the equality of the limits (cf. Theorem 9 below).

Evidently, all mistakes in what follows are my sole responsibility.

1. Spectral radius formula revisited

Let {G} be a reductive linear algebraic group. Recall that the Cartan projection {\kappa:G\rightarrow\mathfrak{a}^+} and Jordan projection {\lambda:G\rightarrow\mathfrak{a}^+} were defined in the previous post via the Cartan decomposition {g\in K\exp(\kappa(g))K} and the Jordan–Chevalley decomposition {g=g_e g_h g_u} with {g_e} elliptic, {g_u} unipotent, and {g_h} “hyperbolic” conjugated to {\exp(\lambda(g))}.

The semisimple rank {d_s} of {G} is {d_s=\textrm{dim}(A)-\textrm{dim}(Z(G))} where {A} is a maximal torus of {G} and {Z(G)} is the center of {G}. We denote by {\overline{\alpha}_i\in\textrm{Hom}(\mathfrak{a},\mathbb{R})}, {1\leq i\leq d:=\textrm{dim}(A)}system of roots such that {\Pi:=\{\overline{\alpha}_1,\dots, \overline{\alpha}_{d_s}\}} is a base of simple roots.

Each {\overline{\alpha}\in\Pi} induces a weight {\omega_{\overline{\alpha}}\in\textrm{Hom}(\mathfrak{a},\mathbb{R})} satisfying {\omega_{\overline{\alpha}}|_{\mathfrak{a}_Z}=0} and {\langle\omega_{\overline{\alpha}},\overline{\beta}\rangle=0} for all {\overline{\beta}\in\Pi\setminus\{\overline{\alpha}\}}, where {\mathfrak{a}_Z} is the Lie subalgebra of {A\cap Z(G)} and {\langle.,.\rangle} is a fixed extension of the Killing form on the Lie subalgebra {\mathfrak{a}_S} of {A\cap [G,G]} to the Lie algebra {\mathfrak{a}} of {A} such that {\mathfrak{a}=\mathfrak{a}_S\oplus\mathfrak{a}_Z} becomes an orthogonal decomposition.

The weights {\omega_{\overline{\alpha}_i}}, {1\leq i\leq d_s}, are the highest weights of distinguished representations {\rho_i}, {1\leq i\leq d_s} of {G}. One has

\displaystyle \omega_{\overline{\alpha}_i}(\kappa(g)) = \log \|\rho_i(g)\|_i \textrm{ and } \omega_{\overline{\alpha}_i}(\lambda(g)) = \log |\lambda_1(\rho_i(g))|

where {\|.\|_i} is a choice of {\rho_i(K)}-invariant norm with {\rho_i(A)} diagonalisable in an orthonormal basis and {\rho_i(G)} stable under the adjoint operation, and {\lambda_1(M)} denotes the top eigenvalue of a matrix {M}. In particular, the Cartan projection {\kappa(g)} is represented by a vector of logarithms of norms of matrices, the Jordan projection {\lambda(g)} is represented by a vector of the logarithms of the moduli of top eigenvalues of matrices, and, a fortiori, the usual formula for the spectral radius implies that:

Lemma 1 One has

\displaystyle \lim\limits_{n\rightarrow\infty}\frac{1}{n}\kappa(g^n) = \lambda(g)

for every {g\in G}.

2. Proximal elements and Cartan projections

As we indicated in § 2.1 of the previous post of this series, the convergence of {\frac{1}{n}\kappa(S^n)} relies on the notion of proximal matrices.

Definition 2 Let {d([x],[y]):=\frac{\|x\wedge y\|}{\|x\|\cdot\|y\|}} be the Fubini-Study metric on the projective space {\mathbb{P}(V)} of a finite-dimensional real vector space {V} equipped with an Euclidean norm {\|.\|}.Given {0<\varepsilon\leq r}, we say that {g\in GL(V)} is a {(r,\varepsilon)}-proximal matrix whenever:

  • {g} has an unique eigenvalue of maximal modulus with eigendirection {v_g^+\in\mathbb{P}(V)} and {g}-invariant supplementary hyperplane {H_g^{<}\subset\mathbb{P}(V)};
  • {d(v_g^+, H_g^{<})\geq 2r};
  • {d(gx, gy)\leq \varepsilon d(x,y)} for all {x, y\in\mathbb{P}(V)} with {d(x,H_g^{<})\geq \varepsilon} and {d(y,H_g^{<})\geq \varepsilon}.

In general, we say that an element {g\in G} of a reductive linear algebraic group {G} with distinguished representations {\rho_i}, {1\leq i\leq d_s}, is {(G,r,\varepsilon)}proximal when the matrices {\rho_i(g)} are {(r,\varepsilon)}-proximal for all {1\leq i\leq d_s}.

A basic feature of proximal elements is the fact their Cartan and Jordan projections are comparable (cf. Lemmas 2.15 and 2.16 of Breuillard–Sert paper extracted from Benoist’s paper).

Lemma 3 There is a constant {C_G>0} such that

\displaystyle \|\kappa(h_1\dots h_n)\|\leq C_G(\|\kappa(h_1)\|+\dots+\|\kappa(h_n)\|)


\displaystyle \|\kappa(h_1 g h_2) - \kappa(g)\|\leq C_G(\|\kappa(h_1)\|+\|\kappa(h_2)\|)

for all {g,h_1,h_2,\dots, h_n\in G}.For each {r>0}, there is a constant {C_r>0} such that the Cartan and Jordan projections of any {(G,r,\varepsilon)}-proximal element {g\in G} satisfy

\displaystyle \|\kappa(g)-\lambda(g)\|\leq C_r.

Another crucial feature of proximal elements (discovered by Abels, Margulis and Soifer, see Theorem 4.1 of their paper) is their ubiquity in Zariski dense monoids:

Theorem 4 (Abel–Margulis–Soifer) Let {G} be a connected, reductive, real Lie group. Suppose that {\Gamma\subset G} is a Zariski dense monoid. Then, there exists {r=r(\Gamma)>0} such that, for all {0<\varepsilon\leq r}, there exists a finite subset {F=F(\Gamma, r,\varepsilon)\subset \Gamma} with the property: given {g\in G}, there exists {f\in F} so that {gf} is {(G,r,\varepsilon)}-proximal.

At this point, we are ready to prove the convergence of Cartan projections:

Theorem 5 Let {G} be a connected reductive real Lie group and suppose that {S\subset G} is a compact subset generating a Zariski dense subgroup. Then,

\displaystyle \frac{1}{n}\kappa(S^n)

converges in the Hausdorff topology as {n\rightarrow\infty}.

Proof: By Lemma 2 of the previous post, our task is reduced to show that {\frac{1}{m}\kappa(S^m)}, {m\in\mathbb{N}}, stays in a compact region of {\mathfrak{a}}, and for each {\delta>0}, there exists {n_0\in\mathbb{N}} such that {\limsup\limits_{m\rightarrow\infty}d(x,\frac{1}{m}\kappa(S^m))\leq\delta} for all {x\in\frac{1}{n}\kappa(S^n)} and {n\geq n_0}.

By Lemma 3, there exists a constant {C_G>0} such that

\displaystyle \frac{1}{m}\kappa(g)\leq C_G\sup\limits_{s\in S}\|\kappa(s)\|:=C_{G,S}

for all {g\in S^m}. It follows that {\frac{1}{m}\kappa(S^m)\subset B(0,C_{G,S})} for all {m\in\mathbb{N}}, that is, {\frac{1}{m}\kappa(S^m)} is confined in a compact region of {\mathfrak{a}}.

Let us now estimate {d(x,\frac{1}{m}\kappa(S^m))} for {x\in\frac{1}{n}\kappa(S^n)}, say {x=\frac{1}{n}\kappa(g)} with {g\in S^n}. By Abels–Margulis–Soifer theorem 4, we can select a finite subset {F} of the monoid generated by {S} such that for each {h\in G}, there exists {a\in F} so that {ha} is {(G,r,\varepsilon)}-proximal. In particular, we can take {f\in F} such that {(gf)^k} is {(r,\varepsilon)}-proximal for all {k\in\mathbb{N}}. By Lemma 3, we have

\displaystyle \|k\lambda(gf)-\kappa((gf)^k)\| = \|\lambda((gf)^k)-\kappa((gf)^k)\|\leq C_r \quad \forall \, \, k\geq 1


\displaystyle \|\kappa(gf)-\kappa(g)\|\leq C_G\|\kappa(f)\|.

Since {\lambda(gf)=\frac{1}{k}\lambda((gf)^k)}, it follows from the triangular inequality that

\displaystyle \begin{array}{rcl} \|\kappa(g)-\frac{1}{k}\kappa((gf)^k)\|&\leq& \|\kappa(g)-\kappa(gf)\|+\|\kappa(gf)-\lambda(gf)\|+\|\frac{1}{k}\lambda((gf)^k)-\frac{1}{k}\kappa((gf)^k)\| \\ &\leq& C_G\|\kappa(f)\|+C_r+\frac{1}{k}C_r = C_G\|\kappa(f)\|+\frac{k+1}{k}C_r. \end{array}

Therefore, if we fix {h_0\in S} and we write {f\in S^{n_f}}, {n_f\in\mathbb{N}}, we can use the Euclidean division {m=k(n+n_f)+j}, {0\leq j<n+n_f} to obtain an element of {S^m} via the formula

\displaystyle g_m:=h_0^j(gf)^k.

It follows from the definitions and Lemma 3 that

\displaystyle \begin{array}{rcl} d(x,\frac{1}{m}S^m)&\leq& \|x-\frac{1}{m}\kappa(g_m)\|=\|\frac{1}{n}\kappa(g)-\frac{1}{m}\kappa(g_m)\| \\ &\leq& \frac{1}{n}\|\kappa(g)-\frac{1}{k}\kappa((gf)^k)\|+\|\frac{1}{nk}\kappa((gf)^k)-\frac{1}{m}\kappa(g_m)\| \\ &\leq& \frac{1}{n}\left(C_G\|\kappa(f)\|+\frac{(k+1)C_r}{k}\right)+\left|\frac{1}{nk}-\frac{1}{m}\right|\|\kappa((gf)^k)\|+\frac{1}{m}\|\kappa((gf)^k)-\kappa(g_m)\| \\ &\leq& \frac{1}{n}\left(C_G\|\kappa(f)\|+\frac{(k+1)C_r}{k}\right)+C_G\left(k(n+n_f)\left|\frac{1}{nk}-\frac{1}{m}\right|+\frac{j}{m}\right) \sup_{s\in S}\|\kappa(s)\|. \end{array}


\displaystyle k(n+n_f)\left|\frac{1}{nk}-\frac{1}{m}\right| = \left|\frac{n+n_f}{n}-\frac{k(n+n_f)}{m}\right|=\frac{n_f}{n}+\frac{j}{m},

by taking {m\rightarrow\infty} (or equivalently {k\rightarrow\infty}) we derive that

\displaystyle \limsup\limits_{m\rightarrow\infty}d(x,\frac{1}{m}\kappa(S^m))\leq \frac{1}{n}\left(C_G\sup\limits_{f\in F}\|\kappa(f)\|+C_r\right)+\frac{1}{n}\left(C_G\sup\limits_{f\in F}n_f \sup_{s\in S}\|\kappa(s)\|\right).

Hence, given {\delta>0}, there exists {n_0\in\mathbb{N}} such that

\displaystyle \limsup\limits_{m\rightarrow\infty}d(x,\frac{1}{m}\kappa(S^m))\leq \delta

for all {x\in\frac{1}{n}\kappa(S^n)}, {n\geq n_0}. This completes the proof. \Box

3. Twisting and Jordan projections

A Zariski dense monoid of matrices is twisting in the sense that it always contains an element putting a finite configuration of lines and hyperplanes in general positions:

Lemma 6 Let {G} be a connected, reductive Lie group and suppose that {\Gamma\subset G} is a Zariski dense monoid. Given a finite collection {(\rho_i, V_i)}, {1\leq i\leq D}, of irreducible representations of {G} and finite configurations {v_i^j} and {H_i^j}, {1\leq j\leq t}, of points and hyperplanes in {\mathbb{P}(V_i)}, there is an element {\gamma\in \Gamma} such that

\displaystyle \rho_i(\gamma)v_i^j\notin H_i^j

for all {1\leq i\leq D} and {1\leq j\leq t}.

Proof: Since {(\rho_i, V_i)} are irreducible, the sets {\{g\in G:\rho_i(\gamma)v_i^j\notin H_i^j\}} are non-empty and Zariski open in {G}. Thus,

\displaystyle \bigcap\limits_{\substack{1\leq i\leq D \\ 1\leq j\leq t}} \{g\in G:\rho_i(\gamma)v_i^j\notin H_i^j\}

is Zariski open in {G} and non-empty (because {G} is connected). Since {\Gamma} is Zariski dense,

\displaystyle \Gamma\cap \bigcap\limits_{\substack{1\leq i\leq D \\ 1\leq j\leq t}} \{g\in G:\rho_i(\gamma)v_i^j\notin H_i^j\}\neq\emptyset.

This completes the argument. \Box

Remark 1 The conclusion of this lemma can be reinforced as follows (cf. Remark 2.22 of Breuillard–Sert paper): it is possible to select {\gamma} from a finite subset of {\Gamma} depending only on {D} and {t} (but not on {v_i^j} and {H_i^j}).

At this stage, we can start the discussion of the convergence of Jordan projections:

Theorem 7 Let {G} be a connected reductive real Lie group and suppose that {S\subset G} is a compact subset generating a Zariski dense subgroup. Then,

\displaystyle \frac{1}{n}\lambda(S^n)

converges in the Hausdorff topology as {n\rightarrow\infty}.

Proof: Similarly to the previous section (on convergence of Cartan projections), our task consists into showing that for each {\delta}, there exists {n_0\in\mathbb{N}} such that

\displaystyle \limsup\limits_{m\rightarrow\infty} d(x,\frac{1}{m}\lambda(S^m))\leq \delta

for all {x\in\frac{1}{n}\lambda(S^n)}, {n\geq n_0}. In this direction, let us fix {\delta>0} and let us take {x=\frac{1}{n}\lambda(g)}, {g\in S^n}. By the formula for the spectral radius (cf. Lemma 1), we can fix {\ell\in\mathbb{N}} with

\displaystyle \|\frac{1}{\ell}\kappa(g^{\ell})-\lambda(g)\|<\delta.

By Abels–Margulis–Soifer theorem 4, we can fix {F} a finite subset of the monoid {\Gamma} generated by {S} such that for some {f\in F}, we have that {g^{\ell} f} is {(G,r,\varepsilon)}-proximal.

By Lemma 3,

\displaystyle \|\kappa(g^{\ell}f)-\kappa(g^{\ell})\|\leq C_G\|\kappa(f)\|\leq C_G n_f\sup\limits_{s\in S}\|\kappa(s)\|

where {f\in S^{n_f}}, and

\displaystyle \|\lambda((g^{\ell}f)^k)-\kappa((g^{\ell}f)^k)\|\leq C_r

for all {k\geq 1}.

Consider the distinguished representations {(\rho_i, V_i)}, {1\leq i\leq d_s}, of {G}. Note that the dominant eigendirection {v_i^+} and the dominated hyperplane {H_i^<} for the actions of the proximal matrices {\rho_i((g^{\ell}f)^k)} on {\mathbb{P}(V_i)} are the same for all {k\geq 1}.

We fix {h_0\in S}. By the twisting property in Lemma 6, there exists {\gamma\in\Gamma}, say {\gamma\in S^{n_{\gamma}}}, such that

\displaystyle \rho_i(\gamma)\rho_i(h_0^j)v_i^+\notin H_i^<

for all {1\leq i\leq d_s} and {0\leq j<n\ell+n_f}.

The dynamics of projective actions of the iterates of a proximal matrix {a} is easy to describe: any direction transverse to {H_a^<} is attracted towards {v_a^+}. By rendering this argument slightly more quantitative (with the aid of the so-called Tits proximality criterion), Breuillard and Sert proved in Lemma 3.6 of their paper that

Lemma 8 If {a\in G} is {(G,r,\varepsilon)}-proximal and {T\subset G} is a finite subset such that

\displaystyle \rho_i(t)v_{\rho_i(a)}^+\notin H_{\rho_i(a)}^<

for all {t\in T} and {1\leq i\leq d_s}, then there exists {\widehat{r}>0} such that for all {0<\widehat{\varepsilon}<\widehat{r}} and {t\in T}, one has that {ta^k} is {(G,\widehat{r},\widehat{\varepsilon})}-proximal for all {k\geq k_0=k_0(\widehat{\varepsilon})}.

By applying this lemma with {T:=\{\gamma h_0^j: 0\leq j<n\ell+n_f\}}, we can select {0<\widehat{\varepsilon}<\widehat{r}} such that {\gamma h_0^j(g^{\ell}f)^k} is {(G,\widehat{r},\widehat{\varepsilon})}-proximal for all {1\leq i\leq d_s}, {0\leq j<n\ell+n_f} and {k\geq k_0(\widehat{\varepsilon})}.

Once again, it follows from Lemma 3 that

\displaystyle \|\kappa(\gamma h_0^j(g^{\ell}f)^k)-\kappa((g^{\ell}f)^k)\|\leq C_G\sup\limits_{t\in T}\|\kappa(t)\|


\displaystyle \|\kappa(\gamma h_0^j(g^{\ell}f)^k)-\lambda(\gamma h_0^j(g^{\ell}f)^k)\|\leq C_{\widehat{r}}

for all {0\leq j<n\ell+n_f} and {k\geq k_0(\widehat{\varepsilon})}.

By Euclidean division, we can write {m-n_{\gamma}=k(n\ell+n_f)+j} with {0\leq j<n\ell+n_f} and define

\displaystyle g_m:= \gamma h_0^j (g^{\ell}f)^k\in S^m.

From our discussion above, we derive that

\displaystyle \begin{array}{rcl} \|x-\frac{1}{m}\lambda(g_m)\| &\leq& \frac{1}{n}\|\lambda(g)-\frac{1}{\ell}\kappa(g^{\ell})\| + \|\frac{1}{n\ell}\kappa(g^{\ell})-\frac{1}{m}\lambda(g_m)\| \\ &\leq& \frac{\delta}{n}+ \frac{1}{n\ell}\|\kappa(g^{\ell})-\kappa(g^{\ell}f)\| + \|\frac{1}{n\ell}\kappa(g^{\ell}f)-\frac{1}{m}\lambda(g_m)\| \\ &\leq& \frac{\delta}{n}+C_G \frac{n_f}{n\ell}\sup\limits_{s\in S}\|\kappa(s)\|+\|\frac{1}{n\ell}\kappa(g^{\ell}f)-\frac{1}{m}\lambda(g_m)\| \\ &\leq& \frac{\delta}{n}+C_G \frac{n_f}{n\ell}\sup\limits_{s\in S}\|\kappa(s)\|+\frac{C_r}{n\ell}+ \|\frac{1}{n\ell}\lambda(g^{\ell}f)-\frac{1}{m}\lambda(g_m)\| \\ &=& \frac{\delta}{n}+C_G \frac{n_f}{n\ell}\sup\limits_{s\in S}\|\kappa(s)\|+\frac{C_r}{n\ell}+ \|\frac{1}{kn\ell}\lambda((g^{\ell}f)^k)-\frac{1}{m}\lambda(g_m)\| \\ &\leq& \frac{\delta}{n}+C_G \frac{n_f}{n\ell}\sup\limits_{s\in S}\|\kappa(s)\|+\frac{C_r}{n\ell}+\frac{C_r}{kn\ell}+ \|\frac{1}{kn\ell}\kappa((g^{\ell}f)^k)-\frac{1}{m}\lambda(g_m)\| \\ &\leq& \frac{\delta}{n}+C_G \frac{n_f}{n\ell}\sup\limits_{s\in S}\|\kappa(s)\|+\frac{C_r}{n\ell}+\frac{C_r+C_G\sup\limits_{t\in T}\|\kappa(t)\|}{kn\ell} +\|\frac{1}{kn\ell}\kappa(g_m)-\frac{1}{m}\lambda(g_m)\| \\ &\leq& \frac{\delta}{n}+C_G \frac{n_f}{n\ell}\sup\limits_{s\in S}\|\kappa(s)\|+\frac{C_r}{n\ell}+\frac{C_r+C_G\sup\limits_{t\in T}\|\kappa(t)\|+C_{\widehat{r}}}{kn\ell} +\|\frac{1}{kn\ell}\lambda(g_m)-\frac{1}{m}\lambda(g_m)\|. \end{array}

Since {\sup\limits_{t\in T}\|\kappa(t)\|\leq C_G(n\ell+n_f+n_{\gamma})\sup\limits_{s\in S}\|\kappa(s)\|} and

\displaystyle \frac{1}{kn\ell}-\frac{1}{m} = \frac{kn_f+j+n_{\gamma}}{mkn\ell},

by letting {m\rightarrow\infty} (or equivalently, {k\rightarrow\infty}) we conclude that

\displaystyle \limsup\limits_{m\rightarrow\infty} d(x,\frac{1}{m}\lambda(S^m))\leq \frac{\delta}{n}+C_G \frac{n_f}{n\ell}\sup\limits_{s\in S}\|\kappa(s)\|+\frac{C_r}{n\ell}+C_G \frac{n_f}{n\ell}\sup\limits_{s\in S}\|\kappa(s)\|

for all {x\in\frac{1}{n}\lambda(S^n)}. This completes the proof. \Box

4. Coincidence of the limits

Let {G} be a connected, reductive real Lie group and let {S\subset G} be a compact subset generating a Zariski dense monoid. By Theorems 5 and 7, we have that

\displaystyle \frac{1}{n}\kappa(S^n)\rightarrow J_{Cartan} \quad \textrm{and} \quad \frac{1}{n}\lambda(S^n)\rightarrow J_{Jordan}

as {n\rightarrow\infty}.

Theorem 9 We have {J_{Cartan}=J_{Jordan}}.

Proof: By the formula for the spectral radius (cf. Lemma 1), for all {g\in G}, one has {\frac{1}{n}\kappa(g^n)\rightarrow \lambda(g)} as {n\rightarrow\infty}. In particular, {J_{Jordan}\subset J_{Cartan}}.

In order to derive the other inclusion, we recall that the proof of Theorem 5 about the convergence of Cartan projections revealed that there exists {i_0=i_0(S)\in\mathbb{N}} and a constant {C_S>0} such that for all {n\in\mathbb{N}} and {g\in S^n}, there exists {f\in S^i} with {i\leq i_0} and

\displaystyle \|\kappa(g)-\lambda(gf)\|\leq C_S.


\displaystyle \begin{array}{rcl} \|\frac{1}{n}\kappa(g)-\frac{1}{n+i}\lambda(gf)\|&\leq& \|\frac{1}{n}\kappa(g)-\frac{1}{n}\lambda(gf)\|+\left|\frac{1}{n}-\frac{1}{n+i}\right|\lambda(gf) \\ &\leq& \frac{1}{n}C_S + \frac{i_0}{n}\frac{\lambda(gf)}{n+i}. \end{array}

Since {gf\in S^{n+i}}, we have that {\lambda(gf)/(n+i)} is bounded and, a fortiori,

\displaystyle \|\frac{1}{n}\kappa(g)-\frac{1}{n+i}\lambda(gf)\|\rightarrow 0

as {n\rightarrow\infty}. This shows that {J_{Cartan}\subset J_{Jordan}}, as desired. \Box

5. Realization of the joint spectrum by sequences

Closing this post, let us further discuss Theorem 5 from the previous post by showing that {x=\lim\limits_{n\rightarrow\infty}\frac{1}{n}\kappa(a_n)} with {a_n\in S^n} is realized by a single sequence {b=(b_1,b_2,\dots)\in S^{\mathbb{N}}} in the sense that {x=\lim\limits_{n\rightarrow\infty}\frac{1}{n}\kappa(b_1\dots b_n)}.

For this sake, we use Abels–Margulis–Soifer theorem 4 and the strong version in Remark 1 of the twisting property in Lemma 6 to select a finite subset {F} of {\Gamma} and some constants {0<\varepsilon<r} such that for each {n\in\mathbb{N}} there are {f_n, \gamma_n\in F} with the property that {g_n:=\gamma_n a_n f_n} is a Schottky family in the sense that {g_n} is {(G,r,\varepsilon)}-proximal, and {d(v_{g_n}^+,H_{g_{n+1}}^<)\geq 6r} and {d(v_{g_1}^+,H_{g_{n}}^<)\geq 6r} for all {n\in\mathbb{N}}. (This nomenclature comes from the fact that the projective actions of the elements in a Schottky family resemble the classical Schottky groups.) Note that {g_n\in S^{|g_n|}} where {|g_n|=n+O(1)}.

Let us now choose a rapidly increasing sequence {(\ell_n)_{n\in\mathbb{N}}} so that

\displaystyle \sum\limits_{i=1}^{n-1} i\ell_i=o(\ell_n)

for all {n\in\mathbb{N}}, and we define {(b_1,b_2,\dots)\in S^{\mathbb{N}}} by

\displaystyle b_1 b_2\dots b_k\dots:=g_1^{\ell_1} g_2^{\ell_2}\dots g_n^{\ell_n}\dots

By definition, any finite word {b_1\dots b_k} has the form {g_1^{\ell_1} g_2^{\ell_2}\dots g_n^{\ell_n} g_{n+1}^{\ell}\overline{g}} where {\ell\leq\ell_{n+1}} and {\overline{g}} is a prefix of {g_{n+1}}. Observe that

\displaystyle \begin{array}{rcl} k&=&\sum\limits_{i=1}^n |g_i|\ell_i + |g_{n+1}|\ell+|\overline{g}| = |g_n|\ell_n+|g_{n+1}|\ell+o(\ell_n) \\ &=& n\ell_n + (n+1)\ell+O(1)(\ell_n+\ell) \end{array}

By Lemma 3, {\|\kappa(\overline{g})\|=O(n)}. Moreover, Lemma 2.17 in Breuillard–Sert paper ensures that the Schottky property for the family {(g_n)_{n\in\mathbb{N}}} makes that {g_1^{\ell_1}\dots g_n^{\ell_n} g_{n+1}^{\ell}} is a {(G,2r,2\varepsilon)}-proximal element with

\displaystyle \|\lambda(g_1^{\ell_1}\dots g_n^{\ell_n} g_{n+1}^{\ell})-\sum\limits_{i=1}^n\ell_i\lambda(g_i)-\ell\lambda(g_{n+1})\|=O(n).

Therefore, it follows from Lemma 3 that

\displaystyle \begin{array}{rcl} \kappa(b_1\dots b_k) &=& \kappa(g_1^{\ell_1} g_2^{\ell_2}\dots g_n^{\ell_n} g_{n+1}^{\ell}\overline{g}) = \kappa(g_1^{\ell_1} g_2^{\ell_2}\dots g_n^{\ell_n} g_{n+1}^{\ell})+O(n) \\ &=& \lambda(g_1^{\ell_1} g_2^{\ell_2}\dots g_n^{\ell_n} g_{n+1}^{\ell})+O(n) = \sum\limits_{i=1}^n\ell_i\lambda(g_i)+\ell\lambda(g_{n+1})+O(n) \\ &=& \sum\limits_{i=1}^n\ell_i\kappa(g_i)+O(1)\sum\limits_{i=1}^n\ell_i+\ell\kappa(g_{n+1})+O(1)\ell+O(n) \\ &=& \sum\limits_{i=1}^{n-1}(i+O(1))\ell_i+\ell_n\kappa(a_n)+O(1)\ell_n+\ell\kappa(a_{n+1})+O(1)\ell+O(n) \\ &=& n\ell_n\frac{1}{n}\kappa(a_n)+(n+1)\ell\frac{1}{n+1}\kappa(a_{n+1})+O(1)(\ell_n+\ell). \end{array}

Since {\frac{1}{n}\kappa(a_n)} converges to {x}, we conclude that {\frac{1}{k}\kappa(b_1\dots b_k)} converges to {x} as {k\rightarrow\infty}.

Yuri Lima asked me to announce in this blog the following opportunity for a post-doctoral position in Dynamical Systems and Ergodic Theory at Universidade Federal do Ceará (located in Fortaleza, Brazil):

Serrapilheira Postdoctoral Fellowship – UFC

The Department of Mathematics at Universidade Federal do Ceará (UFC) invites applications for a Serrapilheira Postdoctoral Fellowship in dynamical systems and ergodic theory. The position is for one year with start date at any moment between March 2020 and September 2020, with possibility of extension for another year.

Qualifications and expectations

The position is part of the project “Jangada Dinâmica – boosting dynamical systems in Brazil’s Northeastern region”, which is funded by Instituto Serrapilheira and aims to boost dynamical systems and ergodic theory in the mathematical community of universities located in the Northeastern region of Brazil. The applicant must have completed a PhD and be qualified for conducting research in either dynamical systems and/or ergodic theory. There are NO teaching duties. As part of the program, and to foster interaction, the fellow shall visit another department of Mathematics in the Northeast for one month each semester or two months per year. Applications from underrepresented groups in Mathematics are highly encouraged.


The salary will range from 5000–6000 Brazilian Reais monthly, tax free, in a twelve month-base calendar, according to the applicant’s qualifications. There will be an extra 5000 Brazilian Reais for each of the two months of visits to another institution in the Northeast. The salary is more attractive than those offered by regular Brazilian funding agencies.

Department of Mathematics at UFC

The Department of Mathematics at UFC currently holds the highest rank among Brazilian Mathematics departments. Having a strong history in the field of differential geometry, during the last 15 years it has developed new research groups in analysis, graph optimization and, more recently, in dynamical systems. Currently, the group of dynamical systems has two members, with expertise on nonuniform hyperbolicity, partial hyperbolicity, and symbolic dynamics.


UFC is located in the city of Fortaleza, which has approximately 2.5 million inhabitants and is the fifth largest city of Brazil. Located in the Northeastern region of Brazil, Fortaleza is becoming a common port of entry to the country, with many direct flights to the US and Europe. Historically known for touristic reasons, it is nearby beaches with warm water and white sand dunes, and its cost of living is cheaper than bigger cities like Rio de Janeiro and São Paulo, thus making the monthly stipend affordable.

Documentation required

– CV with publication list

– Research statement

– Two (or more) letters of recommendation.

All documents must be sent to The applicant must send the first two documents, and ask two (or more) professors to directly send their letters of recommendation.

Deadline: December 31, 2019.

More information:

In this previous post here (from 2018), I described some “back of the envelope calculations” (based on private conversations with Scott Wolpert) indicating that some sectional curvatures of the Weil–Petersson (WP) metric could be at least exponentially small in terms of the distance to the boundary divisor of Deligne–Mumford compactification.

Very roughly speaking, this heuristic computation went as follows: the WP sectional curvature of any {2}-plane can be written as the sum of three terms; for the {2}-planes considered in the previous post, the main term among those three seemed to be a kind of {L^4}-norm of Beltrami differentials with essentially disjoint supports; finally, this {L^4}-type norm was shown to be really small once a certain Green propagator is ignored.

Last April 2019, I met Scott during an event at Simons Center for Geometry and Physics, and I took the opportunity to tell him that one could perhaps show that the measure of the set of {2}-planes leading to tiny WP curvatures is very small using the real-analyticity of the WP metric.

More concretely, my idea was very simple: since the Grassmannian {G} of {2}-planes tangent to a point {p} is a compact space, the WP sectional curvature defines a real-analytic function {c:G\rightarrow(-\infty, 0)}, and we dispose of good upper bounds for {|c|} and all of its derivatives in terms of the distance of {p} to the boundary (see this article here), we can hope to get reasonable estimates for the measure of the sets {\{P\in G: |c(P)|\leq\varepsilon\}} using the techniques of these articles here and here (which are close in spirit to the classical fact [explained in Lemma 3.2 of Kleinbock–Margulis paper, for instance] that the measure of the sets {\{|P|\leq\varepsilon\}} are small whenever {P} is a polynomial function on {[0,1]} whose degree and {C^0}-norm are bounded).

As it turns out, Scott thought that this strategy made some sense and, in particular, he promised to use my suggestion as a motivation to review his arguments concerning WP sectional curvatures.

After several email exchanges with Howard Masur and I, Scott announced that there were some mistakes in the construction of tiny WP sectional curvature: in a nutshell, one should not restrict the analysis to a single “main term” in the formula for WP sectional curvatures as a sum of three expressions, and one can not ignore the effect of the Green propagator. More importantly, Scott made a detailed study of these mistakes which ultimately led him to establish polynomial upper bounds for WP sectional curvatures at the heart of his newest preprint available here.

In this post, we will follow closely Scott’s preprint in order to give an outline of the proof of a polynomial upper bound for WP sectional curvatures:

Theorem 1 (Wolpert) Given two integers {g\geq 0} and {n\geq 0} with {3g-3+n\geq 1}, there exists a constant {C(g,n)>0} with the following property.If {\sigma(X)} denotes the product of the lengths of the short geodesics of a hyperbolic surface {X} of genus {g} with {n} cusps whose systole is sufficiently small, then the sectional curvatures of the Weil-Petersson metric at {X} are at most

\displaystyle -C(g,n)\cdot\sigma(X)^7

Remark 1 As it was pointed out by Scott in his preprint, it is likely that this estimate is not optimal: indeed, one expects that the best exponent should be {3} rather than {7}.

In what follows, we’ll assume some familiarity with some basic aspects of the geometry of the Weil–Petersson metric (such as those described in these posts here and here).

1. Weil–Petersson sectional curvatures

Let {X} be a hyperbolic surface of genus {g\geq0} with {n\geq0}. If we write {X=\mathbb{H}/\Gamma}, where {\mathbb{H}} is the usual hyperbolic plane and {\Gamma} is a group of isometries of {\mathbb{H}} describing the fundamental group of {X}, then the holomorphic tangent space at {X} to the moduli space {\mathcal{M}_{g,n}} of Riemann surfaces of genus {g} with {n} punctures is naturally identified with the space {B(\Gamma)} of harmonic Beltrami differentials on {X} (and the cotangent space is related to quadratic differentials).

In this setting, the Weil–Petersson metric is the Riemannian metric {ds^2=2\sum g_{\alpha\overline{\beta}} dt_{\alpha}\overline{dt_{\beta}}} induced by the Hermitian inner product

\displaystyle g_{\alpha\overline{\beta}} = \langle\mu_{\alpha},\mu_{\beta}\rangle := \int_X \mu_{\alpha}\overline{\mu_{\beta}} \, dA

where {\mu_{\alpha}, \mu_{\beta}\in B(\Gamma)} and {dA} is the hyperbolic area form on {X}.

Remark 2 Note that {\langle.,.\rangle} is well-defined: if {\mu=\mu(z)\overline{dz}/dz} and {\nu=\nu(z)\overline{dz}/dz} are Beltrami differentials, then {\mu\overline{\nu}} is a function on {X}.

The Riemann tensor of the Weil–Petersson metric was computed by Wolpert in 1986:

\displaystyle R_{\alpha\overline{\beta}\gamma\overline{\delta}} = (\alpha\overline{\beta}, \gamma\overline{\delta}) + (\alpha\overline{\delta}, \gamma\overline{\beta})

where {(a\overline{b},c\overline{d}) := \int_X (\mu_a\overline{\mu_b}) \mathcal{D}(\mu_c\overline{\mu_d})\, dA} and {\mathcal{D}:=-2(\Delta-2)^{-1}} is an operator related to the Laplace–Beltrami operator {\Delta} on {L^2(X)}.

Remark 3 Our choice of notation here differs from Wolpert’s preprint! Indeed, he denotes the Laplace–Beltrami operator by {D} and he writes {\Delta=-2(D-2)^{-1}}.

The Riemann tensor gives access to nice formulas for the sectional curvatures thanks to the work of Bochner. More concretely, given {v_1} and {v_2} span a {2}-plane {P} in the real tangent space to {\mathcal{M}_{g,n}} at {X}, let us take Beltrami differentials {\mu_1} and {\mu_2} such that {v_1=\mu_1+\overline{\mu_2}}, {v_2=\mu_1-\overline{\mu_2}}, and {\{\mu_1,\mu_2\}} is orthonormal. Then, Bochner showed that the sectional curvature of {P} is

\displaystyle K(P)=\frac{R_{1\overline{2}1\overline{2}}-R_{1\overline{2}2\overline{1}}-R_{2\overline{1}1\overline{2}}+R_{2\overline{1}2\overline{1}}}{4g_{1\overline{1}}g_{2\overline{2}}-2|g_{1\overline{2}}|^2-2\textrm{Re}(g_{1\overline{2}})^2} = \frac{R_{1\overline{2}1\overline{2}}-R_{1\overline{2}2\overline{1}}-R_{2\overline{1}1\overline{2}}+R_{2\overline{1}2\overline{1}}}{4}

Hence, by Wolpert’s formula for the Riemann tensor of the WP metric, we see that

\displaystyle K(P) = \frac{2(1\overline{2}, 1\overline{2})-(1\overline{2}, 2\overline{1})-(1\overline{1}, 2\overline{2})-(2\overline{1}, 1\overline{2})-(2\overline{2}, 1\overline{1})+2(2\overline{1}, 2\overline{1})}{4} \ \ \ \ \ (1)

2. Spectral theory of {\mathcal{D}}

Wolpert’s formula for the Riemann tensor of the WP metric hints that the spectral theory of {\mathcal{D}} plays an important role in the study of the WP sectional curvatures.

For this reason, let us review some key properties of {\mathcal{D}} (and we refer to Section 3 of Wolpert’s preprint for more details and references). First, {\mathcal{D}=-2(\Delta-2)^{-1}} is a positive operator on {L^2(X)} whose norm is {1}: these facts follow by integration by parts. Secondly, {\Delta} is essentially self-adjoint on {L^2(X)}, so that {\mathcal{D}} is self-adjoint on {L^2(X)}. Moreover, the maximum principle permits to show that {\mathcal{D}} is also a positive operator on {C_0(X)} with unit norm. Finally, {\mathcal{D}} has a positive symmetric integral kernel: indeed,

\displaystyle \mathcal{D}f(p) = \int_X G(p,q) f(q) \, dA

where the Green propagator {G} is the Poincaré series

\displaystyle G(p,q)=-2\sum\limits_{\gamma\in\Gamma} Q_1(d(p,\gamma(q)))

associated to an appropriate Legendre function {Q_1}. (Here, {d(.,..)} stands for the hyperbolic distance on {\mathbb{H}}.) For later reference, we recall that {Q_1} has a logarithmic singularity at {0} and {-Q_1(x)\sim e^{-2x}} whenever {x} is large.

3. Negativity of the WP sectional curvatures

Interestingly enough, as it was first noticed by Wolpert in 1986, the spectral features of {\mathcal{D}} described above are sufficient to derive the negativity of WP sectional curvatures from Cauchy-Schwarz inequality. More precisely, since {\mathcal{D}} is self-adjoint, i.e.,

\displaystyle (a\overline{b},c\overline{d}) := \int_X (\mu_a\overline{\mu_b}) \, \mathcal{D}(\mu_c\overline{\mu_d}) \, dA = \int_X \mathcal{D}(\mu_a\overline{\mu_b}) \, \mu_c\overline{\mu_d}\,dA

and its integral kernel {G} is a real function, a straightforward computation reveals that the equation (1) for the sectional curvature {K(P)} of a {2}-plane {P} can be rewritten as

\displaystyle \begin{array}{rcl} K(P) &=& \frac{2(1\overline{2}, 1\overline{2})-(1\overline{2}, 2\overline{1})-(1\overline{1}, 2\overline{2})-(2\overline{1}, 1\overline{2})-(2\overline{2}, 1\overline{1})+2(2\overline{1}, 2\overline{1})}{4} \\ &=& \frac{4\textrm{Re}(1\overline{2}, 1\overline{2}) -2(1\overline{2}, 2\overline{1}) -2(1\overline{1}, 2\overline{2})}{4}. \end{array}

If we decompose the function {\mu_1\overline{\mu_2}} into its real and imaginary parts, say {\mu_1\overline{\mu_2} = f+ih}, then we see that

\displaystyle \begin{array}{rcl} \textrm{Re}(1\overline{2}, 1\overline{2}) - (1\overline{2}, 2\overline{1}) &=& \left[\int_X f\,\mathcal{D}f \, dA - \int_X h\, \mathcal{D}h \, dA\right] - \left[\int_X f\,\mathcal{D}f \, dA + \int_X h\, \mathcal{D}h \, dA\right] \\ &=& -2\int_X h\, \mathcal{D}h \, dA. \end{array}

Since {\mathcal{D}} is a positive operator, we conclude that {\textrm{Re}(1\overline{2}, 1\overline{2}) - (1\overline{2}, 2\overline{1})\leq 0} and, a fortiori,

\displaystyle K(P)\leq \frac{\textrm{Re}(1\overline{2}, 1\overline{2})-(1\overline{1}, 2\overline{2})}{2} \ \ \ \ \ (2)

The non-positivity of the right-hand side of (2) can be established in three steps. First, the positivity of {\mathcal{D}} also implies that

\displaystyle \textrm{Re}(1\overline{2}, 1\overline{2})\leq \int_X |f|\,\mathcal{D}|f|\,dA\leq \int_X |f|\,\mathcal{D}|\mu_1\overline{\mu_2}|\,dA.

Secondly, the fact that {\mathcal{D}} has a positive integral kernel {G} allows to apply the Cauchy–Schwarz inequality to get that {\mathcal{D}|uv| =\int G |u v| = \int G^{1/2}|u| G^{1/2}|v| \leq (\mathcal{D}|u|^2)^{1/2} (\mathcal{D}|v|^2)^{1/2}}. Therefore,

\displaystyle \int_X |f|\,\mathcal{D}|\mu_1\overline{\mu_2}|\,dA\leq \int_X |f|\,(\mathcal{D}|\mu_1|^2)^{1/2} (\mathcal{D}|\mu_2|^2)^{1/2}\,dA\leq \int_X |\mu_1\overline{\mu_2}| \, (\mathcal{D}|\mu_1|^2)^{1/2} (\mathcal{D}|\mu_2|^2)^{1/2}\,dA

Finally, the Cauchy–Schwarz inequality also says that

\displaystyle \int_X |\mu_1\overline{\mu_2}| \, (\mathcal{D}|\mu_1|^2)^{1/2} (\mathcal{D}|\mu_2|^2)^{1/2}\,dA\leq \left(\int_X |\mu_1|^2\,(\mathcal{D}|\mu_2|^2)\, dA\right)^{1/2}\left(\int_X |\mu_2|^2\,(\mathcal{D}|\mu_1|^2)\, dA\right)^{1/2}=(1\overline{1},2\overline{2})

In summary, we showed that

\displaystyle (I)\leq (II)\leq (III)\leq (IV)\leq (V)\leq (VI) \ \ \ \ \ (3)


\displaystyle \begin{array}{rcl} & &(I):=\textrm{Re}(1\overline{2}, 1\overline{2}), \quad (II):=\int_X |f|\,\mathcal{D}|f|\,dA, \quad (III):=\int_X |f|\,\mathcal{D}|\mu_1\overline{\mu_2}|\,dA, \\ & & (IV):=\int_X |f|\,(\mathcal{D}|\mu_1|^2)^{1/2} (\mathcal{D}|\mu_2|^2)^{1/2}\,dA, \quad (V):=\int_X |\mu_1\overline{\mu_2}| \, (\mathcal{D}|\mu_1|^2)^{1/2} (\mathcal{D}|\mu_2|^2)^{1/2}\,dA, \\ & & (VI):= (1\overline{1}, 2\overline{2}) \end{array}

In particular, {(I)\leq (VI)}, so that it follows from (2) that all sectional curvatures {K(P)} of the WP metric are non-positive, i.e., {K(P)\leq 0}.

Actually, it is not hard to derive that {K(P)<0} at this stage: indeed, {K(P)=0} would force a case of equality in Cauchy-Schwarz inequality and this is not possible in our context because {\{\mu_1,\mu_2\}} is orthonormal.

Remark 4 Philosophically speaking, the “analog” to this argument in the realm of Teichmüller dynamics is Forni’s proof of the spectral gap property {\lambda_2<1} for the Lyapunov exponents of the Teichmüller geodesic flow. In fact, after some computations with variational formulas for the so-called Hodge norm, Forni establishes that {\lambda_2<1} by ruling out an equality case in a certain Cauchy-Schwarz estimate.

4. Reduction of Theorem 1 to bounds on {\mathcal{D}}‘s kernel

The discussion in the previous section says that small WP sectional curvatures correspond to almost equalities in certain Cauchy-Schwarz inequalities.

Hence, a natural strategy towards the proof of Theorem 1 consists into showing that an almost equality in (3) is impossible. In this direction, Wolpert establishes the following result:

Theorem 2 (Wolpert) There are two constants {c_1(g,n)>0} and {c_2(g,n)>0} with the following property. If we have an almost equality

\displaystyle (V)-(I)\leq c_1(g,n)\cdot\sigma(X)^7,

between the terms {(I)} and {(V)} in (3), then {(VI)} and {(I)} can not be almost equal:

\displaystyle (VI)-(I)\geq c_2(g,n)\cdot\sigma(X)^3

Of course, Theorem 1 is an immediate consequence of Theorem 2 (in view of (2) and the estimate {(VI)-(I)\geq (V)-(I)} [implied by (3)]).

Thus, it remains only to prove Theorem 2. For this sake, we need further spectral information on {\mathcal{D}}, namely, some lower bounds on its the kernel {G(p,q)}. In order to illustrate this point, let us now show Theorem 2 assuming the following statement.

Proposition 3 There exists a constant {c_3(g,n)>0} such that

\displaystyle G(p,q)\geq c_3(g,n)\cdot \sigma(X)^3

whenever {p} and {q} do not belong to the cusp region {X_{cusps}} of {X}.

Remark 5 We recall that the cusp region {X_{cusps}} of {X} is a finite union of portions of {X} which are isometric to a punctured disk {\{0<|w|<c_4(g,n)\}} (equipped with the hyperbolic metric {ds^2=(|dw|/|w|\log|w|)^2}).

For the sake of exposition, let us first establish Theorem 2 when {X} is compact, i.e., {X_{cusps}=\emptyset}, before explaining the extra ingredient needed to treat the general case.

4.1. Proof of Theorem 2 modulo Proposition 3 when {X_{cusps}=\emptyset}

Suppose that {(V)-(I)\leq c_1(g,n)\sigma(X)^7} for a constant {c_1(g,n)} to be chosen later. In this regime, our goal is to show that {(VI)} is “big” and {(II)} is “small”, so that {(VI)-(I)} is necessarily “big”.

We start by quickly showing that {(VI)} is “big”. Since {\mu_1} and {\mu_2} are unitary tangent vectors, it follows from Proposition 3 that

\displaystyle (VI)=\int_X |\mu_1|^2\,\mathcal{D}|\mu_2|^2\,dA\geq c_3(g,n) \sigma(X)^3 \ \ \ \ \ (4)

Let us now focus on proving that {(II)} is “small”. Since {(II)-(I)\leq (V)-(I)} (cf. (3)), if we write {\mu_1\overline{\mu_2} = f+ih=f^+-f^-+ih} (where {f^+} and {f^-} are the positive and negative parts of the real part {f} of {\mu_1\overline{\mu_2})}, then we obtain that

\displaystyle \begin{array}{rcl} c_1(g,n)\,\sigma(X)^7\geq (II)-(I) &=& \int_X |f|\,\mathcal{D}|f|\,dA - \textrm{Re}\int_X\mu_1\overline{\mu_2}\,\mathcal{D}(\mu_1\overline{\mu_2})\, dA \\ &=& \int_X f^+\,\mathcal{D}f^+\,dA + 2\int_X f^+\,\mathcal{D}f^-\,dA+\int_X f^-\,\mathcal{D}f^-\,dA \\ & &- \int_X f\,\mathcal{D}f\,dA+\int_X h\,\mathcal{D}h\,dA \\ &=&4\int_X f^+\,\mathcal{D}f^-\,dA+\int_X h\,\mathcal{D}h\,dA. \end{array}

Since {\mathcal{D}} is positive, we derive that {4\int_X f^+\,\mathcal{D}f^-\,dA\leq c_1(g,n)\,\sigma(X)^7}. Thus, if {X} is compact, i.e., {X_{cusps}=\emptyset}, then Proposition 3 says that {G(p,q)\geq c_3(g,n)\,\sigma(X)^3} for all {p,q\in X}. It follows that

\displaystyle 4\,c_3(g,n)\,\sigma(X)^3\int_X f^+\,dA\int_X f^-\,dA\leq c_1(g,n)\,\sigma(X)^7

By orthogonality of {\{\mu_1,\mu_2\}}, we have that {\textrm{Re}\int_X\mu_1\overline{\mu_2}\,dA=0}, i.e., {\int_X f^+\,dA = \int_X f^-\,dA = (1/2) \int_X |f|\,dA}. By plugging this information into the previous inequality, we obtain the estimate

\displaystyle c_3(g,n)\,\left(\int_X |f|\,dA\right)^2\leq c_1(g,n)\,\sigma(X)^4 \ \ \ \ \ (5)

Next, we observe that {(V)-(IV)\leq (V)-(I)} (cf. (3)) in order to obtain that

\displaystyle c_1(g,n)\,\sigma(X)^7\geq (V)-(IV)=\int_X (|\mu_1\overline{\mu_2}|-|f|) \, (\mathcal{D}|\mu_1|^2)^{1/2} (\mathcal{D}|\mu_2|^2)^{1/2}\,dA

On the other hand, Proposition 3 ensures that {\mathcal{D}|\mu_{\ast}|^2\geq c_3(g,n)\,\sigma(X)^3\,\int_X|\mu_{\ast}|^2\,dA} for {\ast=1,2}. Since {\mu_1} and {\mu_2} are unitary tangent vectors, one has {\mathcal{D}|\mu_{\ast}|^2\geq c_3(g,n)\,\sigma(X)^3} for {\ast=1,2}. By inserting this inequality into the previous estimate, we derive that

\displaystyle c_1(g,n)\,\sigma(X)^7\geq c_3(g,n)\,\sigma(X)^3\,\int_X (|\mu_1\overline{\mu_2}|-|f|)\,dA \ \ \ \ \ (6)

From (5) and (6), we see that

\displaystyle \int_X |\mu_1\overline{\mu_2}|\,dA\leq \sqrt{\frac{c_1(g,n)}{c_3(g,n)}}\sigma(X)^2+\frac{c_1(g,n)}{c_3(g,n)}\sigma(X)^4\leq 2\sqrt{\frac{c_1(g,n)}{c_3(g,n)}}\sigma(X)^2 \ \ \ \ \ (7)

whenever {X} has a sufficiently small systole.

This {L^1} bound on {|\mu_1\overline{\mu_2}|} can be converted into a {C^0} bound thanks to Cauchy integral formula. More concrentely, as it is explained in Section 2 of Wolpert’s preprint, after observing that {|\mu_1\overline{\mu_2}| = |\mu_1\mu_2|} and replacing Beltrami differentials {\mu_1} and {\mu_2} by the dual objects {q_1} and {q_2} (namely, quadratic differentials), we are led to study quartic differentials {q_1q_2}. By Cauchy integral formula on {\mathbb{H}}, one has

\displaystyle |q_1q_2(ds^2)^{-2}|(p)\leq \frac{1}{\pi}\int_{B(p,1)}|q_1q_2(ds^2)^{-2}|\,dA

On the other hand, if {X=\mathbb{H}/\Gamma} has systole {\rho(X)} and the cusp region {X_{cusps}} is empty, then the injectivity radius at any {p\in X} is {\geq \rho(X)/2}. Thus, there exists an universal constant {c_0>0} such that

\displaystyle |q_1q_2(ds^2)^{-2}|(p)\leq \frac{1}{\pi}\int_{B(p,1)}|q_1q_2(ds^2)^{-2}|\,dA\leq c_0\frac{1}{\rho(X)}\|q_1q_2\|_{L^1(X)}

for all {p\in X}. By plugging this inequality into (7), we conclude that

\displaystyle |\mu_1\overline{\mu_2}(p)|\leq 2c_0\sqrt{\frac{c_1(g,n)}{c_3(g,n)}}\frac{1}{\rho(X)}\sigma(X)^2\leq 2c_0\sqrt{\frac{c_1(g,n)}{c_3(g,n)}}\sigma(X)

for all {p\in X}.

Since {\mathcal{D}} is a positive operator on {C_0(X)} with unit norm (cf. Section 2 above) and {|f|\leq |\mu_1\overline{\mu_2}|}, we have that the previous inequality implies the following {C^0} bound on {\mathcal{D}|f|}:

\displaystyle \mathcal{D}|f|(p)\leq 2c_0\sqrt{\frac{c_1(g,n)}{c_3(g,n)}}\sigma(X)

for all {p\in X}. By combining this estimate with (7), we conclude that

\displaystyle (II)=\int_X |f|\,\mathcal{D}|f|\,dA\leq 4c_0\frac{c_1(g,n)}{c_3(g,n)}\sigma(X)^3 \ \ \ \ \ (8)

In summary, (4) and (8) imply that

\displaystyle (VI)-(I)\geq (VI)-(II)\geq \frac{c_3(g,n)}{2}\cdot\sigma(X)^3:=c_2(g,n)\cdot\sigma(X)^3

for the choice of constant {c_1(g,n):=\frac{c_3(g,n)^2}{8c_0}}. This proves Theorem 2 in the absence of cusp regions.

4.2. Proof of Theorem 2 modulo Proposition 3 when {X_{cusps}\neq\emptyset}

The arguments above for the case {X_{cusps}=\emptyset} also work in the case {X_{cusps}\neq\emptyset} because the cusp regions carry only a tiny fraction of the mass of the relevant functions, Beltrami differentials, etc.

More precisely, as it is explained in Section 2 of Wolpert’s preprint, if the constant {c_4(g,n)>0} is chosen correctly, then the Cauchy integral formula and the Schwarz lemma can be used to prove that

\displaystyle \int_{X_{cusps}}|\varphi (ds^2)^{-2}|\,dA\leq \frac{1}{8}\|\varphi\|_{L^1(X)}

for all holomorphic quartic differentials {\varphi}.

In particular, we do not lose too much information after truncating {\mu_1}, {\mu_2}, etc. to {X\setminus X_{cusps}} and this allows us to repeat the arguments of the case {X_{cusps}=\emptyset} to the corresponding truncated objects {\widetilde{\mu_1}}, {\widetilde{\mu_2}}, etc. without any extra difficulty: see Section 5 of Wolpert’s preprint for more details.

5. Proof of Proposition 3

Closing this post, let us give an idea of the proof of Proposition 3 (and we refer the reader to Section 4 of Wolpert’s preprint for more details).

Since {G(p,q)=-2\sum\limits_{\gamma\in\Gamma} Q_1(d(p,\gamma(q)))} and {-Q_1\sim e^{-2x}} (cf. Section 2 above), our task is reduced to give lower bounds on the Poincaré series

\displaystyle K(p,q)=\sum\limits_{\gamma\in\Gamma} e^{-2d(p,\gamma(q))}

For this sake, let us first recall that a hyperbolic surface {X} has thick-thin decomposition: the thick portion is the region where the injectivity radius is bounded away from zero by a uniform constant and the thin portion is the complement of the thick region. Geometrically, the thin region is the disjoint union of the cusp region {X_{cusps}} and a finite number of collars around simple closed short geodesics: roughly speaking, a collar consisting of the points at distance {\leq w(\alpha)=\log(1/\ell_{\alpha})+O(1)} of a short simple closed geodesic {\alpha} of length {\ell_{\alpha}}.

We can provide lower bounds on {K(p,q)} in terms of the behaviours of simple geodesic arcs connecting {p} and {q} on {X}.

More concretely, let {\theta_{pq}} be the shortest geodesic connecting {p} and {q}. Since {\theta_{pq}} is simple, we have that, for certain adequate choices of the constants defining the collars, one has that {\theta_{pq}} can not “back track” after entering a collar, i.e., it must connect the boundaries (rather than going out via the same boundary component). Furthermore, {\theta_{pq}} can not go very high into a cusp. Thus, if we decompose {\theta_{pq}} according to its visits to the thick region, the collars and the cusps, then the fact that {p,q\in X\setminus X_{cusps}} permits to check that it suffices to study the passages of {\theta_{pq}} through collars in order to get a lower bound on {K(p,q)}.

Next, if {\eta} is a subarc of {\theta_{pq}} crossing a collar around a short closed geodesic {\alpha}, then we can apply Dehn twists to {\eta} to get a family of simple arcs indexed by {\mathbb{Z}} giving a “contribution” to {K(p,q)} of

\displaystyle \sum\limits_{n\in\mathbb{Z}}e^{-2(2w(\alpha)+|n|\ell_{\alpha})}\geq c_5(g,n)\cdot \ell_{\alpha}^3

for some constant {c_5(g,n)>0} depending only on the topology of {X}. In this way, the desired result follows by putting all “contributions” together.

The celebrated works of several mathematicians (including Poincaré, Denjoy, …, ArnoldHermanYoccoz, …) provide a very satisfactory picture of the dynamics of smooth circle diffeomorphisms:

  • each {C^r}-diffeomorphism {f\in\textrm{Diff}^r(\mathbb{T})} of the circle {\mathbb{T}:=\mathbb{R}/\mathbb{Z}} has a well-defined rotation number {\alpha=\rho(f)} (which can be defined using the cyclic order of its orbits, for instance);
  • {f\in\textrm{Diff}^r(\mathbb{T})} is topologically semi-conjugated to the rigid rotation {R_{\alpha}(x)=x+\alpha} (i.e., {h\circ f=R_{\alpha}\circ h} for a surjective continuous map {h:\mathbb{T}\rightarrow \mathbb{T}}) whenever its rotation number {\alpha=\rho(f)} is irrational;
  • if {f\in\textrm{Diff}^2(\mathbb{T})} has irrational rotation number {\alpha}, then {f} is topologically conjugated to {R_{\alpha}} (i.e., there is an homeomorphism {h:\mathbb{T}\rightarrow\mathbb{T}} such that {h\circ f = R_{\alpha}\circ h});
  • if {f\in\textrm{Diff}^r(\mathbb{T})}, {r\geq 3}, has an irrational rotation number {\alpha} satisfying a Diophantine condition of the form {|\alpha-p/q|\geq c/q^{2+\beta}} for some {c>0}, {(r-1)/2>\beta\geq 0}, and all {p/q\in\mathbb{Q}}, then there exists {h\in\textrm{Diff}^{r-1-\beta-}(\mathbb{T}):= \bigcap\limits_{\varepsilon>0}\textrm{Diff}^{r-1-\beta-\varepsilon}(\mathbb{T})} conjugating {f} and {R_{\alpha}} (i.e., {h\circ f = R_{\alpha}\circ h});
  • etc.

In particular, if {\alpha} has Roth type (i.e., for all {\varepsilon>0}, there exists {c_{\varepsilon}>0} such that {|\alpha-p/q|\geq c_{\varepsilon}/q^{2+\varepsilon}} for all {p/q\in\mathbb{Q}}), then any {f\in\textrm{Diff}^r(\mathbb{T})} with rotation number {\alpha} is {C^{r-1-}} conjugated to {R_{\alpha}} whenever {r>3}. (The nomenclature is motivated by Roth’s theorem saying that any irrational algebraic number has Roth type, and it is well-known that the set of Roth type numbers has full Lebesgue measure in {\mathbb{R}}.)

In the last twenty years, many authors gave important contributions towards the extension of this beautiful theory.

In this direction, a particularly successful line of research consists into thinking of circle rotations {R_{\alpha}} as standard interval exchange transformations on 2 intervals and trying to build smooth conjugations between generalized interval exchange transformations (g.i.e.t.) and standard interval exchange transformations. In fact, Marmi–Moussa–Yoccoz studied the notion of standard i.e.t. of restricted Roth type (a concept designed so that the circle rotation {R_{\alpha}} has restricted Roth type [when viewed as an i.e.t. on 2 intervals] if and only if {\alpha} has Roth type) and proved that, for any {r\geq 2}, the {C^{r+3}} g.i.e.t.s {T} close to a standard i.e.t. {T_0} of restricted Roth type such that {T} is {C^r}-conjugated to {T_0} form a {C^1}-submanifold of codimension {(g-1)(2r+1)+s} where {T_0} is the first return map to an interval transverse to a translation flow on a translation surface of genus {g\geq 1} and {T_0} is an i.e.t. on {d=2g+s-1} intervals.

An interesting consequence of this result of Marmi–Moussa–Yoccoz is the fact that local conjugacy classes behave differently for circle rotations and arbitrary i.e.t.s. Indeed, a circle rotation is an i.e.t. on 2 intervals associated to the first return map of a translation flow on the torus {\mathbb{T}^2=\mathbb{R}^2/\mathbb{Z}^2}, so that {R_{\alpha}} has genus {g=1} and also {s=1}. Hence, Marmi–Moussa–Yoccoz theorem says that its local conjugacy class of {R_{\alpha}} with {\alpha} of Roth type has codimension {(g-1)(2r+1)+s=1} regardless of the differentiability scale {r}. Of course, this fact was previously known from the theory of circle diffeomorphisms: by the results of Herman and Yoccoz, the sole obstruction to obtain a smooth conjugation between {f} and {R_{\alpha}} (with {\alpha} of Roth type) is described by a single parameter, namely, the rotation number of {f}. On the other hand, Marmi–Moussa–Yoccoz theorem says that the codimension

\displaystyle (g-1)(2r+1)+s

of the local conjugacy class of an i.e.t. of restricted Roth type with genus {g\geq 2} grows linearly with the differentiability scale {r}.

Remark 1 This indicates that KAM theoretical approaches to the study of the dynamics of g.i.e.t.s might be delicate because the “loss of regularity” in the usual KAM schemes forces the analysis of cohomological equations (linearized versions of the conjugacy problem) in several differentiability scales and Marmi–Moussa–Yoccoz theorem says that these changes of differentiabilty scale produce non-trivial effects on the numbers of obstructions (“codimensions”) to solve cohomological equations.

In any case, this interesting phenomenon concerning the codimension of local conjugacy classes of i.e.t.s of genus {g\geq 2} led Marmi–Moussa–Yoccoz to make a series of conjectures (cf. Section 1.2 of their paper) in order to further compare the local conjugacy classes of circle rotations and i.e.t.s of genus {g\geq 2}.

Among these fascinating conjectures, the second open problem in Section 1.2 of Marmi–Moussa–Yoccoz paper asks whether, for almost all i.e.t.s {T_0}, any {C^4} g.i.e.t. {T} with trivial conjugacy invariants (e.g., “simple deformations”) and {C^0} conjugated to {T_0} is also {C^1} conjugated to {T_0}. In other words, the {C^0} and {C^1} conjugacy classes of a typical i.e.t. {T_0} coincide.

In this short post, I would like to transcript below some remarks made during recent conversations with Pascal Hubert showing that the hypothesis “for almost all i.e.t.s {T_0}” can not be removed from the conjecture above. In a nutshell, we will see in the sequel that the self-similar standard interval exchange transformations associated to two special translation surfaces (called Eierlegende Wollmilchsau and Ornithorynque) of genera {3} and {4} are {C^0} but not {C^1} conjugated to a rich family of piecewise affine interval exchange transformations. Of course, I think that these examples are probably well-known to experts (and Jean-Christophe Yoccoz was probably aware of them by the time Marmi–Moussa–Yoccoz wrote down their conjectures), but I’m including some details of the construction of these examples here mostly for my own benefit.

Disclaimer: As usual, even though the content of this post arose from conversations with Pascal, all mistakes/errors in the sequel are my sole responsibility.

1. Preliminaries

1.1. Rauzy–Veech algorithm

The notion of “irrational rotation number” for generalized interval exchange transformations relies on the so-called Rauzy–Veech algorithm.

More concretely, given a {C^r}-g.i.e.t. {f:I\rightarrow I} sending a finite partition (modulo zero) {I=\bigcup\limits_{\alpha\in\mathcal{A}} I_{\alpha}^t} of {I} into closed subintervals {I_{\alpha}^t} disposed accordingly to a bijection {\pi_t:\mathcal{A}\rightarrow\{1,\dots,d\}} to a finite partition (modulo zero) {I=\bigcup\limits_{\alpha\in\mathcal{A}} I_{\alpha}^b} of {I} into closed subintervals {I_{\alpha}^b} disposed accordingly to a bijection {\pi_b:\mathcal{A}\rightarrow\{1,\dots,d\}} (via {C^r}-diffeomorphisms {f|_{I_{\alpha}^t}:I_{\alpha}^t\rightarrow I_{\alpha}^b}), an elementary step of the Rauzy–Veech algorithm produces a new {C^r}-g.i.e.t. {\mathcal{R}(f)} by taking the first return map of {f} to the interval {I\setminus J} where {J=I_{\pi_t^{-1}(d)}^t}, resp. {I_{\pi_b^{-1}(d)}^b} whenever {|I_{\pi_t^{-1}(d)}^t|<|I_{\pi_b^{-1}(d)}^b|}, resp. {|I_{\pi_t^{-1}(d)}^t|>|I_{\pi_b^{-1}(d)}^b|} (and {\mathcal{R}(f)} is not defined when {|I_{\pi_t^{-1}(d)}^t| = |I_{\pi_b^{-1}(d)}^b|}).

We say that a {C^r}-g.i.e.t. {f} has irrational rotation number whenever the Rauzy–Veech algorithm {\mathcal{R}} can be iterated indefinitely. This nomenclature is partly justified by the fact that Yoccoz generalized the proof of Poincaré’s theorem in order to establish that a {C^r}-g.i.e.t. {f} with irrational rotation number is topologically semi-conjugated to a standard, minimal i.e.t. {T_0}.

1.2. Denjoy counterexamples

Similarly to Denjoy’s theorem in the case of circle diffeomorphisms, the obstruction to promote topological semi-conjugations between {f} and {T_0} as above into {C^0}-conjugations is the presence of wandering intervals for {f}, i.e., non-trivial intervals {A} whose iterates under {f} are pairwise disjoint (i.e., {f^i(A)\cap f^j(A)=\emptyset} for all {i,j\in\mathbb{Z}}, {i\neq j}).

Moreover, as it was also famously established by Denjoy, a little bit of smoothness (e.g., {C^1} with derivative of bounded variation) suffices to preclude the existence of wandering intervals for circle diffeomorphisms, and, actually, some smoothness is needed because there are several examples of {C^1}-diffeomorphisms with any prescribed irrational rotation number and possessing wandering intervals. Nevertheless, it was pointed out by several authors (including Camelier–GutierrezBressaud–Hubert–MaasMarmi–Moussa–Yoccoz, …), a high amount of smoothness is not enough to avoid wandering intervals for arbitrary {C^r}-g.i.e.t.: indeed, there are many examples of piecewise affine interval exchange transformations possessing wandering intervals.

Remark 2 The facts mentioned in the previous two paragraphs partly justifies the nomenclature Denjoy counterexample for a {C^r}-g.i.e.t. with irrational rotation number possessing wandering intervals.

In the context of piecewise affine i.e.t.s, the Denjoy counterexamples are also characterized by the behavior of certain Birkhoff sums. More concretely, let {T} be a piecewise affine i.e.t. with irrational rotation number, say {T} is semi-conjugated to a standard i.e.t. {T_0:\bigcup I_{\alpha}^t\rightarrow \bigcup I_{\alpha}^b}. By definition, the logarithm {\log DT} of the slope of {T} is constant on the continuity intervals of {T} and, hence, it allows to naturally define a function {w} taking a constant value {w_{\alpha}} on each continuity interval {I_{\alpha}^t} of {T_0}. In this setting, it is possible to prove (see, e.g., the subsection 3.3.2 of Marmi–Moussa–Yoccoz paper) that {T} has wandering intervals if and only if there exists a point {x^*\in I=\bigcup I_{\alpha}^t} with bi-infinite {T_0}-orbit such that

\displaystyle \sum\limits_{n\in\mathbb{Z}} \exp(S_n w(x^*))<\infty

where the Birkhoff sum {S_nw(x^*)} at a point {x^*} with orbit {T_0^j(x^*)\in \textrm{int}(I_{\alpha_j}^t)} for all {j\in\mathbb{Z}} is defined as {S_nw(x^*)=\sum\limits_{j=0}^{n-1}w_{\alpha_j}}, resp. {\sum\limits_{j=-1}^{n}w_{\alpha_j}} for {n\geq 0}, resp. {n<0}.

For our subsequent purposes, it is worth to record the following interesting (direct) consequence of this “Birkhoff sums” characterization of piecewise affine Denjoy counterexamples:

Proposition 1 Let {T} be a piecewise affine i.e.t. topologically semi-conjugated to a standard, minimal i.e.t. {T_0}. Denote by {w} the piecewise constant function associated to the logarithms of the slopes of {T}.If {\liminf\limits_{n\rightarrow\infty} |S_n w(y)|<\infty} for all {y} with bi-infinite {T_0}-orbit, then {T} is topologically conjugated to {T_0} (i.e., {T} is not a Denjoy counterexample).

1.3. Special Birkhoff sums and the Kontsevich–Zorich cocycle

An elementary step of the Rauzy–Veech algorithm {\mathcal{R}} replaces a standard, minimal i.e.t. {T_0} on an interval {I=\bigcup\limits_{\alpha\in\mathcal{A}} I_{\alpha}^t} by a standard, minimal i.e.t. {\mathcal{R}(T_0)} given by the first return map of {T_0} on an appropriate subinterval {J=\bigcup\limits_{\alpha\in\mathcal{A}} J_{\alpha}^t\subset I}.

The special Birkhoff sum {\mathcal{S}} associated to an elementary step {\mathcal{R}} is the operator mapping a function {\phi:I\rightarrow I} to a function {\mathcal{S}\phi(x)=S_{r(x)}\phi(x):=\sum\limits_{j=0}^{r(x)-1}\phi(T_0^j(x))}, {x\in J}, where {r(x)} stands for the first return time to {J}.

The special Birkhoff sum operator {S} preserves the space of piecewise constant functions in the sense that {\mathcal{S}\phi} is constant on each {J_{\alpha}^t} whenever {\phi} is constant on each {I_{\beta}^t}. In particular, the restriction of {\mathcal{S}} to the space of such piecewise constant functions gives rise to a matrix {B:\mathbb{R}^{\mathcal{A}}\rightarrow \mathbb{R}^{\mathcal{A}}}. The family of matrices obtained from the successive iterates of the Rauzy–Veech algorithm provides a concrete description of the so-called Kontsevich–Zorich cocycle.

In summary, the behaviour of special Birkhoff sums (i.e., Birkhoff sums at certain “return” times) of piecewise constant functions is described by the Kontsevich–Zorich cocycle. Therefore, in view of Proposition 1, it is probably not surprising to the reader at this point that the Lyapunov exponents of the Kontsevich–Zorich cocycle will have something to do with the presence or absence of piecewise affine Denjoy counterexamples.

1.4. Eierlegende Wollmilchsau and Ornithorynque

The Eierlegende Wollmilchsau and Ornithorynque are two remarkable translation surfaces {M_{EW}} and {M_{O}} of genera {3} and {4} obtained from finite branched covers of the torus {\mathbb{T}^2}. Among their several curious features, we would like to point out that the following fact proved by Jean-Christophe Yoccoz and myself: if {T_0} is a standard i.e.t. on {\#\mathcal{A}=9} or {10} intervals (resp.) associated to the first return map of the translation flow {V} in a typical direction on {M_{EW}} or {M_{O}} (resp.), then there are vectors {q_V}, {p_{T_0}} and a {(\#\mathcal{A}-2)}-dimensional vector subspace {H} such that {\mathbb{R}^{\mathcal{A}} = \mathbb{R} q_V\oplus H\oplus \mathbb{R} p_{T_0}} is an equivariant decomposition with respect to the matrices of the Kontsevich–Zorich cocycle with the following properties:

  • (a) {q_V} generates the Oseledets direction of the top Lyapunov exponent {\theta_1>0};
  • (b) {p_{T_0}} generates the Oseledets direction of the smallest Lyapunov exponent {-\theta_1};
  • (c) the matrices of the Kontsevich–Zorich cocycle act on {H} through a finite group.

In the literature, the Lyapunov exponents {\pm\theta_1} are usually called the tautological exponents of the Kontsevich–Zorich cocycle. In this terminology, the third item above is saying that all non-tautological Lyapunov exponents of the Kontsevich–Zorich associated to {M_{EW}} and {M_{O}} vanish.

In the next two sections, we will see that this curious behaviour of the Kontsevich–Zorich cocycle of {M_{EW}} or {M_{O}} along {H} allows to construct plenty of piecewise affine i.e.t.s which are {C^0} but not {C^1} conjugated to standard (and uniquely ergodic) i.e.t.s.

2. “Il n’y a pas de contre-exemple de Denjoy affine par morceaux issu de {M_{EW}} et {M_{O}}

In this section (whose title is an obvious reference to a famous article by Jean-Christophe Yoccoz), we will see that the Eierlegende Wollmilchsau and Ornithorynque never produce piecewise affine Denjoy counterexamples with irrational rotation number of “bounded type”.

More precisely, let us consider {T} is a piecewise affine i.e.t. topologically semi-conjugated to {T_0} coming from (the first return map of the translation flow in the direction of a pseudo-Anosov homeomorphism of) {M_{EW}} or {M_{O}}. It is well-known that the piecewise constant function {w} associated to the logarithms of the slopes {DT} of {T} belongs to {H\oplus \mathbb{R} p_{T_0}} (see, e.g., Section 3.4 of Marmi–Moussa–Yoccoz paper). In order to simplify the exposition, we assume that the “irrational rotation number” {T_0} has “bounded type”, that is, {T_0} is self-similar in the sense that some of its iterates {\mathcal{R}^k(T_0)} under the Rauzy–Veech algorithm actually coincides with {T_0} up to scaling.

If {w\in H}, then the item (c) from Subsection 1 above implies that all special Birkhoff sums of {w} (in the future and in the past) are bounded. From this fact, we conclude that {\liminf\limits_{n\rightarrow\infty} |S_nw(y)|\leq C} for all {y} with bi-infinite {T_0}-orbit: indeed, as it is explained in details in Bressaud–Bufetov–Hubert article, if {T_0} is self-similar, then the orbits of {T_0} can be described by a substitution on a finite alphabet {\mathcal{A}} and this allows to select a bounded subsequence of {S_nw(y)} thanks to the repetition of certain words in the prefix-suffix decomposition.

In particular, it follows from Proposition 1 above that there is no Denjoy counterexample among the piecewise affine i.e.t.s {T} topologically semi-conjugated to a self-similar standard i.e.t. {T_0} coming from {M_{EW}} or {M_O} such that {w\in H}.

Remark 3 Actually, it is possible to explore the fact that {p_{T_0}} is a stable vector (i.e., it generates the Oseledets space of a negative Lyapunov exponent) to remove the constraint “{w\in H}” from the statement of the previous paragraph.

In other words, we showed that any {w\in H} always provides a piecewise affine i.e.t. {C^0}-conjugated to {T_0}. Note that this is a relatively rich family of piecewise affine i.e.t.s because {H} is a vector space of dimension {7}, resp. {8}, when {T_0} is a self-similar standard i.e.t. coming from {M_{EW}}, resp. {M_O}.

3. Cohomological obstructions to {C^1} conjugations

Closing this post, we will show that the elements {w\in H\setminus\{0\}} always lead to piecewise affine i.e.t.s which are not {C^1} conjugated to self-similar standard i.e.t.s of {M_{EW}} or {M_O}. Of course, this shows that the {C^0} and {C^1} conjugacy classes of a self-similar standard i.e.t. of {M_{EW}} or {M_O} are distinct and, a fortiori, the Marmi–Moussa–Yoccoz conjecture about the coincidence of {C^0} and {C^1} conjugacy classes of standard i.e.t.s becomes false if we remove “for almost all standard i.e.t.s” from its statement.

Suppose that {T} is a piecewise affine i.e.t. {C^1}-conjugated to a self-similar standard i.e.t. {T_0} of {M_{EW}} or {M_O}, say {T\circ h = h\circ T_0} for some {C^1}-diffeomorphism {h}. By taking derivatives, we get

\displaystyle (DT\circ h) \cdot h' = h'\circ T_0

since {T_0} is an isometry. Of course, we recognize the slope of {T} on the left-hand side of the previous equation. So, by taking logarithms, we obtain

\displaystyle w=\Psi\circ T_0 - \Psi

where {\Psi:=\log h'} is a {C^0} function. In other terms, {\Psi} is a solution of the cohomological equation and {w} is a {C^0}-coboundary. Hence, the Birkhoff sums {S_nw=\Psi\circ T_0^n-\Psi} are bounded and, by continuity of {\Psi}, the special Birkhoff sums {\mathcal{S}w} of {w} converge to zero. Equivalently, {w\in\mathbb{R}^{\mathcal{A}}} belongs to the weak stable space of the Kontsevich–Zorich cocycle (compare with Remark 3.9 of Marmi–Moussa–Yoccoz paper).

However, the item (c) from Subsection 1.4 above tells that the Kontsevich–Zorich cocycle acts on {w\in H\setminus\{0\}} through a finite group of matrices and, thus, {w\in H\setminus\{0\}} can not converge to zero under the Kontsevich–Zorich cocycle.

This contradiction proves that {T} is not {C^1}-conjugated to {T_0}, as desired.

Older Posts »