Posted by: matheuscmss | February 8, 2019

Breuillard-Sert’s joint spectrum (I)

Last November 2018, Romain Dujardin, Charles Favre, Thomas Gauthier, Rodolfo Gutiérrez-Romo and I started a groupe de travail around the preprint The joint spectrum by Emmanuel Breuillard and Cagri Sert.

My plan is to transcript my notes from this groupe de travail in a series of posts starting today with a summary of the first meeting where an overview of the whole article was provided. As usual, all mistakes/errors in the sequel are my sole responsibility.

1. Introduction

Let {M_d(\mathbb{C})} be the set of {d\times d} matrices with complex entries. Given {A\in M_d(\mathbb{C})}, recall that its spectral radius {r(A)} is given by Gelfand’s formula

\displaystyle r(A) = \lim\limits_{n\rightarrow\infty}\|A^n\|^{1/n}

More generally, given a compact subset {S\subset M_d(\mathbb{C})}, recall that its joint spectral radius of {S} (introduced by Rota–Strang in 1960) is the quantity

\displaystyle R(S) := \lim\limits_{n\rightarrow\infty} \sup\limits_{g_1,\dots, g_n\in S} \|g_1\dots g_n\|^{1/n} = \lim\limits_{n\rightarrow\infty} \sup\limits_{g\in S^n} \|g\|^{1/n}

where {S^n:=\{g_1\dots g_n: g_1,\dots, g_n\in S\}}.

Remark 1 By submultiplicativity (or, more precisely, Fekete’s lemma), the limit defining {R(S)} always exists.

Remark 2 {R(S)} is independent of the choice of {\|.\|}. In particular, {R(S) = R(g S g^{-1})} for all {g\in GL_d(\mathbb{C})}.

The joint spectral radius appears naturally in several areas of Mathematics (such as wavelets and control theory), and my first contact with this notion occurred through a subfield of Dynamical Systems called ergodic optimization (where one considers an observable {f} and one seeks to maximize {\int f d\mu} among all invariant probability measures {\mu} of a given dynamical system).

The goal of Breuillard–Sert article is two-fold: they introduce of a notion of joint spectrum of {S} and they show that it vastly refines previous related concepts such as joint spectral radius, Benoist cone, etc.

Today, our plan is to provide an overview of some of the main results obtained by Breuillard–Sert. For this sake, we divide this post into two sections: the first one contains a potpourri of prototypical versions of Breuillard–Sert’s theorems, and the the last section provides the precise statements whose proofs will be discussed in subsequent posts in this series.

Read More…

Advertisements
Posted by: matheuscmss | February 6, 2019

Examples of Rauzy classes (after Yoccoz)

This week I attended the mini-conference Autour des surfaces de translation organized by Corentin Boissy and Slavyana Geninska at Toulouse.

One of the main objectives of this meeting was to discuss in details a somewhat long (66 pages) text by Jean-Christophe Yoccoz containing new notions and tools allowing to efficiently describe certain combinatorial objects known as Rauzy diagrams.

In fact, this text was still in preliminary format when Jean-Christophe passed away and, for this reason, Corentin and I spend a certain time discussing the insertion of footnotes in order to clarify several portions of Jean-Christophe’s text. After Corentin and I found that the text was finally “accessible” (to anyone with a certain familiarity with Jean-Christophe’s survey here, say), it was decided that we should “celebrate” the occasion with a meeting around this matter.

In any case, one of the outcomes of the mini-conference is that Jean-Christophe’s text entitled Examples of Rauzy classes with footnotes by Corentin is finally publicly available here.

In a nutshell, the first part of Jean-Christophe’s text is devoted to the notions of heightbi-monotonous cycles, etc., allowing to explore a given Rauzy diagram starting from a certain subgraph whose vertices consist of the so-called standard permutations; then, the second part of Jean-Christophe’s text is a sort of “proof of concept” where several Rauzy diagrams are described (including some containing several thousands of vertices). Here, it is worth to notice that he did the corresponding calculations by hand (mostly during winter vacations at Loctudy as he told me)! Corentin wrote a few Sage programs to double check some of these calculations and, as expected, they turned out to be correct.

Closing this short post, let me try to explain below some of Jean-Christophe’s motivations to get a systematic description of Rauzy diagrams.

First of all, recall that the study of the dynamics of interval exchange transformations and translation flows often relies on a renormalization scheme (“continued fraction algorithm”) called Rauzy–Veech induction: for a detailed exposition of this topic, the reader can consult Yoccoz’s survey here.

Roughly speaking, the Rauzy–Veech induction serves to encode the renormalization of interval exchange transformations and translation flows via topological Markov shifts induced by Rauzy diagrams: more concretely, a Rauzy diagram is a special type of oriented graph {\mathcal{D}} and the dynamics of the renormalization procedure is described by the topological Markov shift consisting of the shift dynamics {(e_n)_{n\in\mathbb{Z}}\mapsto (e_{n+1})_{n\in\mathbb{Z}}} on the space of bi-infinite paths (i.e., concatenations of edges) on {\mathcal{D}}.

In general, Rauzy diagrams are defined as follows. We take an abstract finite alphabet {\mathcal{A}} on {d\geq 2} letters. A permutation {\pi=(\pi_t, \pi_b)} is a pair of bijections {\pi_t,\pi_b:\mathcal{A}\rightarrow\{1,\dots,d\}} (normally we would like to say that {\pi_b\circ\pi_t^{-1}} is a permutation of {\{1,\dots, d\}}, but the data {\pi=(\pi_t,\pi_b)} provides a more “symmetric” way to describe permutations). In the literature, {\pi} is often denoted as a list of the form

\displaystyle \pi=\left(\begin{array}{ccc} \pi_t^{-1}(1) & \dots & \pi_t^{-1}(d) \\ \pi_b^{-1}(1) & \dots & \pi_b^{-1}(d) \end{array}\right)

and the first, resp. last letter of the top and bottom rows are denoted {_{t}\alpha=\pi_t^{-1}(1)} and {_{b}\alpha = \pi_b^{-1}(1)}, resp. {\alpha_{t}=\pi_t^{-1}(d)} and {\alpha_{b}= \pi_b^{-1}(d)}.

The top operation {\mathcal{R}_t} maps a permutation {\pi=(\pi_t,\pi_b)} to {\mathcal{R}_t(\pi)=(\pi_t,\pi_b')} where {\pi_b'} is obtained from {\pi_b} by performing a cyclic permutation of the letters appearing after {\alpha_t} on the bottom row of {\pi}. Similarly, one can define the bottom operation {\mathcal{R}_b} by symmetry (i.e., essentially by exchanging the roles of top and bottom rows). In this setting, a Rauzy diagram {\mathcal{D}} is the oriented graph whose vertices correspond to the orbit of a given permutation {\pi} under the top and bottom operations, and whose oriented edges have the form {\kappa\rightarrow\mathcal{R}_t(\kappa)} and {\kappa\rightarrow\mathcal{R}_b(\kappa)}.

Exercise 1 Draw the three Rauzy diagrams associated to the following three permutations: {\left(\begin{array}{cc} A & B \\ B & A \end{array}\right)}, {\left(\begin{array}{ccc} A & B & C \\ C & B & A \end{array}\right)}, {\left(\begin{array}{cccc} A & B & C & D \\ D & C & B & A \end{array}\right)}.

Among many other results in this topic, our recent work with Avila and Yoccoz on the partial solution of the so-called Zorich conjecture (previously discussed in this post here) relies upon the precise knowledge of the geometry of Rauzy diagrams.

For this reason, right after partially solving Zorich’s conjecture, Jean-Christophe started his detailed study of arbitrary Rauzy diagrams in hope to solve Zorich’s conjecture in full generality.

As it turns out, Zorich’s conjecture was recently solved in full generality by Rodolfo Gutiérrez-Romo while bypassing many fine aspects of Rauzy diagrams (see the original article here and/or this post here), but it is clear that Jean-Christophe’s text on Rauzy diagrams will pave a way for further applications of the fascinating combinatorial objects.

Posted by: matheuscmss | January 3, 2019

Romain Dujardin’s Bourbaki seminar talk 2018

Last October, Romain Dujardin gave a nice talk at Bourbaki seminar about the equidistribution of Fekete points, pluripotential theory and the works of Robert BermanSébastien Boucksoum and David Witt Nyström(including this article here). The video of Dujardin’s talk (in French) is available here and the corresponding lecture notes (also in French) are available here.

In the sequel, I will transcript my notes for Dujardin’s talk (while referring to his text for all omitted details). In particular, we will follow his path, that is, we will describe how a question related to polynomial interpolation was solved by complex geometry methods, but we will not discuss the relationship of the material below with point processes.

Remark 1 As usual, any errors/mistakes in this post are my sole responsibility.

1. Polynomial interpolation and logarithmic potential theory in one complex variable

1.1. Polynomial interpolation

Let {\mathcal{P}_k(\mathbb{C})} be the vector space of polynomials of degree {\leq k} in one complex variable. By definition, {\dim\mathcal{P}_k(\mathbb{C})=k+1}.

The classical polynomial interpolation problem can be stated as follows: given {k+1} points {z_0,\dots, z_{k+1}} on {\mathbb{C}}, can we find a polynomial with prescribed values at {z_j}‘s? In other terms, can we invert the evaluation map {ev(z_0,\dots,z_k):\mathcal{P}_k(\mathbb{C})\rightarrow \mathbb{C}^{k+1}}, {P\mapsto (P(z_0),\dots, P(z_k))} ?

The solution to this old question is well-known: in particular, the problem can be explicitly solved (whenever the points are distinct).

What about the effectiveness and/or numerical stability of these solutions? It is also well-known that they might be “unstable” in many aspects: for instance, the inverse of {ev(z_0,\dots, z_k)} starts to behave badly when some of the points {z_0,\dots, z_k} get close together, a small error on the values {P(z_0),\dots, P(z_k)} might lead to huge errors in the polynomial {P}Runge’s phenomenon shows that certain interpolations about equidistant point in {[-1,1]} are highly oscillating, etc.

This motivates the following question: are there “optimal” choices for the points (leading to “minimal instabilities” in the solution of the interpolation problem)?

This vague question can be formalized in several ways. For instance, the interpolation problem turns out to be a linear algebra question asking to invert an appropriate Vandermonde matrix {V(z_0,\dots, z_k)} and, a fortiori, the calculations will eventually oblige us to divide by an adequate determinant {\det V(z_0,\dots, z_k)}. Hence, if we denote by {e_i(z)=z^i}, {i=0, \dots, k}, the base of monomials of {\mathcal{P}_k(\mathbb{C})}, then we can say that an optimal configuration maximizes the modulus of Vandermonde’s determinant

\displaystyle \det M(z_0,\dots, z_k) = \det(e_i(z_j))_{0\leq i, j\leq k} = \prod\limits_{0\leq i<j\leq k} (z_j-z_i).

Of course, this optimization problem has a trivial solution if we do not impose constraints on {z_0,\dots, z_k}. For this reason, we shall fix some compact subset {K\subset \mathbb{C}} and we will assume that {z_0,\dots, z_k\in K}.

Definition 1 A Fekete configuration {(z_0,\dots,z_k)\in K^{k+1}} is a maximum of

\displaystyle K^{k+1}\ni(w_0,\dots, w_k)\mapsto \prod\limits_{0\leq i<j\leq k} |w_j-w_i|

Definition 2 The {(k+1)}-diameter of {K} is

\displaystyle d_{k+1}(K) := \prod\limits_{0\leq i<j\leq k} |z_j-z_i|^{2/k(k+1)}

where {(z_0,\dots, z_k)} is a Fekete configuration.

It is not hard to see that {d_{k+1}(K)\leq d_k(K)}, i.e., {d_k(K)} is a decreasing sequence.

Definition 3 {d_{\infty}(K)=\lim\limits_{k\rightarrow\infty} d_k(K)} is the transfinite diameter.

The transfinite diameter is related to the logarithmic potential of {K}.

1.2. Logarithmic potential

In the one-dimensional setting, the equidistribution of Fekete configurations towards an equilibrium measure was established by Fekete and Szegö.

Theorem 4 (Fekete, Szegö) If {d_{\infty}(K)>0} and {F_k} is a sequence of Fekete configurations, then the sequence of probability measures

\displaystyle \frac{1}{k+1}\sum\limits_{z\in F_k} \delta_z =:\frac{1}{k+1}[F_k]

converges in the weak-* topology to the so-called equilibrium measure {\mu_K} of {K}.

Proof: Let us introduce the following “continuous” version of Fekete configurations. Given a measure {\mu} on {K}, its “energy” is

\displaystyle I(\mu) := \int\log|z-w| \, d\mu(z) \, d\mu(w),

so that if we forget about the “diagonal terms” {\log|z_i-z_i|}, then {I([F_k]) = \log d_{k+1}(K)}. Recall that the capacity of {K} is {\textrm{cap}(K)=\exp(V(K))} where {V(K) = \sup \{I(\mu): \mu\in\mathcal{M}(K)\}} (and {\mathcal{M}(K)} stands for the space of probability measures on {K}).

Theorem 5 (Frostman) Either {K} is polar, i.e., {I(\mu)=-\infty} for all {\mu\in\mathcal{M}(K)} or there is an unique {\mu_K\in\mathcal{M}(K)} with {I(\mu_K)=V(K)}.

We are not going to prove this result here. Nevertheless, let us mention that an important ingredient in the proof of Frostman’s theorem is the logarithmic potential {u_{\mu}(z)=\int\log|z-w|\,d\mu(w)} associated to {\mu\in\mathcal{M}(K)}: it is a subharmonic function whose (distributional) Laplacian is {\Delta u_{\mu} = 2\pi\mu}. A key feature of the logarithmic potential is the fact that if {I(\mu) = V(K)}, then {u_{\mu}(z)=V(K)} for {\mu}-almost every {z}: observe that this allows to conclude the uniqueness of {\mu_K} because it would follow from {I(\mu)=V(K)=I(\nu)} that {u_{\mu}-u_{\nu}} is harmonic, “basically” zero on {K}, and {u_{\mu}-u_{\nu} = O(1)} near infinity.

Anyhow, it is not hard to deduce the equidistribution of Fekete configurations towards {\mu_K} from Frostman’s theorem. Indeed, let {\mu_k} be {\frac{1}{k+1}[F_k]} and consider the modified energy {\widetilde{I}(\mu_k) = \int_{z\neq w} \log|z-k|\,d\mu_k(z)\,d\mu_k(w) = \frac{k+1}{k}\log\delta_{k+1}(K)}. A straighforward calculation (cf. the proof of Théorème 1.1 in Dujardin’s text) reveals that if {\mu_{k_j}} is a converging subsequence, say {\mu_{k_j}\rightarrow\nu}, then {I(\nu)\geq \log d_{\infty}(K) \geq V(K)}. \Box

1.3. Two remarks

The capacity of {K} admits several equivalent definitions: for instance, the quantities

\displaystyle \tau_k(K)=\inf\{\|P\|_K:=\sup\limits_{z\in K}|P(z)|: P \textrm{ is a monic polynomial of degree }\leq k\}

form a submultiplicative sequence (i.e., {\tau_{k+l}(K)\leq\tau_k(K)\tau_l(K)}) and the so-called Chebyshev constant

\displaystyle \tau_{\infty}(K) := \lim\limits_{n\rightarrow\infty}\tau_n(K)^{1/n}

coincides with {\textrm{cap}(K)}. In other terms, the capacity of {K} is the limit of certain geometrical quantities {\tau_k(K)} associated to a natural norm {\|.\|_K} on the spaces of polynomials {\mathcal{P}_k(\mathbb{C})}.

Also, it is interesting to consider the maximization problem for weighted versions

\displaystyle I_Q(\mu) = I(\mu)+\int Q\,d\mu

of the energy {I(\mu)} of measures.

As it turns out, these ideas play a role in higher dimensional context discussed below.

2. Pluripotential theory on {\mathbb{C}^n}

Denote by {\mathcal{P}_k(\mathbb{C}^n)} the space of polynomials of degree {\leq k} on {n} complex variables: it is a vector space of dimension {\binom{n+k}{k}:=N_k\sim k^n/n!} as {k\rightarrow \infty}.

Let {K} be a compact subset of {\mathbb{C}^n} and consider {z_1,\dots, z_{N_k}\in K}. Similarly to the case {n=1}, the interpolation problem of inverting the evaluation map {\mathcal{P}_k(\mathbb{C}^n)\ni P\mapsto (P(z_1),\dots, P(z_{N_k}))\in\mathbb{C}^{N_k}} involves the computation of the determinant {\det(e_i(z_j))} where {(e_i)} is the base of monomials. Once again, we say that a collection {(z_1,\dots, z_{N_k})} of {N_k} points in {K} maximizing the quantity {|\det(e_i(z_j))|} is a Fekete configuration and the transfinite diameter of {K} is {d_{\infty}(K)=\limsup\limits_{k\rightarrow\infty} d_k(K)} where

\displaystyle d_k(K) = \max\limits_{(z_1,\dots, z_{N_k})\in K^{N_k}}|\det(e_i(z_j))|^{1/kN_k}

is the {k}diameter of {K}.

Given the discussion of the previous section, it is natural to ask the following questions: do Fekete configurations equidistribute? what about the relation of the transfinite diameter and pluripotential theory?

A first difficulty in solving these questions comes from the fact that it is not easy to produce a “continuous” version of Fekete configurations via a natural concept of energy of measures having all properties of the quantity {I(\mu)} in the case {n=1}.

A second difficulty towards the questions above is the following: besides the issues coming from pairs of points which are too close together, our new interpolation problem has new sources of instability such as the case of a configuration of points lying in an algebraic curve. In particular, this hints that some techniques coming from complex geometry will help us here.

The next result provides an answer (comparable to Frostman’s theorem above) to the second question:

Theorem 6 (Zaharjuta (1975)) The limit of {(d_k(K))_{k\in\mathbb{N}}} exists. Moreover, if {K} is not pluripolar, then {d_{\infty}(K)>0}.

Here, we recall that a pluripolar set is defined in the context of pluripotential theory as follows. First, a function {u:\Omega\rightarrow[-\infty,+\infty)} on a open subset {\Omega\subset\mathbb{C}^n} is called plurisubharmonic (psh) whenever {u} is upper semicontinuous (usc) and {u|_C} is subharmonic for any {C\subset\Omega} holomorphic curve. Equivalently, if {u\not\equiv-\infty}, then {u} is psh when {u} is usc and the matrix of distributions {\left(\frac{\partial^2 u}{\partial z_j\partial\overline{z_k}}\right)} is positive-definite Hermitian, i.e., {dd^cu:=\frac{i}{\pi}\sum\frac{\partial^2 u}{\partial z_j\partial \overline{z_k}} dz_j\wedge d\overline{z_k}\geq 0}. Next, {E} is pluripolar whenever {E\subset \{u=-\infty\}} where {u\not\equiv-\infty} is a psh function.

An important fact in pluripotential theory is that the positive currents {dd^c u} can be multiplied: if {u_1,\dots, u_m} are bounded psh functions, then the exterior product {dd^c u_1\wedge\dots\wedge dd^c u_m} can be defined as a current. In particular, we can define the Monge-Ampère operator {MA(u)=(dd^c u)^n = \frac{n!}{\pi^n}\det\left(\frac{\partial^2 u}{\partial z_j\partial\overline{z_k}}\right)idz_1\wedge d\overline{z_1}\wedge\dots\wedge i dz_n\wedge d\overline{z_n}} on the space of bounded psh functions.

Note that {MA(u)} is a positive current of maximal degree, i.e., a positive measure. This allows us to define a candidate for the equilibrium measure in higher dimensions in the following way.

Let

\displaystyle \mathcal{L} = \{u \textrm{ psh on } \mathbb{C}^n \textrm{ with } u(z)\leq \log|z|+O(1) \textrm{ as }z\rightarrow\infty\}

be the so-called Lelong class of psh functions. Given a compact subset {K\subset\mathbb{C}^n}, let

\displaystyle V_K(z) = \sup\{u(z): u\in\mathcal{L}, u|_K\leq 0\}.

Observe that {V_K(z)} is a natural object: for instance, it differs only by an additive constant from the logarithmic potential of the equilibrium measure of {K} when {n=1}. Indeed, this follows from the key property of the logarithmic potential (“{I(\mu)=V(K)} implies {u_{\mu}(z)=V(K)} for {\mu}-almost every {z}”) mentioned earlier and the fact that {V_K} is a subharmonic function which essentially vanishes on {K}.

In general, {V_K(z)} is not psh. So, let us consider the psh function given by its usc regularization {V_K^*(z):=\inf\{u(z): u \textrm{ usc, } u\geq V_K\}}. Note that {V_K^*\geq V_K\geq 0}, so that we can define the equilibrium measure of {K} as

\displaystyle \mu_K:=(d d^c V_K^*)^n.

It is worth to point out that {V_K} can be recovered from the study of polynomials (in a similar way to our discussion of the Chebyshev constant in the previous section). More concretely, we have {\frac{1}{k}\log |P|\in\mathcal{L}} for all {P\in\mathcal{P}_k(\mathbb{C}^n)} and a result of Siciak ensures that

\displaystyle V_K(z)=\sup\left\{\frac{1}{k}\log |P(z)|: k\in\mathbb{N}, P\in\mathcal{P}_k(\mathbb{C}^n), P\leq 1 \textrm{ on } K\right\}

Finally, note that this discussion admits a weighted version where the usual Euclidean norm {|.|} on {\mathbb{C}^n} is replaced by {|.| \exp(-Q)}, the determinant {|\det (e_i(z_j))|} is replaced by {|\det (e_i(z_j))| \exp(-Q(z_1))\dots \exp(-Q(z_{N_k}))}, etc.

Read More…

Posted by: matheuscmss | December 30, 2018

Continued fractions, binary quadratic forms and Markov spectrum

In Number Theory, the study of the so-called Markov spectrum often relies upon the following classical fact relating continued fractions and the values of real indefinite binary quadratic forms at integral points:

Theorem 1 The sets

\displaystyle M_1:=\left\{\frac{\sqrt{b^2-4ac}}{\inf\limits_{(x,y)\in\mathbb{Z}^2\setminus\{(0,0)\}}|ax^2+bxy+cy^2|}<\infty: a,b,c\in\mathbb{R}, b^2-4ac>0\right\}

and

\displaystyle M_2:=\left\{\sup\limits_{n\in\mathbb{Z}}([g_n; g_{n+1},\dots]+[0;g_{n-1},g_{n-2},\dots])<\infty: (g_m)_{m\in\mathbb{Z}}\in(\mathbb{N}^*)^{\mathbb{Z}}\right\}

coincide. (Here, {[t_0;t_1,\dots] := t_0+\frac{1}{t_1+\frac{1}{\ddots}}} stands for continued fraction expansions.)

Remark 1 It is worth to note that the set {M_1} concerns only the real binary quadratic forms {q(x,y)=ax^2+bxy+cy^2} with {b^2-4ac>0} such that the quantity

\displaystyle \frac{\sqrt{b^2-4ac}}{\inf\limits_{(x,y)\in\mathbb{Z}^2\setminus\{(0,0)\}}|ax^2+bxy+cy^2|}

is finite. In particular, we are excluding real binary quadratic forms {q} with {0\in q(\mathbb{Z}^2\setminus\{(0,0)\})}.

In this post, we follow the books of Dickson and Cusick–Flahive in order to give a proof of this result via the classical reduction theory of binary quadratic forms.

1. Generalities on binary quadratic forms

A real binary quadratic form is {q(x,y)=ax^2+bxy+cy^2} with {a,b,c\in\mathbb{R}}. From the point of view of linear algebra, a binary quadratic form is

\displaystyle q(x,y) = ax^2+bxy+cy^2 =\langle Mv,v\rangle = v^t M v \ \ \ \ \ (1)

where {\langle .,.\rangle} is the usual Euclidean inner product of {\mathbb{R}^2}, {v} is the column vector {v=\left(\begin{array}{c} x \\ y \end{array}\right)}, and {M} is the matrix {M=\left(\begin{array}{cc} a & b/2 \\ b/2 & c \end{array}\right)}.

The discriminant of {q(x,y)=ax^2+bxy+cy^2 = \langle Mv, v\rangle} is {d:=b^2-4ac = -4\det(M)}.

Remark 2 The values taken by {q(x,y)=bxy} are very easy to describe: in particular, {0\in q(\mathbb{Z}^2\setminus\{(0,0)\})} in this context. Hence, by Remark 1, we can focus on {q(x,y)=ax^2+bxy+cy^2} with {a\neq 0} or {c\neq0}. Moreover, by symmetry (i.e., exchanging the roles of the variables {x} and {y}), we can assume that {a\neq0}. So, from now on, we shall assume that {a\neq0}.

Note that {4aq(x,y)=(2ax+by)^2-dy^2}. Therefore, {q} is definite (i.e., its values are all positive or all negative) whenever the discriminant is {d\leq 0}. Conversely, when {a\neq 0}, {q} is indefinite (i.e., it takes both positive and negative values) whenever {d>0}. In view of the definition of {M_1} in the statement of Theorem 1 above, we will restrict ourselves from now on to the indefinite case {d>0}.

Observe also that {q(x,y)=a\prod\limits_{a\omega^2+b\omega+c=0}(x-\omega y) = a(x-fy)(x-sy)} where

\displaystyle f:=\frac{\sqrt{d}-b}{2a} \quad \textrm{and} \quad s := \frac{-\sqrt{d}-b}{2a} \ \ \ \ \ (2)

are the first and second roots of {a\omega^2+b\omega+c=0}. In particular, {0\in q(\mathbb{Z}^2\setminus\{(0,0)\})} when {f\in\mathbb{Q}} or {s\in\mathbb{Q}}. So, by Remark 1, we will suppose from now on that {f, s\notin\mathbb{Q}}.

In summary, from now on, our standing assumptions on {q(x,y)=ax^2+bxy+cy^2} are {a\neq0}, {d:=b^2-4ac>0} and {f:=\frac{\sqrt{d}-b}{2a}\notin\mathbb{Q}}, {s:= \frac{-\sqrt{d}-b}{2a}\notin\mathbb{Q}}.

Remark 3 The first and second roots {f}, {s} and the discriminant {d>0} determine the coefficients {a}, {b}, {c} of the binary quadratic form {q(x,y)}. Indeed, the formula {q(x,y)=a(x-fy)(x-sy)} says that it suffices to determine {a}. On one hand, the modulus {|a|} is given by {4a^2(f-s)^2 = d}. On the other hand, the fact that {f} is the 1st root and {s} is the 2nd root determines the sign of {a}: if we replace {a} by {-a} in the formula {q(x,y)=a(x-fy)(x-sy)}, then we obtain the binary quadratic form {-q} whose 1st root is {s} and 2nd root is {f}; hence, an ambiguity on the sign of {a} could only occur when {f=s}, i.e., {d=0}, in contradiction with our assumption {d>0}.

2. Action of {GL_2(\mathbb{R})} on binary quadratic forms

A key idea (going back to Lagrange) to study the values of a fixed binary quadratic form {q(x,y)=ax^2+bxy+cy^2} is to investigate the equivalent problem of describing the values of a family of binary quadratic forms on a fixed vector.

More precisely, if a vector {v=\left(\begin{array}{c} x \\ y \end{array}\right)=T(V)} is obtained from a fixed vector {V=\left(\begin{array}{c} X \\ Y \end{array}\right)} via a matrix {T=\left(\begin{array}{cc} \alpha & \beta \\ \gamma & \delta \end{array}\right)\in GL_2(\mathbb{R})}, i.e., {x=\alpha X + \beta Y}, {y=\gamma X+ \delta Y}, then the value of {q} at {v} equals the value of {q\circ T} at {V}. By (1), the binary quadratic forms {q} and {q\circ T} are related by:

\displaystyle q\circ T (V) = (T(V))^t\cdot M \cdot T(V) = V^t (T^t M T) V \ \ \ \ \ (3)

where {M=\left(\begin{array}{cc} a & b/2 \\ b/2 & c \end{array}\right)}. In other terms, {q\circ T(X,Y)=AX^2+BXY+CY^2} where

\displaystyle \left(\begin{array}{cc} A & B/2 \\ B/2 & C \end{array}\right) = T^t M T = \left(\begin{array}{cc} \alpha & \gamma \\ \beta & \delta \end{array}\right)\left(\begin{array}{cc} a & b/2 \\ b/2 & c \end{array}\right)\left(\begin{array}{cc} \alpha & \beta \\ \gamma & \delta \end{array}\right), \ \ \ \ \ (4)

i.e., {A=a\alpha^2+b\alpha\gamma+c\gamma^2}, {B=2a\alpha\beta+b(\alpha\delta+\beta\gamma)+ 2c\gamma\delta}, and {C=a\beta^2+b\beta\delta+c\delta^2}.

Observe that (3) implies that the discriminant of {q\circ T} is {\det(T)^2\cdot d}. Thus, {q\circ T} and {q} have the same discriminant whenever {\det(T)=\pm1}.

2.1. Action of {SL_2(\mathbb{Z})} on binary quadratic forms

The quantity {\inf\{|z|: z\in q(\mathbb{Z}^2\setminus\{(0,0)\})\}} appearing in the definition of {M_1} in the statement of Theorem 1 leads us to restrict our attention to the action of {SL_2(\mathbb{Z})} on {q(x,y)=ax^2+bxy+cy^2}. In fact, since {q(\lambda x,\lambda y) = \lambda^2 q(x,y)}, we have that

\displaystyle \inf\{|z|: z\in q(\mathbb{Z}^2\setminus\{(0,0)\})\} = \inf\{|z|: z\in q(\mathbb{Z}^2_{prim})\} \ \ \ \ \ (5)

where {\mathbb{Z}^2_{prim}:=\{(p,q)\in\mathbb{Z}^2: gcd(p,q)=1\}} is the set of primitive vectors of {\mathbb{Z}^2}. So, it is natural to concentrate on {SL_2(\mathbb{Z})} because it acts transitively on {\mathbb{Z}^2_{prim}}.

Definition 2 Two binary quadratic forms {q} and {Q} are equivalent whenever {Q=q\circ T} for some {T\in SL_2(\mathbb{Z})}.

Note that two equivalent binary quadratic forms have the same discriminant. Furthermore, if {(x-\omega y)} is a factor of {q(x,y)=ax^2+bxy+cy^2} and {x=\alpha X+\beta Y}, {y=\gamma X+\delta Y} with {T=\left(\begin{array}{cc}\alpha & \beta\\ \gamma&\delta\end{array}\right)\in SL_2(\mathbb{Z})}, then {(\alpha X+\beta Y - \omega(\gamma X+\delta Y))} is a factor of {q\circ T(X,Y)=AX^2+BXY+CY^2} and, a fortiori, {(X-\frac{-\beta+\omega\delta}{\alpha-\omega\gamma}Y)} is a factor of {q\circ T}. In particular, the roots of {A\Omega^2+B\Omega+C=0} are related to the roots of {a\omega^2+b\omega+c=0} via

\displaystyle \Omega=\frac{-\beta+\omega\delta}{\alpha-\omega\gamma} \ \ \ \ \ (6)

In other words, {\omega} and {\Omega} are related to each other via the action of {T^{-1} = \left(\begin{array}{cc} \delta & -\beta \\ -\gamma & \alpha \end{array}\right)} by Möbius transformations. Actually, a direct calculation with the formulas for {A} and {B} in (4) and the fact that {\alpha\delta-\beta\gamma=1} show that (6) respect the order of the roots, i.e.,

\displaystyle F=\frac{-\beta+f\delta}{\alpha-f\gamma} \quad \textrm{and} \quad S=\frac{-\beta+s\delta}{\alpha-s\gamma} \ \ \ \ \ (7)

where {f} and {F} are the first roots, and {s} and {S} are the 2nd roots.

Remark 4 Note that {\alpha-\omega\gamma\neq 0} because {(\alpha, \gamma)\in\mathbb{Z}^2_{prim}} and we are assuming that the roots of {a\omega^2+b\omega+c=0} are irrational.

Lemma 3 Under our standing assumptions, any binary quadratic form {q} is equivalent to some {ax^2+bxy+cy^2} with {|b|\leq |a|\leq \sqrt{d/3}}.

Proof: We will proceed in two steps: first, we apply certain elements of {SL_2(\mathbb{Z})} to ensure that {|a|\leq \sqrt{d/3}}, and, after that, we use a parabolic matrix to obtain {|b|\leq |a|}.

By (4), the matrix {H_0=\left(\begin{array}{cc} h_0 & 1 \\ -1 & 0 \end{array}\right)\in SL_2(\mathbb{Z})} converts {q(x,y)=a_0x^2+b_0xy+c_0y^2} into {q\circ H_0(X,Y) = a_1X^2+b_1XY+a_0Y^2} with {b_1=2a_0h_0-b_0}.

If {|a_0|>\sqrt{d/3}}, the choice of {h_0\in\mathbb{Z}} with {|2a_0h_0-b_0|\leq |a_0|} in the discussion above leads to {q\circ H_0(X,Y)=a_1X^2+b_1XY+a_0Y^2} with

\displaystyle 4a_1a_0 = b_1^2-d\leq b_1^2 = (2a_0h_0-b_0)^2\leq a_0^2 \quad \textrm{and} \quad -4a_1a_0 = d-b_1^2\leq d<3a_0^2.

In other terms, if {|a_0|>\sqrt{d/3}}, then {q} is equivalent to {a_1X^2+b_1XY+a_0Y^2} with {|4a_1a_0|<3a_0^2}, i.e., {|a_1|<(3/4)|a_0|}.

By iterating this process, we find a sequence {a_nX^2+b_nXY+a_{n-1}Y^2} of binary quadratic forms equivalent to {q} such that {|a_n|<(3/4)^n |a_0|} whenever {|a_{n-1}|>\sqrt{d/3}}. It follows that {q} is equivalent to some {ax^2+\widetilde{b}xy+\widetilde{c}y^2} with {|a|\leq \sqrt{d/3}}.

Finally, by (4), the matrix {P=\left(\begin{array}{cc} 1 & k \\ 0 & 1 \end{array}\right)\in SL_2(\mathbb{Z})} converts {ax^2+\widetilde{b}xy+\widetilde{c}y^2} into {aX^2+bXY+cY^2} with {b=\widetilde{b}+2ka}. Hence, the choice of {k\in\mathbb{Z}} with {|\widetilde{b}+2ka|\leq |a|} gives that {q} is equivalent to {ax^2+bxy+cy^2} with {|b|\leq |a|\leq \sqrt{d/3}}. \Box

2.2. Reduction theory (I)

In general, the study of the dynamics of the action of a group {G} on a certain space {\mathcal{M}} is greatly improved in the presence of nice fundamental domain, i.e., a portion {\mathcal{D}\subset \mathcal{M}} with good geometrical properties capturing all orbits in the sense that the {G}-orbit of any {x\in \mathcal{M}} intersects {\mathcal{D}}.

In the setting of {SL_2(\mathbb{Z})} acting on binary quadratic forms, the role of fundamental domain is played by the notion of reduced binary quadratic form:

Definition 4 We say that {q(x,y)=ax^2+bxy+cy^2} is reduced whenever

\displaystyle \frac{|\sqrt{d}-b|}{2|a|}=|f|<1, \quad \frac{|\sqrt{d}+b|}{2|a|}=|s|>1 \quad \textrm{and} \quad f\cdot s<0

Remark 5 Suppose that {q} is reduced. Then, {d-b^2=(\sqrt{d}-b)(\sqrt{d}+b)=-4a^2fs>0} thanks to the condition {f\cdot s<0}. In particular, {|b|<\sqrt{d}}, so that {\sqrt{d}- b, \sqrt{d}+b > 0}. Hence, {0 < \sqrt{d}-b < 2|a| < \sqrt{d}+b} thanks to the condition {|f|<1<|s|}. Note that the inequality {\sqrt{d}-b<\sqrt{d}+b} says that {b>0}. In other words, {q} reduced implies ({0<b<\sqrt{d}} and) {0<\sqrt{d}-b<2|a|< \sqrt{d}+b}.Conversely, the inequality {0<\sqrt{d}-b<2|a|< \sqrt{d}+b} implies that ({0<b<\sqrt{d}} and) {q} is reduced.

Therefore, {q} is reduced if and only if

\displaystyle 0<\sqrt{d}-b<2|a|< \sqrt{d}+b

Furthermore, the identity {|\sqrt{d}-b|\cdot|\sqrt{d}+b|=4|ac|} implies that {q} is reduced if and only if

\displaystyle 0<\sqrt{d}-b<2|c|< \sqrt{d}+b

In particular, {q} is reduced if and only {q\circ R} is reduced where {R\in GL_2(\mathbb{Z})} is the matrix {R=\left(\begin{array}{cc} 0&1\\ 1 & 0\end{array}\right)} inducing the change of variables {x=Y}, {y=X}.Moreover, if {q} is reduced, then {f\cdot a = (\sqrt{d}-b)/2>0} and {c/a = f\cdot s<0}, i.e., the sign of {c} is opposite to the signs of {f} and {a}.

The next result says that reduced forms are a fundamental domain for the {SL_2(\mathbb{Z})} action on binary quadratic forms:

Theorem 5 Under our standing assumptions, any binary quadratic form {q} is equivalent to some reduced binary quadratic form.

Proof: By Lemma 3, we can assume that {q(x,y)=ax^2+bxy+cy^2} with {|b|\leq |a|\leq \sqrt{d/3}\leq\sqrt{d}}, so that {|4ac|=d-b^2\leq d}. Thus, {\min\{2|a|, 2|c|\}\leq\sqrt{d}}. By performing the change of variables {x=Y}, {y=-X} (i.e., applying the matrix {\left(\begin{array}{cc} 0&1 \\ -1&0\end{array}\right)\in SL_2(\mathbb{Z})}) if necessary, we can suppose that {2|a|\leq \sqrt{d}}.

Since we are assuming that {a\neq 0}, it is possible to choose {k\in\mathbb{Z}} such that {0\leq\sqrt{d}-2|a|\leq \widetilde{b}:=b+2ak\leq \sqrt{d}}. So, if we apply the matrix {\left(\begin{array}{cc} 1&k \\ 0&1\end{array}\right)\in SL_2(\mathbb{Z})} (i.e., {x=X+kY}, {y=Y}) to {q}, then we get the equivalent binary quadratic form {ax^2+\widetilde{b}xy+\widetilde{c}} with {0\leq\sqrt{d}-\widetilde{b}\leq 2|a|\leq \sqrt{d}\leq \sqrt{d}+\widetilde{b}}.

We affirm that {ax^2+\widetilde{b}xy+\widetilde{c}} is reduced. In fact, by Remark 5, our task is to show that we have strict inequalities

\displaystyle 0<\sqrt{d}-\widetilde{b}< 2|a|< \sqrt{d}+\widetilde{b}

However, {0=\sqrt{d}-b} or {\sqrt{d}-\widetilde{b}=2|a|} or {2|a|=\sqrt{d}+\widetilde{b}} would imply that {0} or {\pm1} is a root of {a\omega^2+\widetilde{b}\omega+\widetilde{c}=0}, a contradiction with our assumption that none of its roots is rational. Hence, {ax^2+\widetilde{b}xy+\widetilde{c}} is reduced. \Box

Read More…

In this last post of this series, we want to complete the discussion of Oh–Benoist–Miquel theorem by giving a sketch of its proof in the cases not covered in previous posts.

More precisely, let us remind that Oh–Benoist–Miquel theorem (answering a conjecture of Margulis) asserts that:

Theorem 1 Let {G} be a semisimple algebraic Lie group of real rank {\textrm{rank}_{\mathbb{R}}(G)\geq 2}. Denote by {U\subset G} a horospherical subgroup of {G}. If {\Gamma\subset G} is a discrete Zariski-dense and irreducible subgroup such that {\Gamma\cap U} is cocompact, then {\Gamma} is commensurable to an arithmetic lattice {G_{\mathbb{Z}}}.

Moreover, we remind that the proof of this result was worked out in the previous two posts of this series for {G = SL(4,\mathbb{R})} and {U = \left\{\left(\begin{array}{cccc} 1 & 0 & \ast & \ast \\ 0 & 1 & \ast & \ast \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{array}\right)\right\}}. Furthermore, we observed en passant that these arguments can be generalized without too much effort to yield a proof of Theorem 1 when

  • {U} is commutative;
  • {U} is reflexive (i.e., {U} is conjugate to an opposite horospherical subgroup {U^-});
  • {S} is not compact (where {P=N_G(U)}, {P^-=N_G(U^-)}, {L=P\cap P^-} and {S=[L,L]}).

Today, we will divide our discussion below into five sections discussing prototypical examples covering all possible remaining cases for {U} and {S}.

Remark 1 The fact that Theorem 1 holds for the examples in Sections 12 and 3 below is originally due to Oh. Similarly, the example in Section 4 was originally treated by Selberg. Finally, the original proof of Theorem 1 for the example in Section 5 is due to Benoist–Oh. Nevertheless, expect for Section 3, the arguments discussed below are some particular examples illustrating the general strategy of Benoist–Miquel and, hence, they provide some proofs which are different from the original ones.

1. {U} is not reflexive

The prototype example of this case is {G=SL(3,\mathbb{R})} and {U = \left\{\left(\begin{array}{ccc} 1 & \ast & \ast \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right)\right\}}.

The corresponding parabolic subgroup {P=N_G(U)} is the stabilizer of the line {\mathbb{R}\, e_1}:

\displaystyle P=\left\{\left(\begin{array}{ccc} \ast & \ast & \ast \\ 0 & \ast & \ast \\ 0 & \ast & \ast \end{array}\right)\right\} = \{T\in G: T(e_1)\in\mathbb{R}e_1\}.

Equivalently, {P} is the stabilizer of the flag {\{0\}\subset\mathbb{R}e_1\subset \mathbb{R}^3}. Therefore, {U} is not reflexive because its opposite is the stabilizer of a plane.

Since {\Gamma} is Zariski-dense in {G}, we can find {g, h\in\Gamma} such that {\{e_1, g(e_1), h(e_1)\}} is a basis of {\mathbb{R}^3}. Hence, there is no loss in generality in assuming that {g(e_1)=e_2} and {h(e_1)=e_3}. In this setting,

\displaystyle U'=gUg^{-1} = \left\{\left(\begin{array}{ccc} 1 & 0 & 0 \\ \ast & 1 & \ast \\ 0 & 0 & 1 \end{array}\right)\right\}, \quad U''=hUh^{-1} = \left\{\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ \ast & \ast & 1 \end{array}\right)\right\}.

Also, we know that {U'/(U'\cap\Gamma)} and {U''/(U''\cap\Gamma)} are compact. Moreover, {\Delta = \langle\Gamma\cap U', \Gamma\cap U''\rangle} is a discrete and Zariski dense subgroup of the semi-direct product

\displaystyle H=\langle U, U'\rangle = \left\{ \left(\begin{array}{ccc} a & b & v_1 \\ c & d & v_2 \\ 0 & 0 & 1 \end{array}\right): \left(\begin{array}{cc} a & b \\ c & d \end{array}\right)\in S, \left(\begin{array}{c} v_1 \\ v_2 \end{array}\right)\in V \right\}

of {S=SL(2,\mathbb{R})} and {V=\mathbb{R}^2}.

In this context, a key fact is the following result of Auslander (compare with Proposition 4.17 in Benoist–Miquel paper):

Theorem 2 (Auslander) Let {H} be an algebraic subgroup obtained from a semi-direct product of {S} semisimple and {V} solvable, and denote by {p:H\rightarrow S} the natural projection. If {\Delta\subset H} is discrete and Zariski dense, then {p(\Delta)} is also discrete and Zariski dense in {S}.

The information about the discreteness of the projection {p(\Delta)} in the previous statement is extremely precious for our purposes. Indeed, Auslander theorem implies that the projections {p(\Gamma\cap U)} and {p(\Gamma\cap U')} are discrete. Using these facts, one checks that

\displaystyle \Gamma\cap\left\{\left(\begin{array}{ccc} 1 & 0 & \ast \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right)\right\} \neq \{\textrm{Id}\} \quad \textrm{ and } \quad \Gamma\cap\left\{\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & \ast \\ 0 & 0 & 1 \end{array}\right)\right\} \neq \{\textrm{Id}\}

By repeating this argument with {(U, U'')} and {(U', U'')} in the place of {(U,U')}, one can “fill all non-diagonal entries”, that is, one essentially gets that {\Gamma} contains finite-index subgroups of

\displaystyle \left\{ \left(\begin{array}{ccc} 1 & \ast & \ast \\ 0 & 1 & \ast \\ 0 & 0 & 1 \end{array}\right)\in SL(3,\mathbb{Z})\right\} \textrm{ and } \left\{ \left(\begin{array}{ccc} 1 & 0 & 0 \\ \ast & 1 & 0 \\ \ast & \ast & 1 \end{array}\right)\in SL(3,\mathbb{Z})\right\},

so that Raghunathan–Venkataramana–Oh theorem (stated in the previous post of this series) guarantees that {\Gamma} is commensurable to {SL(3,\mathbb{Z})}.

This completes our sketch of proof of Theorem 1 for our prototype of non-reflexive subgroup {U} above.

2. {U} is Heisenberg and {S} is not compact

Heisenberg horospherical subgroup {U} is a {2}-step nilpotent whose associated parabolic group {P=N_G(U)} acts by similarities (of some Euclidean norm) on the center of the Lie algebra {\mathfrak{u}} of {U}.

A prototypical example of {U} Heisenberg and {S} non-compact is {G=SL(4,\mathbb{R})} and

\displaystyle U=\left\{ \left( \begin{array}{cccc} 1 & \ast & \ast & \ast \\ 0 & 1 & 0 & \ast \\ 0 & 0 & 1 & \ast \\ 0 & 0 & 0 & 1 \end{array} \right) \right\}

As it turns out, any Heisenberg {U} is reflexive. Thus, we have that {U^-=\gamma_0 U \gamma_0^{-1}} is opposite to {U} for some adequate choice {\gamma_0\in\Gamma}.

In particular, it is tempting to mimmic the arguments from the second and third posts of this series, namely, one introduces the lattices

\displaystyle \Lambda = \log(\Gamma\cap U)\in X_{\mathfrak{u}}, \quad \Lambda^- = \log(\Gamma\cap U^-)\in X_{\mathfrak{u}^-},

so that the arithmeticity of {\Gamma} follows from the closedness of the {\textrm{Ad}_L}-orbit of {(\Lambda, \Lambda^-)} in {X_{\mathfrak{u}}\times X_{\mathfrak{u}^-}} when {S} is not compact; moreover, the closedness of {\textrm{Ad}_L(\Lambda, \Lambda^-)} is basically a consequence of the closedness and discreteness of {F(\Lambda)} in {\mathbb{R}} for an appropriate choice of polynomial function {F}.

In the case of {U} commutative, we took {F(X) = \Phi(\exp(X)\gamma_0)}, where {\Phi(g) = \textrm{det}_{\mathfrak{u}}(M(g))}, {M(g)=\pi\circ \textrm{Ad}(g)\circ \pi} and {\pi:\mathfrak{g}\rightarrow \mathfrak{u}} was the natural projection with respect to the decomposition {\mathfrak{g} = \mathfrak{u}\oplus \mathfrak{l} \oplus \mathfrak{u}^-}.

As it turns out, the case of {U} Heisenberg can be dealt with by slightly modifying the construction in the previous paragraph. More precisely, one considers a natural graduation

\displaystyle \mathfrak{g} = \underbrace{\mathfrak{g}_{-2}\oplus \mathfrak{g}_{-1}}_{=\mathfrak{u}^-} \oplus \underbrace{\mathfrak{g}_0}_{=\mathfrak{l}} \oplus \underbrace{\mathfrak{g}_1 \oplus \overbrace{\mathfrak{g}_2}^{=\textrm{ center of }\mathfrak{u}}}_{=\mathfrak{u}}

and one sets {F(X)=\Phi(\exp(X)\gamma_0)}, {\Phi(g) = \textrm{det}(M(g))}, {M(g)=\pi\circ\textrm{Ad}(g)\circ \pi}, and {\pi} is the natural projection {\pi:\mathfrak{g}\rightarrow \mathfrak{g}_2}. In our prototypical example, the polynomial function {F(X) = \Phi(\exp(X)\gamma_0)} is very explicit:

\displaystyle F(\left(\begin{array}{cccc} 0 & x_1 & x_2 & z \\ 0 & 0 & 0 & y_1 \\ 0 & 0 & 0 & y_2 \\ 0 & 0 & 0 & 0 \end{array}\right)) = (x_1 y_1 + x_2 y_2)^2 - z^2

This completes our sketch of proof of Theorem 1 when {U} is Heisenberg and {S} is not compact.

3. {U} is not commutative and {U} is not Heisenberg

Our prototype of non-commutative and non-Heisenberg {U} is the subgroup

\displaystyle U=\left\{ \left(\begin{array}{cccc} 1 & \ast & \ast & \ast \\ 0 & 1 & \ast & \ast \\ 0 & 0 & 1 & \ast \\ 0 & 0 & 0 & 1 \end{array}\right) \right\}

of {G=SL(4,\mathbb{R})}.

In this context, we will explore some well-known results from the theory of lattices in nilpotent groups to reduce our task to the case of {U} commutative and reflexive.

More concretely, the properties of nilpotent groups together with our hypothesis that {\Delta=\Gamma\cap U} is a lattice in {U} allow to conclude that {\Delta':=[\Delta,\Delta]} is a lattice in

\displaystyle U':=[U, U] = \left\{ \left( \begin{array}{cccc} 1 & 0 & \ast & \ast \\ 0 & 1 & 0 & \ast \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right) \right\},

and, consequently, the centralizer {\Delta_0} of {\Delta'} in {\Delta} is a lattice in the centralizer

\displaystyle U_0= \left\{ \left( \begin{array}{cccc} 1 & 0 & \ast & \ast \\ 0 & 1 & \ast & \ast \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right) \right\}

of {U'} in {U}. Therefore, we reduced matters to the case of {U_0} commutative and reflexive which was discussed in the previous two posts of this series.

In particular, our sketch of proof of Theorem 1 when {U} is non-commutative and non-Heisenberg is complete.

4. {U} commutative and {S} is compact

The basic example of {U} commutative and {S} compact is {G=SL(2,\mathbb{R})\times SL(2,\mathbb{R})} and

\displaystyle U= \left\{ (\left(\begin{array}{cc} 1 & \ast \\ 0 & 1 \end{array}\right), \left(\begin{array}{cc} 1 & \ast \\ 0 & 1 \end{array}\right)) \right\}.

In this setting, we consider

\displaystyle L = \left\{ (\left(\begin{array}{cc} \lambda_1 & 0 \\ 0 & \lambda_1^{-1} \end{array}\right), \left(\begin{array}{cc} \lambda_2 & 0 \\ 0 & \lambda_2^{-1} \end{array}\right)): \lambda_1, \lambda_2\in \mathbb{R}^*\right\} = P\cap P^-

the common Levi subgroup of {P=N_G(U)} and the parabolic subgroup {P^-} normalizing an opposite of {U}, and the “unimodular Levi subgroup”

\displaystyle \begin{array}{rcl} L_0 &=& \{\ell\in L: \textrm{det}_{\mathfrak{u}} \textrm{Ad}(\ell)=1\} \\ &=& \left\{ (\left(\begin{array}{cc} \lambda_1 & 0 \\ 0 & \lambda_1^{-1} \end{array}\right), \left(\begin{array}{cc} \lambda_2 & 0 \\ 0 & \lambda_2^{-1} \end{array}\right)): \lambda_1 \lambda_2=\pm1\right\} \\ &\simeq& \mathbb{R}^*\times\{\pm1\} \end{array}

The discussion in the second post of this series ensures that the {\textrm{Ad}_{L_0}}-orbit of {(\Lambda, \Lambda^-)} is closed in {X_{\mathfrak{u}}\times X_{\mathfrak{u}^-}}.

We affirm that {\textrm{Ad}_{L_0}(\Lambda, \Lambda^-)} is compact. Indeed, this fact can be proved via Mahler’s compactness criterion: more concretely, recall from the second post of this series that the proof of the closedness of {\textrm{Ad}_L(\Lambda, \Lambda')} produced a polynomial {F} on {\mathfrak{u}} which is {\textrm{Ad}_{L_0}}-invariant and whose values {F(\Lambda)} on {\Lambda} form a closed and discrete subset of {\mathbb{R}}; in our prototypical example, a direct computation shows that

\displaystyle F(\left(\begin{array}{cc} 0 & x_1 \\ 0 & 0 \end{array}\right), \left(\begin{array}{cc} 0 & x_2 \\ 0 & 0 \end{array}\right)) = x_1^2 x_2^2;

in particular, {\inf\limits_{\ell\in L_0} \inf\limits_{X\in\Lambda\setminus\{0\}} \|\textrm{Ad}(\ell) X\|^4 > \inf\limits_{\ell\in L_0} \inf\limits_{X\in\Lambda\setminus\{0\}} F(\textrm{Ad}(\ell) X)}; therefore, the {\textrm{Ad}_{L_0}}-invariance of {F} together with the closedness and discreteness of {F(\Lambda)} imply that

\displaystyle \begin{array}{rcl} \inf\limits_{\ell\in L_0} \inf\limits_{X\in\Lambda\setminus\{0\}} \|\textrm{Ad}(\ell) X\|^4 &>& \inf\limits_{\ell\in L_0} \inf\limits_{X\in\Lambda\setminus\{0\}} F(\textrm{Ad}(\ell) X) \\ &=& \inf\limits_{X\in\Lambda\setminus\{0\}} F(X) =\min\limits_{X\in\Lambda\setminus\{0\}} F(X); \end{array}

since {\Lambda} is irreducible, {\min\limits_{X\in\Lambda\setminus\{0\}} F(X)>0}, and, a fortiori, there are no arbitrarily short non-trivial vectors in the closed family of lattices {\textrm{Ad}_{L_0}(\Lambda)}; hence, we can apply Mahler’s compactness criterion to complete the proof of our affirmation.

At this point, we observe that {L_0\simeq \mathbb{R}^*\times \{\pm1\}} is not compact (because {\textrm{rank}_{\mathbb{R}}(G)\geq 2}), so that the compactness of {\textrm{Ad}_{L_0}(\Lambda,\Lambda^-)} means that the stabilizer of this orbit is infinite. Consequently, {\Gamma\cap L} is infinite, and a quick inspection of the previous post reveals that this is precisely the information needed to apply Margulis’ construction of {\mathbb{Q}}-forms and Raghunathan–Venkataramana theorem in order to derive the arithmeticity of {\Gamma}. This completes our sketch of proof of Theorem 1 when {U} is commutative and {S} is compact (and the reader is invited to consult Section 4.6 of Benoist–Miquel paper for more details).

5. {U} is Heisenberg and {S} compact

Closing this series of post, let us discuss the remaining case of {U} Heisenberg and {S} compact. A concrete example of this situation is {G=SL(3,\mathbb{R})} and

\displaystyle U=\left\{ \left(\begin{array}{ccc} 1 & \ast & \ast \\ 0 & 1 & \ast \\ 0 & 0 & 1 \end{array}\right) \right\}.

In this context, {L=\left\{ \left(\begin{array}{ccc} a & 0 & 0 \\ 0 & b & 0 \\ 0 & 0 & c \end{array}\right): abc=1 \right\}} and an unimodular Levi subgroup is

\displaystyle L_0 = \left\{ \left(\begin{array}{ccc} a & 0 & 0 \\ 0 & 1/a^2 & 0 \\ 0 & 0 & a \end{array}\right): a\in\mathbb{R}^* \right\}

Once again, let us recall that we know that {\textrm{Ad}_{L_0}(\Lambda, \Lambda^-)} is closed, where

\displaystyle \Lambda= \left\{ u=u(x,y,z)=\left(\begin{array}{ccc} 0 & x & z \\ 0 & 0 & y \\ 0 & 0 & 0 \end{array}\right)\in \mathfrak{u} \right\}

We affirm that there is no loss of generality in assuming that {x\neq 0} and {y\neq 0} for all {u=u(x,y,z)\in \Lambda\setminus (\Lambda\cap [U, U])}. Indeed, if this is not the case (say {y=0} for some {u\in\Lambda\setminus (\Lambda\cap [U,U])}), then we are back to the setting of Section 1 above (of the horospherical subgroup {\left\{\left(\begin{array}{ccc} 1 & \ast & \ast \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right)\right\}}).

Here, we can derive the arithmeticity of {\Gamma} along the same lines in Section 4 above (where it sufficed to study an appropriate polynomial {F(\left(\begin{array}{cc} 0 & x_1 \\ 0 & 0 \end{array}\right), \left(\begin{array}{cc} 0 & x_2 \\ 0 & 0 \end{array}\right))=x_1^2 x_2^2} to employ Mahler’s compactness criterion). More precisely, one uses the fact that {x\neq0} and {y\neq0} for all {u\in\Lambda\setminus (\Lambda\cap [U,U])} to prove that {\textrm{Ad}_{L_0}(\Lambda, \Lambda')} is compact, so that {\Gamma\cap L} is infinite and, thus, by Margulis’ construction of {\mathbb{Q}}-forms and Raghunathan–Venkataramana theorem, {\Gamma} is arithmetic.

Recall that the main goal of this series of posts is the proof of the following result:

Theorem 1 Let {G} be a semisimple algebraic Lie group of real rank {\textrm{rank}_{\mathbb{R}}(G)\geq 2}. Denote by {U\subset G} a horospherical subgroup of {G}. If {\Gamma\subset G} is a discrete Zariski-dense and irreducible subgroup such that {\Gamma\cap U} is cocompact, then {\Gamma} is commensurable to an arithmetic lattice {G_{\mathbb{Z}}}.

Last time, we discussed the first half of the proof of this theorem in the particular case of {G=SL(2p,\mathbb{R})}, {p\geq 2}, and {U=\left\{\left( \begin{array}{cc} I & B \\ 0 & I \end{array} \right): B\in M(p,\mathbb{R})\right\}}. Actually, we saw that this specific form of {U\subset G} is not very important: all results from the previous post hold whenever

  • {U} is reflexive: in the context of the example above, this is the fact that {U} is conjugate to the opposite horospherical subgroup {U^-=\left\{\left( \begin{array}{cc} I & 0 \\ C & I \end{array} \right): C\in M(p,\mathbb{R})\right\}};
  • {U} is commutative.

Indeed, we observed that {U} reflexive allows to also assume that {\Gamma\cap U^-} is cocompact in {U^-}. Then, this property and the commutativity of {U} were exploited to establish the closedness of the {\textrm{Ad}_L}-orbit of {(\Lambda, \Lambda^-)} in {X_{\mathfrak{u}}\times X_{\mathfrak{u}^-}}, where {\Lambda:=\log(\Gamma\cap U)\in X_{\mathfrak{u}} = \{\textrm{lattices in }\mathfrak{u}\}}, {\Lambda^-:=\log(\Gamma\cap U^-)\in X_{\mathfrak{u}^-} = \{\textrm{lattices in }\mathfrak{u}^-\}}, and

\displaystyle L=P\cap P^- = \left\{\left(\begin{array}{cc} A & 0 \\ 0 & D \end{array}\right):\textrm{det}(A)\cdot\textrm{det}(D)=1\right\}

is the common Levi subgroup of the parabolic subgroups {P=N_G(U)} and {P^-=N_G(U^-)} normalizing {U} and {U^-}.

Today, we will discuss the second half of the proof of Theorem 1 in the particular case of {G=SL(2p,\mathbb{R})}, {p\geq 2}, and {U=\left\{\left( \begin{array}{cc} I & B \\ 0 & I \end{array} \right): B\in M(p,\mathbb{R})\right\}}: in other terms, our goal below is to obtain the arithmeticity of {\Gamma} from the closedness of {\textrm{Ad}_L(\Lambda, \Lambda^-)} in the homogenous space {X_{\mathfrak{u}}\times X_{\mathfrak{u}^-} := G_0/\Gamma_0}. This step is due to Hee Oh (see Proposition 3.4.4 of her paper).

1. From closedness to infinite stabilizer

Let {S=[L,L] = \left\{ \left(\begin{array}{cc} A & 0 \\ 0 & D \end{array}\right)\in G: \textrm{det}(A) = \textrm{det}(D) = 1 \right\} \simeq SL(p,\mathbb{R})\times SL(p,\mathbb{R})} and {A=\left\{ \left(\begin{array}{cc} \lambda I & 0 \\ 0 & \lambda^{-1} I \end{array}\right)\in G: \lambda\in\mathbb{R}^*\right\}}, so that {L=AS}. In particular, the closedness of {\textrm{Ad}_L(\Lambda, \Lambda^-)} implies that

{\textrm{Ad}_S(\Lambda, \Lambda^-)} is closed in {X_{\mathfrak{u}}\times X_{\mathfrak{u}^-} = G_0/\Gamma_0}.

The next proposition asserts that the stabilizer of this orbit is large whenever {S} is not compact:

Proposition 2 The stabilizer {\textrm{Stab}_S(\Lambda,\Lambda^-)=\{s\in S: \textrm{Ad}_s(\Lambda,\Lambda^-) = (\Lambda,\Lambda^-)\}} is a lattice in {S}.

This proposition is a direct consequence of the closedness of the {\textrm{Ad}_S(\Lambda, \Lambda^-)} in {G_0/\Gamma_0} and the following general fact:

Proposition 3 Let {G_0} be a Lie group, {\Gamma_0} a lattice in {G_0}, and {x_0\in X_0=G_0/\Gamma_0}. Suppose that {S_0\subset G_0} is a semisimple subgroup with finite center such that {S_0 x_0} is closed, then {S_0\cap \Gamma_0} is a lattice in {S_0}.

Proof: The first ingredient of the argument is Howe–Moore’s mixing theorem: it asserts that if {S_0} is a semisimple group with finite center and {(\mathcal{H}_0, \pi_0)} is an unitary representation of {S_0} with {\{v\in\mathcal{H}_0: \pi_0(S_0)v = v\}=\{0\}}, then

\displaystyle \lim\limits_{s\rightarrow\infty}\langle\pi_0(s) v_0, w_0\rangle = 0

for all {v_0, w_0\in\mathcal{H}_0}. (Here, {s\rightarrow\infty} means that the projection of {s} to any simple factor {S^{(i)}} of {S_0 = \prod S^{(i)}} diverges.)

The second ingredient of the argument is Dani–Margulis recurrence theorem: it says that if {\Gamma_0} is a lattice in a Lie group {G_0} and {u_t} is a one-parameter unipotent subgroup of {G_0}, then, given {x_0\in X_0=G_0/\Gamma_0} and {\varepsilon>0}, there exists a compact subset {K\subset X_0} such that

\displaystyle \frac{1}{T}\textrm{Leb}(\{t\in [0,T]: u_t x_0\in K\}) \geq 1-\varepsilon

for all {T>0}.

The basic idea to obtain the desired proposition is to apply these ingredients to {\mathcal{H}_0 = L^2(X_0,\lambda_0)}, where {\lambda_0} is a {S_0}-invariant measure on {S_0 x_0}, and {u_t} is a one-parameter unipotent subgroup in the product {S_0''} of non-compact simple factors of {S_0} . Here, we observe that {\lambda_0} is a bona fide Radon measure because we are assuming that {S_0 x_0} is closed, and, if we take {u_t} not contained in proper normal subgroups of {S_0''}, then {u_t\rightarrow\infty} as {t\rightarrow\infty} thanks to the absence of compact factors in {S_0''}. In this setting, our task is reduced to prove that {\lambda_0} is a finite measure.

In this direction, we apply Dani–Margulis recurrent theorem to get {A\subset X_0} with {0 < \lambda_0(A) < \infty} and a compact subset {K\subset X_0} such that

\displaystyle \frac{1}{T}\textrm{Leb}(\{t\in [0,T]: u_t x\in K\}) \geq \frac{1}{2}

for all {T>0} and {x\in A}. In this way, we obtain that the characteristic functions {1_A} and {1_K} of {A} and {K} are two elements of {\mathcal{H}_0} with {\langle \pi_0(u_t) 1_A, 1_K \rangle = \lambda_0(A\cap u_t^{-1}(K))}, and, hence, by Fubini’s theorem,

\displaystyle \begin{array}{rcl} \frac{1}{T}\int_0^T \langle \pi_0(u_t) 1_A, 1_K \rangle\,dt &=& \frac{1}{T}\int_0^T \lambda_0(A\cap u_t^{-1}(K)) \,dt \\ &=& \int_A \frac{1}{T}\textrm{Leb}(\{t\in [0,T]: u_t x\in K\}) \, d\lambda_0 \\ &\geq& \lambda_0(A)/2 > 0 \end{array}

for all {T>0}. It follows from Howe–Moore’s mixing theorem that

\displaystyle \{v\in\mathcal{H}_0: \pi_0(S_0'')v=v\}\neq \{0\},

that is, there exists a non-zero function {\varphi''\in\mathcal{H}_0} which is {S_0''}-invariant. By averaging {\varphi''} over the product {S_0'} of the compact factors of {S_0} if necessary, we obtain a non-zero function {\varphi\in\mathcal{H}_0} which is {S_0}-invariant. By ergodicity of {\lambda_0}, we have that {\varphi} is constant and, a fortiori, {\lambda_0} is a finite measure. \Box

2. From infinite stabilizer to {\Gamma\cap L} infinite

Our previous discussions (about {\textrm{Ad}_L}-actions) paved the way to understand {\Gamma\cap L}. Intuitively, it is important to get some information about {\Gamma\cap L} in our way towards showing the arithmeticity of {\Gamma} because we already know that {\Gamma\cap U} and {\Gamma\cap U^-} are lattices (i.e., {\Gamma} projects to lattices in “other directions”).

The intuition in the previous paragraph is confirmed by Margulis construction of {\mathbb{Q}}-forms:

Theorem 4 (Margulis) If {\Gamma\cap L} is infinite, then {\Gamma} is contained in some {\mathbb{Q}}-form {G_{\mathbb{Q}}} of {G}.

We will discuss the proof of this result in the next section. For now, we want to exploit the information {\Gamma\subset G_{\mathbb{Q}}} in order to derive the arithmeticity of {\Gamma}. For this sake, we invoke the following result:

Theorem 5 (Raghunathan-Venkataramana) Assume that {G} is semisimple of {\textrm{rank}_{\mathbb{R}}(G)\geq 2} defined over {\mathbb{Q}}, and {G} is {\mathbb{Q}}-simple. Suppose that {U} and {U^-} are opposite horospherical subgroups defined over {\mathbb{Q}}.If {\Gamma\subset G_{\mathbb{Z}}} is a subgroup such that {\Gamma\cap U_{\mathbb{Z}}}, resp. {\Gamma\cap U_{\mathbb{Z}}^-}, has finite index in {U_{\mathbb{Z}}}, resp. {U_{\mathbb{Z}}^-}, then {\Gamma} has finite index in {G_{\mathbb{Z}}}.

Remark 1 As it was kindly pointed out to me by David Fisher, Raghunathan-Venkataramana theorem is due to Margulis in the cases covered by Raghunathan (at least).

Remark 2 This result is false when {\textrm{rank}_{\mathbb{R}}(G)=1}, e.g., {G=SL(2,\mathbb{R})} (and {\Gamma=\left\langle\left(\begin{array}{cc}1 & 3 \\ 0 & 1 \end{array}\right), \left(\begin{array}{cc}1 & 0 \\ 3 & 1 \end{array}\right) \right\rangle}).

Roughly speaking, Raghunathan-Venkataramana theorem essentially establishes the desired Theorem 1 provided we know in advance that {\Gamma\subset G_{\mathbb{Q}}}.

In the sequel, we will treat Raghunathan-Venkataramana theorem as a blackbox and we will complete the proof of Theorem 1 in the case {U} reflexive and commutative, and {S} non-compact such as {G=SL(2p,\mathbb{R})} and {U = \left\{\left(\begin{array}{cc} I & B \\ 0 & I\end{array}\right)\in G: B\in M(p,\mathbb{R})\right\}}.

Remark 3 In some natural situations (e.g., subgroups generated by the matrices of the so-called Kontsevich–Zorich cocycle) we have that {\Gamma\subset G_{\mathbb{Z}}}. In particular, it is a pity that the lack of time made that Yves Benoist could not explain to me the proof of Raghunathan-Venkataramana theorem. Anyhow, I hope to come back to discuss this point in more details in the future.

Proof of Theorem 1:  We consider the subgroup {\Gamma':=\langle \Gamma\cap U, \Gamma\cap U^-\rangle} of {\Gamma}. It is discrete and Zariski dense in {G}. Therefore, the normalizer {\Gamma''=N_G(\Gamma')} is Zariski dense in {G}, and it is not hard to check that it is also discrete.

Observe that Proposition 2 says that {\Gamma''\cap S} is a lattice in {S} (because {\Gamma''=N_G(\Gamma')\supset \textrm{Stab}_S(\Lambda, \Lambda^-)}). Since {S} is non-compact, we have that {\Gamma'\cap L} is infinite. Thus, Margulis’ Theorem 4 implies that {\Gamma'\subset\Gamma''\subset G_{\mathbb{Q}}} for some {\mathbb{Q}}-form of {G}.

Note also that {U} and {U^-} are defined over {\mathbb{Q}}: in fact, if {H\subset GL(n,\mathbb{R})} is an algebraic subgroup such that {H\cap GL(n,\mathbb{Q})} is Zariski dense in {H}, then {H} is defined over {\mathbb{Q}}.

Hence, we can apply Raghunathan-Venkataramana Theorem 5 to get that {\Gamma'} is commensurable to {G_{\mathbb{Z}}}. Since {\Gamma'\subset\Gamma}, this proves the arithmeticity of {\Gamma}. \Box

Remark 4 As we noticed above, the arguments presented so far allows to prove Theorem 1 when {U} is reflexive and commutative, and {S} is non-compact.

3. From {\Gamma\cap L} infinite to arithmeticity

In this (final) section (of this post), we discuss some steps in the proof of Margulis  Theorem 4 stated above.

We write {\mathfrak{g} = \mathfrak{u}^-\oplus \underbrace{\mathfrak{s}\oplus\mathfrak{a}}_{=\mathfrak{l}} \oplus\mathfrak{u}}, i.e., we decompose the Lie algebra of {G} in terms of the Lie algebras of {U^-}, {U} and {L=AS}.

Our goal is to find a {\mathbb{Q}}-form of {G} containing {\Gamma}. For this sake, let us do some “reverse engineering”: assuming that we found {G_{\mathbb{Q}}\supset\Gamma}, what are the constraints satisfied by the {\mathbb{Q}}-structure {\mathfrak{g}_{\mathbb{Q}}} on its Lie algebra?

First, we note that we dispose of lattices {\Lambda\subset \mathfrak{u}} and {\Lambda^-\subset\mathfrak{u}^-}. Hence, we are “forced” to define {\mathfrak{u}_{\mathbb{Q}}} and {\mathfrak{u}^-_{\mathbb{Q}}} as the {\mathbb{Q}}-vector spaces spanned by {\Lambda} and {\Lambda^-}.

Next, we observe that the choice of {\mathfrak{u}_{\mathbb{Q}}} above imposes a natural {\mathbb{Q}}-structure {\mathfrak{s}_{\mathbb{Q}}} on {\mathfrak{s}} via the adjoint map {\textrm{Ad}:S\rightarrow SL(\mathfrak{u})}. In fact, {S} is defined over {\mathbb{Q}} because we know that {\Gamma\cap S} is a lattice (and, hence, Zariski-dense) in {S} and {\textrm{Ad}(S\cap\Gamma)\subset SL(\mathfrak{u}_{\mathbb{Q}})} (by definition).

Finally, since {U}, {U^-} and {S} are already defined over {\mathbb{Q}} and we want a {\mathbb{Q}}-structure on {\mathfrak{g} = \mathfrak{u}^-\oplus \mathfrak{s}\oplus\mathfrak{a} \oplus\mathfrak{u}}, it remains to put a {\mathbb{Q}}-structure on {\mathfrak{a}}. In general this is not difficult: for instance, we can take

\displaystyle \mathfrak{a}_{\mathbb{Q}} = \left\{ \left( \begin{array}{cc} \mu I & 0 \\ 0 & -\mu I\end{array}\right):\mu\in\mathbb{Q} \right\}

in the context of the example {G=SL(2p,\mathbb{R})} and {U = \left\{\left(\begin{array}{cc} I & B \\ 0 & I\end{array}\right)\in G: B\in M(p,\mathbb{R})\right\}}.

Once we understood the constraints on a {\mathbb{Q}}-form of {G} containing {\Gamma}, we can work backwards and set

\displaystyle \mathfrak{g}_\mathbb{Q} = \mathfrak{u}^-_\mathbb{Q}\oplus \mathfrak{s}_\mathbb{Q}\oplus\mathfrak{a}_\mathbb{Q} \oplus\mathfrak{u}_\mathbb{Q}

where {\mathfrak{u}_\mathbb{Q}}, {\mathfrak{u}^-_\mathbb{Q}}, {\mathfrak{s}_\mathbb{Q}} and {\mathfrak{a}_\mathbb{Q}} are the {\mathbb{Q}}-structures from the previous paragraphs.

At this point, the proof of Theorem 4 is complete once we show the following facts:

  • {\mathfrak{g}_{\mathbb{Q}}} doesn’t depend on the choices (of {U^-}, etc.);
  • {\textrm{Ad}(\gamma) \mathfrak{g}_{\mathbb{Q}}\subset \mathfrak{g}_{\mathbb{Q}}} for all {\gamma\in\Gamma};
  • {\mathfrak{g}_{\mathbb{Q}}} is a Lie algebra.

The proof of these statements is described in the proof of Proposition 4.11 of Benoist–Miquel paper. Closing this post, let us just make some comments on the independence of {\mathfrak{g}_{\mathbb{Q}}} on the choices. For this sake, suppose that {U'} is another choice of horospherical subgroup. Denote by {\mathfrak{u}'} its Lie algebra, and let {P'=N_G(U')} be the associated parabolic subgroup. Our task is to verify that the Lie algebra {\mathfrak{p}' = \textrm{Lie}(P')} is defined over {\mathbb{Q}}. In this direction, the basic strategy is to reduce to the case when {\mathfrak{u}'} is opposite to {\mathfrak{u}} and {\mathfrak{u}^-} in order to get {\mathfrak{p}' = (\mathfrak{p}'\cap\mathfrak{p})\oplus(\mathfrak{p}'\cap\mathfrak{p}^-)}. Finally, during the implementation of this strategy, one relies on the following properties discussed in Lemma 4.8 of Benoist–Miquel article of the action of the unimodular normalizers {Q=\{g\in P: \textrm{det}_{\mathfrak{u}}\textrm{Ad}(g) = 1\}} of horospherical subgroups {U\subset P} on the space {X=G/\Gamma} with basepoint {x_0=\Gamma/\Gamma}:

  • If {Ux_0} is compact, then {Qx_0} is closed;
  • If {Qx_0} and {Q^-x_0} are closed, then {Sx_0 = (Q\cap Q^-) x_0} is closed;
  • If {Ux_0} is compact and {S x_0} is closed, then {(\Gamma\cap S)(\Gamma\cap U)} has finite index in {\Gamma\cap SU = \Gamma\cap Q}.

In any event, this completes our discussion of Theorem 4. In particular, we gave a (sketch of) proof of Theorem 1 when {U} is commutative and reflexive, and {S} is non-compact (cf. Remark 4).

Next time, we will establish Theorem 1 in the remaining cases of {U} and {S}.

As it was announced in the end of the first post of this series, we will discuss today the first half of the proof of the following result:

Theorem 1 Let {G:=SL(2p,\mathbb{R})}, {p\geq 2}, and {U := \left\{\left(\begin{array}{cc} I & B \\ 0 & I\end{array}\right)\in G: B\in M(p,\mathbb{R})\right\}}. Suppose that {\Gamma} is a discrete and Zariski dense subgroup of {G} such that {\Gamma\cap U} is cocompact. Then, {\Gamma} is commensurable to some {\mathbb{Z}}-form {G_{\mathbb{Z}}} of {G}.

Remark 1 This statement is originally due to Hee Oh, but the proof below is a particular case of Benoist–Miquel’s arguments. In particular, our subsequent discussions can be generalized to obtain the statement of Theorem 1 of the previous post in full generality.

Remark 2 Theorem 1 is not true without the higher rank assumption {p\geq 2} (i.e., {\textrm{rank}_{\mathbb{R}} G = 2p-1\geq 2}): indeed, {\Gamma = \left\langle\left(\begin{array}{cc} 1 & 3 \\ 0 & 1\end{array}\right), \left(\begin{array}{cc} 1 & 0 \\ 3 & 1\end{array}\right)\right\rangle} has infinite index in {SL(2,\mathbb{Z})}.

Our task is to construct a {\mathbb{Z}}-form {G_{\mathbb{Z}}} satisfying the conclusions of Theorem 1. This is not very easy because it must cover all possible cases of {\mathbb{Z}}-forms such as:

Example 1

  • {G_{\mathbb{Z}} = SL(2p,\mathbb{Z})};
  • {G_{\mathbb{Z}} = SL(2s,D_{\mathbb{Z}})} where {D_{\mathbb{Z}}} are integers in a division algebra over {\mathbb{Q}};
  • {G_{\mathbb{Z}} = SU(2p,\mathbb{Z}[\sqrt{2}])}”, i.e., {G_{\mathbb{Z}} = \left\{g\in SL(2p,\mathbb{Z}[\sqrt{2}]): g^{\sigma} = {}^Tg^{-1}\right\}} where {\sigma} is Galois conjugation (and {{}^Tg} is the transpose of {g}).

Before trying to construct adequate {\mathbb{Z}}-forms, let us make some preliminary reductions.

We denote by {P=N_G(U)} the parabolic subgroup normalizing of {U} in {G}: more concretely,

\displaystyle P = \left\{g\in G: g=\left(\begin{array}{cc} A & B \\ 0 & D\end{array}\right) \right\} = \{g\in G: g(\mathbb{W}) = \mathbb{W}\}

where {\mathbb{W}:=\mathbb{R}^p\times\{0\}\subset\mathbb{R}^{2p}}.

Next, we consider {U^- = \left\{\left(\begin{array}{cc} I & 0 \\ C & I\end{array}\right)\in G\right\}}. In the literature, {U^-} is called an opposite horospherical subgroup to {U}.

Since {\Gamma} is Zariski dense in {G}, there exists {\gamma_0 = \left(\begin{array}{cc} A_0 & B_0 \\ C_0 & D_0\end{array}\right)\in \Gamma} such that {\gamma_0(\mathbb{W})\oplus \mathbb{W} = \mathbb{R}^{2p}} (i.e., {\textrm{det}(C_0)\neq 0}). By taking a basis of {\mathbb{R}^{2p}} such that {\gamma_0(\mathbb{W}) = \{0\}\times\mathbb{R}^{p}}, we have that

\displaystyle \gamma_0 U\gamma_0^{-1} = U^- \ \ \ \ \ (1)

In particular, {U^-\cap\Gamma} is cocompact in {U^-} (thanks to the assumptions of Theorem 1).

Remark 3 This is the one of the few places in Benoist–Miquel argument where the Zariski denseness of {\Gamma} is used.

Remark 4 In general, the argument above works when {U} is reflexive, that is, {U} is conjugated to an opposite horospherical subgroup {U^-}.

We denote by {P^-=N_G(U^-) = \left\{g\in G: g=\left(\begin{array}{cc} A & 0 \\ C & D\end{array}\right)\in G \right\}} an opposite parabolic subgroup, and

\displaystyle L=P\cap P^- = \left\{g\in G: g=\left(\begin{array}{cc} A & 0 \\ 0 & D\end{array}\right)\in G \right\}

the common Levi subgroup of {P} and {P^-}. In particular, we have decompositions (in semi-direct products)

\displaystyle P=L\,U \quad \textrm{and} \quad P^- = L\,U^-

Let {\mathfrak{u}=\textrm{Lie}(U)} and {\mathfrak{u}^- = \textrm{Lie}(U^-)} be the Lie algebras of {U} and {U^-}. Note that {\Lambda:=\log(\Gamma\cap U)} and {\Lambda^-:=\log(\Gamma\cap U^-)} (resp.) are lattices in {\mathfrak{u}} and {\mathfrak{u}^-} (resp.). In other terms,

\displaystyle \Lambda\in X_{\mathfrak{u}} \textrm{ and } \Lambda^-\in X_{\mathfrak{u}^-}

where {X_{\ast}} is the space of lattices in {\ast\in\{\mathfrak{u}, \mathfrak{u}^-\}}.

Remark 5 Note that {\log g = g-\textrm{Id}_{2p\times 2p}} for all {g\in U} in the context of the example {G=SL(2p,\mathbb{R})} and {U := \left\{\left(\begin{array}{cc} I & B \\ 0 & I\end{array}\right)\in G: B\in M(p,\mathbb{R})\right\}}.

Observe that {L} is the intersection of the normalizers of {U} and {U^-}. Therefore, {L} acts on the spaces of lattices {X_{\mathfrak{u}}} and {X_{\mathfrak{u}^-}} via the adjoint map {\textrm{Ad}_L} (i.e., by conjugation).

As it turns out, the key step towards the proof of Theorem 1 consists in showing that the {\textrm{Ad}_{L}}-orbits of {\Lambda\in X_{\mathfrak{u}}} and {(\Lambda, \Lambda^-)\in X_{\mathfrak{u}}\times X_{\mathfrak{u}^-}} are closed. In other terms, the proof of Theorem 1 can be divided into two parts:

  • closedness of the {\textrm{Ad}_L}-orbits of {\Lambda\in X_{\mathfrak{u}}} and {(\Lambda, \Lambda^-)\in X_{\mathfrak{u}}\times X_{\mathfrak{u}^-}};
  • construction of the {\mathbb{Z}}-form {G_{\mathbb{Z}}} based on the closedness of the {\textrm{Ad}_L}-orbits above.

In the remainder of this post, we shall establish the closedness of relevant {\textrm{Ad}_L}-orbits. Then, the next post of this series will be dedicated to obtain an adequate {\mathbb{Z}}-form {G_{\mathbb{Z}}} (i.e., arithmeticity) from this closedness property.

Remark 6 Hee Oh’s original argument used Ratner’s theory for the semi-simple part of {L} to derive the desired closedness property. The drawback of this strategy is the fact that it doesn’t allow to treat some cases (such as  {G=SO(2,m)}), and, for this reason, Benoist and Miquel are forced to proceed along the lines below.

1. Closedness of the {\textrm{Ad}_L}-orbit of {\Lambda}

Consider Bruhat’s decomposition {\mathfrak{g} = \mathfrak{u}\oplus\mathfrak{l}\oplus\mathfrak{u}^-} (where {\mathfrak{l}=\textrm{Lie}(L)} is the Lie algebra of {L}) and the corresponding projection {\pi:\mathfrak{g}\rightarrow\mathfrak{u}}.

Given {g\in G}, set {M(g)=\pi \circ \textrm{Ad}_g \circ \pi\in \textrm{End}(\mathfrak{u})}, i.e.,

\displaystyle \textrm{Ad}_g = \left(\begin{array}{ccc} M(g) & \ast & \ast \\ \ast & \ast & \ast \\ \ast & \ast & \ast\end{array}\right),

and consider the Zariski open set

\displaystyle \Omega = U^- \, P = \left\{\left(\begin{array}{cc} A & B \\ C & D \end{array}\right)\in G: \textrm{det} A\neq 0\right\}.

Our first step towards the closedness of the {\textrm{Ad}_L}-orbit of {\Lambda} is to exploit the discreteness of {\Gamma} and the commutativity of {U} in order to get that the actions of the matrices {M(g)}, {g\in \Gamma\cap\Omega}, on the vectors {X} of the lattice {\Lambda} do not produce arbitrarily short vectors:

Proposition 2 The set {\{M(g) X: g\in\Gamma\cap\Omega, X\in\Lambda\}} is closed and discrete in {\mathfrak{u}}.

Proof: Let {g_n = v_n \ell_n u_n\in\Gamma} with {v_n\in U^-}, {\ell_n\in L}, {u_n\in U}, and {X_n\in \Lambda} such that {X_n' :=M(g_n)X_n \rightarrow X_{\infty}'\in \mathfrak{u}}.

Our task is to show that {X_{\infty}' = X_n'} for all {n} sufficiently large.

For this sake, note that a direct calculation reveals that {\textrm{Ad}_u\circ \pi = \pi = \pi\circ \textrm{Ad}_v} for all {u\in U} and {v\in U^-}. In particular, {X_n' = M(g_n) X_n= \textrm{Ad}(\ell_n)X_n}. Now, we use the cocompactness of {\Gamma\cap U^-} to write {v_n = \delta_n^{-1} v_n'} with {\delta_n\in \Gamma\cap U^-} and {v_n'\rightarrow v_{\infty}'} (modulo taking subsequences).

By definition, {\gamma_n:= \delta_n g_n \exp(X_n) g_n^{-1}\delta_n^{-1} = v_n'\ell_n u_n \exp(X_n) u_n^{-1} \ell_n^{-1} v_n'^{-1}\in \Gamma}. Since {U} is commutative, {u_n\exp(X_n)u_n^{-1} = \exp(X_n)} and hence

\displaystyle \Gamma\ni \gamma_n = v_n' \ell_n\exp(X_n)\ell_n^{-1} v_n'^{-1} = v_n' \exp(X_n') v_n'^{-1} \rightarrow v_{\infty}' \exp(X_{\infty}') v_{\infty}'^{-1}

as {n\rightarrow \infty}. Because {\Gamma} is discrete, it follows that {\textrm{Ad}_{v_n'}(X_n') = \textrm{Ad}_{v_{\infty}'}(X_{\infty}')} for all {n} sufficiently large. Therefore, {X_n'=\pi\circ \textrm{Ad}_{v_n'}(X_n') = \pi\circ\textrm{Ad}_{v_{\infty}'}(X_{\infty}') = X_{\infty}'} for all {n} sufficiently large. This completes the proof of the proposition. \Box

Remark 7 The fact that {U} is commutative plays a key role in the proof of this proposition.

Next, we shall combine this proposition with Mahler’s compactness criterion to study the set of determinants of the matrices {M(g)} for {g\in\Gamma}.

Proposition 3 Let {\Phi(g) = \textrm{det}_{\mathfrak{u}} M(g) = \textrm{det}(A)^{2p}} for {g=\left(\begin{array}{cc} A & B \\ C & D \end{array}\right)\in G}. Then, the set {\{\Phi(g): g\in \Gamma\}} is closed and discrete in {\mathbb{R}}.

Proof: Given {g_n\in\Gamma} such that {\Phi(g_n)\rightarrow\Phi_{\infty}}, we want to show that {\Phi(g_n) = \Phi_{\infty}} for all {n} sufficiently large.

By contradiction, let us assume that this is not the case. In particular, there is a subsequence {g_{n_k}} with {\Phi(g_{n_k})\neq 0}, i.e., {g_{n_k}\in\Gamma\cap\Omega}, and also {\Phi(g_{n_k})\neq \Phi_{\infty}} for all {k}. Note that, by definition, {\Phi(g_{n_k})} is the covolume of {M(g_{n_k})\Lambda}.

By Proposition 2, {M(g_{n_k})\Lambda} doesn’t have small vectors. Since these lattices also have bounded covolumes (because {\Phi(g_{n_k})\rightarrow\Phi_{\infty}}), we can invoke Mahler’s compactness criterion to extract a subsequence {g_{n_c}} such that {M(g_{n_c})\Lambda\rightarrow \Lambda_{\infty}}. In this setting, Proposition 2says that we must have {M(g_{n_c})\Lambda = \Lambda_{\infty}} for all {c} sufficiently large, so that {\Phi(g_{n_c}) = \Phi_{\infty}} for all {c} sufficiently large, a contradiction. \Box

Now, we shall modify {\Phi} to obtain a polynomial function {F} on {\mathfrak{u}}. For this sake, we recall the element {\gamma_0 = \left(\begin{array}{cc} A_0 & B_0 \\ C_0 & D_0 \end{array}\right)\in \Gamma} introduced in (1) (conjugating {U} and {U^-}). In this context,

\displaystyle F(X) := \Phi(\exp(X)\gamma_0) = (\textrm{det}(C_0))^{2p} (\textrm{det}(B))^{2p}

is a polynomial function of {X= \left(\begin{array}{cc} 0 & B \\ 0 & 0 \end{array}\right)\in \mathfrak{u}}. Moreover, an immediate consequence of Proposition 3 is:

Corollary 4 {F(\Lambda)} is a closed and discrete subset of {\mathbb{R}}.

The polynomial {F} is relevant to our discussion because it is intimately connected to the action of {\textrm{Ad}_L}:

Remark 8

  • on one hand, a straighforward calculation reveals that {F\circ \textrm{Ad}_{\ell}} is proportional to {F} for all {\ell\in L} (i.e., {F\circ \textrm{Ad}_{\ell} = \lambda_{\ell} F} for some [explicit] {\lambda_{\ell}\in \mathbb{R}}): see Lemma 3.12 of Benoist–Miquel paper;
  • on the other hand, some purely algebraic considerations show that {\textrm{Ad}_L} is the virtual stabilizer of the proportionality class of {F}, i.e., {\textrm{Ad}_L} is a finite-index subgroup of {\{\varphi\in \textrm{Aut}(\mathfrak{u}): F\circ \varphi \textrm{ is proportional to } F\}}: see Proposition 3.13 of Benoist–Miquel paper.

At this stage, we are ready to establish the closedness of the {\textrm{Ad}_L}-orbit of {\Lambda}:

Theorem 5 The {\textrm{Ad}_L}-orbit of {\Lambda} is closed in {X_{\mathfrak{u}}}.

Proof: Given {\ell_n\in L} such that {\textrm{Ad}_{\ell_n}\Lambda\rightarrow\Lambda_{\infty}}, we write {\textrm{Ad}_{\ell_n}\Lambda = \varphi_n(\Lambda_{\infty})} with {\varphi_n\in \textrm{Aut}(\mathfrak{u})} converging to the identity element {e\in\textrm{Aut}(\mathfrak{u})}.

Our task is reduce to show that {\varphi_{n}\in\textrm{Ad}_L} for all {n} sufficiently large. For this sake, it suffices to find {\varphi_{n_k}} stabilizing the proportionality class of the polynomial {F} (thanks to Remark 8).

In this direction, we take {X\in\Lambda_{\infty}}. By definition, {\varphi_n(X)\in \textrm{Ad}_{\ell_n}\Lambda}. Also, by Remark 8, we know that {F(\textrm{Ad}_{\ell_n}\Lambda) = \lambda_{\ell_n} F(\Lambda)} for some constant {\lambda_{\ell_n}} depending on the covolume {\Phi(\ell_n)} of {\textrm{Ad}_{\ell_n}\Lambda}. In particular, {F(\varphi_n X)\in\lambda_{\ell_n} F(\Lambda)}.

As it turns out, since the lattices {\textrm{Ad}_{\ell_n}\Lambda} converge to {\Lambda_{\infty}}, one can check that the quantities {\lambda_{\ell_n}} converge to some {\lambda_{\infty}\neq 0} related to the covolume of {\Lambda_{\infty}}. Moreover, {\lambda_{\ell_n} F(\Lambda)\ni F(\varphi_n X)\rightarrow F(X)} because {\varphi_n\rightarrow e}. Hence, we can apply Corollary 4 to deduce that {F(X)\in\lambda_{\infty} F(\Lambda)} and

\displaystyle F(\varphi_n X) = (\lambda_{\infty}/\lambda_{\ell_n}) F(X)

for all {n} sufficiently large depending on {X\in\Lambda}, say {n\geq n(X)}.

At this point, we observe that the degrees of the polynomials {F\circ\varphi_n - (\lambda_{\infty}/\lambda_{\ell_n}) F} are uniformly bounded and {\Lambda_{\infty}} is Zariski-dense in {\mathfrak{u}}. Thus, we can choose {n_0\in\mathbb{N}} such that

\displaystyle F(\varphi_n X) = (\lambda_{\infty}/\lambda_{\ell_n}) F(X)

for all {n\geq n_0} and {X\in\mathfrak{u}}.

In other terms, {\varphi_n\in\textrm{Aut}(\mathfrak{u})} stabilizes the proportionality class of {F} for all {n\geq n_0}. It follows from Remark 8 that {\varphi_n\in\textrm{Ad}_L} for all {n} sufficiently large. This completes the argument. \Box

2. Closedness of the {\textrm{Ad}_L}-orbit of {(\Lambda, \Lambda^-)}

The proof of the fact that the {\textrm{Ad}_L}-orbit of {(\Lambda,\Lambda')} is closed in {X_{\mathfrak{u}} \times X_{\mathfrak{u}^-}} follows the same ideas from the previous section: one introduces the polynomial {G(X,Y)=\Phi(\exp(X)\exp(Y))} for {(X,Y)\in \mathfrak{u}\times \mathfrak{u}^-}, one shows that {G(\Lambda\times\Lambda^-)} is closed and discrete in {\mathbb{R}}, and one exploits this information to get the desired conclusion.

In particular, our discussion of the first half of the proof of Theorem 1 is complete. Next time, we will see how this information can be used to derive the arithmeticity of {\Gamma}. We end this post with the following remark:

Remark 9 Roughly speaking, we covered Section 3 of Benoist–Miquel article (and the reader is invited to consult it for more details about all results mentioned above).Finally, a closer inspection of the arguments shows that the statements are true in greater generality provided {U} is reflexive and commutative (cf. Remarks 4 and 7).

Last week, Jon ChaikaJing Tao and I co-organized the Summer School on Teichmüller Theory and its Connections to Geometry, Topology and Dynamics at Fields Institute.

This activity was part of the Thematic Program on Teichmüller Theory and its Connections to Geometry, Topology and Dynamics, and it consisted of four excellent minicourses by Yves BenoistHee OhGiulio Tiozzo and Alex Wright.

These minicourses were fully recorded and the corresponding videos will be available at Fields Institute video archive in the near future.

Meanwhile, I decided to transcript my notes of Benoist’s minicourse in a series of four posts (corresponding to the four lectures delivered by him).

Today, we shall begin this series by discussing the statement of the main result of Benoist’s minicourse, namely:

Theorem 1 (Oh, Benoist–Miquel) Let {G} be a semisimple algebraic Lie group of real rank {\textrm{rank}_{\mathbb{R}}(G)\geq 2}. Suppose that {U} is a horospherical subgroup of {G}, and assume that {\Gamma} is a Zariski dense and irreducible subgroup of {G} such that {U\cap \Gamma} is cocompact. Then, there exists an arithmetic subgroup {G_{\mathbb{Z}}} such that {\Gamma} and {G_{\mathbb{Z}}} are commensurable.

The basic reference for the proof of this theorem (conjectured by Margulis) is the original article by Benoist and Miquel. This theorem completes the discussion in Hee Oh’s thesis where she dealt with many families of examples of semisimple Lie groups {G} (as Hee Oh kindly pointed out to me, the reader can find more details about her contributions to Theorem 1 in these articles here).

Remark 1 I came across Benoist–Miquel theorem during my attempts to understand a question by Sarnak about the nature of Kontsevich–Zorich monodromies. In particular, I’m thankful to Yves Benoist for explaining in his minicourse the proof of a result that Pascal Hubert and I used as a black box in our recent preprint here.

Below the fold, the reader will find my notes of the first lecture of Benoist’s minicourse (whose goal was simply to discuss several keywords in the statement of Theorem 1).

Read More…

During the preparation of my joint articles with K. Burns, H. Masur and A. Wilkinson about the rates of mixing of the Weil-Peterson geodesic flow (on moduli spaces of Riemann surfaces), we exchanged some emails with S. Wolpert about the sectional curvatures of the Weil-Petersson metric near the boundary of moduli spaces.

As it turns out, Wolpert communicated to us an interesting mechanism to show that some sectional curvatures can be exponentially small in terms of the square of the distance to the boundary.

On the other hand, this mechanism does not seem to be well-known: indeed, I was asked in many occasions about the behavior of the Weil-Petersson sectional curvatures near the boundary, and each time my colleagues were surprised by Wolpert’s examples.

In this short post, I will try to describe Wolpert’s construction of tiny Weil-Petersson sectional curvatures. (Of course, all mistakes below are my responsibility.)

1. Weil-Petersson metric

Recall that the cotangent bundle to the moduli space of Riemann surfaces is naturally identified with the space of quadratic differentials on Riemann surfaces.

The Weil-Petersson inner product {\langle\phi,\psi\rangle_{WP}} between two quadratic differentials {\phi} and {\psi} on a Riemann surface {S} is

\displaystyle \langle\phi,\psi\rangle_{WP} = \int_S \phi\overline{\psi}(ds^2)^{-1} \ \ \ \ \ (1)

where {ds^2} is the hyperbolic metric of {S}.

Remark 1 The quadratic differentials {\phi} and {\psi} are locally given by {\phi=f(z)dz^2} and {\psi=g(z)dz^2}, so that {\phi\psi = f(z) \overline{g(z)} dz^2d\overline{z}^2 = f(z) \overline{g(z)} |dz|^4}. In particular, we use the hyperbolic metric to obtain a {L^2}-type formula (because the area form is {|dz|^2}).

The incomplete, smooth, Kähler, negatively curved Riemannian metrics on moduli spaces of Riemann surfaces induce by the Weil-Petersson inner products are the so-called Weil-Petersson (WP) metrics.

Recall that the moduli spaces of Riemann surfaces are not compact because a hyperbolic closed geodesic {\alpha} on a Riemann surface {S} might have arbitrarily small hyperbolic length {\ell_S(\alpha)}. Moreover, the Weil-Petersson metric is incomplete because we can pinch off a hyperbolic closed geodesic {\alpha} on {S} in finite time {\leq \ell_S(\alpha)^{1/2}}. Nevertheless, the natural compactification of the moduli space of Riemann surfaces with respect to the Weil-Petersson metric turns out to be the Deligne-Mumford compactification where one adds a boundary by including stable nodal Riemann surfaces into the picture. (See Burns-Masur-Wilkinson paper and the references therein for more details.)

Today, we are interested on the order of magnitude of the Weil-Petersson sectional curvatures at a point {X} of moduli space of Riemann surfaces. More concretely, we want to understand WP sectional curvatures {K} of cotangent planes to {X} in terms of the distance {d} of {X} to the boundary (of Deligne-Mumford compactification).

By Wolpert’s work, we know that {-K=O(1/d)}, i.e., WP sectional curvatures {K} are bounded away from {-\infty} by a polynomial function of the inverse {1/d} of the distance to the boundary.

On the other hand, a potential cancellation in Wolpert’s formulas for WP curvatures makes it hard to infer upper bounds on WP sectional curvatures in terms of {1/d}. (Nevertheless, the situation is better understood for holomorphic sectional curvatures and WP Ricci curvatures: see, e.g., Melrose-Zhu paper.)

In any event, Wolpert discovered that there is no chance to expect an upper bound on all WP sectional curvatures {K} at {X} in terms of a polynomial function of the distance {d} of {X} to the boundary: in fact, we will see below that Wolpert constructed examples of Riemann surfaces {X} where some WP sectional curvature {K} behaves like {-K\sim \exp(-1/d^2)}.

2. Plumbing coordinates

The geometry of a Riemann surface near the boundary of moduli space is described by the so-called plumbing construction.

Roughly speaking, if a Riemann surface {X} is close to acquire a node at a curve {\alpha}, then we can describe an annular region {A} surrounding {\alpha} using a complex parameter {t\in\mathbb{C}} with {|t|\ll 1} and two complex coordinates {z} and {w} with the following properties.

The curve {\alpha} separates the annular region {A} into two components {A_1} and {A_2}. The coordinate {z} takes {A} to {\{|t|\leq |z|\leq 1\}} in such a way {\alpha} is mapped to {\{|z|=\sqrt{|t|}\}} and {A_1} is mapped into {\{\sqrt{|t|}\leq |z|\leq1\}}. Similarly, the coordinate {w} takes {A} to {\{|t|\leq |w|\leq 1\}} in such a way {\alpha} is mapped to {\{|w|=\sqrt{|t|}\}} and {A_2} is mapped into {\{\sqrt{|t|}\leq |w|\leq1\}}. Furthermore, we recover the annular region {A} by identifying points via the relation

\displaystyle zw=t

In the figure below, we depicted a Riemann surface {X_t} obtained from this plumbing construction near a curve separating it into two torii.

Remark 2 In the plumbing construction, the size {|t|} of the parameter {t} gives a bound on the distance of {X} to the boundary of moduli space: indeed, the hyperbolic length of the geodesic representative of {\alpha} is {\sim 1/\log(1/|t|)}. Of course, this is coherent with the idea that {zw=0} describes a node.Also, the phase of {t} is related to the so-called twist parameters.

3. Tiny Weil-Petersson curvature

Consider the plumbing construction in the figure above. It illustrates a curve {\alpha_t} separating a Riemann surface {X_t} of genus {2} into two torii {T_1} and {T_2} with natural coordinates {z} and {w}. In these coordinates, the curve {\alpha_t} is {\{z:|z|=|t|^{1/2}\} = \{w:|w|=|t|^{1/2}\}}.

We start with the quadratic differential {\psi_1=dz^2} on {T_1}. If we want to extend {\psi_2} to {T_2} and, a fortiori, {X_t}, then we need to understand the behavior of {\psi_1} in the portion {\{|t|^{1/2}\leq |w|\leq 1\}} of {T_2} intersecting the annular region surrounding {\alpha_t}. In other words, we have to describe {\psi_1} in {w}-coordinates.

For this sake, we recall that the definition of plumbing construction says that {zw=t}. Thus, {z=t/w} and the formula {dz^2 = t^2 w^{-4} dw^2} allows us to extend {\psi_1} to {X_t}.

This description has the following interesting consequence: while the Weil-Petersson size of {\psi_1} on {T_1} is {\sim 1} (because {\psi_1|_{T_1}=dz^2}), the Weil-Petersson size (cf. (1)) of {\psi_1} on {T_2} is {\sim |t|} because

\displaystyle \int_{|t|^{1/2}\leq |w|\leq 1} \underbrace{\frac{|t|^4}{|w|^8}\frac{(dw d\overline{w})^2}{|dw|^2}}_{L^2\textrm{-norm of } \psi_1=t^2w^{-4} dw^2} \underbrace{|w|^2\log^2|w|}_{(\textrm{hyperbolic metric})^{-1}}\sim \int_{|t|^{1/2}\leq |w|\leq 1} \frac{|t|^4}{|w|^6} |dw|^2

\displaystyle \sim \int_{\sqrt{|t|}}^1 \frac{|t|^4}{r^6}r dr\sim |t|

By exchanging the roles of the subindices {1} and {2} in the previous discussion, we also get a quadratic differential {\psi_2} with Weil-Petersson size {\sim 1} on {T_2} and {\sim |t|} on {T_1}.

Remark 3 The reader is invited to consult Sections 2 and 3 of this paper of Wolpert for a more detailed discussion of quadratic differentials on Riemann surfaces coming from plumbing constructions.

At this point, Wolpert notices that the Weil-Petersson sectional curvature of the plane {P(\psi_1, \psi_2)} spanned by {\psi_1} and {\psi_2} is tiny in the following sense.

It is explained in this paper of Wolpert that the Weil-Petersson curvature is a sort of “{L^4}-norm” which in the case of {P(\psi_1,\psi_2)} correspond to simply compute the size of the product {\psi_1\psi_2}. Since {\psi_n} has size {\sim 1} on {T_n} and {\sim |t|} on {T_{3-n}} for {n\in\{1,2\}}, the product {\psi_1\psi_2} has size {\sim |t|} on {X_t=T_1\cup T_2}. In summary, the Weil-Petersson curvature of {P(\psi_1,\psi_2)} is

\displaystyle \sim-|t|

On the other hand, the geodesic representative of {\alpha_t} has hyperbolic length {\ell(t)} satisfying

\displaystyle \ell(t)\sim1/\log(1/|t|),

so that the Weil-Petersson distance of {X_t} to the boundary of moduli space is

\displaystyle \sim\ell(t)^{1/2}\sim 1/\log^{1/2}(1/|t|)

In summary, we exhibited Riemann surfaces {X} at Weil-Petersson distance {d\rightarrow 0} to the boundary of moduli space where some Weil-Petersson sectional curvature has size

\displaystyle \sim -\exp(-1/d^2)

The collection of best constants {c} for the Diophantine approximation problem of finding infinitely many rational solutions {p/q\in\mathbb{Q}} to the inequality

\displaystyle |\alpha-\frac{p}{q}|<\frac{1}{cq^2}

with {\alpha\in\mathbb{R}\setminus\mathbb{Q}} is encoded by the so-called Lagrange spectrum {L}.

In a similar vein, the Markov spectrum {M} encodes best constants for a Diophantine problem involving indefinite binary quadratic real forms.

These spectra were first studied in a systematic way by A. Markov in 1880, and, since then, their structures attracted the attention of several mathematicians (including Hurwitz, Perron, etc.).

Among the basic properties of these spectra, it is worth mentioning that {L\subset M} are closed subsets of the real line. Moreover, the works of Markov from 1880 and Hall from 1947 imply that

\displaystyle L\cap(-\infty, 3) = M\cap(-\infty, 3) = \{\sqrt{5}<\sqrt{8}<\dots\}

is a increasing sequence of quadratic surds converging to {3}, and

\displaystyle L\cap[6,\infty) = M\cap[6,\infty) = [6,\infty)

On the other hand, it took some time to decide whether {L=M}. Indeed, Freiman proved in 1968 that {M\setminus L\neq\emptyset} by exhibiting a countable (infinite) collection of isolated points in {M\setminus L}. After that, Freiman constructed in 1973 an element of {M\setminus L} which was shown to be a non-isolated point of {M\setminus L} by Flahive in 1977.

A common feature of these examples of elements in {M\setminus L} is the fact that they occur before {\sqrt{12}=3.46\dots} In 1975, Cusick conjectured that there were no elements in {M\setminus L} beyond {\sqrt{12}}.

In our preprint uploaded to arXiv a couple of days ago, Gugu and I provide the following negative answer to Cusick’s conjecture:

Theorem 1 The Hausdorff dimension of {(M\setminus L)\cap (3.7, 3.71)} is {\geq 0.53128}.

Below the fold, we give an outline of the proof of this theorem.

Remark 1 The basic reference for this post is the classical book of Cusick and Flahive.

Read More…

Older Posts »

Categories