Let (resp. , , ) be a discrete-time (resp. continuous-time) dynamical system preserving a probability measure .
We say that (resp. ) is mixing when as (resp. as ). Intuitively, a dynamical system is mixing whenever the events and become “independent” as .
Equivalently, , resp. , is mixing when, given two observables , in , the correlation function
goes to zero as , resp. .
Remark 1 The equivalence between the two definitions of mixing can be easily seen by noticing that (where is the characteristic function of ) and by recalling that measurable observables are always nicely approximated by linear combinations of characteristic functions.
Example 1 The Bernoulli shift , equipped with the Bernoulli (product) measure , is a mixing dynamical system. Indeed, since the cylinders generate the Borel -algebra, it suffices to study the correlation function in the case and , and, in this situation, it is not hard to see that the events and are independent (i.e., ) for all .
Given a mixing dynamical system, it is natural to ask oneself about the speed of decay of the correlation function , that is, how fast (polynomially? exponentially?) does it go to zero as . In fact, besides its intrinsic interest, this question has interesting applications in other areas: for example, the exponential (or at least sufficiently high degree polynomial) decay of correlation functions (for “smooth” observables) of the geodesic flow on hyperbolic manifolds was recently used by J. Kahn and V. Markovic to study fundamental groups of hyperbolic -manifolds (and, in its turn, this work of J. Kahn and V. Markovic was used by I. Agol in his solution of the so-called virtual Haken conjecture).
The long-term goal of this post is the discussion of the so-called Dolgopyat estimate and its application to exponential decay of correlations of certain flows, i.e., continuous-time dynamical systems. In this direction, we will divide this post into two sections. In the next section, we will firstly discuss exponential decay of correlations for certain discrete-time dynamical systems. Then, in the final section, we will see that, as far as decay of correlations are concerned, the case of flows is quite different from the discrete-time case, and we will discuss how Dolgopyat’s estimate enters into the game.
Remark 2 For a recent application of Dolgopyat’s estimate in a number-theoretical setting, see the paragraph after Theorem 4.6 of this article of J. Bourgain and A. Kontorovich on Zaremba’s conjecture.
Closing this introduction, let us make the following technically useful remark. In general, while studying correlation functions (or ), we can assume (without loss of generality) that . Indeed, this is a consequence of the following identities:
Here, in the third equality, we used the fact that preserves to deduce that . Thus, we can (and do) systematically assume that in what follows.
1. Correlations of certain discrete-time systems
Let’s warm up by studying decay of correlations of the following baby example. Consider , , integer, a linear (uniformly) expanding map of the circle . Denote by the Lebesgue measure on . The reader is invited to check that the Lebesgue measure is -invariant.
Consider now two analytic observables, say and such that, for some fixed , we have and , sufficiently large.
Let us now study the correlation . As we mentioned above, we can assume that , so that . In this setting, we have
It follows that
that is, the correlation decays super-exponentiallyas .
Despite its simplicity, this calculation contains some important ideas in the study of correlations of hyperbolic systems.
Firstly, the constant “responsible” for the decay of is strongly related to the “expanding” (“hyperbolic”) nature of , : indeed, by composing with and testing against , we are “shifting” the Fourier modes from its original spot to the spot ; geometrically, this means that the iterates of the expanding map are “spreading” the support of around , so that the variations (derivatives) of at any given scale become weaker as increases. This “spreading” and “smoothing” process of due to the expanding features of is illustrated in the following picture:
Secondly, the decay of the correlations is measured by the quantity related to the “oscillations” (sizes of “derivatives”) of . In particular, it is important to impose some regularity on the observables and and then try to estimate in terms of certain norms capturing the fine structure (oscillations) of and at small scales (such as Hölder norms, norms, Sobolev norms, bounded variation norms, analytic norms, etc.). Of course, there is no unique choice of such norms but the main point here is that some regularity “better than or ” is certainly needed: for instance, in Figure 1 above, we see that, despite the fact that “oscillates” more than , their norm are equal (to ), i.e., the norm doesn’t see the difference between the fine structures of and . Actually, as it turns out, even for dynamical systems such as the Bernoulli shift and the map (mod 1) that we would like to call “exponentially mixing”, it is possible to select observables and with low regularity (for instance not Hölder) such that the correlation decays “slowly”. We refer the reader to Subsection 1.4 of V. Baladi’s book for more comments on such examples.
After our warm up with the baby example , on the circle , let us mention that a Fourier-analytic (Harmonic Analysis) approach to the study of decay of correlations normally works well in algebraic contexts (hyperbolic automorphisms/flows on nilmanifolds), but, in general, they are not well-adapted to treat non-algebraic hyperbolic systems. For this reason, one needs other methods to study the decay of correlations of hyperbolic systems. In the sequel, we will loosely follow the excellent book by Viviane Baladi to describe a popular functional-analytic method for the decay of correlations of uniformly expanding discrete-time systems.
Let be a connected compact Riemannian manifold without boundary, and let be an uniformly expanding map, i.e., for some , one has for all , .
Denote by the Lebesgue measure on induced by its Riemannian structure. In the sequel, we will illustrate the application of methods of Functional Analysis to the study of decay of correlations by giving an outline of the proof of the following result:
Theorem 1 Suppose that is a expanding map for some . Then, preserves an unique probability absolutely continuous with respect to the Lebesgue measure whose Radon-Nikodym derivative is (i.e., -Hölder continuous) and positive. Moreover, the correlation functions decay exponentially as whenever and are observables.
Remark 3 The assumption that is can’t be removed: indeed, it is not hard to construct expanding maps where the existence of absolutely continuous invariant probabilities fails (see, e.g., here and here). Actually, A. Avila and J. Bochi showed here that the non-existence of absolutely invariant probabilities is a generic property among maps.
Historically, the first part of this theorem (concerning the existence and uniqueness of an absolutely continuous invariant probability with Hölder density) is due to K. Krzyzewski and W. Szlenk, while the presentation below of the exponential decay of correlations is due to C. Liverani.
The first step towards the proof of Theorem~1 is the following observation. Suppose that is an absolutely continuous measure. Then, the pullback of by (defined as ) satisfies
is the so-called Ruelle’s transfer operator. The verification of this identity is a simple application of the change of variables theoremand it is left to the reader as an exercise.
Remark 4 In the previous identity it is implicit that, as is local diffeomorphism of a compact manifold , the function is bounded away from and , and the quantity is finite for every (actually, the quantity is also independent of and it is the so-called degree of ).
Remark 5 More generally, given a function , we can define the transfer operator . In this language, the transfer operator defined above is . In the literature, these slightly generalized transfer operators are important: for instance, they allow to construct a class of dynamically relevant invariant probabilities called equilibrium states (see V. Baladi’s book for more explanations).
In words, this identity says that the transfer operator keeps track of what happens with the density when we pullback the measure (in other words, the pullback operator on measures and the transfer operator are in a sort of duality).
In terms of the transfer operator, an absolutely continuous probability that is -invariant (i.e., ) corresponds to a fixed point of . In particular, this hints that functional-analytic methods might be useful to find .
In our setting, the basic idea is very simple: we will find the fixed point by iterating , or, more precisely, we will show that the sequence “converges” to (where is the constant function). However, before talking about “convergence”, we need to specify a nice functional space where this convergence will take place.
In fact, there is no unique/standard choice of adequate functional spaces for the study of the transfer operator, and this explains why there is a vast literature (see, e.g., these papers of V. Baladi, V. Baladi and S. Gouezel, S. Gouezel and C. Liverani, V. Baladi and M. Tsujii) on the selection of Banach spaces adapted to certain classes of dynamical systems.
Fortunately, in our current setting, the choice is not hard: we will consider the transfer operator acting on the Banach space of -Hölder continuous functions equipped with the norm
where is the distance induced by the Riemannian structure on .
Note that preserves : indeed, as is expanding of a compact manifold , we have that each term of the finite sum
is a locally uniformly Hölder continuous function of .
Once we selected the functional space to let act, we need some sort of “contraction property” in order to check that the sequence is actually converging to some . For this purpose, we recall below some of Garrett Birkhoff‘s results on Hilbert (projective) metrics on positive cones in Banach spaces.
Let be a Banach space and a closed convex cone (i.e., is a closed subset of such that if and belong to , and whenever and ). A convex cone defines a partial order : we say that when . This partial order is compatible with the vector space structure of (i.e., it is respected by additions and multiplication by positive scalars), and it is continuous (i.e., if , and for all , then ) when the cone is closed.
Using the partial order , we can define the so-called Hilbert projective metric on as follows. Given , we put
(with the convention that and/or when these sets are empty), and we define the Hilbert projective metricas
It is worth to point out that is a natural (and classical) object in Geometry because it induces a metricon the projectivization obtained by quotienting by the relation iff for some .
A key result of Garrett Birkhoff in this context is the following theorem:
Theorem 2 Let be a convex cone in a Banach space and let be a linear operator such that . Then,
for all .
In a nutshell, this theorem says that any linear operator sending a convex cone inside itself doesn’t increase the Hilbert distance, and, furthermore, contracts strictly the Hilbert distance if the -diameter of is finite.
In the context of transfer operators associated to a expanding map , the plan is to apply Birkhoff’s result to the following family , , of cones in :
In plain terms, is the cone of positive -Hölder functions such that has -Hölder constant at most . As the reader can check, is a family of closed convex cones in .
The family is well-adapted for our current goals because of the following result:
Lemma 3 Let be a expanding map, say for some . Then, given , there exists such that
for every .
Proof: We start the argument with the following auxiliary estimate. Given , we have that
where . Here, we used that is (because is ), and for all where is the dimension of (because ).
After this preliminary estimate, let us consider and let us try to show that for sufficiently large. Denote by the degree of . Given , let us fix a numbering of the elements of . From the expanding features of , we have that for any close to , the elements of can be numbered as in such a way that . Now, let us study the quantity
The proof of the lemma will be complete if we show that this quantity is bounded from above by (for sufficiently large). In this direction, let us note that, by definition,
From our auxiliary estimate and the fact that , we deduce that
Since , we obtain from this inequality that
Because and play symmetric roles, we conclude that
that is, when . In particular, given , since
the proof of the lemma is complete.
In the literature, this type of “contraction estimate” for transfer operators is called Lasota-Yorke inequality (who first introduced this kind of inequality in this paper here). From Lemma 3 above (i.e., Lasota-Yorke inequality), we can outline (leaving the details to Baladi’s book) the end of the proof of Theorem 1 as follows.
By (directly) computing the quantities and in the definition of Hilbert projective metric in the case of the family of cones (see Equations (2.13) and (2.14) in Baladi’s book), it is possible to calculate the Hilbert projective metric on the cone (see Equation (2.15) in Baladi’s book). From this calculation, one can show that the cone has -diameter
In particular, since the constant function belongs to for all , we deduce that the sequence is Cauchy for the Hilbert metric (for ). In principle, this may not sound very interesting because we would like to extract a limit from the sequence and, for this purpose, it is more relevant to get the Cauchy property with respect to the Hölder norm . Fortunately, it is possible to comparethe -distance and the -distance for elements in (see Lemmas 2.2 and 2.3 in Baladi’s book), so that we have a fixed point (for ). In particular, from the very definition of , it follows that has an absolutely continuous invariant probability, namely with . Furthermore, it is not hard to see that is unique. Indeed, if , (), is another absolutely continuous -invariant probability, is another fixed point of . Since and , we have that for some . Thus, from the previous (Cauchy) estimate for the we would have
and thus : actually, from the fact that we onlyget that (as is a projective metric), but in our case we can deduce the equality because and are probability measures (i.e., and are normalized so that ).
At this point, it remains only to justify why the correlation functions of Hölder observables decay exponentially fast to complete the proof of Theorem 1. Below we will just sketch how this is proven (refereeing to Theorem 2.3 in Baladi’s book for more details). One writes
Thus, it is clear that decays exponentially if we can show that converges towards the point in the line .
Here, this convergence takes place for () in the norm because, as we vaguely mentioned, this norm can be compared with the -distance and for the latter we dispose of the Cauchy estimate . In general, the case can be reduced to the case because the cones are big enough to contain open balls in the -norm (formally, one replaces by where is a sufficiently large constant so that belongs to some ). Anyhow, this is essentially all we can say about the proof of Theorem 1 without entering into technical details. So, let’s close this section with the following remarks.
Remark 6 The Cauchy estimate for also allows to deduce its quasi-compactness in the space and the spectral gap property, i.e., its spectral radius is , a simple eigenvalue of with eigenspace , and all other elements of the spectrum of belong to the ball centered at and radius . In general, the exponential decay of correlations and the quasi-compactness of the transfer operator are intimately related: for example, one can deduce exponential decay from quasi-compactness, and, for this reason, some authors try to prove first quasi-compactness before approaching the question of decay of correlation functions. However, as we will see in the next section, in the case of flows, one doesn’t necessarily has to show quasi-compactness of to deduce exponential decay of correlations, although some control of the spectrum of is needed in some way.
Remark 7 In this section we focused exclusively on expanding maps, but the theory of exponential decay of correlations exists also for hyperbolic diffeomorphisms, e.g., diffeomorphisms such that the tangent bundle decomposes into two complementary subbundles where expands along and contracts along . However, one has to be careful about how to construct the functional spaces along the stable direction: indeed, the fact that contracts along means that, by iterating forward, we see exactly the inverse of Figure 1! In particular, the transfer operator tend to behave “badly” on smooth functions in the direction , and this is why normally one has to either “reduce” the dynamics to an expanding one by “quotienting modulo stable manifolds” (cf. Baladi’s book) or one considers the transfer operators on “anisotropic” functional spaces, i.e., a space of observables that are “smooth along the unstable direction ” and “distributions along the stable direction ” (cf. these articles here for more discussion on the “anisotropic approach”).
2. Dolgopyat’s estimate and correlations for flows
In this section we consider a volume-preserving continuous-time hyperbolic dynamical system, e.g., we assume that where is contracted dy , is expanded by and is the flow direction. A prominent class of hyperbolic (Anosov) flows are the geodesic flows on negatively curved manifolds (see, e.g., Katok-Hasselblat’s book). In this setting, denoting by the volume (“Lebesgue measure”) preserved by and we wish to study the correlation functions
via the (spectral) properties of the transfer operator
At first sight, it is tempting to conjecture (by analogy with the case of diffeomorphisms) that the correlation functions of smooth observables for hyperbolic flows decay exponentially. However, one must be careful here because of a subtle but fundamental difference between hyperbolic diffeomorphisms and hyperbolic flows: while the derivative of hyperbolic diffeomorphisms have a definite (hyperbolic, i.e., contracting or expanding) behavior along all directions in , the derivative of hyperbolic flows have a definite hyperbolic behavior only along directions that are transverse to the flow direction . Indeed, it is not hard to see that the behavior of along the flow direction is neutral (“almost isometric”), i.e., the fact that the distance between and is of order remains almost unchanged after flowing these points by any time (at least for small).
In order to see how “nasty” the presence of the neutral (flow) direction might be (for the decay of correlations), let us consider the following “prototype” of hyperbolic flows.
Let be a discrete-time dynamical system, e.g., acting on (a.k.a. Arnold’s cat map). Given a positive “reasonable” function , we can define a suspension flow from the following receipt. We consider where is the equivalence relation obtained by gluing the points and , and we let be the flow induced by the flow . In the literature, is called the suspension of with roof function . In the case that the “basis dynamics” is hyperbolic, it is not hard to see that is a hyperbolic flow. The suspensions of hyperbolic maps represent all nice hyperbolic flows (because hyperbolic flows admit nice Markov partitions; see, e.g., Dolgopyat’s paper and references therein for more comments).
Now, let us consider the hyperbolic diffeomorphism displaying a horseshoe whose dynamics is defined by two rectangles and in , i.e., and are a Markov partition of the horseshoe (in Figure 1 of this post here, and are the connected components of ). Next, we fix a roof function such that , and , that is, is a piecewise constant roof function whose values on different parts of the horseshoe are rationally independent. In this setting, it is possible to check that the suspension flow obtained from the data and is mixing (i.e., correlation functions of smooth observables decay to zero). But, as it was shown by D. Ruelle in this article here, the correlation functions do not decay exponentially fast! Intuitively, the decay of correlations gets slower for suspensions of hyperbolic maps with piecewise constant roof functions because, despite the fast mixing imposed by in the “-direction”, a set of the form , , travels with unit linear speed under and hence it can’t hit the level before a time of order has passed (that is, the set has to wait a little before visiting other levels such as ).
Once we know that the decay of correlation functions may be not exponential for hyperbolic flows, it is natural to ask oneself about the largest class of hyperbolic flows with exponential decay of correlations. In this direction, after the notable works of M. Ratner, M. Pollicott and P. Collet, H. Epstein and G. Gallavotti, we know that geodesic flows on constant negative curvature manifolds of dimension 2 and 3 (and, in dimension 2, some small perturbations of such flows) exhibit exponential decay of correlations. In these works, the techniques were algebraic (and/or perturbative), and thus they are hard to generalize.
After 6 years (or so) from these first works, N. Chernov came up with the first dynamical approach and he showed that geodesic flows on negatively curved surfaces have sub-exponential decay of correlations. Then, D. Dolgopyat developed a crucial estimate to show (here) allowing him to deduce exponential decay of correlations for Anosov flows with smooth () invariant foliations. In principle, the regularity assumption imposed by D. Dolgopyat is quite restrictive, but, a posteriori, some authors (such as C. Liverani, V. Baladi and B. Vallée, and A. Avila, S. Gouezel and J.-C. Yoccoz) saw that his estimate could work with less restrictive assumptions. In naive terms, the estimates by Dolgopyat can be used to prove exponential decay of correlations for suspension flows over hyperbolic maps with roof functions that are not integrable, i.e., they don’t look like piecewise constant function even after change of variables.
In the remainder of this post, we will try to briefly (and vaguely) explain from a geometrical point of view the meaning of Dolgopyat’s estimate and its relationship to suspensions flows over hyperbolic maps with non-integrable roof functions. Here, we will loosely follow this excellent article of C. Liverani on contact Anosov flows (and we strongly recommend the reader interested into details to consult this nicely written paper).
Firstly, while C. Liverani works with Anosov flows, I will somewhat simplify our lives by forgetting the stable direction, or, if one wishes, I will think of my flow as a suspension of a expanding map (instead of hyperbolic map) via some smooth (say ) roof function . Next, I will assume that is not integrable, i.e., is not of the form where is piecewise constant and is a smooth coboundary, i.e., . In simpler terms, is integrable if it is possible to “change variables” (with the aid of the coboundary term ) so that our flow is conjugated to a suspension of an expanding map with a piecewise constant function . In other words, the non-integrability says that our flow is not one of the “bad flows” of D. Ruelle in disguise, and hence we have some chance of getting exponential mixing to start off. In C. Liverani’s paper, the analog of the non-integrability condition is “hidden” in the fact that he considers contact Anosov flows.
In this setting, given a observable with zero mean (i.e., ), our goal is to show that
where is some norm trying to measure the oscillations of along the unstable direction, and . In fact, this estimate is sufficient to derive exponential decay of correlations for observables because
Remark 8 The estimate doesn’t imply quasi-compactness of nor spectral gap property (compare with Remark 6). In fact, this happens because the current technology doesn’t permit to analyze directly the time-one map of the flow (except in some rare cases such as in this paper here). In some sense, this is an incarnation of the fundamental differences between maps and flows as far as correlations are involved).
In simple terms, we wish to take advantage of the expansion effect of in the unstable (“-direction”) in order to show the estimate , and, hence, it is a nice idea to consider a (Hölder or) norm along this direction such as
where is a sufficiently small constant (for practical purposes, one may think that ).
Remark 9 Of course, is not the norm used by C. Liverani in his article (instead, he deals with a slightly more technical version of it). However, we will stick to this norm because it is “good enough” to illustrate the cancellation idea behind Dolgopyat’s estimate.
By mimicking the argument of the previous section, we wish to understand how the norm of evolves as . In this direction, we fix some piece of size of “(fake) unstable manifold” of , i.e., a set of the form (where is the ball of center and radius ) and we let evolve under . Since the basis dynamics of the suspension flow is an expanding map, we see that consists of a union of a certain exponential number of pieces of size .
In particular, the function is nicely decomposed into a exponential number of functions supported on (fake) unstable manifolds crossing a set of the form whose graphs obtained from the one of after applying a certain number of times. At this stage, the naive idea to estimate the norm is: we estimate the norms of each and we sum everything up. Here, since the basis dynamics is expanding, the norm of norm of along the unstable direction (“-direction”) gets better (by the reason explained in Figure 1). However, it is possible to see that the trivial bounds on the “ part” of the norm are not sufficient to deduce get the desired bound because of the large (exponential) number of terms (i.e. functions ). In other words, the naive idea doesn’t work unless we get some sort of cancellation making the norms of ‘s somewhat smaller than the trivial bound.
In order to get the desired cancellation, we need essentially to compare (using ) the values of on the several pieces of unstable manifolds intersecting a given . Here, the fact that the roof function is not integrable plays a key role. Indeed, if is (piecewise) constant (or more generally is integrable), then the graphs of the restrictions of to the several pieces match “perfectly” (possibly after a change of variables) as it is shown in the left side of the figure below. On the other hand, if is not integrable, the graphs of the restrictions of to the several pieces will look somewhat shifted with respect to each other. In particular, a sum (or more generally a integral) of the form exhibits a certain amount of cancellation (as the signs of the really change) and, by quantifying this, one obtain gets a Dolgopyat estimate that the sum of the parts of the norm of the functions is (compare with Lemma 5.2 in C. Liverani’s paper). This cancellation phenomenon is illustrated in the right-hand side of the figure below.
Closing this post, let us reiterate that this description is a very crude (first order) approximation of the beautiful ideas of Dolgopyat (as several details are missing such as the precise definition of the norms are not the same, one studies in terms of the its resolvent and not directly, etc.), but I hope that the sacrifices in rigor are compensated by the fact that the previous picture explains (at least intuitively) with the words “Dolgopyat estimate” and “cancellation due to oscillation and non-integrability” appear together in the papers in the fascinating subject of decay of correlations for hyperbolic flows.
Update [November 10, 2012]: The reader might find interesting to consult also this survey article by C. Liverani written in the occasion when the Brin prize 2009 was awarded to D. Dolgopyat.
Update [November 12, 2012]: As it was pointed out to me by Artur Avila today, there are some situations where the transfer operator of the time-one map of the flow can be analyzed directly (contrary to what was said in a previous version of Remark 8 above). Indeed, this is the case in this article here of Masato Tsujii.