Posted by: yglima | October 24, 2009

ERT2: Polynomial Von Neumann’s Theorem

As proved in ERT1, the existence of ergodic averages {(M-N)^{-1}\cdot\sum_{n=N+1}^{M}T^nf} implies recurrence results for measure-preserving system (from now on, denoted by mps). A natural question is to ask about some kind of generalized ergodic averages and its implications in recurrence. By generalized ergodic averages we mean expressions like


where {(a_n)_{n\in\mathbb N}} is a sequence of positive integers. Von Neumann’s Theorem shows that convergence holds if {a_n=an+b}, where {a,b} are positive integers: just apply the result to the transformation {T^a} and the function {T^bf}. In this post, we prove that the same result holds if {a_n=p(n)}, where {p(x)\in\mathbb Z[x]} is a polynomial such that {p(n)\ge 0}, for every {n\ge 0}. We can assume, without lost of generality, that {p(0)=0}. In fact, {T^{p(n)}f=T^{p(n)-p(0)}(T^{p(0)}f)} and {\tilde p(x)=p(x)-p(0)} satisfies the required condition.

Theorem 1 (H. Furstenberg) If {(X,\mathcal B,\mu,T)} is a mps and {p(x)\in\mathbb Z[x]} is a polynomial such that {p(n)\ge 0}, for every {n\ge 0}, and {p(0)=0}, then the limit

\displaystyle \lim_{N\rightarrow+\infty}\dfrac{1}{N}\sum_{n=1}^{N}T^{p(n)}f

converges in {L^2} for every {f\in L^2}.

Again, this theorem is Hilbertian in nature and follows from a more general version for unitary operators.

Theorem 2 If {T:\mathcal H\rightarrow\mathcal H} is a unitary operator on a Hilbert space {\mathcal H} and {p(x)\in\mathbb Z[x]} is a polynomial such that {p(n)\ge 0}, for every {n\ge 0}, and {p(0)=0}, then the sequence of operators

\displaystyle T_N=\dfrac{1}{N}\sum_{n=1}^NT^{p(n)},\ N\ge 1,

converges pointwise in norm.

Proof: The idea is the same as in Von Neumann’s Theorem: we look for an orthogonal decomposition {\mathcal H=\mathcal M\oplus\mathcal M^{\perp}} such that the behaviour of {T_N} is understood in each component. {\mathcal M} will represent the structured component of {T} and {\mathcal M^{\perp}} the randomic one in the following sense:

  • the long-time behaviour of elements {x\in\mathcal M} is (almost-)periodic.
  • the long-time behaviour of elements of {\mathcal M^{\perp}} self-amortizes and converges to zero.

Unfortunately, the decomposition of Von Neumann’s Theorem does not work here. In fact, let {x\in\mathcal H} be periodic with respect to {T}, say {T^ax=x}, {a\in\mathbb N}. If we write {N=aq+r}, 0\le r<a,

\displaystyle \begin{array}{rcl} T_Nx&=&\dfrac{1}{N}\displaystyle\sum_{n=1}^{N}T^nx\\ &=&\dfrac{1}{aq+r}\displaystyle\sum_{n=1}^{aq+r}T^{n\,({\rm mod}\,a)}x\\ &=&\dfrac{1}{aq+r}\left(q\cdot\displaystyle\sum_{n=0}^{a-1}T^nx+\displaystyle\sum_{n=1}^{r}T^nx\right)\\  &=&\dfrac{1}{a+rq^{-1}}\displaystyle\sum_{n=0}^{a-1}T^nx+\dfrac{1}{N}\displaystyle\sum_{n=1}^{r}T^nx  \end{array}

converges to {(x+Tx+\cdots+T^{a-1}x)/a}, because

\displaystyle \lim_{N\rightarrow+\infty}rq^{-1}=\lim_{N\rightarrow+\infty}\dfrac{1}{N}\sum_{n=1}^{r}T^nx=0.

This means that every periodic point of {T} has a structured behaviour with respect to {T_N}. For this reason, let

\displaystyle \mathcal M=\overline{\{x\in\mathcal H\,;\,\exists\,a\in\mathbb N\text{ such that }T^ax=x\}}.

Exercise 1 Prove that the set of {x\in \mathcal H} for which the sequence {(T_Nx)_{N\ge 1}} converges is a closed subspace of {\mathcal H}.

By Exercise 1, the sequence {(T_Nx)_{N\ge 1}} converges whenever {x\in\mathcal M}. By linearity, it remains to prove convergence for {x\in\mathcal M^{\perp}}. Such subspace is characterized by

\mathcal M^{\perp}=\left\{x\in\mathcal H\,;\,\displaystyle\dfrac{1}{N}\sum_{n=1}^{N}T^{an}x\rightarrow 0\,,\ \forall\,a\in\mathbb N\right\}.\ \ \ \ \ \ \ (1)

This follows from Von Neumann’s Theorem: if {\mathcal H=\mathcal M_a\oplus{\mathcal M_a}^{\perp}} is the decomposition with respect to {T^a}, {a\in\mathbb N}, then {\mathcal M} is equal to \overline{\oplus_{a\in\mathbb N}\mathcal M_a} and its orthogonal complement is given by

\displaystyle \mathcal M^{\perp}=\bigcap_{a\in\mathbb N}{\mathcal M_a}^{\perp}\,,

which proves (1). This means that {N^{-1}\cdot\sum_{n=1}^{N}T^{p(n)}x\ \rightarrow\ 0} for every {x\in\mathcal M^{\perp}} and for every degree-one polynomial {p(x)=ax}, {a\in\mathbb N}. The proof will be complete if we show that the same happens for larger-degree polynomials. By induction, suppose that

\displaystyle \dfrac{1}{N}\sum_{n=1}^{N}T^{p(n)}x\ \rightarrow\ 0\,,\ \forall\,x\in\mathcal M^{\perp},\ \ \ \ \ \ (2)

for every polynomial {p(x)\in\mathbb Z[x]} such that {p(0)=0}, {p(\mathbb N)\subset\mathbb N} and {{\rm deg}(p)\le n_0}. Consider {q(x)\in\mathbb Z[x]} such that {q(0)=0}, {q(\mathbb N)\subset\mathbb N} and {{\rm deg}(q)=n_0+1}. We wish to reduce the convergence {N^{-1}\cdot\sum_{n=1}^{N}T^{q(n)}x\rightarrow 0} to one of the form {N^{-1}\cdot\sum_{n=1}^{N}T^{p(n)}x\rightarrow 0}, with {{\rm deg}(p)\le n_0} (which we know to be true, by the induction hypothesis). This is done with the use of Van der Corput’s Trick (see this lecture of Terry Tao for a broader discussion on this trick).

Theorem 3 (Van der Corput Trick) If {(x_n)_{n\in\mathbb N}\subset\mathcal H} is a bounded sequence such that

for every {h\in\mathbb N}, then {\left\|\dfrac{1}{N}\sum_{n=1}^{N}x_n\right\|\rightarrow 0.}

Exercise 2 Prove the above theorem. (Hint: this is Theorem 2.2 of this survey of Vitaly Bergelson.)

We’re done if the sequence {x_n=T^{q(n)}x}, {n\in\mathbb N}, satisfies the conditions of Theorem 3. In fact, as {T} is unitary,

\displaystyle \left\langle x_{n+h},x_n \right\rangle=\left\langle T^{q(n+h)}x,T^{q(n)}x\right\rangle=  \left\langle T^{q(n+h)-q(n)}x,x\right\rangle=\left\langle T^{p_h(n)}x,x\right\rangle,

where {p_h(x)=q(x+h)-q(x)} is a polynomial of smaller degree, and so (2) is satisfied. This concludes the proof of Theorem 2. \Box

The method used above is one of the main principles Ergodic Ramsey Theory: the dichotomy between structure and randomness, decomposing the object of study into these two components. Usually, we first define the structured one, in terms of the desired ergodic averages, so that convergence follows almost directly from the definition. Its orthogonal complement is the randomic component and convergence along it is proved using Van der Corput like theorems. For a further discussion on this dichotomy, the reader is referred to this paper of Terence Tao. Observe that the same method applies to prove the following

Theorem 4 If {T:\mathcal H\rightarrow\mathcal H} is a unitary operator on a Hilbert space {\mathcal H} and {p(x)\in\mathbb Z[x]} is a polynomial such that {p(n)\ge 0}, for every {n\ge 0}, and {p(0)=0}, then the sequence of operators

\displaystyle \dfrac{1}{M-N}\cdot\sum_{n=1}^NT^{p(n)},\ M-N\rightarrow+\infty,

converges pointwise in norm.

Now it’s time to obtain the recurrence consequences (which, as expected, will be stronger than those in ERT1). Let {P:\mathcal H\rightarrow\mathcal M} be the orthogonal projection. We’ll proceed exactly as in Proposition 6 of ERT1, except that the notation will be heavier.

Proposition 5 Let {f\in L^2\backslash\{0\}} be such that {{f\ge 0}}. Then {Pf\ge 0} and {\left\|Pf\right\|>0}.

Proof: Consider the subspaces {\mathcal M^{(n)}=\cap_{a=1}^{n}\mathcal M_a} , {n\in\mathbb N}. By approximation, if each projection {f_n} of {f} into {\mathcal M^{(n)}} satisfies {f_n\ge 0} and {\left\|f_n\right\|>0}, the same happens to {Pf}. Fix {n} and consider the function {g_n=\max\{f_n,0\}}. Then {g_n\in\mathcal M^{(n)}} (Exercise 3) and {{\left\|f-g_n\right\|\le\left\|f-f_n\right\|}}. Because {f_n} minimizes the distance of {f} to {\mathcal M^{(n)}}, we have {f_n=g_n\ge0}. In addition, if we had {\left\|f_n\right\|=0}, then

\displaystyle f\in{\mathcal M^{(n)}}^\perp=\bigcup_{a=1}^{n}{\mathcal M_a}^{\perp}\,,

implying that {N^{-1}\cdot\sum_{n=1}^{N}T^{an}f\rightarrow0} for some {a\in\{1,2,\ldots,n\}}. Integrating, we conclude

\displaystyle \int_X fd\mu=\int_X\left(\dfrac{1}{N}\sum_{n=1}^{N}T^{an}f\right)\rightarrow0\ \Longrightarrow\ \int_X fd\mu=0\ \Longrightarrow\ f=0,

a contradiction. \Box

Theorem 6 If {(X,\mathcal B,\mu,T)} is a mps, {p(x)\in\mathbb Z[x]} is a polynomial such that {p(n)\ge 0}, for every {n\ge 0}, {p(0)=0}, and {A\in\mathcal B} such that {\mu(A)>0}, then the set

\displaystyle \left\{n\in\mathbb N\,;\,\mu\left(A\cap T^{-p(n)}A\right)>0\right\}

is syndetic.

Proof: If {f=\chi_A}, then from (2) the expression {\left\langle f,(M-N)^{-1}\cdot\sum_{n=N+1}^{M}T^{p(n)}f\right\rangle} converges to {\left\langle f,Pf\right\rangle=\left\|Pf\right\|^2>0} as {M-N\rightarrow+\infty}. Since

\displaystyle \left\langle f,T^{p(n)}f \right\rangle=\left\langle \chi_A,\chi_{T^{-p(n)}A} \right\rangle=\mu(A\cap T^{-p(n)}A)\,,

Theorem {4} guarantees the conclusion. \Box

Previous posts: ERT0, ERT1.


  1. It’s actually not that much harder to prove ( maybe a good exercise for the students – using the van der Corput trick ) that in fact convergence in the polynomial von Neumann theorem takes place almost everywhere with respect to any totally ergodic ( meaning that there are no rational eigenvalues ) system and for any function in L^2 ( or bounded to be on the
    safe side ) In this case the limit equals the integral.

    It’s a much deeper theorem due to Bourgain that
    almost sure convergence always holds in any ergodic system for L^2-functions. This is known
    to fail for L^1-functions. Note however that the
    limit in this case might fail to be invariant.

    It’s also plain that the mean convergence can be
    extended to cover linear contractions ( not necessarily unitary ) by a simple lifting argument ( going back to Sz. Nagy I believe ). I don’t think
    the corresponding theorem ( however the exact phrasing is not quite clear to me ) is known for
    non-linear contractions.

    • Dear Reader,

      Thanks for the informations. In fact, Bourgain’s and other related results and conjectures will be the topic for the next post. With respect to contractions, you are absolutely right. This follows from the two facts below: if T is a linear contraction (meaning that |Tf| <= |f|) operator on a Hilbert space H, then
      (a) |T^*f| <= |f| for every f in H;
      (b) Ker(T^* – I) = Ker(T – I), where T^* stands for the adjoint operator of T,
      so that the proof presented in ERT1 also holds.

  2. Hi,
    This was a very nice article.
    Could you give a reference for Furstenberg’s theorem?

  3. actually, i realized that the neatest proof (but maybe not extendible) is from the spectral theorem + weyl’s theorem on equidistribution.

    but a reference to furstenberg’s paper where the argument you mentioned above would be nice.

    • Dear asdf, please see below Yuri’s comments (sent to me by email):

      Dear asdf,

      The original reference is Theorem 3.5 of Furstenberg’s paper “Poincare recurrence and number theory”, published in Bulletin of the American Mathematical Society, 1981, no. 3, 211–234. His argument is just like you said: spectral theorem and Weyl’s equidistribution theorem.

      If I am not mistaken, the proof in the post is due to V. Bergelson. Actually, the reference I used was Bergelson’s survey Ergodic Ramsey theory – and update.

      Best, Yuri.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: