Posted by: yglima | January 7, 2012

Applications of Szemerédi’s regularity lemma: triangle removal lemma, Roth’s theorem, corner’s theorem and graph removal lemma

In the previous post we proved Szemerédi’s regularity lemma, and now we give a few of its various applications: the triangle removal lemma, Roth’s theorem on the existence of arithmetic progressions of length {3} in subsets of the integers with positive density, the corner’s theorem and, finally, the graph removal lemma.

1. Triangle removal lemma

Most applications of Szemerédi’s regularity lemma deal with monotone problems, when throwing in more edges can only help. In these applications, one starts applying the original form of the regularity lemma to create a regular partition, then gets rid of all edges within the clusters of the partition, also the edges of non-regular pairs as well as those of regular pairs with small density. The leftover “pure” graph is much easier to handle and still contains most of the original edges. This is what happens in proving the triangle removal lemma.

The triangle removal lemma is the (intuitive, yet nontrivial) fact that if one has to delete at least {\varepsilon n^2} edges of a graph with {n} vertices to destroy all triangles in it, then the graph must contain at least {\delta n^3} triangles, where {\delta=\delta(\varepsilon)>0}. If one only thinks naively, the conclusion is that the graph contains at least {\varepsilon n^2} triangles, and the strength of the triangle removal lemma is that, instead of quadratic, the number of triangles is cubic. It was first proved by Ruzsa and Szemerédi in the paper Triple systems with no six points carrying three triangles, who also observed it implies Roth’s theorem, as we shall see in the next section.

Definition 1 Given {\varepsilon>0}, a graph {G=(V,E)} is {\varepsilon}-far from being triangle free if one has to delete at least {\varepsilon\cdot|V|^2} edges of {G} to destroy all triangles in it.

In particular, every graph that is {\varepsilon}-far from being triangle-free has at least one triangle (indeed, at least {\varepsilon\cdot|V|^2} of them).

Theorem 2 (Triangle removal lemma) For any {0<\varepsilon<1}, there is {\delta=\delta(\varepsilon)>0} such that, whenever {G=(V,E)} is {\varepsilon}-far from being triangle-free, then it contains at least {\delta\cdot|V|^3} triangles.

Proof: Let {G=(V,E)} be an {\varepsilon}-far from being triangle-free graph and {|V|=n}. We can assume {n>N(\varepsilon/4,\lfloor 4/\varepsilon\rfloor)} by just taking {\delta} sufficiently small so that

\displaystyle \delta\cdot N(\varepsilon/4,\lfloor 4/\varepsilon\rfloor)^3<1\,.

Consider the {\varepsilon/4}-regular partition {\mathcal U=\{V_0,V_1,\ldots,V_k\}} given by Szemerédi’s regularity lemma. Let {c=|V_1|=\cdots=|V_k|} and {G'} the graph obtained from {G} by deleting the following edges:

  1. All edges incident in {V_0}: there are at most {\varepsilon n^2/4} edges.
  2. All edges inside the clusters {V_1,\ldots,V_k}: the number of edges is at most

    \displaystyle c^2\cdot k<\dfrac{n^2}{k}<\dfrac{\varepsilon\cdot n^2}{4}\,\cdot

  3. All edges that lie in irregular pairs: there are less than

    \displaystyle \left(\dfrac{\varepsilon}{4}\cdot k^2\right)\cdot c^2<\dfrac{\varepsilon\cdot n^2}{4}

    edges.

  4. All edges lying in a pair of clusters whose density is less than {\varepsilon/2}: their cardinality is at most

    \displaystyle {k\choose 2}\cdot\dfrac{\varepsilon\cdot c^2}{2}<\dfrac{\varepsilon\cdot n^2}{4}\,\cdot

The number of deleted edges is less than {\varepsilon\cdot n^2} and so {G'} contains a triangle. The three vertices of such triangle belong to three remaining distinct clusters, let us say {V_1,V_2,V_3}. We’ll show that in fact these clusters support many triangles.

Call a vertex {v_1\in V_1} typical if it has at least {\varepsilon c/4} adjacent vertices in {V_2} and at least {\varepsilon c/4} adjacent vertices in {V_3}. As, by hypothesis,

\displaystyle d\left(V_i',V_j'\right)\ge\dfrac{\varepsilon}{4} \ \ \ \ \ (1)

whenever {V_i'\subset V_i}, {V_j'\subset V_j} have cardinality at least {\varepsilon c/4}, there are more than {c/2} typical vertices in {V_1}. In fact, the number of vertices in {V_1} with at least {\varepsilon c/4} adjacent vertices in {V_2} is greater than {(1-\varepsilon/4)\cdot c}. If this were not the case, the subset {V_1'\subset V_1} of non-typical vertices would have more than {\varepsilon c/4} elements and would satisfy

\displaystyle d\left(V_1',V_2\right)<\dfrac{|V_1'|\cdot\dfrac{\varepsilon c}{4}}{|V_1'|\cdot|V_2|} =\dfrac{\varepsilon}{4}\,,

contradicting (1). As the same argument holds to {V_3}, the number of typical vertices in {V_1} is at least

\displaystyle \left(1-2\cdot\dfrac{\varepsilon}{4}\right)\cdot c>\dfrac{c}{2}\,\cdot

Let {v_1\in V_1} be one of them and consider {V_2'\subset V_2,V_3'\subset V_3} the vertices adjacent to {v_1}.

Every edge between {V_2'} and {V_3'} generates a triangle. Observe that this number is at least

\displaystyle e(V_2',V_3')\ge \dfrac{\varepsilon}{4}\cdot|V_2'|\cdot|V_3'|\ge \dfrac{\varepsilon^3\cdot c^2}{4^3}\,\cdot

Summing this up in {v_1\in V_1} typical, {G'} has at least {(\varepsilon c/4)^3} triangles. Because {c>n/2T(\varepsilon/4,\lfloor 4/\varepsilon\rfloor)}, this quantity is greater or equal to

\displaystyle \left(\dfrac{\varepsilon}{4}\cdot \dfrac{n}{2\cdot T(\varepsilon/4,\lfloor 4/\varepsilon\rfloor)}\right)^3= \left(\dfrac{\varepsilon}{8\cdot T(\varepsilon/4,\lfloor 4/\varepsilon\rfloor)}\right)^3\cdot n^3= \delta(\varepsilon)\cdot n^3.

\Box

1.1. Roth’s theorem

As an application of the triangle removal lemma, we prove Roth’s theorem proved in On certain sets of integers.

Theorem 3 (Roth) If {A\subset\mathbb N} has positive upper density, then it contains an arithmetic progression of length {3}.

Proof: Assume that

\displaystyle |A\cap\{1,\ldots,n\}|>\varepsilon n\,,\ \forall\,n\ge n_0.

Consider a graph {G=(V,E)} in the following way:

  1. {V=V_1\cup V_2\cup V_3}, where {V_1,V_2,V_3} have {3n} vertices labeled from 1 to {3n} each.
  2. There is an edge from {i\in V_1} to {j\in V_2} iff {j-i\in A}.
  3. There is an edge from {j\in V_2} to {k\in V_3} iff {k-j\in A}.
  4. There is an edge from {i\in V_1} to {k\in V_3} iff {(k-i)/2\in A}.

Then {i,j,k} form a triangle exists iff

\displaystyle \left\{\begin{array}{c} j-i=a_1\in A\\ k-j=a_3\in A\\ \dfrac{k-i}{2}=a_2\in A \end{array}\right.\Longrightarrow (a_1,a_2,a_3)\text{ is an arithmetic progression in }A,

that is, the triangles identify arithmetic progressions of length 3 in {A}, including the trivial ones {(a,a,a)}, {a\in A}. There are more than {\varepsilon n\cdot n=\varepsilon n^2} of these trivial triangles {i,i+a,i+2a} and they are all disjoint. This mere disjointness implies {G} is {\varepsilon}-far from being triangle-free and so, by the triangle removal lemma, {G} has at least {\delta n^3} triangles of which at least {\delta n^3-81n^2} are non-trivial. The proof is complete by taking {n>81\delta^{-1}}. \Box

1.2. Corner’s theorem

This result was first proved by Ajtai and Szemerédi in Sets of lattice points that form no squares. The simpler proof we present here, using the triangle removal lemma, was obtained by Solymosi in Note on a generalization of Roth’s theorem. We point out, and leave the proof to the reader, that the corner’s theorem is a strengthening of Roth’s theorem.

Definition 4 A corner is an axis-aligned isosceles triangle of {\mathbb Z^2}, that is, it is a set of three different elements of {\mathbb Z^2} of the form

\displaystyle (x,y),(x+h,y)\text{ and }(x,y+h).

The corner’s theorem states that every set of positive density has a corner.

Theorem 5 (Corner’s theorem) For every {\varepsilon>0}, there exists {n>0} such that any subset of {\{1,\ldots,n\}\times\{1,\ldots,n\}} with at least {\varepsilon n^2} points has a corner.

Proof: We proceed similarly to the proof of Roth’s theorem. Let {A} be a subset of {\{1,\ldots,n\}\times\{1,\ldots,n\}} with at least {\varepsilon n^2} points and consider the tripartite graph {G=(V,E)} defined by:

  1. {V=V_1\cup V_2\cup V_3}, where {V_1,V_2} and {V_3} represent the horizontal, vertical and diagonal lines of {\{1,\ldots,n\}\times\{1,\ldots,n\}} respectively.
  2. There is an edge from a line of {V_i} to a line of {V_j} iff the intersection of the two lines belongs to {A}.

{G} has {|V_1|+|V_2|+|V_3|=n+n+2n=4n} vertices. The triangles of {G} correspond to the corners of {A}, including the trivial ones {(x,y),(x,y),(x,y)}. {G} has more than {|A|\ge \varepsilon n^2} of these trivial triangles and they are all disjoint, so that {G} is {\varepsilon/16}-far from being triangle-free. By the triangle removal lemma, {G} has at least {\delta n^3} triangles of which at least {\delta n^3-n^2} are non-trivial and give the required corner. \Box

2. Graph removal lemma

The triangle removal lemma asserts that every graph for which it is necessary to throw a positive fraction of edges in order to destroy all triangles indeed has a positive fraction of triangles. The fact is that, as proved by Erdös, Frankl and Rödl in The asymptotic number of graphs not containing a fixed subgraph and a problem for hypergraphs having no exponent, instead of fixing the triangle configuration, the result is also true for any fixed configuration. More formally, let {H} be a finite graph with {h} vertices and, analogously, consider the following

Definition 6 Given {\varepsilon>0}, a graph {G=(V,E)} is {\varepsilon}-far from being {H}-free if one has to delete at least {\varepsilon\cdot|V|^2} edges of {G} to destroy all copies of {H} in it.

Theorem 7 (Graph removal lemma) For any {0<\varepsilon<1}, there is {\delta=\delta(\varepsilon)>0} such that, whenever {G=(V,E)} is {\varepsilon}-far from being {H}-free, then it contains at least {\delta\cdot|V|^h} copies of {H}.

The proof of this theorem is more intricate than that of the triangle removal lemma. Actually, it depends on the structure of the graph {H}. If, for example, {H} is a four-cycle, then the argument applied in the proof of the triangle removal lemma does not work, mainly because, once the “impure” edges are discarded, the copy of {H} that remains may have two vertices in a same cluster. In other words, the connectivity properties of {H} influence the distribution of the vertices along the clusters in a potential candidate for copy of {H} in {G}. As in the triangle removal lemma, this problem does not occur if {H} is the complete graph {K_r} in {r} vertices. For this reason, the proof of the graph removal lemma will be accomplished in three parts:

Part 1. The establishment of the graph removal lemma for {K_r}.

Part 2. We observe that, for a general {H}, the application of the same idea in Part 1 only guarantees the existence of {r} clusters, where {r} is the chromatic number of {H}.

Part 3. If we apply the same idea as in Part 1, allowing the choice of more than one vertex in a same cluster, we obtain the result for any {H}.

As remarked above, Part 1 follows the same lines of the proof of the triangle removal lemma: we clean out the graph and the remaining copy of {K_r} is supported in {r} different clusters, which indeed contain many copies of {K_r}. The construction of many copies is again accomplished by the typicality of most of the vertices, and is given by the following

Lemma 8 If {(A,B)} is {\varepsilon'}-regular and {d(A,B)>\varepsilon}, then at least {(1-\varepsilon')|A|} vertices of {A} are adjacent to at least {(\varepsilon-\varepsilon')|B|} vertices of {B}.

Proof: Let {A'=\{v\in A\,;\, v\text{ is adjacent to less than }(\varepsilon-\varepsilon')|B|\text{ vertices of }B\}}. Then

\displaystyle d(A',B)<\dfrac{|A'|\cdot(\varepsilon-\varepsilon')|B|}{|A'|\cdot|B|}=\varepsilon-\varepsilon'. \ \ \ \ \ (2)

If {|A'|\ge\varepsilon'|A|}, the {\varepsilon'}-regularity guarantees that

\displaystyle d(A',B)>d(A,B)-\varepsilon'>\varepsilon-\varepsilon',

thus contradicting (2). \Box

Proof of the graph removal lemma for K_r: Let {G=(V,E)} be {\varepsilon}-far from being {K_r}-free graph with {|V|=n>N((\varepsilon/6)^r,(\varepsilon/6)^{-r})}, and consider the {(\varepsilon/6)^r}-regular partition {\mathcal U=\{V_0,V_1,\ldots,V_k\}} given by Szemerédi’s regularity lemma. Let {c=|V_1|=\cdots=|V_k|} and {G'} the graph obtained from {G} by deleting the following edges:

  1. All edges incident in {V_0}: there are at most {(\varepsilon/6)^r\cdot n^2} edges.
  2. All edges inside the clusters {V_1,\ldots,V_k}: the number of edges is at most

    \displaystyle c^2\cdot k<\dfrac{n^2}{k}<(\varepsilon/6)^r\cdot n^2.

  3. All edges that lie in irregular pairs: there are less than

    \displaystyle \left((\varepsilon/6)^r\cdot k^2\right)\cdot c^2<(\varepsilon/6)^r\cdot n^2

    edges.

  4. All edges lying in a pair of clusters whose density is less than {\varepsilon/3}: their cardinality is at most

    \displaystyle {k\choose 2}\cdot\dfrac{\varepsilon\cdot c^2}{3}<\dfrac{\varepsilon\cdot n^2}{3}\,\cdot

The number of deleted edges is less than {\varepsilon\cdot n^2} and so {G'} contains {K_r}. The vertices of such {K_r} belong to {r} distinct remaining clusters, say {V_1,V_2,\ldots,V_r}. We’ll show that these clusters support many copies of {K_r}. This is done in {r} steps, the step {i} consisting of choosing a vertex {v_i} from {V_i} in such a way that {v_i} is adjacent to each of the previously chosen vertices {v_1,\ldots,v_{i-1}}. If there are {\delta_i c} ways of choosing {v_i}, where {\delta_i} is independent of {n}, then {G} contains at least

\displaystyle (\delta_1 c)\cdots(\delta_r c)> \left(\dfrac{\delta_1\cdots\delta_r}{2^r\cdot T((\varepsilon/6)^r,(\varepsilon/6)^{-r})^r}\right)\cdot n^r =: \delta\cdot n^r

copies of {K_r} and we’re done.

By Lemma 8, at least {(1-r\cdot(\varepsilon/6)^r)|V_1|} points in {V_1} are joined to at least {(\varepsilon/3-(\varepsilon/6)^r)|V_j|} points of {V_j} for each {j=2,\ldots,r}. Take one such point {v_1} and denote by {V_j^1} the subset of {V_j} of all the adjacent vertices to {v_1}, for each {j=2,\ldots,r}. We have

\displaystyle |V_j^1|\ge\left(\dfrac{\varepsilon}{3}-\left(\dfrac{\varepsilon}{6}\right)^r\right)|V_j| >\left(\dfrac{\varepsilon}{6}\right)|V_j|

and hence any two of the clusters {V_2^1,\ldots,V_r^1} are {(\varepsilon/6)^{r-1}}-regular and have density at least {\varepsilon/3-(\varepsilon/6)^r}. This concludes the first step.

We now proceed to step 2: again by Lemma 8, at least {(1-r\cdot(\varepsilon/6)^{r-1})|V_2^1|} points in {V_2^1} are joined to at least {(\varepsilon/3-(\varepsilon/6)^r-(\varepsilon/6)^{r-1})|V_j^1|} points of {V_j^1} for each {j=3,\ldots,r}. Take one such point {v_2\in V_2^1} and denote by {V_j^2} the subset of {V_j^1} of all the adjacent vertices to {v_2}, for each {j=3,\ldots,r}. We have

\displaystyle |V_j^2|\ge \left(\dfrac{\varepsilon}{3}-\left(\frac{\varepsilon}{6}\right)^r-\left(\frac{\varepsilon}{6}\right)^{r-1}\right)|V_j^1| >\left(\dfrac{\varepsilon}{6}\right)|V_j^1|

and hence any two of the clusters {V_3^2,\ldots,V_r^2} are {(\varepsilon/6)^{r-2}}-regular and have density at least {\varepsilon/3-(\varepsilon/6)^r-(\varepsilon/6)^{r-1}}.

Assuming, without loss of generality, that {r\varepsilon<1} and {\varepsilon/6+\cdots+(\varepsilon/6)^r<\varepsilon/3}, the above procedure can be repeated {r} times. This concludes the proof. \Box

Now let {H} be an arbitrary graph. The chromatic number of {H} is the smallest number of colors needed to paint the vertices of {H} in such a way that no two adjacent vertices have the same color. Equivalently, it is the smallest {r} for which {H} is {r}-partite, that is, for which one can divide the vertices of {H} into {r} disjoint subsets such that no two vertices on a same subset are adjacent. Let these subsets have cardinality {h_1,\ldots,h_r} and {K_{h_1,\ldots,h_r}} be the complete {r}-partite graph whose subsets have cardinality {h_1,\ldots,h_r}. Obviously, {K_{h_1,\ldots,h_r}} contains {H} and so the number of copies of {H} in a given graph is at least the number of copies of {K_{h_1,\ldots,h_r}}.

Observe that if we apply the same idea as in Part 1 to a {\varepsilon}-far from being {H}-free graph, the remaining copy of {H} has vertices in at least {r} clusters {V_1,\ldots,V_r}, and not necessarily in {h} different clusters. This is not a problem: instead of choosing one vertex in each {V_i}, we choose {h_i} of them. If the same procedure works, each of these choices generates a copy of {K_{h_1,\ldots,h_r}} and thus of {H}. This is how we proceed below.

Proof of the graph removal lemma: Apply the same argument as in Part 1 to obtain clusters {V_1,\ldots,V_r} such that any pair is {(\varepsilon/6)^h}-regular and has density at least {\varepsilon/3}. We can thus find {(1-r\cdot(\varepsilon/6)^h)|V_1|} points of {V_1} which are joined to at least {(\varepsilon/3-(\varepsilon/6)^h)|V_i|} points of {V_i} for each {i=2,\ldots,r}. Take one such point {v_1\in V_1} and denote by {V_i^1} the set of all vertices of {V_i} which are joined to {v_1}, for each {i=2,\ldots,r}. Also, set {V_1^1=V_1\backslash\{v_1\}} (here is the difference: we don’t discard {V_1}). We have

\displaystyle |V_i^1|\ge (\varepsilon/3-(\varepsilon/6)^h)|V_i|>(\varepsilon/6)|V_i|

for every {i=1,\ldots,r} and hence each pair among {V_1^1,\ldots,V_r^1} is {(\varepsilon/6)^{h-1}}-regular and has density at least {\varepsilon/3-(\varepsilon/6)^h}.

Repeating this argument successively {h_1} times in the first cluster, {h_2} times in the second cluster, {\ldots} , {h_r} times in the {r}-th cluster, we construct vertices {v_1,\ldots,v_h} forming a copy of {K_{h_1,\ldots,h_r}}. This completes the proof. \Box

We point out a recent proof of the graph removal lemma avoiding the use of Szemerédi’s regularity lemma has been obtained by Fox in the paper A new proof of the graph removal lemma. Although it does not use the regularity lemma, its idea is similar. Instead of using the mean square density given by the index

\displaystyle q(\mathcal U)=\sum_{A,B\in\mathcal U}q(A,B)= \sum_{A,B\in\mathcal U}\dfrac{|A|}{n}\cdot \dfrac{|B|}{n}\cdot d^2(A,B),

it uses a mean entropy density

\displaystyle \sum_{A,B\in\mathcal U}\dfrac{|A|}{n}\cdot \dfrac{|B|}{n}\cdot f(d(A,B)),

where {f(x)=x\log x} for {0< x\le 1} and {f(0)=0}. Like in the regularity lemma, whenever a partition does not supports many copies of {H}, Fox shows, using a Jensen defect inequality, that there is a refinement of the partition that increases the mean square entropy a fixed amount.

We finish this post mentioning another version of the graph removal lemma that counts the number of induced graphs. A subgraph {H} of a graph {G} is said to be induced if any pair of vertices of {H} are adjacent if and only if they are adjacent in {G}. For example, {K_5} has an induced {K_4} but does not have an induced four-cycle. This shows that induced graphs are harder to find, and actually that the excess of edges might prevent them to exist. In this setting, we have to consider a new definition of {\varepsilon}-far from being {H}-free, in which one can remove or include edges.

Definition 9 Given {\varepsilon>0}, a graph {G=(V,E)} is {\varepsilon}-unavoidable for {H} if any graph that differs from {G} in no more that {\varepsilon\cdot |V|^2} edges has an induced copy of {H}.

We thus have the graph removal lemma, proved by Alon, Fischer, Krivelevich and Szegedy in Efficient testing of large graphs.

Theorem 10 (Graph removal lemma for induced graphs) For any {0<\varepsilon<1}, there is {\delta=\delta(\varepsilon)>0} such that, whenever {G=(V,E)} is {\varepsilon}-unavoidable for {H}, then it contains at least {\delta\cdot|V|^h} induced copies of {H}.

We won’t prove it. Instead, we refer the reader to the original paper.

The content of this and the post Szemerédi’s regularity lemma can be downloaded in PDF format in my lecture notes available at my homepage.

About these ads

Responses

  1. Question on Corner’s Theorem:

    G has V = 3A vertices and at least A >= d * n^2 = (d * n^2 / 9A^2) V^3 trivial triangles.
    Hence, triangle removal must be applied with e = e(d * n^2 / 9A^2).
    It follows that G has at least

    27 e A^3 – A

    triangles. But this need not necessarily be > 1, since e = e(d * n^2 / 9A^2) converges to zero as n –> infty (Notice that A^2 ~ n^4).

  2. Dear D.H.J. Nam,

    You’re absolutely right. The constructed graph has O(n^2) vertices and so. to apply the triangle removal lemma, it should be O(n^4)-far from being triangle free, which I don’t see why.

    To correctly prove the theorem, we define a graph with O(n) vertices in the following way: instead of labeling the graph by the points of A itself, we label it by horizontal, vertical and diagonal lines of the grid and join two lines if their intersection belongs to A. You can check the details in the post, now upgraded with a (hopefully) right proof.

    Thanks for the observations.

    • Dear yglima,

      if you wish, you can delete my correction, since by now, it is unnecessary and only makes this blog longer than it could be …


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.

Join 107 other followers

%d bloggers like this: