Posted by: matheuscmss | December 30, 2018

Continued fractions, binary quadratic forms and Markov spectrum

In Number Theory, the study of the so-called Markov spectrum often relies upon the following classical fact relating continued fractions and the values of real indefinite binary quadratic forms at integral points:

Theorem 1 The sets

\displaystyle M_1:=\left\{\frac{\sqrt{b^2-4ac}}{\inf\limits_{(x,y)\in\mathbb{Z}^2\setminus\{(0,0)\}}|ax^2+bxy+cy^2|}<\infty: a,b,c\in\mathbb{R}, b^2-4ac>0\right\}

and

\displaystyle M_2:=\left\{\sup\limits_{n\in\mathbb{Z}}([g_n; g_{n+1},\dots]+[0;g_{n-1},g_{n-2},\dots])<\infty: (g_m)_{m\in\mathbb{Z}}\in(\mathbb{N}^*)^{\mathbb{Z}}\right\}

coincide. (Here, {[t_0;t_1,\dots] := t_0+\frac{1}{t_1+\frac{1}{\ddots}}} stands for continued fraction expansions.)

Remark 1 It is worth to note that the set {M_1} concerns only the real binary quadratic forms {q(x,y)=ax^2+bxy+cy^2} with {b^2-4ac>0} such that the quantity

\displaystyle \frac{\sqrt{b^2-4ac}}{\inf\limits_{(x,y)\in\mathbb{Z}^2\setminus\{(0,0)\}}|ax^2+bxy+cy^2|}

is finite. In particular, we are excluding real binary quadratic forms {q} with {0\in q(\mathbb{Z}^2\setminus\{(0,0)\})}.

In this post, we follow the books of Dickson and Cusick–Flahive in order to give a proof of this result via the classical reduction theory of binary quadratic forms.

1. Generalities on binary quadratic forms

A real binary quadratic form is {q(x,y)=ax^2+bxy+cy^2} with {a,b,c\in\mathbb{R}}. From the point of view of linear algebra, a binary quadratic form is

\displaystyle q(x,y) = ax^2+bxy+cy^2 =\langle Mv,v\rangle = v^t M v \ \ \ \ \ (1)

where {\langle .,.\rangle} is the usual Euclidean inner product of {\mathbb{R}^2}, {v} is the column vector {v=\left(\begin{array}{c} x \\ y \end{array}\right)}, and {M} is the matrix {M=\left(\begin{array}{cc} a & b/2 \\ b/2 & c \end{array}\right)}.

The discriminant of {q(x,y)=ax^2+bxy+cy^2 = \langle Mv, v\rangle} is {d:=b^2-4ac = -4\det(M)}.

Remark 2 The values taken by {q(x,y)=bxy} are very easy to describe: in particular, {0\in q(\mathbb{Z}^2\setminus\{(0,0)\})} in this context. Hence, by Remark 1, we can focus on {q(x,y)=ax^2+bxy+cy^2} with {a\neq 0} or {c\neq0}. Moreover, by symmetry (i.e., exchanging the roles of the variables {x} and {y}), we can assume that {a\neq0}. So, from now on, we shall assume that {a\neq0}.

Note that {4aq(x,y)=(2ax+by)^2-dy^2}. Therefore, {q} is definite (i.e., its values are all positive or all negative) whenever the discriminant is {d\leq 0}. Conversely, when {a\neq 0}, {q} is indefinite (i.e., it takes both positive and negative values) whenever {d>0}. In view of the definition of {M_1} in the statement of Theorem 1 above, we will restrict ourselves from now on to the indefinite case {d>0}.

Observe also that {q(x,y)=a\prod\limits_{a\omega^2+b\omega+c=0}(x-\omega y) = a(x-fy)(x-sy)} where

\displaystyle f:=\frac{\sqrt{d}-b}{2a} \quad \textrm{and} \quad s := \frac{-\sqrt{d}-b}{2a} \ \ \ \ \ (2)

are the first and second roots of {a\omega^2+b\omega+c=0}. In particular, {0\in q(\mathbb{Z}^2\setminus\{(0,0)\})} when {f\in\mathbb{Q}} or {s\in\mathbb{Q}}. So, by Remark 1, we will suppose from now on that {f, s\notin\mathbb{Q}}.

In summary, from now on, our standing assumptions on {q(x,y)=ax^2+bxy+cy^2} are {a\neq0}, {d:=b^2-4ac>0} and {f:=\frac{\sqrt{d}-b}{2a}\notin\mathbb{Q}}, {s:= \frac{-\sqrt{d}-b}{2a}\notin\mathbb{Q}}.

Remark 3 The first and second roots {f}, {s} and the discriminant {d>0} determine the coefficients {a}, {b}, {c} of the binary quadratic form {q(x,y)}. Indeed, the formula {q(x,y)=a(x-fy)(x-sy)} says that it suffices to determine {a}. On one hand, the modulus {|a|} is given by {4a^2(f-s)^2 = d}. On the other hand, the fact that {f} is the 1st root and {s} is the 2nd root determines the sign of {a}: if we replace {a} by {-a} in the formula {q(x,y)=a(x-fy)(x-sy)}, then we obtain the binary quadratic form {-q} whose 1st root is {s} and 2nd root is {f}; hence, an ambiguity on the sign of {a} could only occur when {f=s}, i.e., {d=0}, in contradiction with our assumption {d>0}.

2. Action of {GL_2(\mathbb{R})} on binary quadratic forms

A key idea (going back to Lagrange) to study the values of a fixed binary quadratic form {q(x,y)=ax^2+bxy+cy^2} is to investigate the equivalent problem of describing the values of a family of binary quadratic forms on a fixed vector.

More precisely, if a vector {v=\left(\begin{array}{c} x \\ y \end{array}\right)=T(V)} is obtained from a fixed vector {V=\left(\begin{array}{c} X \\ Y \end{array}\right)} via a matrix {T=\left(\begin{array}{cc} \alpha & \beta \\ \gamma & \delta \end{array}\right)\in GL_2(\mathbb{R})}, i.e., {x=\alpha X + \beta Y}, {y=\gamma X+ \delta Y}, then the value of {q} at {v} equals the value of {q\circ T} at {V}. By (1), the binary quadratic forms {q} and {q\circ T} are related by:

\displaystyle q\circ T (V) = (T(V))^t\cdot M \cdot T(V) = V^t (T^t M T) V \ \ \ \ \ (3)

where {M=\left(\begin{array}{cc} a & b/2 \\ b/2 & c \end{array}\right)}. In other terms, {q\circ T(X,Y)=AX^2+BXY+CY^2} where

\displaystyle \left(\begin{array}{cc} A & B/2 \\ B/2 & C \end{array}\right) = T^t M T = \left(\begin{array}{cc} \alpha & \gamma \\ \beta & \delta \end{array}\right)\left(\begin{array}{cc} a & b/2 \\ b/2 & c \end{array}\right)\left(\begin{array}{cc} \alpha & \beta \\ \gamma & \delta \end{array}\right), \ \ \ \ \ (4)

i.e., {A=a\alpha^2+b\alpha\gamma+c\gamma^2}, {B=2a\alpha\beta+b(\alpha\delta+\beta\gamma)+ 2c\gamma\delta}, and {C=a\beta^2+b\beta\delta+c\delta^2}.

Observe that (3) implies that the discriminant of {q\circ T} is {\det(T)^2\cdot d}. Thus, {q\circ T} and {q} have the same discriminant whenever {\det(T)=\pm1}.

2.1. Action of {SL_2(\mathbb{Z})} on binary quadratic forms

The quantity {\inf\{|z|: z\in q(\mathbb{Z}^2\setminus\{(0,0)\})\}} appearing in the definition of {M_1} in the statement of Theorem 1 leads us to restrict our attention to the action of {SL_2(\mathbb{Z})} on {q(x,y)=ax^2+bxy+cy^2}. In fact, since {q(\lambda x,\lambda y) = \lambda^2 q(x,y)}, we have that

\displaystyle \inf\{|z|: z\in q(\mathbb{Z}^2\setminus\{(0,0)\})\} = \inf\{|z|: z\in q(\mathbb{Z}^2_{prim})\} \ \ \ \ \ (5)

where {\mathbb{Z}^2_{prim}:=\{(p,q)\in\mathbb{Z}^2: gcd(p,q)=1\}} is the set of primitive vectors of {\mathbb{Z}^2}. So, it is natural to concentrate on {SL_2(\mathbb{Z})} because it acts transitively on {\mathbb{Z}^2_{prim}}.

Definition 2 Two binary quadratic forms {q} and {Q} are equivalent whenever {Q=q\circ T} for some {T\in SL_2(\mathbb{Z})}.

Note that two equivalent binary quadratic forms have the same discriminant. Furthermore, if {(x-\omega y)} is a factor of {q(x,y)=ax^2+bxy+cy^2} and {x=\alpha X+\beta Y}, {y=\gamma X+\delta Y} with {T=\left(\begin{array}{cc}\alpha & \beta\\ \gamma&\delta\end{array}\right)\in SL_2(\mathbb{Z})}, then {(\alpha X+\beta Y - \omega(\gamma X+\delta Y))} is a factor of {q\circ T(X,Y)=AX^2+BXY+CY^2} and, a fortiori, {(X-\frac{-\beta+\omega\delta}{\alpha-\omega\gamma}Y)} is a factor of {q\circ T}. In particular, the roots of {A\Omega^2+B\Omega+C=0} are related to the roots of {a\omega^2+b\omega+c=0} via

\displaystyle \Omega=\frac{-\beta+\omega\delta}{\alpha-\omega\gamma} \ \ \ \ \ (6)

In other words, {\omega} and {\Omega} are related to each other via the action of {T^{-1} = \left(\begin{array}{cc} \delta & -\beta \\ -\gamma & \alpha \end{array}\right)} by Möbius transformations. Actually, a direct calculation with the formulas for {A} and {B} in (4) and the fact that {\alpha\delta-\beta\gamma=1} show that (6) respect the order of the roots, i.e.,

\displaystyle F=\frac{-\beta+f\delta}{\alpha-f\gamma} \quad \textrm{and} \quad S=\frac{-\beta+s\delta}{\alpha-s\gamma} \ \ \ \ \ (7)

where {f} and {F} are the first roots, and {s} and {S} are the 2nd roots.

Remark 4 Note that {\alpha-\omega\gamma\neq 0} because {(\alpha, \gamma)\in\mathbb{Z}^2_{prim}} and we are assuming that the roots of {a\omega^2+b\omega+c=0} are irrational.

Lemma 3 Under our standing assumptions, any binary quadratic form {q} is equivalent to some {ax^2+bxy+cy^2} with {|b|\leq |a|\leq \sqrt{d/3}}.

Proof: We will proceed in two steps: first, we apply certain elements of {SL_2(\mathbb{Z})} to ensure that {|a|\leq \sqrt{d/3}}, and, after that, we use a parabolic matrix to obtain {|b|\leq |a|}.

By (4), the matrix {H_0=\left(\begin{array}{cc} h_0 & 1 \\ -1 & 0 \end{array}\right)\in SL_2(\mathbb{Z})} converts {q(x,y)=a_0x^2+b_0xy+c_0y^2} into {q\circ H_0(X,Y) = a_1X^2+b_1XY+a_0Y^2} with {b_1=2a_0h_0-b_0}.

If {|a_0|>\sqrt{d/3}}, the choice of {h_0\in\mathbb{Z}} with {|2a_0h_0-b_0|\leq |a_0|} in the discussion above leads to {q\circ H_0(X,Y)=a_1X^2+b_1XY+a_0Y^2} with

\displaystyle 4a_1a_0 = b_1^2-d\leq b_1^2 = (2a_0h_0-b_0)^2\leq a_0^2 \quad \textrm{and} \quad -4a_1a_0 = d-b_1^2\leq d<3a_0^2.

In other terms, if {|a_0|>\sqrt{d/3}}, then {q} is equivalent to {a_1X^2+b_1XY+a_0Y^2} with {|4a_1a_0|<3a_0^2}, i.e., {|a_1|<(3/4)|a_0|}.

By iterating this process, we find a sequence {a_nX^2+b_nXY+a_{n-1}Y^2} of binary quadratic forms equivalent to {q} such that {|a_n|<(3/4)^n |a_0|} whenever {|a_{n-1}|>\sqrt{d/3}}. It follows that {q} is equivalent to some {ax^2+\widetilde{b}xy+\widetilde{c}y^2} with {|a|\leq \sqrt{d/3}}.

Finally, by (4), the matrix {P=\left(\begin{array}{cc} 1 & k \\ 0 & 1 \end{array}\right)\in SL_2(\mathbb{Z})} converts {ax^2+\widetilde{b}xy+\widetilde{c}y^2} into {aX^2+bXY+cY^2} with {b=\widetilde{b}+2ka}. Hence, the choice of {k\in\mathbb{Z}} with {|\widetilde{b}+2ka|\leq |a|} gives that {q} is equivalent to {ax^2+bxy+cy^2} with {|b|\leq |a|\leq \sqrt{d/3}}. \Box

2.2. Reduction theory (I)

In general, the study of the dynamics of the action of a group {G} on a certain space {\mathcal{M}} is greatly improved in the presence of nice fundamental domain, i.e., a portion {\mathcal{D}\subset \mathcal{M}} with good geometrical properties capturing all orbits in the sense that the {G}-orbit of any {x\in \mathcal{M}} intersects {\mathcal{D}}.

In the setting of {SL_2(\mathbb{Z})} acting on binary quadratic forms, the role of fundamental domain is played by the notion of reduced binary quadratic form:

Definition 4 We say that {q(x,y)=ax^2+bxy+cy^2} is reduced whenever

\displaystyle \frac{|\sqrt{d}-b|}{2|a|}=|f|<1, \quad \frac{|\sqrt{d}+b|}{2|a|}=|s|>1 \quad \textrm{and} \quad f\cdot s<0

Remark 5 Suppose that {q} is reduced. Then, {d-b^2=(\sqrt{d}-b)(\sqrt{d}+b)=-4a^2fs>0} thanks to the condition {f\cdot s<0}. In particular, {|b|<\sqrt{d}}, so that {\sqrt{d}- b, \sqrt{d}+b > 0}. Hence, {0 < \sqrt{d}-b < 2|a| < \sqrt{d}+b} thanks to the condition {|f|<1<|s|}. Note that the inequality {\sqrt{d}-b<\sqrt{d}+b} says that {b>0}. In other words, {q} reduced implies ({0<b<\sqrt{d}} and) {0<\sqrt{d}-b<2|a|< \sqrt{d}+b}.Conversely, the inequality {0<\sqrt{d}-b<2|a|< \sqrt{d}+b} implies that ({0<b<\sqrt{d}} and) {q} is reduced.

Therefore, {q} is reduced if and only if

\displaystyle 0<\sqrt{d}-b<2|a|< \sqrt{d}+b

Furthermore, the identity {|\sqrt{d}-b|\cdot|\sqrt{d}+b|=4|ac|} implies that {q} is reduced if and only if

\displaystyle 0<\sqrt{d}-b<2|c|< \sqrt{d}+b

In particular, {q} is reduced if and only {q\circ R} is reduced where {R\in GL_2(\mathbb{Z})} is the matrix {R=\left(\begin{array}{cc} 0&1\\ 1 & 0\end{array}\right)} inducing the change of variables {x=Y}, {y=X}.Moreover, if {q} is reduced, then {f\cdot a = (\sqrt{d}-b)/2>0} and {c/a = f\cdot s<0}, i.e., the sign of {c} is opposite to the signs of {f} and {a}.

The next result says that reduced forms are a fundamental domain for the {SL_2(\mathbb{Z})} action on binary quadratic forms:

Theorem 5 Under our standing assumptions, any binary quadratic form {q} is equivalent to some reduced binary quadratic form.

Proof: By Lemma 3, we can assume that {q(x,y)=ax^2+bxy+cy^2} with {|b|\leq |a|\leq \sqrt{d/3}\leq\sqrt{d}}, so that {|4ac|=d-b^2\leq d}. Thus, {\min\{2|a|, 2|c|\}\leq\sqrt{d}}. By performing the change of variables {x=Y}, {y=-X} (i.e., applying the matrix {\left(\begin{array}{cc} 0&1 \\ -1&0\end{array}\right)\in SL_2(\mathbb{Z})}) if necessary, we can suppose that {2|a|\leq \sqrt{d}}.

Since we are assuming that {a\neq 0}, it is possible to choose {k\in\mathbb{Z}} such that {0\leq\sqrt{d}-2|a|\leq \widetilde{b}:=b+2ak\leq \sqrt{d}}. So, if we apply the matrix {\left(\begin{array}{cc} 1&k \\ 0&1\end{array}\right)\in SL_2(\mathbb{Z})} (i.e., {x=X+kY}, {y=Y}) to {q}, then we get the equivalent binary quadratic form {ax^2+\widetilde{b}xy+\widetilde{c}} with {0\leq\sqrt{d}-\widetilde{b}\leq 2|a|\leq \sqrt{d}\leq \sqrt{d}+\widetilde{b}}.

We affirm that {ax^2+\widetilde{b}xy+\widetilde{c}} is reduced. In fact, by Remark 5, our task is to show that we have strict inequalities

\displaystyle 0<\sqrt{d}-\widetilde{b}< 2|a|< \sqrt{d}+\widetilde{b}

However, {0=\sqrt{d}-b} or {\sqrt{d}-\widetilde{b}=2|a|} or {2|a|=\sqrt{d}+\widetilde{b}} would imply that {0} or {\pm1} is a root of {a\omega^2+\widetilde{b}\omega+\widetilde{c}=0}, a contradiction with our assumption that none of its roots is rational. Hence, {ax^2+\widetilde{b}xy+\widetilde{c}} is reduced. \Box

2.3. Reduction theory (II)

In this subsection, we use the matrices {L_{\delta}:=\left(\begin{array}{cc} 0 & 1 \\ -1 & \delta \end{array}\right)\in SL_2(\mathbb{Z})} inducing the change of variables {x=Y}, {y=-X+\delta Y} (similar to an Euclidean division) to navigate through the set of reduced binary quadratic forms (cf. Lemma 6 below).

Before explaining this fact, let us introduce the following definition: we say that the binary quadratic forms {q\circ L_{\delta}}, resp. {q\circ L_{\delta}^{-1}}, {\delta\in\mathbb{Z}}, are the right, resp. left neighbors of the binary quadratic form {q}.

The notion of neighbors is adapted to the investigation of reduced binary quadratic forms because of the following result:

Lemma 6 Under our standing assumptions, a reduced binary quadratic form has an unique reduced right neighbor and an unique reduced left neighbor.

Proof: Let us first show the existence and uniqueness of the reduced right neighbor of a reduced binary quadratic form {q(x,y)=ax^2+bxy+cy^2}.

Consider the 1st and 2nd roots {f} and {s} of {a\omega^2+b\omega+c=0}. Since {q} is reduced, i.e., {|f|<1<|s|} and {f\cdot s<0}, the largest integer {|\delta|<1/|f|} is not zero. We take {\delta\in\mathbb{Z}} with the same sign as {f} and {a} and, hence, opposite to {c} (cf. Remark 5).

By (4) and (7), the right neighbor {q\circ L_{\delta}} is {cX^2+\widetilde{b}XY+\widetilde{c}Y^2} with {\widetilde{b}=-b-2\delta c} and the roots of {c\Omega^2+\widetilde{b}\Omega+\widetilde{c}=0} satisfy {F=\delta-\frac{1}{f}} and {S=\delta-\frac{1}{s}}.

Thus, our choice of {\delta} implies that {|F|<1} and {F} has opposite sign to {f} and {\delta}. Since {f\cdot s<0}, {s} has opposite sign to ({f} and) {\delta}, we have that {S} has the same sign of {\delta} and {|S|>|\delta|\geq 1}. In summary, the right neighbor {q\circ L_{\delta}} of {q} is reduced.

Moreover, this reduced right neighbor is unique because {|f|, |F|<1} and {F=\delta-\frac{1}{f}} imply that {\delta\in\mathbb{Z}} must be the unique integer with the same sign of {f} and {|\delta|=\lfloor1/|f|\rfloor}.

Finally, let us establish the existence and the uniqueness of the reduced left neighbor of {q(x,y)=ax^2+bxy+cy^2}. Consider the matrix {R=\left(\begin{array}{cc} 0 & 1 \\ 1 & 0\end{array}\right)} (with determinant {-1}) inducing the change of variables {x=Y} and {y=X}. By Remark 5, the fact that {q} is reduced implies that {q\circ R(X,Y)=cX^2+bXY+aY^2} is also reduced. In particular, {q\circ R} has an unique reduced right neighbor {q\circ R \circ L_{\delta}}. In particular, {q\circ R\circ L_{\delta}\circ R} is reduced. Since

\displaystyle R L_{\delta} R = \left(\begin{array}{cc} 0 & 1 \\ 1 & 0\end{array}\right)\left(\begin{array}{cc} 0 & 1 \\ -1 & \delta\end{array}\right)\left(\begin{array}{cc} 0 & 1 \\ 1 & 0\end{array}\right) = \left(\begin{array}{cc} \delta & -1 \\ 1 & 0\end{array}\right)=L_{\delta}^{-1},

we conclude that {q\circ L_{\delta}^{-1}} is the unique left neighbor of {q}. \Box

This lemma allows us to organize the reduced binary quadratic forms into chains:

Definition 7 chain {\dots, \Phi_{-1},\Phi_0, \Phi_1,\dots} is a sequence of reduced binary quadratic forms such that {\Phi_{i+1}} is the right neighbor {\Phi_i} for all {i\in\mathbb{Z}}.

2.4. Reduction theory (III)

The proof of Lemma 6 hints to a connection between chains and continued fraction expansions. More concretely, let {\{\Phi_i\}_{i\in\mathbb{Z}}} be a chain, say {\Phi_i(x,y) = a_ix^2+b_ixy+c_iy^2} and the change of variables {x=Y}, {y=-X+\delta_i Y} converts {\Phi_i} into {\Phi_{i+1}} (i.e., {\Phi_{i+1} = \Phi_i\circ L_{\delta_i}}). Note that {a_{i+1} = c_i} for all {i\in\mathbb{Z}}. By Remark 5, since {\Phi_i} is reduced, it follows that the sign of {a_{i+1}=c_i} is opposite to the sign of {a_i}. In particular, up to replacing each {\Phi_i} by its right neighbor {\Phi_{i+1}}, we can assume that {a_0>0}.

In this context, we write {\Phi_i(x,y) = (-1)^i A_i+B_ixy+(-1)^{i+1}A_{i+1}} with {A_i>0} for all {i\in\mathbb{Z}}. Since {\Phi_{i+1}=\Phi_i\circ L_{\delta_i}}, we have

\displaystyle B_{i+1}+B_i = 2g_i A_{i+1} \ \ \ \ \ (8)

where {g_i=(-1)^i\delta_i}. As {B_i>0} (because {\Phi_i} is reduced) and {A_i>0} for all {i\in\mathbb{Z}}, the previous equation implies that

\displaystyle g_i>0 \ \ \ \ \ (9)

for all {i\in\mathbb{Z}}. Moreover, {\Phi_{i+1}=\Phi_i\circ L_{\delta_i}} means that the roots of {a_j\omega^2+b_j\omega+c_j=0} satisfy

\displaystyle f_{i+1} = \delta_i - \frac{1}{f_i} \quad \textrm{and} \quad s_{i+1} = \delta_i - \frac{1}{s_i}.

In other words,

\displaystyle F_i = g_i+\frac{1}{F_{i+1}} \quad \textrm{and} \quad S_i = \frac{1}{g_{i-1}+S_{i-1}} \ \ \ \ \ (10)

where {F_i:=(-1)^i/f_i} and {S_i=(-1)^{i+1}/s_i}.

In particular, {F_i=[g_i; g_{i+1},\dots]:=g_i+\frac{1}{g_{i+1}+\frac{1}{\ddots}}} and {S_i = [0;g_{i-1},g_{i-2},\dots]} where {g_i\in\mathbb{N}^*} describe the entries of the continued fraction expansions of

\displaystyle \frac{1}{f_0} = [g_0;g_1,g_2,\dots] \quad \textrm{and} \quad -\frac{1}{s_0} = [0;g_{-1},g_{-2},\dots].

Moreover, the fact that {f_i=\frac{\sqrt{d}-B_i}{2(-1)^iA_i}} and {s_i=\frac{-\sqrt{d}-B_i}{2(-1)^iA_i}} (where {d} is the common discriminant of {\Phi_i}), we have

\displaystyle \begin{array}{rcl} [g_i; g_{i+1},\dots]+[0;g_{i-1},g_{i-2},\dots] &=& F_i+S_i = 2A_i\left(\frac{1}{\sqrt{d}-B_i}+\frac{1}{\sqrt{d}+B_i}\right) \\ &=& 2A_i\frac{2\sqrt{d}}{d-B_i^2} = \frac{4A_i\sqrt{d}}{-4(-1)^iA_i(-1)^{i+1}A_{i+1}} \\ &=& \frac{\sqrt{d}}{A_{i+1}}, \end{array}

that is

\displaystyle [g_i; g_{i+1},\dots]+[0;g_{i-1},g_{i-2},\dots] = \frac{\sqrt{d}}{|\Phi_{i+1}(1,0)|} \ \ \ \ \ (11)

2.5. Reduction theory (IV)

Coming back to our study of binary quadratic forms, we saw that, under our standing assumption, any binary quadratic form is equivalent to some chain of reduced forms.

The next fundamental result says that any equivalence class of binary quadratic forms contains an unique chain.

Theorem 8 Two equivalent reduced binary quadratic forms belong to the same chain.

As it turns out, one can deduce Theorem 1 directly from (11) and this theorem. For this reason, let us postpone the proof of Theorem 8 to the last portion of this section in order to complete the proof of Theorem 1.

2.6. End of proof of Theorem 1 (modulo Theorem 8)

Lagrange used Theorem 8 to obtain the following result:

Theorem 9 Let {\{\Phi_i\}_{i\in\mathbb{Z}}} be a chain of reduced binary quadratic forms with discriminant {d}. Then,

\displaystyle \bigcup\limits_{j\in\mathbb{Z}} \{\Phi_j(n,m): (n,m)\in\mathbb{Z}^2_{prim}, |\Phi_j(n,m)|<\sqrt{d}/2\}\subset \{\Phi_i(1,0):i\in\mathbb{Z}\}

Proof: Fix {j\in\mathbb{Z}} and {(n,m)\in\mathbb{Z}^2_{prim}} such that {a:=\Phi_j(n,m)} verifies

\displaystyle |a|<\sqrt{d}/2

Take {T\in SL_2(\mathbb{Z})} sending {(1,0)} to {(n,m)}. By definition, {a=\Phi_j\circ T(1,0)}. In other terms, {\Phi_j} is equivalent to {\Phi_j\circ T(x,y)=ax^2+Bxy+Cy^2}. Similarly to the proof of Theorem 5, we can take {k\in\mathbb{Z}} such that {b:=B+2ak} satisfies {\sqrt{d}-2|a|<b<\sqrt{d}}, so that the matrix {\left(\begin{array}{cc} 1 & k \\ 0 & 1 \end{array}\right)} converts {\Phi_j\circ T} into the equivalent binary quadratic form {Q(X,Y)=aX^2+bXY+cY^2}. Note that {Q} is reduced because {2|a|<\sqrt{d}} and {\sqrt{d}-2|a|<b<\sqrt{d}}.

In summary, {\Phi_j} and {Q} are equivalent reduced binary quadratic forms. By Theorem 8, {\Phi_j} and {Q} belong to the same chain, i.e., there exists {i\in\mathbb{Z}} such that {Q=\Phi_i}. By definition, this means that {\Phi_j(n,m):=a=\Phi_i(1,0)}. \Box

This result yields the following fact about the values of binary quadratic forms:

Proposition 10 Let {q(x,y)=ax^2+bxy+cy^2} be a binary quadratic form satisfying our standing assumptions. Denote by {\{\Phi_i\}_{i\in\mathbb{Z}}} the unique chain in the equivalence class of {q} (coming from Theorems 5 and 8). Then,

\displaystyle \inf\{|z|:z\in q(\mathbb{Z}^2_{prim})\} = \inf\{|\Phi_i(1,0)|: i\in\mathbb{Z}\}

Proof: Since {q(\mathbb{Z}^2_{prim})=q\circ T(\mathbb{Z}^2_{prim})} for all {T\in SL_2(\mathbb{Z}^2_{prim})}, there is no loss of generality in applying Theorem 5 in order to assume that {q(x,y)=ax^2+bxy+cy^2} is reduced, say {q=\Phi_j}. In this situation, {4|ac|=d-b^2<d}, so that {\min\{2|a|,2|c|\}<\sqrt{d}}, i.e., {\min\{|q(1,0)|, |q(0,1)|\}<\sqrt{d}/2}. Hence,

\displaystyle \inf\{|z|:z\in q(\mathbb{Z}^2_{prim})\} = \inf\{|z|:j\in\mathbb{Z}, z\in \Phi_j(\mathbb{Z}^2_{prim}), |z|<\sqrt{d}/2\}.

The desired conclusion now follows from Theorem 9. \Box

The proof of Theorem 1 is now very easy to complete: indeed, it suffices to put together (5), Proposition 10 and (11).

At this stage, it remains only to prove Theorem 8. This matter is the purpose of the remainder of this section.

2.7. Four basic properties of continued fractions

Before starting our discussion of Theorem 8, we need to recall the following four basic properties of continued fractions:

Proposition 11 Given a sequence {g_i\in\mathbb{N}^*}, {i\in\mathbb{N}}, let {p_k/q_k := [g_0;g_1,\dots,g_k]} be the {k}th convergent of {[g_0;g_1,\dots]}. Then:

  • one has the recurrence relation {\left(\begin{array}{cc} p_{k+1} & p_k \\ q_{k+1} & q_k\end{array}\right) \left(\begin{array}{cc} g_{k+2} & 1 \\ 1 & 0\end{array}\right) = \left(\begin{array}{cc} p_{k+2} & p_{k+1} \\ q_{k+2} & q_{k+1}\end{array}\right)} with the convention that {\left(\begin{array}{cc} p_{-1} & p_{-2} \\ q_{-1} & q_{-2}\end{array}\right) = \left(\begin{array}{cc} 1 & 0 \\ 0 & 1\end{array}\right)};
  • {[g_0;g_1,\dots,g_{k-1},z] = \frac{zp_{k-1}+p_{k-2}}{zq_{k-1}+q_{k-2}}};
  • {\frac{q_k}{q_{k-1}}=[g_k;g_{k-1},\dots, g_1]} and {\frac{p_k}{p_{k-1}} = [g_k;g_{k-2},\dots, g_0]}.
  • {p_{k+1}q_k-p_kq_{k+1}=(-1)^k};

The reader can prove this elementary proposition along the following lines. An induction argument based on the fact that {[t_0;t_1,\dots, t_n, t_{n+1}] = [t_0;t_1,\dots, t_n+\frac{1}{t_{n+1}}]} gives the recurrence relations {p_{k+2} = g_{k+2}p_{k+1}+p_k} and {q_{k+2}=g_{k+2}q_{k+1}+q_k} implying the first three items of the proposition. Finally, the third item is a consequence of the first item because the matrix {\left(\begin{array}{cc} \ast & 1 \\ 1 & 0\end{array}\right)} has determinant {-1}. For more details, the reader is invited to consult the book of Cusick–Flahive and/or Dickson.

2.8. Reduction theory (IV) bis: proof of Theorem 8

Let {q(x,y)=ax^2+bxy+cy^2} and {Q(x,y)=Ax^2+Bxy+Cy^2} be two equivalent reduced binary quadratic forms. By replacing {q} and/or {Q} by their right neighbors if necessary, we can assume that {a, A>0}. In this case, the 1st and 2nd roots of {a\omega^2+b\omega+c=0=A\Omega^2+B\Omega+C} satisfy {0< f, F <1} and {s, S < -1}.

Denote by {T=\left(\begin{array}{cc} \alpha & \beta \\ \gamma & \delta \end{array}\right)\in SL_2(\mathbb{Z})} the matrix realizing the equivalence between {q} and {Q=q\circ T}, so that {f=\frac{\alpha F+\beta}{\gamma F +\delta}} and {s=\frac{\alpha F+\beta}{\gamma F +\delta}} (cf. (7)).

Since there is nothing to prove when {T} is the identity matrix (i.e., {q=Q}), we will assume that {T\neq\textrm{Id}}. Up to replacing {T} by {-T} if necessary, we can further assume that either {\alpha>0} or {\alpha=0} and {\gamma>0}.

Proposition 12 One has {\alpha>0}.

Proof: Indeed, {\alpha=0} and {\gamma>0} would imply that {\gamma=1=-\beta} (because {T\in SL_2(\mathbb{Z})}). Therefore, {f = -\frac{1}{F+\delta}} and {s=-\frac{1}{S+\delta}}, i.e.,

\displaystyle -\delta = f+\frac{1}{F}>1 \quad \textrm{and} \quad \delta=-S-\frac{1}{s}>1,

a contradiction. \Box

Since {\frac{1}{f}=\frac{\gamma F+\delta}{\alpha F+\beta}}, {\frac{1}{s}=\frac{\gamma S+\delta}{\alpha S+\beta}} and {T\in SL_2(\mathbb{Z})}, one has

\displaystyle \left(\frac{\alpha}{f}-\gamma\right)(\alpha F+\beta) = 1 = \left(\frac{\alpha}{s}-\gamma\right)(\alpha S+\beta) \ \ \ \ \ (12)

Proposition 13 If {T} is not the identity matrix, then {\beta\gamma\neq0}.

Proof: If {\beta=0}, then {\alpha=\delta=1} (because {T\in SL_2(\mathbb{Z})} and Proposition 12 ensures that {\alpha>0}). This would imply that {(\frac{1}{s}-\gamma)S=1}, i.e.,

\displaystyle \gamma=\frac{1}{s}-\frac{1}{S} \quad \textrm{and} \quad -\gamma = \frac{1}{S}-\frac{1}{s}

Since {s, S<-1}, one would get {|\gamma|<1}, i.e., {\gamma=0}, so that {T} would be the identity matrix. Similarly, if {\gamma=0}, then one would have {\alpha=\delta=1}, so that {|\beta|=|f-F|}. Since {0<f,F<1}, this would force {|\beta|<1}, i.e., {\beta=0}, so that {T} would be the identity matrix. \Box

Proposition 14 If {T} is not the identity matrix, then {\beta\gamma>0}.

Proof: By the previous proposition, our task is to prove that {\beta>0} {\iff} {\gamma>0}.

If {\beta>0}, then {\alpha F+\beta>1} (as {F>0} and Proposition 12 says that {\alpha>0}). By (12), it follows that {\frac{\alpha}{f}-\gamma<1}. Since {0<f<1}, one gets that

\displaystyle \gamma+1>\alpha/f >\alpha\geq 1

that is, {\gamma>0}.

Similarly, if {\gamma>0}, then {\frac{\alpha}{s}-\gamma<-1}. By (12), one gets that {0>\alpha S+\beta>-1}. Thus, {\beta>0} (as {-S>1} and {\alpha\geq 1}). This completes the argument. \Box

The previous propositions allow us to suppose that {\alpha>0} and {\beta\gamma>0}. Since {\alpha\delta=\beta\gamma+1} (as {T\in SL_2(\mathbb{Z})}), we also have {\delta>0}. Moreover, up to exchanging the roles of {q} and {Q}, i.e., replacing {T=\left(\begin{array}{cc} \alpha & \beta \\ \gamma & \delta \end{array}\right)} by its inverse {T^{-1}=\left(\begin{array}{cc} \delta & -\beta \\ -\gamma & \alpha \end{array}\right)} if necessary, we can assume that {\beta}, {\gamma>0}. Finally, the proof of Proposition 14 also tells us that {\beta}, {\gamma\geq \alpha}. In particular, the fact that {\alpha\delta=\beta\gamma+1>\beta\gamma} says that {\beta}, {\gamma<\delta}.

In summary, we can (and do) assume from now on that

\displaystyle 0<\alpha\leq \beta, \gamma< \delta.

Therefore, {x=\alpha} and {y=\gamma} is a solution of the equation

\displaystyle \delta x-\beta y=1

with {0<x\leq\beta} and {0<y<\delta}.

Proposition 15 The equation {\delta x - \beta y=1} has an unique solution with {0<x\leq\beta} and {0<y<\delta}.

Proof: Since any solution of this equation satisfies {\delta(x-\alpha)=\beta(y-\gamma)} and we have that {gcd(\delta,\beta)=1} (as {T\in SL_2(\mathbb{Z})}), there exists {m\in\mathbb{Z}} with

\displaystyle x-\alpha = \beta m \quad \textrm{and} \quad y-\gamma = \delta m.

On the other hand, if {0<x\leq\beta} and {0<y<\delta}, then {|x-\alpha|<\beta} and {|y-\gamma|<\delta}. It follows that {m=0} and, a fortiori, {x-\alpha = 0 = y-\gamma}. This completes the proof. \Box We affirm that {\alpha} and {\gamma} are determined by a continued fraction expansion of {\delta/\beta}.

In fact, let us write {\delta/\beta=[g_0;g_1,\dots, g_{i-1}]} with {i} even and {g_k\geq 1} for all {0\leq k\leq i-1}: this is always possible because if {i} is odd, then we replace the term {g_{i-2}+\frac{1}{g_{i-1}}} in the continued fraction expansion by {g_{i-2}+1}, resp. {g_{i-2}+\frac{1}{g_{i-1}-1+\frac{1}{1}}} when {g_{i-1}=1}, resp. {g_{i-1}>1}. Denoting by

\displaystyle \frac{y}{x} := [g_0;g_1,\dots, g_{i-2}],

we have from Proposition 11 that {\delta x - \beta y=(-1)^{i-2}=1} and {0<x<\beta}, {0<y<\delta}. By Proposition 15, this implies that {x=\alpha} and {y=\gamma}, i.e.,

\displaystyle \frac{\gamma}{\alpha}=[g_0;g_1,\dots, g_{i-2}].

Moreover, Proposition 11 also tells us that

\displaystyle [g_0; g_1,\dots, g_{i-1}, 1/F] = \frac{\delta/F+\gamma}{\beta/F+\alpha} = \frac{1}{f}.

Since {1/F>1}, we have that {g_k}, {0\leq k\leq i-1}, are the first entries of the continued fraction expansion of {1/f}. Furthermore, (10) says that {1/F = (-1)^i/f_i = 1/f_i} where {f_i} is the first root of the {i}th element {\Phi_i} of the chain containing {\Phi_0:=q}. In other words, {Q} and {\Phi_i} have the same discriminant and the same first root {F=f_i}.

At this point, we are ready to complete the proof of Theorem 8. For this sake, it suffices to establish the following fact:

Proposition 16 One has {Q=\Phi_i}. In particular, {Q} and {q} belong to the same chain.

Proof: We saw that {Q} and {\Phi_i} share the same discriminant and 1st root. Since a binary quadratic form is determined by its discriminant and its 1st and 2nd roots (cf. Remark 3), our task is simply to show that {Q} and {\Phi_i} have the same 2nd root {S=s_i}.

By Proposition 11, the fact that {\delta/\beta=[g_0;g_1,\dots, g_{i-1}]} and {\frac{\gamma}{\alpha}=[g_0;g_1,\dots, g_{i-2}]} implies {\frac{\beta}{\alpha}=[g_{i-1};g_{i-2},\dots, g_1]}, {\frac{\delta}{\gamma}=[g_{i-1};g_{i-2},\dots, g_0]} and, hence,

\displaystyle [g_{i-1};g_{i-2},\dots, g_0, \frac{1}{S_0}]:= [g_{i-1};g_{i-2},\dots, g_0, -s] = \frac{-s\delta+\beta}{-s\gamma+\alpha}=-S.

Since {-s>1}, it follows from (10) that {-S=(-1)^{i+1}s_i=-s_i}, i.e., {S=s_i}. \Box


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

%d bloggers like this: