Category: Uncategorized

  • Why is the fundamental theorem of arithmetic a theorem?

    The fundamental theorem of arithmetic states that every natural number can be factorized uniquely as a product of prime numbers. The word “uniquely” here means unique up to rearranging. The theorem means that if you and I take the same number \(n\) and I write \(n = p_1p_2\ldots p_k\) and you write \(n = q_1q_2\ldots q_l\) where each \(p_i\) and \(q_i\) is a prime number, then in fact \(k=l\) and we wrote the same prime numbers (but maybe in a different order).

    Most people happily accept this theorem as self evident and believe it without proof. Indeed some people take it to be so self evident they feel it doesn’t really deserve the name “theorem” – hence the title of this blog post. In this post I want to highlight two situations where an analogous theorem fails.

    Situation One: The Even Numbers

    Imagine a world where everything comes in twos. In this world nobody knows of the number one or indeed any odd number. Their counting numbers are the even numbers \(\mathbb{E} = \{2,4,6,8,\ldots\}\). People in this world can add numbers and multiply numbers just like we can. They can even talk about divisibility, for example \(2\) divides \(8\) since \(8 = 4\cdot 2\). Note that things are already getting a bit strange in this world. Since there is no number one, numbers in this world do not divide themselves.

    Once people can talk about divisibility, they can talk about prime numbers. A number is prime in this world if it is not divisible by any other number. For example \(2\) is prime but as we saw \(8\) is not prime. Surprisingly the number \(6\) is also prime in this world. This is because there are no two even numbers that multiply together to make \(6\).

    If a number is not prime in this world, we can reduce it to a product of primes. This is because if \(n\) is not prime, then there are two number \(a \) and \(b\) such that \(n = ab\). Since \(a\) and \(b\) are both smaller than \(n\), we can apply the same argument and recursively write \(n\) as a product of primes.

    Now we can ask whether or not the fundamental theorem of arthimetic holds in this world. Namely we want to know if their is a unique way to factorize each number in this world. To get an idea we can start with some small even numbers.

    • \(2\) is prime.
    • \(4 = 2 \cdot 2\) can be factorized uniquely.
    • \(6\) is prime.
    • \(8 = 2\cdot 2 \cdot 2\) can be factorized uniquely.
    • \(10\) is prime.
    • \(12 = 2 \cdot 6\) can be factorized uniquely.
    • \(14\) is prime.
    • \(16 = 2\cdot 2 \cdot 2 \cdot 2\) can be factorized uniquely.
    • \(18\) is prime.
    • \(20 = 2 \cdot 10\) can be factorized uniquely.

    Thus it seems as though there might be some hope for this theorem. It at least holds for the first handful of numbers. Unfortunately we eventually get to \(36\) and we have:

    \(36 = 2 \cdot 18\) and \(36 = 6 \cdot 6\).

    Thus there are two distinct ways of writing \(36\) as a product of primes in this world and thus the fundamental theorem of arithmetic does not hold.

    Situtation Two: A Number Ring

    While the first example is fun and interesting, it is somewhat artificial. We are unlikely to encounter a situation where we only have the even numbers. It is however common and natural for mathematicians to be lead into certain worlds called number rings. We will see one example here and see what an effect the fundamental theorem of arithmetic can have.

    Consider wanting to solve the equation \(x^2+19=y^3\) where \(x\) and \(y\) are both integers. One way to try to solve this is by rewriting the equation as \((x+\sqrt{-19})(x-\sqrt{-19}) = y^3\). With this rewriting we have left the familiar world of the whole numbers and entered the number ring \(\mathbb{Z}[\sqrt{-19}]\).

    In \(\mathbb{Z}[\sqrt{-19}]\) all numbers have the form \(a + b \sqrt{-19}\), where \(a\) and \(b\) are integers. Addition of two such numbers is defined like so

    \((a+b\sqrt{-19}) + (c + d \sqrt{-19}) = (a+c) + (b+d)\sqrt{-19}\).

    Multiplication is define by using the distributive law and the fact that \(\sqrt{-19}^2 = -19\). Thus

    \((a+b\sqrt{-19})(c+d\sqrt{-19}) = (ac-19bd) + (ad+bc)\sqrt{-19}\).

    Since we have multiplication we can talk about when a number in \(\mathbb{Z}[\sqrt{-19}]\) divides another and hence define primes in \(\mathbb{Z}[\sqrt{-19}]\). One can show that if \(x^2 + 19 = y^3\), then \(x+\sqrt{-19}\) and \(x-\sqrt{-19}\) are coprime in \(\mathbb{Z}[\sqrt{-19}]\) (see the references at the end of this post).

    This means that there are no primes in \(\mathbb{Z}[\sqrt{-19}]\) that divides both \(x+\sqrt{-19}\) and \(x-\sqrt{-19}\). If we assume that the fundamental theorem of arthimetic holds in \(\mathbb{Z}[\sqrt{-19}]\), then this implies that \(x+\sqrt{-19}\) must itself be a cube. This is because \((x+\sqrt{-19})(x-\sqrt{-19})=y^3\) is a cube and if two coprime numbers multiply to be a cube, then both of those coprime numbers must be cubes.

    Thus we can conclude that there are integers \(a\) and \(b\) such that \(x+\sqrt{-19} = (a+b\sqrt{-19})^3 \). If we expand out this cube we can conclude that

    \(x+\sqrt{-19} = (a^3-57ab^2)+(3a^2b-19b^3)\sqrt{-19}\).

    Thus in particular we have \(1=3a^2b-19b^3=(3a^2-19b^2)b\). This implies that \(b = \pm 1\) and \(3a^2-19b^2=\pm 1\). Hence \(b^2=1\) and \(3a^2-19 = \pm 1\). Now if \(3a^2 -19 =-1\), then \(a^2=6\) – a contradiction. Similarly if \(3a^2-19=1\), then \(3a^2=20\) – another contradiction. Thus we can conclude there are no integer solutions to the equation \(x^2+19=y^3\)!

    Unfortunately however, a bit of searching reveals that \(18^2+19=343=7^3\). Thus simply assuming that that the ring \(\mathbb{Z}[\sqrt{-19}]\) has unique factorization led us to incorrectly conclude that an equation had no solutions. The question of unique factorization in number rings such as \(\mathbb{Z}[\sqrt{-19}]\) is a subtle and important one. Some of the flawed proofs of Fermat’s Last Theorem incorrectly assume that certain number rings have unique factorization – like we did above.

    References

    The lecturer David Smyth showed us that the even integers do not have unique factorization during a lecture of the great course MATH2222.

    The example of \(\mathbb{Z}[\sqrt{-19}]\) failing to have unique factorization and the consequences of this was shown in a lecture for a course on algebraic number theory by James Borger. In this class we followed the (freely available) textbook “Number Rings” by P. Stevenhagen. Problem 1.4 on page 8 is the example I used in this post. By viewing the textbook you can see a complete solution to the problem.

  • Combing groups

    Two weeks ago I gave talk titled “Two Combings of \(\mathbb{Z}^2\)”. The talk was about some material I have been discussing lately with my honours supervisor. The talk went well and I thought it would be worth sharing a written version of what I said.

    Geometric Group Theory

    Combings are a tool that gets used in a branch of mathematics called geometric group theory. Geometric group theory is a relatively new area of mathematics and is only around 30 years old. The main idea behind geometric group theory is to use tools and ideas from geometry and low dimensional topology to study and understand groups. It turns out that some of the simplest questions one can ask about groups have interesting geometric answers. For instance, the Dehn function of a group gives a natural geometric answer to the solvability of the word problem.

    Generators

    Before we can define what a combing is we’ll need to set up some notation. If \(A\) is a set then we will write \(A^*\) for the set of words written using elements of \(A\) and inverses of elements of \(A\). For instance if \(A = \{a,b\}\), then \(A^* = \{\varepsilon, a,b,a^{-1},b^{-1},aa=a^2,abb^{-1}a^{-3},\ldots\}\) (here \(\varepsilon\) refers to the empty word). If \(w\) is a word in \(A^*\), we will write \(l(w)\) for the length of \(w\). Thus \(l(\varepsilon)=0, l(a)=1, l(abb^{-1}a^{-3})=6\) and so on.

    If \(G\) is a group and \(A\) is a subset of \(G\), then we have a natural map \(\pi : A^* \rightarrow G\) given by:

    \(\pi(a_1^{\pm 1}a_2^{\pm 1}\ldots a_n^{\pm 1}) = a_1^{\pm 1}\cdot a_2^{\pm 1} \cdot \ldots \cdot a_n^{\pm 1}\).

    We will say that \(A\) generates \(G\) if the above map is surjective. In this case we will write \(\overline{w}\) for \(\pi(w)\) when \(w\) is a word in \(A^*\).

    The Word Metric

    The geometry in “geometric group theory” often arises when studying how a group acts on different geometric spaces. A group always acts on itself by left multiplication. The following definition adds a geometric aspect to this action. If \(G\) is a group with generators \(A\), then the word metric on \(G\) with respect to \(A\) is the function \(d_G : G \times G \rightarrow \mathbb{R}\) given by

    \(d_G(g,h) = \min \{ l(w) \mid w \in A^*, \overline{w} = g^{-1}h \}.\)

    That, is the distance between two group elements \(g,h \in G\) is the length of the shortest word in \(A^*\) we can use to represent \(g^{-1}h\). Equivalently the distance between \(g\) and \(h\) is the length of the shortest word we have to append to \(g\) to produce \(h\). This metric is invariant under left-multiplication by \(G\) (ie \(d_G(g\cdot h,g\cdot h’) =d_G(h,h’)\) for all \(g,h,h’ \in G\)). Thus \(G\) acts on \((G,d_G)\) by isometries.

    Words are Paths

    Now that we are viewing the group \(G\) as a geometric space, we can also change how we think of words \(w \in A^*\). Such a word can be thought of as discrete path in \(G\). That is we can think of \(w\) as a function from \(\mathbb{N}\) to \(G\). This way of thinking of \(w\) as a discrete path is best illuminated with an example. Suppose we have the word \(w = ab^2a^{-1}b\), then

    \(w(0) = e,\)
    \(w(1) = \overline{a}\)
    \(w(2) = \overline{ab}\)
    \(w(3) = \overline{ab^2}\)
    \(w(4) = \overline{ab^2a^{-1}}\)
    \(w(5) = \overline{ab^2a^{-1}b}\)
    \(w(t) = \overline{ab^2a^{-1}b},\) \(t \ge 5\).

    Thus the path \(w : \mathbb{N} \rightarrow G\) is given by taking the first \(t\) letters of \(w\) and mapping this word to the group element it represents. With this interpretation of word in \(A^*\) in mind we can now define combings.

    Combings

    Let \(G\) be a group with a finite set of generators \(A\). Then a combing of \(G\) with respect to \(A\) is a function \(\sigma : G \rightarrow A^*\) such that

    1. For all \(g \in G\), \(\overline{\sigma_g} = g\) (we will write \(\sigma_g\) for \(\sigma(g))\).
    2. There exists \(k >0\) such that for all \( g, h \in G\) with \(g \cdot \overline{a} = h\) for some \(a \in A\), we have that \(d_G(\sigma_g(t),\sigma_h(t)) \le k\) for all \(t \in \mathbb{N}\).

    The first condition says that we can think of \(\sigma\) as a way of picking a normal form \(\sigma_g \in A^*\) for each \(g \in G\). The second condition is a bit more involved. It states that if the group elements \(g, h \in G\) are distance 1 from each other in the word metric, then the paths \(\sigma_g,\sigma_h : \mathbb{N} \rightarrow G\) are within distance \(k\) of each other at any point in time.

    An Example

    Not all groups can be given a combing. Indeed if we have a combing of \(G\), then the word problem in \(G\) is solvable and the Dehn function of \(G\) is at most exponential. One group that does admit a combing is \(\mathbb{Z}^2 = \{(m,n) \mid m,n \in \mathbb{Z}\}\). This group is generated by \(A = \{(1,0),(0,1)\} = \{\beta,\gamma\}\) and one combing of \(\mathbb{Z}^2\) with respect to this generating set is

    \(\sigma_{(m,n)} = \beta^m\gamma^n\).

    The first condition of being a combing is clearly satisfied and the following picture shows that the second condition can be satisfied with \(k = 2\).

    A Non-Example

    The discrete Heisenberg group, \(H_3\), can be given by the following presentation

    \(H_3 = \langle \alpha,\beta,\gamma \mid \alpha\gamma = \gamma\alpha, \beta\gamma = \gamma\beta, \beta\alpha = \alpha\beta\gamma \rangle\).

    That is, the group \(H_3\) has three generators \(\alpha,\beta\) and \(\gamma\). The generator \(\gamma\) commutes with both \(\alpha\) and \(\beta\). The generators \(\alpha\) and \(\beta\) almost commute but don’t quite as seen in the relation \(\beta\alpha = \alpha\beta\gamma\).

    Any \(h \in H_3\) can be represented uniquely as \(\sigma_h = \alpha^p\beta^m\gamma^n\) for \(p,m,n \in \mathbb{Z}\). To see why such a representation exists it’s best to consider an example. Suppose that \(h = \overline{\gamma\beta\alpha\gamma\alpha\beta\gamma}\). Then we can use the fact that \(\gamma\) commutes with \(\alpha\) and \(\beta\) to push all \(\gamma\)’s to the right and we get that \(h = \overline{\beta\alpha\alpha\beta\gamma^3}\). We can then apply the third relation to switch the order of \(\alpha\) and \(\beta\) on the right. This gives us that that \(h = \overline{\alpha\beta\gamma\alpha\beta\gamma^3}=\overline{\alpha\beta\alpha\beta\gamma^4}\). If we apply this relation once more we get that \(h = \overline{\alpha^2\beta^2\gamma^5}\) and thus \(\sigma_h = \alpha^2\beta^2\gamma^5\). The procedure used to write \(h\) in the form \(\alpha^p\beta^m\gamma^n\) can be generalized to any word written using \(\alpha^{\pm 1}, \beta^{\pm 1}, \gamma^{\pm 1}\).

    The fact the such a representation is unique (that is if \(\overline{\alpha^p\beta^m\gamma^n} = \overline{\alpha^{p’}\beta^{m’}\gamma^{n’}}\), then \((p,m,n) = (p’,m’,n’)\)) is harder to justify but can be proved by defining an action of \(H_3\) on \(\mathbb{Z}^3\). Thus we can define a function \(\sigma : H_3 \rightarrow \{\alpha,\beta,\gamma\}^*\) by setting \(\sigma_h\) to be the unique word of the form \(\alpha^p \beta^m\gamma^n\) that represents \(h\). This map satisfies the first condition of being a combing and has many nice properties. These include that it is easy to check whether or not a word in \(\{\alpha,\beta,\gamma\}^*\) is equal to \(\sigma_h\) for some \(h \in H_3\) and there are fast algorithms for putting a word in \(\{\alpha,\beta,\gamma\}^*\) into its normal form. Unfortunately this map fails to be a combing.

    The reason why \(\sigma : H_3 \rightarrow \{\alpha,\beta,\gamma\}^*\) fails to be a combing can be seen back when we turned \(\overline{ \gamma\beta\alpha\gamma\alpha\beta\gamma }\) into \( \overline{\alpha^2\beta^2\gamma^5} \). To move \(\alpha\)’s on the right to the left we had to move past \(\beta\)’s and produce \(\gamma\)’s in the process. More concretely fix \(m \in \mathbb{N}\) and let \(h = \overline{\beta^m}\) and \(g = \overline{\beta^m \alpha} = \overline{\alpha \beta^m\gamma^m}\). We have \(\sigma_h = \beta^m\) and \(\sigma_g = \alpha \beta^m \gamma^m\). The group elements \(g\) and \(h\) differ by a generator. Thus, if \(\sigma\) was a combing we should be able to uniformly bound \(d_{H_3}(\sigma_g(t),\sigma_h(t))\) for all \(t \in \mathbb{N}\) and all \(m \in \mathbb{N}\).

    If we then let \(t = m+1\), we can recall that

    \(d_{H_3}(\sigma_g(t),\sigma_h(t)) = \min\{l(w) \mid w \in \{\alpha,\beta,\gamma\}^*, \overline{w} = \sigma_g(t)^{-1}\sigma_h(t)\}.\)

    We have that \(\sigma_h(t) = \overline{\beta^m}\) and \(\sigma_g(t) = \overline{\alpha \beta^m}\) and thus

    \(\sigma_g(t)^{-1}\sigma_h(t) = (\overline{\alpha\beta^m})^{-1}\overline{\beta^m} = \overline{\beta^{-m}\alpha^{-1}\beta^m} = \overline{\alpha^{-1}\beta^{-m}\gamma^{m}\beta^m}=\overline{\alpha^{-1}\gamma^{m}}.\)

    The group element \(\overline{\alpha^{-1}\gamma^{m}}\) cannot be represented as a shorter element of \(\{\alpha,\beta,\gamma\}^*\) and thus \(d_{H_3}(\sigma_g(t),\sigma_h(t)) = m+1\) and the map \(\sigma \) is not a combing.

    Can we comb the Heisenberg group?

    This leaves us with a question, can we comb the group \(H_3\)? It turns out that we can but the answer actually lies in finding a better combing of \(\mathbb{Z}^2\). This is because \(H_3\) contains the subgroup \(\mathbb{Z}^2 \cong \langle \beta, \gamma \rangle \subseteq H_3\). Rather than using the normal form \(\sigma_h = \alpha^p \beta^m \gamma^n\), we will use \(\sigma’_h = \alpha^p \tau_{(m,n)}\) where \(\tau : \mathbb{Z}^2 \rightarrow \{\beta,\gamma \}^*\) is a combing of \(\mathbb{Z}^2\) that is more symmetric. The word \(\tau_{(m,n)}\) is defined to be the sequence of \(m \) \(\beta\)’s and \(n\) \(\gamma\)’s that stays closest to the straight line in \(\mathbb{R}^2\) that joins \((0,0)\) to \((n,p)\) (when we view \(\beta\) and \(\gamma\) as representing \((1,0)\) and \((0,1)\) respectively). Below is an illustration:

    This new function isn’t quite a combing of \(H_3\) but it is the next best thing! It is an asynchronous combing. An asynchronous combing is one where we again require that the paths \(\sigma_h,\sigma_g\) stay close to each other whenever \(h\) and \(g\) are close to each other. However we allow the paths \(\sigma_h\) and \(\sigma_g\) to travel at different speeds. Many of the results that can be proved for combable groups extend to asynchronously combable groups.

    References

    Hairdressing in Groups by Sarah Rees is a survey paper that includes lots examples of groups that do or do not admit combings. It also talks about the language complexity of a combing, something I didn’t have time to touch on in my talk.

    Combings of Semidirect Products and 3-Manifold Groups by Martin Bridson contains a proof that \(H_3\) is asynchornously combable. He actually proves the more general result that any group of the form \(\mathbb{Z}^n \rtimes \mathbb{Z}^m\) is asynchronously combable.

    Thank you to my supervisor, Tony Licata, for suggesting I give my talk on combing \(H_3\) and for all the support he has given me so far.