Recent Posts

  • Why is the fundamental theorem of arithmetic a theorem?

    The fundamental theorem of arithmetic states that every natural number can be factorized uniquely as a product of prime numbers. The word “uniquely” here means unique up to rearranging. The theorem means that if you and I take the same number \(n\) and I write \(n = p_1p_2\ldots p_k\) and you write \(n = q_1q_2\ldots q_l\) where each \(p_i\) and \(q_i\) is a prime number, then in fact \(k=l\) and we wrote the same prime numbers (but maybe in a different order).

    Most people happily accept this theorem as self evident and believe it without proof. Indeed some people take it to be so self evident they feel it doesn’t really deserve the name “theorem” – hence the title of this blog post. In this post I want to highlight two situations where an analogous theorem fails.

    Situation One: The Even Numbers

    Imagine a world where everything comes in twos. In this world nobody knows of the number one or indeed any odd number. Their counting numbers are the even numbers \(\mathbb{E} = \{2,4,6,8,\ldots\}\). People in this world can add numbers and multiply numbers just like we can. They can even talk about divisibility, for example \(2\) divides \(8\) since \(8 = 4\cdot 2\). Note that things are already getting a bit strange in this world. Since there is no number one, numbers in this world do not divide themselves.

    Once people can talk about divisibility, they can talk about prime numbers. A number is prime in this world if it is not divisible by any other number. For example \(2\) is prime but as we saw \(8\) is not prime. Surprisingly the number \(6\) is also prime in this world. This is because there are no two even numbers that multiply together to make \(6\).

    If a number is not prime in this world, we can reduce it to a product of primes. This is because if \(n\) is not prime, then there are two number \(a \) and \(b\) such that \(n = ab\). Since \(a\) and \(b\) are both smaller than \(n\), we can apply the same argument and recursively write \(n\) as a product of primes.

    Now we can ask whether or not the fundamental theorem of arthimetic holds in this world. Namely we want to know if their is a unique way to factorize each number in this world. To get an idea we can start with some small even numbers.

    • \(2\) is prime.
    • \(4 = 2 \cdot 2\) can be factorized uniquely.
    • \(6\) is prime.
    • \(8 = 2\cdot 2 \cdot 2\) can be factorized uniquely.
    • \(10\) is prime.
    • \(12 = 2 \cdot 6\) can be factorized uniquely.
    • \(14\) is prime.
    • \(16 = 2\cdot 2 \cdot 2 \cdot 2\) can be factorized uniquely.
    • \(18\) is prime.
    • \(20 = 2 \cdot 10\) can be factorized uniquely.

    Thus it seems as though there might be some hope for this theorem. It at least holds for the first handful of numbers. Unfortunately we eventually get to \(36\) and we have:

    \(36 = 2 \cdot 18\) and \(36 = 6 \cdot 6\).

    Thus there are two distinct ways of writing \(36\) as a product of primes in this world and thus the fundamental theorem of arithmetic does not hold.

    Situtation Two: A Number Ring

    While the first example is fun and interesting, it is somewhat artificial. We are unlikely to encounter a situation where we only have the even numbers. It is however common and natural for mathematicians to be lead into certain worlds called number rings. We will see one example here and see what an effect the fundamental theorem of arithmetic can have.

    Consider wanting to solve the equation \(x^2+19=y^3\) where \(x\) and \(y\) are both integers. One way to try to solve this is by rewriting the equation as \((x+\sqrt{-19})(x-\sqrt{-19}) = y^3\). With this rewriting we have left the familiar world of the whole numbers and entered the number ring \(\mathbb{Z}[\sqrt{-19}]\).

    In \(\mathbb{Z}[\sqrt{-19}]\) all numbers have the form \(a + b \sqrt{-19}\), where \(a\) and \(b\) are integers. Addition of two such numbers is defined like so

    \((a+b\sqrt{-19}) + (c + d \sqrt{-19}) = (a+c) + (b+d)\sqrt{-19}\).

    Multiplication is define by using the distributive law and the fact that \(\sqrt{-19}^2 = -19\). Thus

    \((a+b\sqrt{-19})(c+d\sqrt{-19}) = (ac-19bd) + (ad+bc)\sqrt{-19}\).

    Since we have multiplication we can talk about when a number in \(\mathbb{Z}[\sqrt{-19}]\) divides another and hence define primes in \(\mathbb{Z}[\sqrt{-19}]\). One can show that if \(x^2 + 19 = y^3\), then \(x+\sqrt{-19}\) and \(x-\sqrt{-19}\) are coprime in \(\mathbb{Z}[\sqrt{-19}]\) (see the references at the end of this post).

    This means that there are no primes in \(\mathbb{Z}[\sqrt{-19}]\) that divides both \(x+\sqrt{-19}\) and \(x-\sqrt{-19}\). If we assume that the fundamental theorem of arthimetic holds in \(\mathbb{Z}[\sqrt{-19}]\), then this implies that \(x+\sqrt{-19}\) must itself be a cube. This is because \((x+\sqrt{-19})(x-\sqrt{-19})=y^3\) is a cube and if two coprime numbers multiply to be a cube, then both of those coprime numbers must be cubes.

    Thus we can conclude that there are integers \(a\) and \(b\) such that \(x+\sqrt{-19} = (a+b\sqrt{-19})^3 \). If we expand out this cube we can conclude that

    \(x+\sqrt{-19} = (a^3-57ab^2)+(3a^2b-19b^3)\sqrt{-19}\).

    Thus in particular we have \(1=3a^2b-19b^3=(3a^2-19b^2)b\). This implies that \(b = \pm 1\) and \(3a^2-19b^2=\pm 1\). Hence \(b^2=1\) and \(3a^2-19 = \pm 1\). Now if \(3a^2 -19 =-1\), then \(a^2=6\) – a contradiction. Similarly if \(3a^2-19=1\), then \(3a^2=20\) – another contradiction. Thus we can conclude there are no integer solutions to the equation \(x^2+19=y^3\)!

    Unfortunately however, a bit of searching reveals that \(18^2+19=343=7^3\). Thus simply assuming that that the ring \(\mathbb{Z}[\sqrt{-19}]\) has unique factorization led us to incorrectly conclude that an equation had no solutions. The question of unique factorization in number rings such as \(\mathbb{Z}[\sqrt{-19}]\) is a subtle and important one. Some of the flawed proofs of Fermat’s Last Theorem incorrectly assume that certain number rings have unique factorization – like we did above.

    References

    The lecturer David Smyth showed us that the even integers do not have unique factorization during a lecture of the great course MATH2222.

    The example of \(\mathbb{Z}[\sqrt{-19}]\) failing to have unique factorization and the consequences of this was shown in a lecture for a course on algebraic number theory by James Borger. In this class we followed the (freely available) textbook “Number Rings” by P. Stevenhagen. Problem 1.4 on page 8 is the example I used in this post. By viewing the textbook you can see a complete solution to the problem.

  • Cayley Graphs and Cakes

    Over the past month I have been studying at the AMSI Summer School at La Trobe University in Melbourne. Eight courses are offered at the AMSI Summer School and I took the one on geometric group theory. Geometric group theory is also the subject of my honours thesis and a great area of mathematics. I previously wrote about some ideas from geometric group theory here.

    One of the main ideas in geometric group theory is to take a finitely generated group and turn it into a geometric object by constructing a Cayley graph (or in Germany, a Dehn gruppenbild or Dehn group picture).

    If \(G\) is a group with a generating set \(A\), then the Cayley graph of \((G,A)\) has the elements of the group as vertices and for each group element \(g \in G\) and generator \(a \in A\), there is an edge from \(g\) to \(ga\).

    Cayley graphs can be very pretty and geometrical interesting. In the final week of the course, our homework was to creatively make a Cayley graph. Here’s a sample of the Cayley graphs we made.

    With a friend I baked a cake and decorated it with the Cayley graph of the group \(C_2 * C_3 \cong \langle a ,b \mid a^2,b^3 \rangle\) with respect to the generating set \(\{a,b\}\). We were really happy with how it looked and tasted and are proud to say that the whole thing got eaten at a BBQ for the summer school students.

    Staying with the food theme, a friend use grapes and skewers to make their Cayley graph. It’s a graph of the discrete Heisenberg group \(\langle a,b,c \mid ac=ca, bc=cb, ba=abc \rangle\). I was amazed at the structural integrity of the grapes. There’s a video about this Cayley graph that you can watch here (alternatively it’s the first result if you seach “Heisenberg group grapes”).

    This Cayley graph is made out of paper and shows how picking a different generating set \(A\) can change the appearance of the Cayley graph. It is a Cayley graph of \(\mathbb{Z}^2\). Normally this Cayley graph is drawn with respect to the generating set \(\{(1,0),(0,1)\}\) and Cayley graph looks like the square lattice on the right. The Cayley graph on the left was made with respect to the generating set \(\{(1,0),(0,1),(1,1)\}\) and the result is a triangular tiling of the plane. Note that while the two Cayley graphs of \(\mathbb{Z}^2\) look quite different, they have some important shared properties. In particular, both of them are two dimensional and flat. (The second image is taken from here)

    Someone also made a Cayley graph of the Baumslag Solitar group \(\text{BS}(1,2) \cong \langle a,t \mid tat^{-1}=a^2 \rangle\) with respect to \(\{a,t\}\). This was a group that came up frequently in the course as it can be used to construct a number of surprising counter examples.

    Finally, my favourite Cayley graph of the day was the incredibly pretty Cayley graph of the Coxeter group

    \(\langle a,b,c,d \mid a^2, b^2, c^2, d^2, (ab)^3, (bc)^3, (ac)^2, (ad)^2, (cd)^2 \rangle\)

    with respect to the set \(\{a,b,c,d\}\).

    I’d like to thank both AMSI and La Trobe for putting on the Summer School and the three geometric group theory lecturers Alejandra Garrido, Murray Elder and Lawrence Reeves for teaching a great course. A huge thanks to the other students for making some great Cayley graphs and letting me share some of them.

  • A Nifty Proof of the Cauchy-Schwarz Inequality

    This blog post is entirely based on the start of this blog post by Terry Tao. I highly recommend reading the post. It gives an interesting insight into how Terry sometimes thinks about proving inequalities. He also gives a number of cool and more substantial examples.

    The main idea in the blog post is that Terry likes to do “arbitrage” on an inequality to improve it. By starting with a weak inequality he exploits the symmetry of the environment he is working in to get better and better inequalities. He first illustrates this with a proof of the Cauchy-Schwarz inequality. The proof given is really nifty and much more memorable than previous proofs I’ve seen. I felt that just had to write it up and share it.

    Let \((V,\langle, \rangle)\) be an inner product space. The Cauchy-Schwarz inequality states that for all \(v,w \in V\), \(|\langle v, w \rangle | \le \|v\| \|w\|\). It’s an important result that leads, among other things, to a proof that \(\| \cdot \|\) satisfies the triangle inequality. There are many proofs of the Cauchy-Schwarz inequality but here is the one Terry presents.

    Since \(\langle \cdot, \cdot \rangle \) is positive definite we have \(0 \le \langle v-w,v-w\rangle\). Now using the fact that \(\langle \cdot, \cdot \rangle\) is additive in each coordinate we have

    \(0 \le \langle v,v \rangle -\langle v, w \rangle -\langle w,v\rangle+ \langle w,w \rangle\).

    Since \(\langle w,v\rangle = \overline{\langle v,w\rangle}\), we can rearrange the above expression to get the inequality

    \(\text{Re}(\langle v,w \rangle) \le \frac{1}{2}\left(\| v \|^2 + \|w\|^2\right)\).

    And now it is time to exploit the symmetry of the above expression and turn this inequality into the Cauchy-Schwarz inequality. The above inequality is worse than the Cauchy Schwarz inequality for two reasons. Firstly, unless \(\langle v, w \rangle\) is a positive real number, the left hand side is smaller than \(|\langle v,w \rangle|\). Secondly, unless \(\|v\|=\|w\|\), the right hand side is larger than the quantity \(\|v\|\|w\|\) that we want. Indeed we want the geometric mean of \(\|v\|^2\) and \(\|w\|^2\) whereas we currently have the arithmetic mean on the right.

    Note that the right hand side is invariant under the symmetry \(v \mapsto e^{i \theta} v\) for any real number \(\theta\). Thus choose \(\theta\) to be the negative of the argument of \(\langle v,w \rangle\). This turns the left hand side into \(|\langle v,w \rangle |\) while the right hand side remains invariant. Thus we have done our first bit of arbitrage and now have the improved inequality

    \(| \langle v,w \rangle | \le \frac{1}{2}\left(\|v\|^2 + \|w\|^2\right)\)

    We now turn our attention to the right hand side and observe that the left hand side is invariant under the map \((v,w) \mapsto \left(c\cdot v, \frac{1}{c} \cdot w\right)\) for any \(c > 0\). Thus by choosing \(c\) we can minimize the right hand side. A little bit of calculus shows that the best choice is \(c = \sqrt{\frac{\|w\|}{\|v\|}}\) (this is valid provided that \(v,w \neq 0\), the case when \(v=0\) or \(w=0\) is easy since we would then have \(\langle v,w \rangle=0\)). If we substitute in this optimal value of \(c\), the right hand side of the above inequality becomes

    \( \frac{1}{2}\left( \left\| \sqrt{\frac{\|w\|}{\|v\|}}\cdot v \right \|^2 +\left \| \sqrt{\frac{\|v\|}{\|w\|}} \cdot w\right \|^2 \right)=\frac{1}{2}\left(\frac{\|w\|}{\|v\|}\|v\|^2+\frac{\|v\|}{\|w\|}\|w\|^2 \right) = \|v\|\|w\|.\)

    Thus we have turned our weak starting inequality into the Cauchy-Schwarz inequality! Again I recommend reading Terry’s original post to see many more examples of this sort of arbitrage and symmetry exploitation.

  • The Outer Automorphism of S6

    This blog post is about why the number 6 is my favourite number – because the symmetric group on six elements is the only finite permutation group that contains an outer automorphism. In the post we will first look at why there are no other symmetric groups with outer automorphisms and then we will define an outer automorphism of \(S_6\).

    Inner and Outer Automorphisms

    If \(G\) is any group, then we can construct a new group \(\text{Aut}(G)\) consisting of all bijective group homomorphism \(\phi : G \to G\). This set forms a group under function composition. There is a natural map from \(G\) to \(\text{Aut}(G)\). A group element \(g \in G\) is taken to the function \(\gamma_g : G \to G\) given by \(\gamma_g(h) = ghg^{-1}\). Thus \(\gamma_g\) is conjugation by \(g\).

    The kernel of \(g \mapsto \gamma_g\) is called the center of \(G\) and is denoted by \(Z(G)\). That is \(Z(G)\) is the subgroup consisting of group elements that commute with every other group element. The image of this map is denoted by \(\text{Inn}(G) \subseteq \text{Aut}(G)\). An automorphism which is in \(\text{Inn}(G)\) is called an inner automorphism. One can check that the subgroup of inner automorphisms is a normal subgroup. Indeed if we have a group element \(g \in G\) and an automorphism \(\phi \in \text{Aut}(G)\), then \(\phi \circ \gamma_g \circ \phi^{-1} = \gamma_{\phi(g)}\). Thus we can form the quotient \(\text{Aut}(G)/\text{Inn}(G)\) which we will call the outer automorphisms of \(G\) and denote by \(\text{Out}(G)\).

    Thus the inner automorphisms of \(G\) are those which come from \(G\) in the form of conjugation. The outer automorphisms of \(G\) are equivalence classes of all the automorphisms of \(G\). Two automorphisms \(\phi, \phi’\) are equivalent in \(\text{Out}(G)\) if there is an inner automorphism \(\gamma_g\) such that \(\gamma_g \circ \phi = \phi’\).

    The Symmetric Group

    For a natural number \(n \in \mathbb{N}\), \(S_n\) will denote the group of permutations of the set \(\{1,2,\ldots,n\}\). Any permutation \(\sigma \in S_n\) can be written as product of disjoint cycles which we will call its cycle type. For instance, if \(n=5\) and \(\sigma(1)=3\), \(\sigma(2)=5\), \(\sigma(3)=4\), \(\sigma(4)=1\) and \(\sigma(5)=2\), then \(\sigma\) can be written in cycle notation as \((1,3,4)(2,5)\). Cycle notation is very useful for describing conjugation in \(S_n\). If we have two permutations \(\sigma, \tau \in S_n\) and we know the cycle notation for \(\sigma\), then the cycle notation for \(\tau \sigma \tau^{-1}\) can be derived by applying \(\tau\) to every entry in the cycle notation of \(\sigma\). For instance, with \(\sigma = (1,3,4)(2,5)\), we have

    \(\tau(1,3,4)(2,5)\tau^{-1} = (\tau(1),\tau(3),\tau(4))(\tau(2),\tau(5))\).

    To see why this result about conjugation is true, note the following. If \(\sigma(i)=j\) for some \(i,j \in \{1,2,\ldots,n\}\), then

    \((\tau\sigma\tau^{-1})\tau(i)=\tau(\sigma(\tau^{-1}(\tau(i))))=\tau(\sigma(i))=\tau(j)\).

    Thus two permutations \(\sigma, \sigma’\) are conjugate to one another if and only if \(\sigma\) and \(\sigma’\) have the same cycle type (ie they have the same number of cycles of any given size). This implies that for \(n \ge 3\), the centre of \(S_n\) must be trivial as for any non-identity \(\sigma \in S_n\), there is at least one other element of the same cycle type. Hence the map \(\sigma \mapsto \gamma_\sigma\) from \(S_n\) to \(\text{Aut}(S_n)\) is injective for \(n \ge 3\).

    Counting Conjugacy Classes

    Before constructing an outer automorphism for \(S_6\), we will see why such a map is special. In particular we will learn that \(\text{Aut}(S_n)\) consists entirely of inner automorphism for \(n \neq 6\). The argument is based off studying the size of different conjugacy classes in \(S_n\). This is important because if \(\phi\) is an automorphism of \(S_n\) and \(\sigma,\sigma’ \in S_n\) are two permutations, then \(\sigma\) is conjugate to \(\sigma’\) if and only if \(\phi(\sigma)\) is conjugate to \(\phi(\sigma’)\).

    In \(S_n\) we know that two elements are conjugate if and only if they have the same cycle type. We can use this to count the size of different conjugacy classes of \(S_n\). In particular the number of permutations with \(n_k\) cycles of size \(k\) for \(k = 1,2,\ldots, n\) is

    \(\displaystyle \frac{\displaystyle n!}{\displaystyle \prod\limits_{k=1}^n n_k! k^{n_k}}\).

    The transpositions \((j,k)\) for \(1 \le j < k\le n\) generate \(S_n\) and form a conjugacy class which we call \(T\). For any automorphism \(\phi \in \text{Aut}(S_n)\) , the set \(\phi(T)\) must be a conjugacy class that is also of size \(| T | = n(n-1)/2\). Since automorphism preserve order, the elements of \(\phi(T)\) must all be of order two. Thus the cycle type of the conjugacy class \(\phi(T)\) must contain only cycles of size two or size one. Also there must be an odd number of two cycles. If there was an even number of two cycles, then we would have that \(\phi(T)\) is contained in the subgroup \(A_n \subseteq S_n\). However this would imply that \(\phi(S_n) \subseteq \phi(A_n)\) which contradicts the fact that \(\phi : S_n \rightarrow S_n\) is a bijection. Thus the following lemma implies that \(\phi(T)=T\) when \(n \neq 6\).

    Lemma: Fix \(n \neq 6\) and an odd number \(k\) such that \(k > 1\) and \(2k \le n\). Let \(m_k\) denote the number of permutations \(\sigma \in S_n\) which have \(k\) cycles of size two and \(n-2k\) cycles of size one, then \(m_k \neq n(n-1)/2\).

    Proof: This claim can be manually checked for \(n < 6\). For \(n > 6\), assume in order to get a contradiction that \(m_k = \frac{n(n-1)}{2}\). Note that \(m_k\) is equal to \(\frac{n!}{k! 2^k(n-2k)! 1^{n-2k}}=\frac{n!}{k! (n-2k)! 2^k}\). Thus we have \(\frac{n(n-1)}{2} = \frac{n!}{k!(n-2k)!2^k}\) and hence

    \(2^{k-1} = \frac{(n-2)(n-3)\ldots(n-2k+1)}{k!}\).

    Now if \(k \ge 5\), then \((n-4)(n-5)\ldots(n-2k+1)\) is a product of at least \(k\) consecutive integers and hence divisible by \(k!\) (see here). Thus

    \(2^{k-1} = \frac{(n-2)(n-3)\ldots(n-2k+1)}{k!} = (n-2)(n-3)\frac{(n-4)(n-5)\ldots(n-2k+1)}{k!}\)

    and hence \(\frac{(n-2)(n-3)\ldots(n-2k+1)}{k!}\) is divisible by an odd number other than one (either \(n-2\) or \(n-3\)). Thus \(\frac{(n-2)(n-3)\ldots(n-2k+1)}{k!}\) cannot be a power of \(2\), contradicting the above equation.

    When \(k = 3\), the above equation becomes:

    \(4= \frac{(n-2)(n-3)(n-4)(n-5)}{3!} = (n-2)\frac{(n-3)(n-4)(n-5)}{3!} = \frac{(n-2)(n-3)(n-4)}{3!}(n-5)\).

    Thus if \(n\) is odd, \(n-2\) is an odd number other than \(1\) that divides 4, a contradiction. If \(n\) is even, then since \(n > 6\), \(n-5\) is an odd number other than \(1\) that divides \(4\), which is again a contradiction. Thus we cannot have any case when \(m_k = \frac{n(n-1)}{2}\).

    \(S_n\) does not have an outer automorphism if \(n \neq 6\)

    With the above lemma we can now prove that all the automorphism of \(S_n\) for \(n \neq 6\) are inner. First note that the only automorphism of \(S_2 = \{ \text{id}, (12)\}\) is the identity which is an inner automorphism. When \(n \ge 3\), the map \(\sigma \mapsto \gamma_\sigma\) from \(S_n\) to \(\text{Aut}(S_n)\) is injective. Thus there are exactly \(n!\) inner automorphisms of \(S_n\). We will show that for \(n \neq 6\) there are exactly \(n!\) automorphisms. Hence there can be no outer automorphisms.

    Let \(\phi\) be an automorphism of \(S_n\). The symmetric group \(S_n\) is also generated by the \(n-1\) transpositions \(\sigma_i = (i,i+1)\) (for \(i =1, \ldots, n-1\)). By the previous lemma we know that if \(n \neq 6\), then each \(\phi(\sigma_i)\) must be a transposition \((j_i,k_i)\) for some \(j_i, k_i \in \{1,\ldots, n\}\) such that \(j_i \neq k_i\). As the collection \(\sigma_1,\ldots,\sigma_{n-1}\) generates \(S_n\), the automorphism \(\phi\) is determined by the transpositions \(\phi(\sigma_i)= (j_i,k_i)\). We will now show that there are exactly \(n!\) choices for these transpositions.

    Since \(\sigma_1\sigma_{2} = (1,2,3)\), we know that \(\phi(\sigma_1)\phi(\sigma_2) = (j_1,k_1)(j_2,k_2)\) must be a three cycle. Thus one of \(j_2,k_2\) must be equal to one of \(j_1,k_1\) and the other must be distinct. By relabeling if necessary, we will require that \(k_1=j_2\) and \(j_1 \neq k_2\). Thus there are \(n\) choices for \(j_1\), \(n-1\) choices for \(k_1=j_2\) and \(n-2\) choices for \(k_2\). This gives us \(n(n-1)(n-2)\) choices for the first two transpositions. Continuing in this manner, we note that \(\sigma_1\sigma_3 = (1,2)(3,4)\) and \(\sigma_2\sigma_3 = (2,3,4)\), we get that one of \(j_3,k_3\) is equal to one of \(k_2\) and the other one is distinct from \(j_1,k_1=j_2,k_2\), thus giving us \(n-3\) choices for the third transposition. In general we will have that \(\sigma_{i-1}\sigma_i\) is a three cycle and \(\sigma_l\sigma_i\) is a pair of disjoint two cycles if \(l < i-1\). Thus one can show inductively that there are \(n-i\) choices for the transposition \(\phi(\sigma_i)\). Hence there are \(n(n-1)(n-2)\ldots 1 = n!\) choices for all the transpositions \(\phi(\sigma_i)\) and hence \(n!\) choices for \(\phi\). Thus, if \(n \neq 6\), all automorphisms of \(S_n\) are inner.

    The outer automorphism of \(S_6\).

    The key part of showing that \(S_n\) didn’t have any outer automorphisms when \(n \neq 6\) was showing that any automorphism must map transpositions to transpositions. Thus to find the outer automorphism of \(S_6\) we will want to find an automorphism that does not preserve cycle type. In particular we will find one which maps transpositions to the cycle type \((a,b)(c,d)(e,f)\). There are \(\frac{6!}{3!2^3} = 15\) such triple transpositions which is exactly \(\frac{6\cdot 5}{2}\) – the number of single transpositions.

    The outer automorphism, \(\psi\), will be determined by the values it takes on \(\sigma_1,\sigma_2,\sigma_3,\sigma_4\) and \(\sigma_5\) (where again \(\sigma_i = (i,i+1)\)). Each of these transpositions will get to a different triple transposition \((a,b)(c,d)(e,f)\). To extend \(\psi\) to the rest of \(S_6\) it suffices to check the Coxeter relations \(\psi(\sigma_k)^2 = (\psi(\sigma_i)\psi(\sigma_{i+1}))^3 = (\psi(\sigma_i)\psi(\sigma_j))^2=\text{id}\) for \(k = 1,2,3,4,5\), \(i =1,2,3,4\) and \(j \ge i+2\).

    The relation \(\psi(\sigma_k)^2=\text{id}\) will hold for any triple transposition \(\psi(\sigma_k)\). The other relations are a bit trickier. If we have two triple transpositions \(\tau = (a,b)(c,d)(e,f)\) and \(\tau’ = (a’,b’)(c’,d’)(e’,f’)\), then \(\tau\tau’\) will be a product of two three cycles if none of the transpositions of \(\tau\) is a transposition of \(\tau’\). If \(\tau\) and \(\tau’\) share exactly one transposition, then \(\tau\tau’\) is a product of two two cycles. If \(\tau\) and \(\tau’\) share two or more transpositions they must actually be equal and hence \(\tau\tau’\) is the identity.

    With this in mind, one can verify that following assignment extends to a valid group homomorphism from \(S_6\) to \(S_6\):

    \(\psi(\sigma_1) = (1,2)(3,4)(5,6)\),
    \(\psi(\sigma_2) = (1,3)(2,5)(4,6)\),
    \(\psi(\sigma_3) = (1,5)(2,6)(3,4)\),
    \(\psi(\sigma_4) = (1,3)(2,4)(5,6)\),
    \(\psi(\sigma_5) = (1,6)(2,5)(3,4)\).

    We now just need to justify that the group homomorphism \(\psi\) is an automorphism. Since \(S_6\) is a finite group, \(\psi\) is a bijection if and only if \(\psi\) is an injection. Recall that only normal subgroups of \(S_6\) are the trivial subgroup, \(A_6\) and \(S_6\). Thus to prove injectivity of \(\psi\), it suffices to show that \(\psi(\tau) \neq \text{id}\) for some \(\tau \in A_6\). Indeed \(\psi(\sigma_1\sigma_3) = (1,6)(2,5) \neq \text{id}\) as required. Thus we have our outer automorphism!

    This isn’t the most satisfying way of defining the outer automorphism as it requires using the generators of the symmetric group. There are many more intrinsic ways to define this outer automorphism which will hopefully be the topic of a later blog post!

    References

    The proof I give that for \(n \neq 6\), the group \(S_n\) has no outer automorphisms is based off the one given in this paper On the Completeness of Symmetric Group.

  • Polynomial Pairing Functions

    One of the great results of the 19th century German mathematician Georg Cantor is that the sets \(\mathbb{N}\) and \(\mathbb{N} \times \mathbb{N}\) have the same cardinality. That is the set of all non-negative integers \(\mathbb{N} = \{0,1,2,3,\ldots\}\) has the same size as the set of all pairs on non-negative integers \(\mathbb{N} \times \mathbb{N} = \{(m,n) \mid m,n \in \mathbb{N} \}\), or put less precisely “infinity times infinity equals infinity”.

    Proving this result amounts to finding a bijection \(p : \mathbb{N} \times \mathbb{N} \to \mathbb{N}\). We will call such a function a pairing function since it takes in two numbers and pairs them together to create a single number. An example of one such function is

    \(p(m,n) = 2^m(2n+1)-1\)

    This function is a bijection because every positive whole number can be written uniquely as the product of a power of two and an odd number. Such functions are of practical as well as theoretical interest. Computers use pairing functions to store matrices and higher dimensional arrays. The entries of the matrix are actually stored in a list. When the user gives two numbers corresponding to a matrix entry, the computer uses a pairing function to get just one number which gives the index of the corresponding entry in the list. Thus having efficiently computable pairing functions is of practical importance.

    Representing Pairing Functions

    Our previous example of a pairing function was given by a formula. Another way to define pairing functions is by first representing the set \(\mathbb{N} \times \mathbb{N}\) as a grid with an infinite number of rows and columns like so:

    We can then represent a pairing function as a path through this grid that passes through each square exactly once. Here are two examples:

    The way to go from one of these paths to a function from \(\mathbb{N} \times \mathbb{N}\) to \(\mathbb{N}\) is as follows. Given as input a pair of integers \((m,n)\), first find the dot that \((m,n)\) represents in the grid. Next count the number of backwards steps that need to be taken to get to the start of the path and then output this number.

    We can also do the reverse of the above procedure. That is, given a pairing function \(p : \mathbb{N} \times \mathbb{N} \to \mathbb{N}\), we can represent \(p\) as a path in the grid. This is done by starting at \(p^{-1}(0)\) and joining \(p^{-1}(n)\) to \(p^{-1}(n+1)\). It’s a fun exercise to work out what the path corresponding to \(p(m,n) = 2^m(2n+1)-1\) looks like.

    Cantor’s Pairing Function

    The pairing function that Cantor used is not any of the ones we have seen so far. Cantor used a pairing function which we will call \(q\). When represented as a path, this is what \(q\) looks like:

    Surprisingly there’s a simple formula that represents this pairing function \(q\) which we will now derive. First note that if we are at a point \((k,0)\), then the value of \(q(k,0)\) is \(1+2+3+\ldots+(k-1)+k= \frac{1}{2}k(k+1)\). This is because to get to \((k,0)\) from \((0,0) = q^{-1}(0)\), we have to go along \(k\) diagonals which each increase in length.

    Now let \((m,n)\) be an arbitrary pair of integers and let \(k = m+n\). The above path first goes through \((k,0)\) and then takes \(n\) steps to get to \((m,n)\). Thus

    \(q(m,n)= q(k,0)+n=\frac{1}{2}k(k+1)+n=\frac{1}{2}(n+m)(n+m+1)+n\)

    And so Cantor’s pairing function is actually a quadratic polynomial in two variables!

    Other Polynomial Pairing Functions?

    Whenever we have a pairing function \(p\), we can switch the order of the inputs and get a new pairing function \(\widetilde{p}\). That is the function \(\widetilde{p}\) is given by \(\widetilde{p}(m,n)=p(n,m)\). When thinking of pairing functions as paths in a grid, this transformation amounts to reflecting the picture along the diagonal \(m = n\).

    Thus there are at least two quadratic pairing functions, Cantor’s function \(q\) and its switched cousin \(\widetilde{q}\). The Fueter–Pólya theorem states these two are actually the only quadratic pairing functions! In fact it is conjectured that these two quadratics are the only polynomial pairing functions but this is still an open question.

    Thank you to Wikipedia!

    I first learnt that the sets \(\mathbb{N}\) and \(\mathbb{N} \times \mathbb{N}\) have the same cardinality in class a number of years ago. I only recently learnt about Cantor’s polynomial pairing function and the Fueter–Pólya theorem by stumbling across the Wikipedia page for pairing functions. Wikipedia is a great source for discovering new mathematics and for checking results. I use Wikipedia all the time. Many of these blog posts were initially inspired by Wikipedia entries.

    Currently, Wikipedia is doing their annual fundraiser. If you are a frequent user of Wikipedia like me, I’d encourage you to join me in donating a couple of dollars to them: https://donate.wikimedia.org.