Primality Testing in OCaml - Part I

This is the first in a series of posts about implementing primality tests in the OCaml programming language. I first stumbled upon OCaml by accident when I was reading about the Coq proof assistant, and it quickly became one of my favorite general-purpose programming languages.

One of the exercises in OCaml’s list of 99 problems is to test the primality of a given integer. This problem got me thinking about different primality tests and how easy or difficult it would be to implement them in a functional style in vanilla OCaml. In this post I will talk about implementing a naïve primality test: an Euler seive. In future posts I plan to work through some more sophisticated tests and compare their performance.

The idea:

The idea behind the Euler seive is very simple: let \(n\) be a positive integer. If \(n\) is composite, then it must have a factor less than or equal to \(\sqrt{n}\), which means it must have a prime factor less than or equal to \(\sqrt{n}\). So, consider the set of integers

\[\{2,3,4,\ldots, \lfloor{\sqrt{n}}\rfloor\}.\]

\(n\) is composite if and only it has a divisor in this set. So beginning with the smallest element of the set, test whether or not it divides \(n\). If it does, then \(n\) is composite. If it does not, remove that element and all of its multiples from the set, and repeat. At each step you either find a divisor of \(n\) or you strictly reduce the number of elements remaining in the set. If you manage to reach the empty set, then \(n\) is prime.

A worked example:

Say we wanted to test the primality of \(409\). Its square root rounds down to \(20\), so we need to test whether \(409\) has a divisor in the set

\[\{2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20\}.\]

Since \(2\) does not divide \(409\), we remove the multiples of \(2\) from the set and try again, with the set of candidate divisors reduced to:

\[\{3,5,7,9,11,13,15,17,19\}\]

Since \(3\) does not divide \(409\), we now remove all multiples of \(3\) from the set, leaving

\[\{5,7,11,13,17,19\}.\]

Further iterations of this process will end up only removing the smallest element from the set. We end up testing whether \(409\) is divisible by \(2,3,5,7,11,13,17,19\), and in each case finding that it is not. Since \(409\) doesn’t have a prime factor less than or equal to \(\sqrt{20}\), it must be prime.

An OCaml implementation:

The Euler Seive as described above is very amenable to being implemented in a recursive and functional style. The only hiccup is that OCaml doesn’t have a built-in utility for creating a list of all integers between given bounds. We will want this not only as a component of our algorithm, but also for testing it. So we’ll create a range utility as a standalone function and also choose a convenient infix operator.

  let range a b =
    let rec range_aux a b lst = match b - a with
      | d when d < 0 -> []
      | 0 -> a :: lst
      | _ -> range_aux a (b-1) (b :: lst) in
    range_aux a b []

  let (--) a b = range a b

It is important to point out that a slightly more direct implementation might feel more natural here:

  let rec range a b = match b - a with
    | d when d < 0 -> []
    | 0 -> [a]
    | _ -> a :: range (a + 1) b

However, this second range function is not suitable for our purposes, because is not tail recursive. Using this second range function to create a list of length around \(10^6\) will overflow the stack. But our original (and tail-recursive) implementation can handle ranges several orders of magnitude longer before it starts to slow down for other reasons.

With our range function in hand, we’re basically ready to implement the Euler seive in OCaml: Here’s the quick summary. Given an integer \(n\):

  1. If \(n < 2\) then it’s not prime
  2. But if \(n \geqslant 2\), make the list from \(2\) to \(\lfloor\sqrt{n}\rfloor\). If \(n\) has a proper divisor, then it must have a proper divisor in this list.
  3. If the smallest integer in that list divides \(n\), then \(n\) is not prime. If the list is empty, then \(n\) is prime.
  4. But if neither of those cases hold, then remove the smallest element of the list together with all of its multiples, and repeat steps 3-4.

My OCaml implementation ended up looking like this:

  let is_prime n = match n with
    | n when n < 2 -> false
    | n -> let candidate_divisors =
             n
             |> float_of_int
             |> Float.sqrt
             |> Float.floor
             |> int_of_float
             |> range 2 in
           let rec seive div_list m = match div_list with
             | [] -> true
             | p :: qs -> if m mod p = 0
                          then false
                          else
                            let not_p_mult q = q mod p <> 0 in
                            let ds = List.filter not_p_mult qs in
                            seive ds m in
           seive candidate_divisors n;;

I’ll save the more exhaustive benchmarking for a future post, where I’ll measure the performance of this is_prime function against some newer implementations. But for now it’s nice to at least spot check the correctness of the function. For example, the list of primes less than \(500\) can be found with 1 -- 500 |> List.filter is_prime, which gives

- : int list =
[2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61;
 67; 71; 73; 79; 83; 89; 97; 101; 103; 107; 109; 113; 127; 131; 137;
 139; 149; 151; 157; 163; 167; 173; 179; 181; 191; 193; 197; 199; 211;
 223; 227; 229; 233; 239; 241; 251; 257; 263; 269; 271; 277; 281; 283;
 293; 307; 311; 313; 317; 331; 337; 347; 349; 353; 359; 367; 373; 379;
 383; 389; 397; 401; 409; 419; 421; 431; 433; 439; 443; 449; 457; 461;
 463; 467; 479; 487; 491; 499]

and these are indeed the \(95\) primes less than \(500\). We could also test the primality of a large prime like is_prime 1_000_000_009, or a challenging semiprime like is_prime 10_003_800_361. This second number is a worst-case composite number for the Euler sieve, since it’s the square of a large prime.

As one more spot check of correctness we might use our is_prime function to implement the prime counting function. Usually denoted \(\pi(x)\), it is defined to be the number of primes less than or equal to \(x\).

  let prime_pi n = 1 -- n
                   |> List.filter is_prime
                   |> List.length

Asking for prime_pi 1_000_000 (accurately) gives 78498, so we can be reasonably assured of the correctness of our code (Although this took about a minute to run on my system. If we were actually interested in the prime counting function itself, this would not be the best way to implement it).

A Cayley table for SL(2,3)

This is a post about generating a Cayley table for the group \(\text{SL}(2,3)\) in a way that visualizes a maximal flag of normal subgroups, similar to my previous post about \(S_4\). I am going to take much of the underlying group theory for granted here, as the focus of this post is the idiomatic Mathematica code that makes generating this example so easy.

The group \(\text{SL}(2,3)\) is the set of all \(2 \times 2\) matrices with entries in \(\mathbb{Z}/3\mathbb{Z}\) with determinant equal to \(1\bmod{3}\). We give special names to four particular elements of this group.

\[e = \begin{bmatrix}1 & 0 \\ 0 & 1\end{bmatrix}, ~ i = \begin{bmatrix}1 & 1 \\ 1 & 2\end{bmatrix}, ~j = \begin{bmatrix}2 & 1 \\ 1 & 1\end{bmatrix}, ~ k = \begin{bmatrix}0 & 2 \\ 1 & 0\end{bmatrix}.\]

Or, in Mathematica:

e := {{1, 0}, {0, 1}};
i := {{1, 1}, {1, 2}};
j := {{2, 1}, {1, 1}};
k := {{0, 2}, {1, 0}};

As it turns out, the elements \(i,j\) and \(k\) generate a normal subgroup \(Q \trianglelefteq \text{SL}(2,3)\) with \(8\) elements, namely \({\pm e,\pm i, \pm j, \pm k}\). Up to isomorphism, this group \(Q\) is the group of Quaternions, and it is the maximal normal subgroup of \(\text{SL}(2,3)\). While every subgroup of \(Q\) is a normal subgroup of \(Q\), not every subgroup of \(Q\) is a normal subgroup of \(\text{SL}(2,3)\). But some of them are, and it turns out that the subgroup \(K = {\pm e}\) is the maximal subgroup of \(Q\) that is normal in \(\text{SL}(2,3)\). In Mathematica:

Q = {e, -e, i, -i, j, -j, k, -k};
K = {e, -e};

These lists are not ordered arbitrarily. In fact, we’ve ordered the elements so that \(Q\) is partitioned into the four left cosets of \(K\), namely \(K\), \(iK\), \(jK\) and \(kK\). Recall that for any group \(G\), any subgroup \(H \leqslant G\), and any choice of element \(g \in G\), the left coset of \(H\) represented by \(g\) is just the set

\[gH = \{g \cdot h ~|~ h \in H\},\]

which is easy enough to express in Mathematica, especially since our sets are actually lists and our group operation is matrix multiplication.

coset[g_, H_] := Table[g.h, {h, H}]

Now we (somewhat arbitrarily) choose coset representatives for the nontrivial cosets of \(Q\) in \(\text{SL}(2,3)\). For the element

\[r = \begin{bmatrix}1 & 2 \\ 0 & 1 \end{bmatrix},\]

the group \(\text{SL}(2,3)\) is a union of the three left cosets \(Q\), \(r Q\) and \(r^2 Q.\)

r = {{1, 2}, {0, 1}};
G = Join[Q, coset[r, Q], coset[r.r, Q]] // Mod[#, 3]&;

At this stage, \(G\) is a list of the elements of \(\text{SL}(2,3)\), but with a very carefully chosen ordering: the first eight, middle eight, and last eight elements are the cosets of \(Q\). And each of those three cosets is partitioned into left cosets of \(K\).

What we need next is not a natural group theoretic construction, but it will simplify the rest of our code: if G[[i]] represents the element at index i in our ordered group G, we want to know the index of G[[i]].G[[j]]. For example, G[[3]] is \(i\), while G[[9]] is \(r\). Their product is \(i\cdot r\), which is equal to \(r\cdot j\), which is G[[13]]. This fact (“third element times ninth element equals thirteenth element”) is really just the group operation for G, but where elements are referenced using their indices. This “group operation on indices” is encapsulated in the following function:

idxop[i_, j_] := PositionIndex[G][Mod[G[[i]].G[[j]], 3]][[1]]

where we can easily check that idxop[3,9] = 13, as noted above. We can use this function to make a \(24 \times 24\) table of integers. This table is precisely the Cayley table we’re interested in, but using position indices instead of group elements.

idxtable = Table[idxop[i, j], {i, 1, 24}, {j, 1, 24}];

idxtable

Note how the top left \(8 \times 8\) square contains only the indices from 1 to 8. The \(8 \times 8\) square to the right of it contains only the indices from 9 to 16, and so on: the entire table respects the cosets of \(Q\) and \(K\) because of the order we chose for the elements of \(\text{SL}(2,3).\) All that’s left is to choose colors and visualize the resulting ArrayPlot. The following constructs a list of \(24\) colors for the indices from 1 to 24 so that the first \(8\) are gradually lightening shades of yellow, then next \(8\) are gradually lightening shades of red, and the last \(8\) are gradually lightening shades of blue. Once we have these \(24\) colors in the flat list colorlist, we Blend them to make a ColorFunction that can be passed as an option to ArrayPlot.

colorchoices = {RGBColor["#ffd100"],
		RGBColor["#eb5e28"],
		RGBColor["#457b9d"]};
		
cosetcolors = NestList[Lighter[#, 1/7] &, #, 7] &;

colorlist = Map[cosetcolors, colorchoices] // Flatten;

threecolors = Function[x, Blend[colorlist, x]];

Now all that’s left to do is to color our idxtable using the ColorFunction that we tailor made for this pupose.

ArrayPlot[idxtable, ColorFunction -> threecolors]

SL23CayleyTable

This visualizes the multiplication in \(\text{SL}(2,3)\) in a way that accentuates the flag of normal subgroups

\[\{e\} \trianglelefteq K \trianglelefteq Q \trianglelefteq \text{SL}(2,3)\]

In particular, you can also see Cayley tables for the quotients of \(\text{SL}(2,3)\) by \(Q\) and \(K\): The \(3 \times 3\) grid of red, yellow and blue is the Cayley table for the quotient by \(Q\), while the \(12 \times 12\) grid of \(2\times 2\) squares is the Cayley table for the quotient by \(K\) (which turns out to be isomoprhic to the alternating group \(A_4\)).

You may notice that the top left \(8 \times 8\) square can be viewed as a \(2 \times 2\) grid of \(4 \times 4\) squares, one dark yellow and one light yellow. This is because \(Q\) has normal subgroups that are bigger than \(K\). The one that’s most clearly visible in this Cayley table is \(\{e,-e,i,-i\}\). But while this is normal in \(Q\), it is not normal in \(\text{SL}(2,3)\), which is why the entire Cayley table does not get neatly broken up in to dark and light \(4 \times 4\) squares, and there is no ordering of G that would make that happen because \(\text{SL}(2,3)\) doesn’t have any normal subgroups of index \(6\).

The tangent variety to the twisted cubic

The twisted cubic curve is a famous example (and counterexample) in algebraic geometry, usually described parametrically as image of the map \(\vec{r} : \mathbb{P}^1 \rightarrow \mathbb{P}^3\) given by

\[ [s:t] \mapsto [s^3 : s^2 t : s t^2 : t^3],\]

though typically for visualization purposes, we’ll restrict our attention to the affine patch where \(s \not = 0,\) where we have the more familiar parametrization \(\vec{r} : \mathbb{R}^1 \rightarrow \mathbb{R}^3\) given by

\[\vec{r}(t) = (t, t^2, t^3).\]

Sean Grate and I were thinking about ideas for a 3D printed object that demonstrates something about the twisted cubic \(C.\) Right now we’re working on printing a model of the tangent variety to the twisted cubic which shows this surface as the union of of the tangent lines to the curve \(C\) itself.

To start, we’ll need parametric equations for the twisted cubic and it’s tangent vector. The model we’ll eventually be making will be bounded by the cube with vertices \((\pm 1, \pm 1, \pm 1),\) and we’ll be making this model out of cylindrical tubes of a fixed radius \(\epsilon,\) so we set up a region function and a choose a parameter accordingly.

r = {t, t^2, t^3};
v = D[r, t];

boundingcube = Function[{x, y, z},
   Abs[x] <= 1 &&
   Abs[y] <= 1 &&
   Abs[z] <= 1];
ep = 1/40;

For the twisted cubic itself, it’s straightforward to make a parametric plot.

curve = ParametricPlot3D[r, {t, -1, 1}, 
   PlotRange -> {{-1.1, 1.1}, 
                 {-1.1, 1.1},
                 {-1.1, 1.1}},
   PlotStyle -> Tube[ep],
   RegionFunction -> boundingcube,
   PlotPoints -> 50];

Instead of plotting the tangent variety itself as a surface, we’ll instead plot a dense collection of tangent lines. For a fixed value of \(t,\) we get a tangent line to \(C\) is parametrized by

\[ \ell(s) = \vec{r}(t) + s \cdot\frac{d\vec{r}}{dt}(t).\]

We use this to create our collection of tangent lines.

tangents = Table[
   ParametricPlot3D[r + s v, {s, -5, 5}, 
    PlotRange -> {{-1, 1},
                  {-1, 1},
                  {-1, 1}},
    PlotStyle -> Tube[ep],
    RegionFunction -> boundingcube,
    PlotPoints -> 60],
   {t, -5, 5, 0.059}];

When we inspect what we have so far, we see that the curve together with this collection of tangent lines has some disconnected pieces, so it isn’t yet suitable for 3D printing.

	Show[curve,tangents,Boxed -> False, Axes -> False]
	

tangentlines

To fix this, we’d like to add one more component to our model. We want to add the intersection of the tangent variety with the bounding box. Finding this intersection will be much easier if we know the implicit equation of the tangent variety, which we can get from the parametrization using Macaulay2. Our parametrization defines a ring map \(\mathbb{R}[x,y,z] \rightarrow \mathbb{R}[s,t],\) and the kernel of this ring map will be generated by the implicit equation of the tangent variety.

i1 : R = QQ[X,Y,Z];

i2 : S = QQ[s,t];

i3 : f = map(S,R,matrix {{t+s, t^2 + 2*s*t, t^3 + 3*s*t^2}});

o3 : RingMap S <--- R

i4 : ker f
             2 2     3      3             2
o4 = ideal(3X Y  - 4X Z - 4Y  + 6X*Y*Z - Z )

o4 : Ideal of R

To find the intersection of this surface with the faces of our bounding cube, we’ll set \(x,y\) or \(z\) equal to \(\pm 1,\) set the resulting expression equal to zero, and solve for one of the remaining variables.

tangentvariety = 3 x^2 y^2 - 4 x^3 z - 4 y^3 + 6 x y z - z^2;

face1 = Solve[(tangentvariety /. x -> -1) == 0, z] // Flatten;
face2 = Solve[(tangentvariety /. x -> 1) == 0, z] // Flatten;
face3 = Solve[(tangentvariety /. y -> -1) == 0, z] // Flatten;
face4 = Solve[(tangentvariety /. y -> 1) == 0, z] // Flatten;
face5 = Solve[(tangentvariety /. z -> -1) == 0, y] // Flatten;
face6 = Solve[(tangentvariety /. z -> 1) == 0, y] // Flatten;

What we get can be used to parametrize the different components of the intersection of the tangent variety with the bounding cube.

facecurves1 = Table[
   ParametricPlot3D[({-1, y, z} /. p), {y, -1, 1},
    PlotStyle -> Tube[ep],
    RegionFunction -> boundingcube,
    PlotPoints -> 40],
   {p, face1}];
facecurves2 = Table[
   ParametricPlot3D[({1, y, z} /. p), {y, -1, 1},
    PlotStyle -> Tube[ep],
    RegionFunction -> boundingcube,
    PlotPoints -> 40],
   {p, face2}];
facecurves3 = Table[
   ParametricPlot3D[({x, -1, z} /. p), {x, -1, 1},
    PlotStyle -> Tube[ep],
    RegionFunction -> boundingcube,
    PlotPoints -> 40],
   {p, face3}];
facecurves4 = Table[
   ParametricPlot3D[({x, 1, z} /. p), {x, -1, 1},
    PlotStyle -> Tube[ep],
    RegionFunction -> boundingcube,
    PlotPoints -> 40],
   {p, face4}];
facecurves5 = Table[
   ParametricPlot3D[({x, y, -1} /. p), {x, -1, 1},
    PlotStyle -> Tube[ep],
    RegionFunction -> boundingcube,
    PlotPoints -> 40],
   {p, face5}];
facecurves6 = Table[
   ParametricPlot3D[({x, y, 1} /. p), {x, -1, 1},
    PlotStyle -> Tube[ep],
    RegionFunction -> boundingcube,
    PlotPoints -> 40],
   {p, face6}];

facecurves = {
   facecurves1,
   facecurves2,
   facecurves3,
   facecurves4,
   facecurves5,
   facecurves6
   };

At the intersection points of these various facecurves, there’s a bit of artifacting we’d like to avoid, which we accomplish by covering these intersection points with a sphere of radius \(\epsilon.\) That way, the corners will be smooth instead of looking like two overlapping cylinders.

cornerpoints = {
   {-1, 1, -1},
   {1, 1, 1},
   {1, -1, 0.6569},
   {-1, -1, -0.6569},
   {1, 0.5408, -1},
   {-1, 0.5408, 1},
   {-0.5408, -1, -1},
   {0.5408, -1, 1}
   };

cornerdots = Table[
    Graphics3D[Sphere[p, ep]],
    {p, cornerpoints}];

The model we want is comprised of the twisted cubic curve, the collection of tangent lines, the curves of intersection between the tangent variety and the bounding cube, and these small corrective spheres.

components = {
	curve, 
	tangents, 
	cornerdots, 
	facecurves
};

Show[components, Boxed -> False, Axes -> False]

finaltangentlines

This can be exported to an stl file and 3D printed. (put picture here). It can be hard to understand what object looks like from a static picture, even a picture of a physical object. So for the purpose of this post, here’s a .gif made of a parametrized family of different views of the model.

frames = Table[
   Show[components,
   Boxed -> False,
   Axes -> False, 
   ViewVector -> {3 Cos[t],
                  3 Sin[t],
                  6 Sin[t/2]}],
                  {t, 0, 4 Pi, 2 Pi/60}];

animation = ListAnimate[frames, 24];

animation

27 lines on a cubic suface

This is a summary of a project Sean Grate an I worked on, using Sage and a 3D printer to make a model of the Clebsch diagonal cubic and its \(27\) lines.

It is a classical fact from algebraic geometry that any smooth cubic surface contains exactly \(27\) lines, and furthermore that there exists a cubic surface for which all these lines are visible in the real locus with a high degree of symmetry. The precise statement of these facts requires some care and some more precise terminology, which I won’t go into here, but you can read more on wikipedia here or here, or in this pair of excellent posts here and here, or in the book The Geometry of some Special Arithmetic Quotients. The relevant facts can be summarized as follows:

The Clebsch diagonal cubic is embedded in \(\mathbb{P}^3\) by the projective equation

\[ (x_0 + x_1 + x_2 + x_3)^3 = x_0^3 + x_1^3 + x_2^3 +x_3^3, \]

and in a suitable affine patch, this surface can be visualized in \( \mathbb{R}^3 \) as the graph of the implicit equation

\[ \begin{aligned} 0 &= 81(x^3 + y^3 + z^3) - 189(x^2y + xy^2 + x^2z + xz^2 + y^2z + yz^2) \\ &+ 54(xyz) + 126(xy + xz + yz) - 9(x^2 + y^2 + z^2) \\ &- 9(x + y + z) + 1, \\ \end{aligned}, \]

or in Sage:

var('x,y,z')

def clebsch(x,y,z):
	return 81*(x^3 + y^3 + z^3) - 189*(x^2*y + x*y^2 + x^2*z + x*z^2 + y^2*z + y*z^2) + 54*(x*y*z) + 126*(x*y + x*z + y*z) - 9*(x^2 + y^2 + z^2) - 9*(x + y + z) + 1

and all of the \(27\) lines will be visible on this graph and can be described by explicit parametrizations, see here for details. If the line passing through \((a,b,c)\) with direction vector \((d,e,f)\) is contained in this surface, we add the list of parameters [a,b,c,d,e,f] to the linedata list, shown here (full list suppressed)

linedata = [[0,0,-1/3,1,-1,0],[0,-1/3,0,1,0,-1],...]

Creating an .stl file with Sage

The implicit equation of the surface, together with the lines it contains, is all we need to produce an .stl file for 3D printing our model. Of course, there already exist many 3D printed models of the Clebsch diagonal cubic, including various models which accentuate the \(27\) lines in various ways.

Some models use color, or embossed the lines on the surface itself, but we were particularly interested in models which displayed only a thin ribbon of the surface, together with the \(27\) lines printed as a collection of cylindrical tubes, like in this example. Inspired by this model, our aim was to produce a 3D printed object in the same vein, but with two main differences: We wanted our model to be bounded by a sphere, and instead of printing the lines as a collection of cylindrical tubes, we wanted to show the lines by printing a thin strip of the part of the surface which surrounds each line.

This can easily be done with the Sage implicit_plot3d function, for which the region option lets us plot only the points for which a specified boolean condition is satisfied. For example, we only want to see the points on the Clebsch diagonal cubic which are inside a sphere of radius \(R\), and which are either within \(\epsilon\) of one of the \(27\) lines or are outside a sphere of radius \(r\), for some reasonable choices of parameter values.

R = 2
r = 1
ep = 0.05

# region function
def rf(x,y,z):

    # dl(i,x,y,z):
    # the distance from (x,y,z) to the
    # line described by linedata[i]
    
    def dl(i,x,y,z):
        [a,b,c,d,e,f] = linedata[i]
    	s = (d*(x-a)+e*(y-b)+f*(z-c))/(d^2+e^2+f^2)
    	return n(sqrt(((x-a) - s*d)^2 +
                      ((y-b) - s*e)^2 +
                      ((z-c) - s*f)^2))
		      		      
    mindl = min([dl(i,x,y,z) for i in [0..26]])
    dist = x^2 + y^2 + z^2
    
    return (dist <= R) and (r <= dist or mindl <= ep)

And to incorporate this region function while making our sage plot:

# the value of the plot_points option
# controls the level of detail. p = 100
# is fine for prototyping, but p = 500 is
# better for the final product. Even
# larger p values may create .stl files
# that are too big for our 3D printer.
p = 100

surface = implicit_plot3d(clebsch(x,y,z),
                          (x,-R,R),
                          (y,-R,R),
                          (z,-R,R),
                          region=rf,
                          plot_points=p,
                          smooth=True)

From a Sage object to a physical object

To make an .stl file, all that’s left to do is surface.save('path/to/file.stl'). Since this object has no thickness, it is not yet suitable for 3D printing. We first used Blender to solidify the mesh, resulting in something like this:

clebsch

We sent the .stl file for the thickened surface to the Form 2 printer at the University of Kentucky Math Lab, and after a 23 hour print and some careful support clipping, this was our final product:

print

We’re quite happy with the results, since it accentuates the pieces of the surface that contain straight lines, while still showing the local twisting of the surface around those lines. This model now has a home in the display case in the math department at UK.

Visualizing normal subgroups in S4

The Math Lab at the University of Kentucky has a group which visualizes mathematical ideas through quilting, with projects involving partitions, hyperbolic tilings, and prime sieves. One idea for a quilting project, originally suggested by Dave Jensen, was to create a quilt representing the Cayley table for the symmetric group \(S_4\) to exhibit its flag of normal subgroups. This post describes implementation of that idea using Mathematica.

The symmetric group \(S_4\) is the group of bijective functions from the set \(\lbrace 1,2,3,4 \rbrace\) to itself. It has an index \(2\) normal subgroup \(A_4\), consisting of only the even permutations, and contained inside \(A_4\) is the Klein \(4\)-group \(V\), generated by all products of disjoint \(2\)-cycles. \(V\) is also normal is \(S_4\). Taken together this describes a flag of normal subgroups in \(S_4\)

\[ (1) \trianglelefteq V \trianglelefteq A_4 \trianglelefteq S_4\]

To work with these permutation groups in Mathematica, all we need to know is that \(S_n\) is generated by any \(2\)-cycle and any \(n\)-cycle, \(A_4\) is generated by \((123)\) and \((124)\), and \(V\) is generated by any two of its non-identity elements.

a = Cycles[{{1, 2}}];
b = Cycles[{{1, 2, 3, 4}}];
G = PermutationGroup[{a, b}];

c = Cycles[{{1, 2, 3}}];
d = Cycles[{{1, 2, 4}}];
A = PermutationGroup[{c, d}];

e = Cycles[{{1, 2}, {3, 4}}];
f = Cycles[{{1, 3}, {2, 4}}];
V = PermutationGroup[{e, f}];

The Cayley table of a group is just a multiplication table for the group operation. If we form a naive Cayley table for \(S_4\) using the default ordering of the elements given by Mathematica, and encoding the elements using a built-in color function, then we don’t expect to see much of the structure, but it’ll show us our starting point.

ArrayPlot[GroupMultiplicationTable[G], 
 ColorFunction -> "ThermometerColors"]

naivecayley

This Cayley table seems to have some structure, especially in the way that the top six rows look like they’re organized by color into squares. Explaining this takes a bit of additional effort to figure out exactly which elements of \(S_4\) have been assigned which colors. We generate a list of elements of \(S_4\), written in cycle notation and backed by their Mathematica-assigned color. Note that since the first element of \(S_4\) is the identity permutation, this is also a description of the topmost row and leftmost column in the Cayley table.

elements = GroupMultiplicationTable[G][[1]];

ArrayPlot[{elements} // Transpose, 
 ColorFunction -> "ThermometerColors", AspectRatio -> 7, 
 Epilog -> { 
   MapIndexed[
    Text[Style[ToString[Flatten[Level[GroupElements[G][[#1]], 1], 1]],40],
    Reverse[#2 - 1/2]] &, 
    Reverse[{elements} // Transpose], {2}]}]

arbitrarycolors

Note that for the built-in ordering of \(S_4\), the first six elements are the subgroup \(S_3 \leqslant S_4\) of permutations that fix \(1\). Since subgroups are closed under multiplication, this explains why the top-left \(6 \times 6\) square of the Cayley table only uses the first six colors from the above list. After staring a little longer, we can also note that the \(7^\text{th}\) through \(12^\text{th}\) elements are exactly the left coset \((12)S_3\), the \(13^\text{th}\) through \(18^\text{th}\) elements are exactly \((132)S_3\), and the last six elements form the last coset \((1432) S_3\). This explains why the first six rows of the Cayley table form what appear to be four distinct squares: The left cosets of \(S_3\) are fixed, but permuted internally, by right multiplication by an element of \(S_3\). Additionally, we observe that each column in the Cayley table seems to consist of four distinct strips: dark blue, light blue, dark red and light red, in some order and internally permuted. This is because left multiplication by an element of \(S_4\) permutes the left cosets of \(S_3\).

However, right multiplication does not preserve the left cosets of a subgroup, unless that subgroup happens to be a normal subgroup. This leads us to our process for making a Cayley table that represents the flag of normal subgroups. We want to re-order the elements of \(S_4\) so that the first four elements form the subgroup \(V\), and so the rest of the list is sorted into the five left cosets of \(V\). Furthermore, we want the first three cosets to comprise the elements of \(A_4\). This accomplishes a partition of the elements of \(S_4\) into the two cosets of \(A_4\), which are further subdivided into three cosets of \(V\) each.

Vreps =
  Table[RightCosetRepresentative[V, g], {g, GroupElements[G]}] // 
   DeleteDuplicates;

(* rearrange them so the first three are in A_4 *)

VrepsSorted = (Select[Vreps, GroupElementQ[A, # ] &] // Sort)~
   Join~(Select[Vreps, ! GroupElementQ[A, # ] &] // Sort);

(* a list of indices giving the desired order*)

idx = Table[
     PermutationProduct[v, r], {r, VrepsSorted}, {v, 
      GroupElements[V]}] // Flatten // GroupElementPosition[G, #] &;

(* The Cayley table, re-arranged accordingly *)
M = Table[
   GroupElementPosition[G, 
    PermutationProduct[GroupElements[G][[idx[[i]]]], 
     GroupElements[G][[idx[[j]]]]]], {i, 1, 24}, {j, 1, 24}];

The index list idx is itself a permutation, which tells us how to re-order the elements of \(S_4\) for our purposes. Inspecting idx we see:

{1, 8, 17, 24, 4, 13, 21, 12, 5, 20, 9, 16, 3,
14, 11, 22, 6, 19, 15, 10, 2, 7, 23, 18}

All that’s left to do is to come up with a color function meeting our specifications. We need to come up with six visually distinct colors \(c_1, \ldots, c_6\), and assign to the numbers \(1,8,17,24\) four different shades of \(c_1\), assign to the numbers \(4,13,21,12\) four different shades of \(c_2\), and so forth. This will distinguish the six cosets of \(V\). Additionally, we want to choose colors so that \(c_1, c_2,\) and \(c_3\) look similar to each other, as do the colors \(c_4, c_5\) and \(c_6\), but with some distinction between the two triplets to distinguish the cosets of \(A_4\).

These are just aesthetic choices, but we might choose some reds, oranges and yellows for the cosets of \(V\) in \(A_4\), and some blues and greens for the cosets of \(V\) that aren’t in \(A_4.\)

fourfades  = NestList[Lighter[#, 1/6] &, #, 3] &;

colorchoices = {
   RGBColor["#EF476F"],
   RGBColor["#F78C6B"],
   RGBColor["#FFD166"],
   RGBColor["#06D6A0"],
   RGBColor["#118AB2"],
   RGBColor["#073B4C"]};
   
colorlist = Map[fourfades, colorchoices] // Flatten;

sixcolors = Function[x, Blend[colorlist, x]];

Looking at our re-ordered list of elements of \(S_4\) with this new color palette, we see that it more or less accomplishes what we set out to accomplish.

ArrayPlot[({Range[24]} // Transpose)/24, 
 ColorFunction -> sixcolors,
 AspectRatio -> 7, 
 Epilog -> {MapIndexed[
    Text[
     Style[ToString[Flatten[Level[GroupElements[G][[idx]][[#1]], 1], 1]], 40],
     Reverse[#2 - 1/2]] &, 
     Reverse[{Table[n, {n, 1, 24}]} // Transpose], {2}]}]

sixcolors

Finally, making a new Cayley table for \(S_4\) with this color palette and element re-ordering, we end up with this:

loc = Position[idx, #][[1]][[1]] &;

ArrayPlot[Map[loc, M, {2}]/24, ColorFunction -> sixcolors]

goodcayley

This Cayley table for \(S_4\) shows the flag of normal subgroup as we hoped it would. It’s still a multiplication table for the group \(S_4\), though with the elements listed in a different order. The top left \(12\times 12\) block is a Cayley table for \(A_4\), and the top left \(4 \times 4\) is a Cayley table for \(V\). Furthermore, the four \(12 \times 12\) blocks form a Cayley table of the quotient group \(S_4 / A_4\), while the thirty-six \(4 \times 4\) blocks are a Cayley table of \(A_4 / V\), and the nine \(4 \times 4\). blocks in the top left are a Cayley table for \(A_4 / V\).

Update (early 2020): The 2D visualization group in the UK math lab is working on a quilt based on this design, and they’re making fast progress. Their work might one day be displayed at mathematical art exhibit at the JMM, like their partitions quilt was this year.