Sunday, April 14, 2013

Group Theory for 5th Graders

In 2011 I was very fortunate to be awarded an internship as an elementary school teaching assistant in my hometown. I worked with 3rd, 4th, and 5th graders who were having difficulties with mathematics.

One thing that I noticed while working with these students is that they are still curious enough to be motivated by the fact that something is cool in its own right without having to scrutinize its utility. In some sense, curiosity is the only genuine prerequisite for mathematics. Unfortunately, we squander this resource by offering a curriculum barren of anything not assumed to be helpful on the end of year exams\(^1\).

Given that this constraint is not going away any time soon, what more can we do with the curriculum we already have? Let's consider a few points:
  • Starting in 3rd grade, students begin to build and work with multiplication tables, which, aside from closure, are not unlike Cayley tables.
  • Also in 3rd grade, students learn to tell time by reading a clock. They are asked questions such as "What time is 5 hours before 3:00?" and "If it is 3:15, where will the minute hand be in 1 hour?". These are precisely the same questions that undergraduate math students are asked in an initial survey of  \(\mathbb{Z}_{12}\) and \(\mathbb{Z}_{60}\).
  • Fourth graders are asked to divide two natural numbers and consider the remainder
  • In order to simplify fractions, fifth grade students are asked to find the greatest common factor of two natural numbers. This is a great opportunity to introduce prime factorization.
  • Notions of Reflection, Rotation, and Translation are covered explicitly in 5th grade geometry standards, at least in Tennessee.
So, why not introduce group theory to fifth graders? The only thing they are really lacking is a structured concept of a "Set". Further, given that all of these students are headed towards an algebra curriculum\(^2\), it makes sense to introduce them to concepts like operators, inverses, identities, associativity, and commutativity ahead of time in a structured, possibly even visual way, rather than relying on an intuitional approach later on.

For example, consider the task of explaining how fractions are added. This involves rewriting one or both fractions in terms of a common denominator. There are two challenges here: (a) how does one find a common denominator? and (b) why is it legitimate to rewrite a fraction as some other fraction?

The answer to the second question hinges, mathematically, on the concept of a group identity element. In this case, we know that for any fraction \(\frac{a}{b}\), it is the case that \(1 \cdot \frac{a}{b} = \frac{a}{b}\). However, it is also the case that for any \(n \in \mathbb{Z}\), \(\frac{n}{n} = 1\). Thus, the natural connection to make is that for any \(n \in \mathbb{Z}\), \(\frac{a}{b} = \frac{na}{nb}\). I have seen many students struggle to connect those two facts, and my guess is that it seems to violate their mathematical intuition. I am very suspicious that discussing the identity concept more explicitly would help clarify this.

Another concept that frequently troubles students at this stage is that statements such as "\(\frac{1}{7} \cdot 7 = 1\)" should be true. One way to introduce this concept is by representing "parts of a whole" visually, as in pizza slices. However, presenting this as a necessary repercussion of the fact that group elements have inverses captures the idea without relying on intuition.

\(^1\) See any work by Diane Ravitch for a more thorough discussion of this problem.
\(^2\) Ostensibly. See Tavis Smiley's work on the "School to Prison Pipeline".

Sunday, March 31, 2013

Leveraging Sow to Simplify Loops

One of Mathematica's coolest list manipulation techniques is the Reap/Sow pattern. Using these functions together allows you to build up multiple collections at once in a decoupled manner.

To understand this idea and when it might make a difference, consider how you would sort a list of integers by divisor. What I mean is, given the list \[\{12,13,14,15,21,24,28,41,47,49,55\}\] sort it into the following lists based on divisor: \[\begin{align} 2 &\rightarrow \{12,14,24,28\} \\ 3 &\rightarrow \{12,15,21,24\} \\ 5 &\rightarrow \{15,55\} \end{align}\] (and here it's okay if one item shows up in two lists)

You might think to do it by using a for-loop and storing each item in a named list depending on what it matches. But if you want a variable list of divisors? In this case, managing the list of results can get a little tricky.

The Reap/Sow pattern presents a different approach: when important values are computed inside a function they can be "sown" with Sow, which means they bubble up the call stack until a matching Reap expression is encountered. You can use a Reap expression to process or discard sown values as you see fit.

An implementation of the divisor sorting algorithm using Reap and Sow might look something like this:

SortByDivisors[numbers_,divisors_]:=Reap[
  For[i=1, i \(\leq\) Length[numbers], i++,
    For[j=1, j \(\leq\) Length[divisors],j++
      n = numbers[[i]];
      d = divisors[[i]];
      If[Mod[n,d] == 0, Sow[n,d]];
    ];
  ];, n_, Rule][[2]]; 


And it will return a collection of Rules, each of which has a divisor as the key and a corresponding list of multiples as the value.

Sunday, March 17, 2013

Employing "Map" to Make Mathematica More Elegant

One of Mathematica's most under-used features is its support for functional programming.  Many researchers treat Mathematica as a procedural language; that is, as though it were C or FORTRAN with interactive plotting. This certainly works but, with apologies to Dijkstra, it can lead to code that is neither simple nor clear -- in short: not what mathematicians would call "elegant".

One quick way to make your Mathematica code more elegant is to use the Map command in place of For loops when building up arrays. Map is not an outright replacement for For loops, but it can be very helpful when you are trying to transform one data set into another.

For a simple example, let's say that you need to compute the sine of a set of angles. If you were doing this in the style of a procedural programming language, your code might look like the following:
input = {0,\(\frac{\pi}{3}\),\(\frac{2\pi}{3}\),\(\pi\)};

output = Array[0,Length[input]];
For[i = 1, i \(\leq\) Length[input], i++,
   output[[i]] = Sin[input[[i]]];
];
Now, that really doesn't look so bad. It's only three lines of code.  However, it's only one idea: the transformation of a single data set. It would be more elegant if we could express this one idea in a single -- readable -- line of code. Fortunately, this is exactly the task for which Map was intended:
output = Map[Sin,input];
This statement applies the Sin command to each element of the input list and captures the results in  an output list in corresponding order. It is equivalent to the previous example in terms of behavior and speed, yet it is superior in terms of clarity because it expresses the idea more compactly.

Of course, for a single loop, the payoff in clarity will be minimal. On the other hand, if you make it a habit to express transformations in this fashion, the benefits you reap from code simplicity will grow along with your project.

Sunday, March 3, 2013

Using Linear Algebra to Teach Linear Algebra

Linear Algebra is supposed to be the study of linear transformations between vector spaces. However, it can be hard to tell that from the way Linear Algebra classes usually start -- i.e. a disconnected, unmotivated survey of row manipulation operations.

To be fair, this discussion isn't entirely unmotivated. It's usually presented in the context of Gaussian elimination for the purpose of solving a system of equations. While that's certainly not inaccurate, presenting the material only from that perspective unnecessarily narrows its scope in the mind of the student, making it harder to generalize later. The problem is three-fold:
  • row manipulation is presented as something that is specifically "for" equation solving
  • the row manipulation operations are presented as external algorithms
  • the matrix concept is treated as a passive thing (a data structure), rather than an active thing (a transformation).
Why do this? Why introduce extra algorithms to fiddle with values in a 2D array? Linear Algebra already provides an algorithm powerful enough to do all this stuff and more: matrix multiplication.

For example, let's start with the following matrix:

\[\left(\begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array}\right)\]
Now suppose we want to interchange Row 1 with Row 2. We can do this by multiplying on the left using a special matrix designed for interchanging those rows:

\[
\left(\begin{array}{ccc} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right) *
\left(\begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array}\right) =
\left(\begin{array}{ccc} d & e & f \\ a & b & c \\ g & h & i \end{array}\right)
\]
Another common row manipulation operation is to add a scalar multiple of one row to another. Let's say we want to triple Row 1 and add those values to Row 3. Again, We can achieve this via left multiplication with a special matrix designed for that purpose:

\[
\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 3 & 0 & 1 \end{array}\right) *

\left(\begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array}\right) =
\left(\begin{array}{ccc} a & b & c \\ d & e & f \\ 3a + g & 3b + h & 3c + i \end{array}\right)
\]
Two questions arise here: Primarily, how are these special matrices constructed? Also, what is the advantage to even doing any of this?

Constructing these matrices becomes obvious once we invoke one of the fundamental principles of Linear Algebra: the matrix representation of any linear transformation comes from applying that transformation to the identity matrix.

So, if you'll notice, our matrix for swapping rows 1 and 2 was constructed by simply swapping rows 1 and 2 of the identity matrix. Likewise, our matrix from adding the triple of row 1 to row 3 was constructed by tripling row 1 of the identity matrix and adding it to row 3 of the identity matrix.

That also partially answers the question "What is the advantage?". As a pedagogical tool, this would provide an early opportunity to teach the core notions of Linear Algebra without bogging the student down in what is frequently perceived as accounting homework.

However, there is a further advantage in that tedious row manipulation algorithms can be represented compactly as products of their corresponding matrices. Not only does this allow for an early discussion of the composition of linear transformations, but taking a giant list of row operations and expressing it compactly as a single matrix is an excellent way to demonstrate that Linear Algebra is Powerful.

Sunday, February 17, 2013

Limits through the Lens of Linear Operators

I've been tutoring students in Calculus for about three years, both at the college level and at the AP high school level. One thing I've noticed is that many of the trickier problems take advantage of the fact that limits, derivatives, and integrals are all linear operators. However, the notion of a linear operator doesn't come up until near the end of a first course in Linear Algebra, so this stuff isn't made clear until well after students have finished calculus.

Recently, I tried doing things the other way around. I was working with a very bright student who was struggling in AP Calculus. After a few sessions, I could tell that the student's core algebra skills were pretty solid, but that there still seemed to be some trouble with the mechanics of solving limit and derivative problems.

First I tried showing how constants can be factored out of limits, and that the operation of finding a limit can be distributed over addition. The student seemed to follow, but not necessarily buy in to what I was saying. Then I rushed on to derivatives and I said "See! The derivative is defined as a limit, so we should expect the same thing here." Well, of course that was a bad explanation, not just because it was hand-wavy, but particularly because it wasn't illuminating.

So I decided to back up and try teaching the mechanics of how linear operators behave in the abstract, with no reference to calculus.

The formal definition of a linear operator is a bit much for an AP Calc student, but the salient features can be presented in terms of Pre-Calculus concepts as follows:
Given two functions \(f(x)\) and \(g(x)\), a linear operator \(L\) will take \(L(a*f(x) + b*g(x))\) and give you \(a*L(f(x)) + b*L(g(x))\).
In this case, the student was a bit jarred about what to do with expressions like \(L(f(x))\), but I explained that \(L\) is an unknown function just like \(f(x)\) and that the important bit is to learn how it affects the algebra and order of operations. To cement the idea, I had the student do some practice problems like the following:
  • Show that \(L(5x + 4x^2)\) can be simplified to \(5*L(x) + 4*L(x^2)\).
  • Show that \(L(x + x^2) + L(2x) + L(2x^2)\) can be simplified to \(3L(x) + 3L(x^2)\).
  • Show that \(L(a*f(x) + b*g(x)) - L(b*g(x) - c*h(x))\) can be simplified to \(L(a*f(x) + c*h(x))\).
Simple exercises like these can help a student grasp the concept of a linear operator in a mechanical way rather than in a conceptual way. Given that the concepts of linear operators don't become crucial until Linear Algebra, I think this sin is a forgivable one, especially in light of the opportunity for facilitating calculus exercises (and thus calculus concepts).

Once the student had a decent grasp of how to manipulate expressions with linear operators, I went back and reintroduced the limit concept from scratch with an emphasis on why we should expect limits to behave like linear operators. I wasn't super rigorous about this, but because of the way limits are usually taught (i.e. using intuition about long-term behavior rather than \(\epsilon-\delta\) neighborhood arguments) I really didn't have to.

After that, I set some more problems for the student, with the goal of making them appear very complex. The idea here was to promote confidence by demonstrating that nasty-looking limit problems could be broken down into separate components because of the fact that the limit is a linear operator. For example, I had the student prove statements similar to the following:

\[\begin{aligned} \lim_{x \to 0}\frac{1}{x} + \lim_{x \to 0}\frac{-1}{x} = & 0\\ \lim_{x \to 0}\frac{\sin(x)}{x}+\cos(x) = & 2\\ \lim_{x \to 0}\frac{\sin(x) + 1}{x} - \lim_{x \to 0}\left(\frac{1}{x} - \cos(x) \right)= & 2\end{aligned}\].

This was particularly successful, because being able to break these problems down into smaller components that can be simplified and cancelled helped the student realize that \(\lim_{x \to 0}\frac{sin(x)}{x}\) was the only part that really presented a challenge. Once I showed how to prove that the limit is 1, everything else seemed to become second nature.

In later sessions, I introduced the student to derivatives (and, when the time came, integrals) as examples of linear operators. I will discuss those experiences in future posts, but for now I will just note that the student in question quickly became a top performer and remained so for both semesters of the AP Calculus course. Mostly this was because of the student's stellar work ethic, but I am inclined to believe that this linear operator approach facilitated the student's ability to grasp new material more quickly and reinforced confidence by unmasking the hidden tricks in many of the harder textbook problems.