Sunday, March 3, 2013

Using Linear Algebra to Teach Linear Algebra

Linear Algebra is supposed to be the study of linear transformations between vector spaces. However, it can be hard to tell that from the way Linear Algebra classes usually start -- i.e. a disconnected, unmotivated survey of row manipulation operations.

To be fair, this discussion isn't entirely unmotivated. It's usually presented in the context of Gaussian elimination for the purpose of solving a system of equations. While that's certainly not inaccurate, presenting the material only from that perspective unnecessarily narrows its scope in the mind of the student, making it harder to generalize later. The problem is three-fold:
  • row manipulation is presented as something that is specifically "for" equation solving
  • the row manipulation operations are presented as external algorithms
  • the matrix concept is treated as a passive thing (a data structure), rather than an active thing (a transformation).
Why do this? Why introduce extra algorithms to fiddle with values in a 2D array? Linear Algebra already provides an algorithm powerful enough to do all this stuff and more: matrix multiplication.

For example, let's start with the following matrix:

\[\left(\begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array}\right)\]
Now suppose we want to interchange Row 1 with Row 2. We can do this by multiplying on the left using a special matrix designed for interchanging those rows:

\[
\left(\begin{array}{ccc} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right) *
\left(\begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array}\right) =
\left(\begin{array}{ccc} d & e & f \\ a & b & c \\ g & h & i \end{array}\right)
\]
Another common row manipulation operation is to add a scalar multiple of one row to another. Let's say we want to triple Row 1 and add those values to Row 3. Again, We can achieve this via left multiplication with a special matrix designed for that purpose:

\[
\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 3 & 0 & 1 \end{array}\right) *

\left(\begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array}\right) =
\left(\begin{array}{ccc} a & b & c \\ d & e & f \\ 3a + g & 3b + h & 3c + i \end{array}\right)
\]
Two questions arise here: Primarily, how are these special matrices constructed? Also, what is the advantage to even doing any of this?

Constructing these matrices becomes obvious once we invoke one of the fundamental principles of Linear Algebra: the matrix representation of any linear transformation comes from applying that transformation to the identity matrix.

So, if you'll notice, our matrix for swapping rows 1 and 2 was constructed by simply swapping rows 1 and 2 of the identity matrix. Likewise, our matrix from adding the triple of row 1 to row 3 was constructed by tripling row 1 of the identity matrix and adding it to row 3 of the identity matrix.

That also partially answers the question "What is the advantage?". As a pedagogical tool, this would provide an early opportunity to teach the core notions of Linear Algebra without bogging the student down in what is frequently perceived as accounting homework.

However, there is a further advantage in that tedious row manipulation algorithms can be represented compactly as products of their corresponding matrices. Not only does this allow for an early discussion of the composition of linear transformations, but taking a giant list of row operations and expressing it compactly as a single matrix is an excellent way to demonstrate that Linear Algebra is Powerful.

No comments:

Post a Comment