Inverses of Linear Transformations

$\require{amsmath}$
Notice, that the operation that "does nothing" to a two-dimensional vector (i.e., leaves it unchanged) is also a linear transformation, and plays the role of an identity for $2 \times 2$ matrices under "multiplication". We can verify that this is true by quickly examining the two properties of a linear transformation...


Suppose $I(\bar{x}) = \bar{x}$   (i.e., $I$ does not change its input vector.)

  1. Is it true that $I(c\bar{x}) = cI(\bar{x})$ ?

    Yes!
    $$I(c\bar{x}) = c\bar{x} = cI(\bar{x})$$

  2. Is it true that $I(\bar{x}+\bar{y}) = I(\bar{x}) + I(\bar{y})$ ?

    Yes!
    $$I(\bar{x}+\bar{y}) = \bar{x} + \bar{y} = I(\bar{x}) + I(\bar{y})$$


So what does the matrix form of this identity, $I$, look like? Remember, the first and second columns of the matrix form indicate where the vectors $\begin{pmatrix}1\\0\end{pmatrix}$ and $\begin{pmatrix}0\\1\end{pmatrix}$ go under the linear transformation.
Since $I$ does nothing to its inputs, the aforementioned vectors are the outputs, and hence the columns of the matrix form. In other words... $$I=\begin{bmatrix}1 & 0\\0 & 1\end{bmatrix}$$


Now that we have established the existence of an identity, we can talk about inverses.

Given any linear transformation, $M$ (in matrix form), does there exist a linear transformation (which we will call $M^{-1}$) that will "undo" the effect of $M$ ? ...and how do we find it?

First, let us make sure that if such a $M^{-1}$ that will "undo" the effect of $M$ actually exists, then it must be a linear transformation...

Let $c$ be a scalar and $\bar{x}$ be some arbitrary vector, and start by trying to show $M^{-1}(c\bar{x}) = cM^{-1}(\bar{x})$. Suppose that there is a vector $\bar{z}$ such that
$$M^{-1}(c\bar{x}) = \bar{z}$$
But then (presuming $c \ne 0$),
$$\begin{array}{rcll}
M(\bar{z}) &=& c \bar{x} & \quad \quad \textrm{now divide by $c$}\\\\
\displaystyle{\frac{M({\bar{z}})}{c}} &=& \bar{x} & \quad \quad \textrm{remember $M$ is linear, so}\\\\
M \left( \displaystyle{\frac{\bar{z}}{c}} \right) &=& \bar{x} & \quad \quad \textrm{now apply $M^{-1}$ to both sides}\\\\
\displaystyle{\frac{\bar{z}}{c}} &=& M^{-1}(\bar{x}) & \quad \quad \textrm{finally, multiply by $c$}\\\\
\bar{z} &=& cM^{-1}(\bar{x}) & \quad \quad
\end{array}$$
However, we know from above that $M^{-1}(c\bar{x}) = \bar{z}$, so
$$M^{-1}(c\bar{x}) = cM^{-1}(\bar{x})$$
Even if $c=0$, we can still come to this conclusion. Consider the following

Suppose $M^{-1}(0 \cdot \bar{x}) = M^{-1}(\bar{0}) = \bar{z}$ for some vector $\bar{z}$. Then, applying $M$ we find $M(\bar{z}) = \bar{0}$. Note, it is trivial to demonstrate $M(\bar{0}) = \bar{0}$ (consider the matrix product). Further, since $M^{-1}$ exists, there must be a unique input vector for $M$ that produces any given output vector, so $M^{-1}(\bar{0}) = \bar{0}$. Lastly, since we know $M^{-1}(\bar{x})$ is some vector, then $0 \cdot M^{-1}(\bar{x}) = \bar{0}$. Putting these things together, we have
$$M^{-1}(0 \cdot \bar{x}) = M^{-1}(\bar{0}) = \bar{0} = 0 \cdot M^{-1}(\bar{x})$$
Hence, the first property of linear transformations holds for $M^{-1}$.

Now, additionally assume that $\bar{y}$ is a second vector. We will attempt to show that $M^{-1}(\bar{x} + \bar{y}) = M^{-1}(\bar{x}) + M^{-1}(\bar{y})$.

Suppose there is a vector $\bar{z}$ such that
$$M^{-1}(\bar{x}) + M^{-1}(\bar{y}) = \bar{z}$$
But then, by applying $M$ to both sides, we have
$$\begin{array}{rcll}
M(M^{-1}(\bar{x}) + M^{-1}(\bar{y}) &=& M(\bar{z}) & \quad \quad \textrm{now appeal to the linearity of $M$}\ldots\\\\
M(M^{-1}(\bar{x})) + M(M^{-1}(\bar{y})) &=& M(\bar{z}) & \quad \quad \textrm{inverses cancel one another, so}\\\\
\bar{x} + \bar{y} &=& M(\bar{z}) & \quad \quad \textrm{now apply $M^{-1}$ to both sides}\ldots\\\\
M^{-1}(\bar{x} + \bar{y}) &=& M^{-1}(M(\bar{z})) & \quad \quad \textrm{again, inverses cancel one another, so}\\\\
M^{-1}(\bar{x} + \bar{y}) &=& \bar{z} & \quad \quad
\end{array}$$
However, we know from above that $M^{-1}(\bar{x}) + M^{-1}(\bar{y}) = \bar{z}$, so
$$M^{-1}(\bar{x} + \bar{y}) = M^{-1}(\bar{x}) + M^{-1}(\bar{y})$$
Hence, the second property for $M^{-1}$ to be a linear transformation holds.

With both of the necessary properties for being a linear transformation holding for $M^{-1}$, it must itself be a linear transformation.

Now, that we know what we are after is a linear transformation, let's consider a specific example with regard to how the matrix form for this linear transformation might be found:

Let us attempt to find (if it exists) the inverse of $$M=\begin{bmatrix}2 & 3\\5 & 7\end{bmatrix}$$
Being a linear transformation, $M^{-1}$ must then also have some matrix form -- let us assume it is given by the following:
$$M^{-1} = \begin{bmatrix}2 & 3\\5 & 7\end{bmatrix}^{-1} = \begin{bmatrix}a & b\\c & d\end{bmatrix}$$
We know that $M^{-1}$ will "undo" $M$, which means for any vector $\bar{v}$, we have:
$$(M^{-1} \circ M)(\bar{v}) = M^{-1}(M(\bar{v})) = \bar{v}$$
Notice, $(M^{-1} \circ M)$ leaves its input vector unchanged -- but that can only mean one thing... $(M^{-1} \circ M)$ must be the identity matrix!

Thus
$$\begin{bmatrix}a & b\\c & d\end{bmatrix} \begin{bmatrix}2 & 3\\5 & 7\end{bmatrix} = \begin{bmatrix}1 & 0\\0 & 1\end{bmatrix}$$
But then, "multiplying" the matrices to compose the underlying linear transformations, we find
$$\begin{align*}
2a+5b &= 1\\
3a+7b &= 0\\
2c+5d &= 0\\
3c+7d &= 1
\end{align*}$$
We can use the first two equations to solve for $a$ and $b$. (We solve this system in the normal way, multiplying each equation by well-chosen constants, so that when two equations are added together, one of the variables is eliminated.)

$\displaystyle{
\begin{align}
7 \cdot 2a + 7 \cdot 5b &= 7\\
-5 \cdot 3a - 5 \cdot 7b &= 0\\
\end{align}}$         $\displaystyle{\begin{align}
3 \cdot 2a + 3 \cdot 5b &= 3\\
-2 \cdot 3a - 2 \cdot 7 b &= 0\\
\end{align}}$

Adding the two equations on the left leaves an equation which can be solved for $a$, while adding the equations on the right leaves an equation that can be solved for $b$.

$$(7 \cdot 2 - 5 \cdot 3)a = 7 \hspace{0.5in} (3 \cdot 5 - 2 \cdot 7)b = 3$$

These in turn provide the solutions:

$$a = \frac{7}{7 \cdot 2 - 5 \cdot 3} \qquad b = \frac{3}{3 \cdot 5 - 2 \cdot 7}$$
We pause here for two reasons. First, note that we left the solution for $a$ and $b$ unsimplified, in that we did not evaluate the denominators. We will continue to leave these denominators unsimplified throughout the rest of the problem. This is so that we can see the form of our solution as it relates to the numbers of our original matrix.

We would like to come up with a shortcut, if you will, to find the inverse of any given matrix quickly. If we simplify things now, spotting that shortcut will be much more difficult.

Second, notice that everything we have done so far we could have done in the context of congruences $\pmod{n}$ instead of equations (as will be the case when we start talking about the Hill cipher). However, "dividing both sides by some coefficient" in a congruence is handled a bit differently. Instead of dividing (and hence, potentially forming a fraction), we will multiply both sides by "multiplicative inverse$\pmod{n}$ of the coefficient in question". This will become more clear later on, but remember that here lies the critical difference between finding inverses of matrices and finding inverses of matrices$\pmod{n}$.

Using the second two equations from our group of four above, we can similarly solve for $c$ and $d$.

We first multiply by well-chosen constants...

$\displaystyle{\begin{align}
-7 \cdot 2c - 7 \cdot 5d &= 0\\
5 \cdot 3c + 5 \cdot 7d &= 5\\
\end{align}}$       $\displaystyle{\begin{align}
-3 \cdot 2c - 3 \cdot 5d &= 0\\
2 \cdot 3c + 2 \cdot 7d &= 2\\
\end{align}}$

Adding the equations on the left, we arrive at an equation that can be solved for $c$, and adding the equations on the right, we get an equation that can be solved for $d$.

$$(5 \cdot 3 - 7 \cdot 2)c = 5 \hspace{0.5in} (2 \cdot 7 - 3 \cdot 5)d = 2$$

Solving these, we see that
$$c = \frac{5}{5 \cdot 3 - 7 \cdot 2} \qquad d = \frac{2}{2 \cdot 7 - 3 \cdot 5}$$
Thus
$$\begin{bmatrix}a & b\\c & d\end{bmatrix} = \begin{bmatrix}\dfrac{7}{7 \cdot 2 - 5 \cdot 3} & \dfrac{3}{3 \cdot 5 - 2 \cdot 7}\\
\dfrac{5}{5 \cdot 3 - 7 \cdot 2} & \dfrac{2}{2 \cdot 7 - 3 \cdot 5}\end{bmatrix}$$
Of course, this looks a little messy, so let's see if we can't tighten it up a bit.

First, notice that if we multiply the upper right and lower left fractions by $\frac{-1}{-1}$, we can get all of the denominators to look the same.
$$\begin{bmatrix}a & b\\c & d\end{bmatrix} = \begin{bmatrix}\dfrac{7}{2 \cdot 7 - 5 \cdot 3} & \dfrac{-3}{2 \cdot 7 - 5 \cdot 3}\\
\dfrac{-5}{2 \cdot 7 - 5 \cdot 3} & \dfrac{2}{2 \cdot 7 - 5 \cdot 3}\end{bmatrix}$$
Now, pulling this common denominator out as a factor sitting outside of the matrix (why can we do this?), we have
$$\begin{bmatrix}a & b\\c & d\end{bmatrix} = \frac{1}{2 \cdot 7 - 5 \cdot 3}\begin{bmatrix}7 & -3\\-5 & 2\end{bmatrix}$$

As such,
$$\begin{bmatrix}2 & 3\\5 & 7\end{bmatrix}^{-1} = \dfrac{1}{2 \cdot 7 - 5 \cdot 3}\begin{bmatrix}7 & -3\\-5 & 2\end{bmatrix}$$
Since, we never simplified any of the arithmetic, we can now clearly see where the numbers went in the final form. This suggests the following "formula" for the matrix form of the inverse of a given linear transformation:

If a linear transformation, $M$, has matrix form
$$M=\begin{bmatrix}x & y\\z & w\end{bmatrix}$$
Then its inverse is given by
$$M^{-1} = \begin{bmatrix}x & y\\z & w\end{bmatrix}^{-1} = \dfrac{1}{x \cdot w - z \cdot y}\begin{bmatrix}w & -y\\-z & x\end{bmatrix}$$
Notice that, depending on the values of $x$, $y$, $z$, and $w$, it is possible that we might have a zero in the denominator of the fraction above. This would be bad in the sense that $M$ would no longer have an inverse. So, the value of the denominator, $(x \cdot w - y \cdot z)$, "determines" whether or not our matrix has an inverse. As such, let us call this special value the determinant of the matrix and denote it in the following way
$$\textrm{det} = \begin{vmatrix}x & y\\z & w\end{vmatrix} = x \cdot w - z \cdot y$$
We can then rewrite the inverse of a matrix as
$$M^{-1} = \begin{bmatrix}x & y\\z & w\end{bmatrix}^{-1} = \textrm{det}^{-1}\begin{bmatrix}w & -y\\-z & x\end{bmatrix}$$