Change of basis

We look into the problem of expressing a vector pp, whose coordinates relative to some basis is known, through the coordinates relative to a different basis.

Let’s assume that pp lies in an nn dimensional vector space VV defined over the field FF. Let AA and BB be two different sets of basis vectors containing the vectors {α1,,αn}\set{\alpha_1, \ldots, \alpha_n} and {β1,,βn}\set{\beta_1, \ldots, \beta_n} respectively.

Let pAp_A represent the coordinates of pp with respect to the basis AA. Now, how do we go about computing pBp_B?

If the coordinates of pAp_A are (a1,,an)(a_1, \ldots, a_n) (With aiFa_i \in F), we have:

p=a1α1++anαn=i=1naiαip = a_1\alpha_1 + \ldots + a_n\alpha_n = \sum_{i=1}^{n}a_i\alpha_i

Assume that the coordinates of pBp_B are (b1,,bn)(b_1, \ldots, b_n). It is important to note that while representing vectors through coordinates relative to a basis, the actual vector remains unchanged. Changing the basis merely changes the representation of the vector by changing the scalars that constitute the linear combination of the basis. So when the basis changes, the coordinates also change such that the new linear combination also represents the same vector.

p=i=1naiαi=i=1nbiβip = \sum_{i=1}^{n}a_i\alpha_i = \sum_{i=1}^{n}b_i\beta_i

Being basis in the same vector space, we can represent the basis vectors of AA as linear combinations of the vectors in BB.

αi=j=1ncijβj     cijF\alpha_i = \sum_{j=1}^{n}c_{ij}\beta_j \ \ \ \ \ c_{ij} \in F

Using this in the previous equation, we get:

p=i=1nj=1naicijβj=i=1nbiβip = \sum_{i=1}^{n}\sum_{j=1}^{n}a_ic_{ij}\beta_j = \sum_{i=1}^{n}b_i\beta_i

The above equation makes it clear that the each of new coordinates bib_i can be obtained through some linear combination of the known coordinates aia_i. What this means is that we have a matrix composed of some scalars that can transform the old coordinates to the new ones: pB=PpAp_B = Pp_A.

What does this matrix PP consist of exactly? We can go ahead and fill up the matrix with the cijc_{ij} scalars, but this isn’t very intuitive and doesn’t tell us much about the matrix itself without careful inspection. Let’s try something more intuitive.

Imagine that instead of the basis vectors αi\alpha_i, we use αiB\alpha_{iB} which represent the basis vectors as coordinates with respect to BB. Now what does i=1naiαiB\sum_{i=1}^{n}a_i\alpha_{iB} give us? As each of the basis is represented relative to BB, it should give us the vector pp relative to BB, or pBp_B. The matrix transformation pB=PpAp_B = Pp_A now becomes:

[b1bn]=[α1BαnB][a1an]\begin{bmatrix} b_1\\ \vdots\\ b_n \end{bmatrix} = \begin{bmatrix} \alpha_{1B} & \ldots & \alpha_{nB} \end{bmatrix}\begin{bmatrix} a_1\\ \vdots\\ a_n \end{bmatrix}

Hence the columns of PP consists of the basis vectors αi\alpha_i represented as coordinates in the frame BB. With this representation, it is quite easy to see that the matrix PP is invertible, as its columns comprise of representations of the basis which are independent. So we also have:

pA=P1pBp_A = P^{-1}p_B

To summarize, given a coordinate of a vector pAp_A, to represent it relative to a different basis BB, all we have to do is represent each of the basis vectors in the original basis αi\alpha_i relative to the new basis and set these as columns of a matrix PP. Multiplying PP and pAp_A will give us the required pBp_B.


Example

Let’s go over a quick example showcasing this change of basis. Consider the two dimensional vector space R2\mathbb{R}^2 over the field of real numbers R\mathbb{R}. Let the standard basis for this space be A={α1,α2}={[1,0],[0,1]}A = \set{\alpha_1, \alpha_2} = \set{[1, 0], [0, 1]}

Consider a vector p=[5,0]p = [5, 0]. Using the standard basis, we have pA=(5,0)p_A = (5, 0).

Introducing a new basis B={β1,β2}=[0,1],[1,0]B = \set{\beta_1, \beta_2} = {[0, -1], [1, 0]}, we want to find the representation of pp relative to BB. If the standard basis AA represents vectors along the xx and yy axis, BB represents vectors along the y-y and xx axis. Geometrically, we can say that that coordinates with respect to BB are rotated clockwise by an angle of 9090^\circ (90-90^\circ using positive angles for anti-clockwise rotations).

Representing AA relative to BB,

α1B=(0,1)\alpha_{1B} = (0, 1) α2B=(1,0)\alpha_{2B} = (-1, 0)

This is because 0β1+1β2=[1,0]=α10\beta_1 + 1\beta_2 = [1, 0] = \alpha_1 and 1β1+0β2=[0,1]=α2-1\beta_1 + 0\beta_2 = [0, 1] = \alpha_2. We can construct the matrix P as:

P=[0110]P = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} pB=[0110][50]=[05]p_B = \begin{bmatrix} 0 & -1\\ 1 & 0 \end{bmatrix}\begin{bmatrix} 5\\ 0 \end{bmatrix} = \begin{bmatrix} 0\\ 5 \end{bmatrix}

We can verify that this pBp_B is correct by taking the linear combination of the new basis vectors: 0β1+5β2=[5,0]=p0\beta_1 + 5\beta_2 = [5, 0] = p. Geometrically, pBp_B lies in the yy direction relative to the new coordinates, which is the xx direction with respect to the standard basis.

Rotating the basis by 90-90^\circ in this case is equivalent to rotating the vector pp by 9090^\circ. We can construct the rotation matrix:

Rot(90)=[cos90sin90sin90cos90]=[0110]Rot(90^\circ) = \begin{bmatrix} \cos{90^\circ} & -\sin{90^\circ}\\ \sin{90^\circ} & \cos{90^\circ} \end{bmatrix} = \begin{bmatrix} 0 & -1\\ 1 & 0 \end{bmatrix}

As we can see, the rotation matrix and PP are one and the same. In this case, the change of basis can be interpreted as the vector being rotated.

See Also

Share

Comments