Matrices and Vectors
A matrix is a rectangular array of numbers enclosed by square brackets. For example:
[ 1 2 3 ]
[ 4 5 6 ]
[ 7 8 9 ]
[10 11 12]
This is a 4×3 matrix (4 rows, 3 columns). The dimension is always given as number of rows × number of columns.
Vectors are a special type of matrix with only one column. An n-dimensional vector has n rows and 1 column:
[1]
[2]
[3]
Notation
- (A_{ij}) refers to the element at row i, column j of matrix A.
- (v_i) refers to the i-th element of vector v.
- We typically use 1-indexed vectors (starts at 1) for mathematical notation.
- Matrices are denoted by uppercase letters (A, B, X), vectors by lowercase (a, b, x, y).
- Scalars are single real numbers.
- (\mathbb{R}) is the set of real numbers; (\mathbb{R}^n) is the set of n-dimensional real vectors.
Addition and Scalar Multiplication
Both addition and scalar multiplication are element-wise operations.
Matrix Addition/Subtraction
Two matrices must have the same dimensions to be added or subtracted:
[ \begin{bmatrix} a & b \ c & d \end{bmatrix} + \begin{bmatrix} w & x \ y & z \end{bmatrix} = \begin{bmatrix} a+w & b+x \ c+y & d+z \end{bmatrix} ]
Scalar Multiplication/Division
Multiply (or divide) every element by the scalar:
[ k \times \begin{bmatrix} a & b \ c & d \end{bmatrix} = \begin{bmatrix} k \cdot a & k \cdot b \ k \cdot c & k \cdot d \end{bmatrix} ]
Matrix-Vector Multiplication
To multiply a matrix (A) (size (m \times n)) by a vector (x) (size (n \times 1)):
- The number of columns in (A) must equal the size of (x).
- The result is a vector (y) of size (m \times 1).
- Each element (y_i) is computed by taking the dot product of the i-th row of (A) with (x).
[ \begin{bmatrix} a & b \ c & d \ e & f \end{bmatrix} \cdot \begin{bmatrix} x \ y \end{bmatrix} = \begin{bmatrix} a \cdot x + b \cdot y \ c \cdot x + d \cdot y \ e \cdot x + f \cdot y \end{bmatrix} ]
Example: Applying a Hypothesis to Multiple Houses
Suppose we have a hypothesis (h(x) = \theta_0 + \theta_1 x) for predicting house prices from size. For four houses with sizes [2104, 1416, 1534, 852] and parameters (\theta_0 = -40), (\theta_1 = 0.25), we can compute all preditcions in one operation:
[ \begin{bmatrix} 1 & 2104 \ 1 & 1416 \ 1 & 1534 \ 1 & 852 \end{bmatrix} \cdot \begin{bmatrix} -40 \ 0.25 \end{bmatrix} = \begin{bmatrix} h(2104) \ h(1416) \ h(1534) \ h(852) \end{bmatrix} ]
This matrix-vector multiplication produces all four predicted prices in a single step, which is both simpler and more computationally efficient than looping through each house individually. This technique is the foundation of vetcorized implementations in machine learning.