Linear Algebra Review for Machine Learning: Matrices, Vectors, and Basic Operations

Matrices and Vectors

A matrix is a rectangular array of numbers enclosed by square brackets. For example:

[ 1 2 3 ]
[ 4 5 6 ]
[ 7 8 9 ]
[10 11 12]

This is a 4×3 matrix (4 rows, 3 columns). The dimension is always given as number of rows × number of columns.

Vectors are a special type of matrix with only one column. An n-dimensional vector has n rows and 1 column:

[1]
[2]
[3]

Notation

  • (A_{ij}) refers to the element at row i, column j of matrix A.
  • (v_i) refers to the i-th element of vector v.
  • We typically use 1-indexed vectors (starts at 1) for mathematical notation.
  • Matrices are denoted by uppercase letters (A, B, X), vectors by lowercase (a, b, x, y).
  • Scalars are single real numbers.
  • (\mathbb{R}) is the set of real numbers; (\mathbb{R}^n) is the set of n-dimensional real vectors.

Addition and Scalar Multiplication

Both addition and scalar multiplication are element-wise operations.

Matrix Addition/Subtraction

Two matrices must have the same dimensions to be added or subtracted:

[ \begin{bmatrix} a & b \ c & d \end{bmatrix} + \begin{bmatrix} w & x \ y & z \end{bmatrix} = \begin{bmatrix} a+w & b+x \ c+y & d+z \end{bmatrix} ]

Scalar Multiplication/Division

Multiply (or divide) every element by the scalar:

[ k \times \begin{bmatrix} a & b \ c & d \end{bmatrix} = \begin{bmatrix} k \cdot a & k \cdot b \ k \cdot c & k \cdot d \end{bmatrix} ]

Matrix-Vector Multiplication

To multiply a matrix (A) (size (m \times n)) by a vector (x) (size (n \times 1)):

  • The number of columns in (A) must equal the size of (x).
  • The result is a vector (y) of size (m \times 1).
  • Each element (y_i) is computed by taking the dot product of the i-th row of (A) with (x).

[ \begin{bmatrix} a & b \ c & d \ e & f \end{bmatrix} \cdot \begin{bmatrix} x \ y \end{bmatrix} = \begin{bmatrix} a \cdot x + b \cdot y \ c \cdot x + d \cdot y \ e \cdot x + f \cdot y \end{bmatrix} ]

Example: Applying a Hypothesis to Multiple Houses

Suppose we have a hypothesis (h(x) = \theta_0 + \theta_1 x) for predicting house prices from size. For four houses with sizes [2104, 1416, 1534, 852] and parameters (\theta_0 = -40), (\theta_1 = 0.25), we can compute all preditcions in one operation:

[ \begin{bmatrix} 1 & 2104 \ 1 & 1416 \ 1 & 1534 \ 1 & 852 \end{bmatrix} \cdot \begin{bmatrix} -40 \ 0.25 \end{bmatrix} = \begin{bmatrix} h(2104) \ h(1416) \ h(1534) \ h(852) \end{bmatrix} ]

This matrix-vector multiplication produces all four predicted prices in a single step, which is both simpler and more computationally efficient than looping through each house individually. This technique is the foundation of vetcorized implementations in machine learning.

Tags: linear algebra matrices vectors matrix multiplication scalar multiplication

Posted on Fri, 08 May 2026 11:11:33 +0000 by atstein