Coordinate Systems and Projection Matrix Debugging in Games 101 Assignment 1

Coordinate Systems and Transformations

A coordinate system provides a framework for describing spatial information. Typically, a local frame is established with the observer at the origin, allowing descriptions like "three meters ahead." However, in a shared environment, multiple observers lead to conflicting descriptions (e.g., "3m ahead" vs. "3m behind"). To resolve this, a unified World Coordinate System is required to map local observations to a single, consistent global position.

Frame Definition

Geometrically, a frame consists of an origin and a basis (a set of three vectors). Orthonormall bases are standard due to their mathematical convenience. In a frame with origin $\mathbf{p}$ and basis ${\mathbf{u}, \mathbf{v}, \mathbf{w}}$, the coordinates $(u, v, w)$ describe the point:

$$ \mathbf{p} + u \mathbf{u} + v\mathbf{v} + w\mathbf{w} $$

In computer graphics, these vectors must be represented relative to a canonical system, usually the "World" coordinates. For 2D right-handed systems, we typically use origin $o$ and basis vectors $x$ (right) and $y$ (up).

Vector Properties

A vector is defined solely by magnitude and direction. Its definition is independent of its starting point. In the world coordinate system, a vector $\mathbf{a}$ remains $\mathbf{a}$ regardless of where its translated, provided its direction and length are preserved.

Camera Implementation (Local to World)

In the assignment, the camera coordinate system has its origin at position $e$. The basis vectors align with the world axes, meaning the camera looks down the negative Z-axis.

Eigen::Vector3f camera_pos = {0, 0, 5};

Eigen::Matrix4f compute_view_transform(Eigen::Vector3f pos)
{
    Eigen::Matrix4f view_matrix = Eigen::Matrix4f::Identity();

    Eigen::Matrix4f translation;
    translation << 1, 0, 0, -pos[0], 
                   0, 1, 0, -pos[1], 
                   0, 0, 1, -pos[2], 
                   0, 0, 0, 1;

    view_matrix = translation * view_matrix;
    return view_matrix;
}

Perspective Projection and Flipping Issues

The rotation matrix logic involves decomposing a vector into components parallel and perpendicular to the rotation axis. The parallel component remains unchanged, while the perpendicular component rotates.

Analyzing the Inverted Triangle

Assuming the following:

  1. The projection matrix parameters (FOV, aspect ratio) are correct.
  2. The World Coordinate system has X/Y on the ground plane and Z pointing up (Right-handed).
  3. The view matrix is implemented correctly, looking towards $-Z$.
  4. The triangle lies flat on the ground (parallel to X/Y) at a depth of -2m.

With the camera at $z=5$, the triangle is at $z=-2$ in world space, translating to $z=-7$ in camera space.

The Problem: Standard perspective projection implements "foreshortening" where $y' = (n/z) \cdot y$.

  • Instructor's Impl: Assuming $n$ and $f$ (near/far) are negative (e.g., $-0.1, -50$), $(n/z)$ is positive. The image scales but does not flip vertically. The transformed $z' = n + f - (nf/z)$ remains negative.
  • Code's Impl: The provided code uses positive $zNear=0.1$ and $zFar=50$. Since $z$ is negative ($-7$), the term $(n/z)$ becomes negative. This scales the image and flips the Y-axis.

Furthermore, the standard projection moves the negative Z values into the positive range for depth testing. However, the code lacks explicit clipping against the viewing frustum in homogeneous coordinates. It proceeds directly to normalization (vert /= vert.w()) and viewport transformation. The viewport transform maps Y coordinates assuming a top-left origin (0,0), whereas intuitive display expects a bottom-left origin.

Solution: Change the near and far planes to negative values (e.g., $-0.1, -50$) to match the right-handed coordinate system looking down $-Z$.

Viewport Transformation and Buffer Writes

OpenCV Mat displays images with the origin at the top-left. To render correctly for human viewing (bottom-left origin), we must flip the Y-coordinate when writing to the framebuffer.

Original Flawed Logic:

void rst::rasterizer::set_pixel(const Eigen::Vector3f& point, const Eigen::Vector3f& color)
{
    // Maps (0,0) to top-left, but calculation is prone to overflow
    if (point.x() < 0 || point.x() >= width ||
        point.y() < 0 || point.y() >= height) return;
    
    // Bug: If point.y() is 0, height - 0 = height, causing out-of-bounds access
    auto idx = (height - point.y()) * width + point.x(); 
    frame_buf[idx] = color;
}

If point.y() is 0, the index becomes height * width, which exceeds the valid buffer range $[0, height \times width)$. While vector access might not throw an error immediately, it corrupts memory.

Corrected Logic:

void rst::rasterizer::set_pixel(const Eigen::Vector3f& point, const Eigen::Vector3f& color)
{
    if (point.x() < 0 || point.x() >= width ||
        point.y() < 0 || point.y() >= height) return;
        
    // Subtract 1 to ensure index stays within [0, size-1]
    auto idx = (height - 1 - point.y()) * width + point.x();
    frame_buf[idx] = color;
}

Note that the viewport transformation also maps the Z-buffer depth from the projected range to the normalized device coordinates (NDC) range for depth testing.

Tags: Computer Graphics Coordinate Systems Transformation Matrix Perspective Projection Viewport Transform

Posted on Fri, 08 May 2026 15:30:08 +0000 by chaddsuk