SQL Query Strategies: Handling NULL Values in Referee Filtering

Problem Context

The objective is to retrieve a list of customer names from a database table, specifically exclduing those who were referred by the customer with ID 2. A naive approach might involve a direct comparison operator in the WHERE clause.

SELECT name 
FROM customer 
WHERE referee_id <> 2;

However, executing this query often yields unexpected results, failing to return rows where the referee_id is NULL. This discrepancy arises from the SQL standard's implementation of logic.

The Logic of NULLs

SQL databases operate on a three-valued logic system comprising TRUE, FALSE, and UNKNOWN. In this system, any comparison operation involving a NULL value—such as NULL = 2 or NULL <> 2—automatically evaluates to UNKNOWN. Since the WHERE clause only filters for rows that evaluate to TRUE, rows resulting in FALSE or UNKNOWN are discarded.

Robust Query Solutions

To accurately capture the required data, the query must explicitly account for the UNKNOWN state. There are two primary methods to handle this.

Method 1: Explicit NULL Checking

The most standard SQL approach involves expanding the condition to explicitly include NULL values using the IS NULL predicate.

SELECT name 
FROM customer 
WHERE referee_id <> 2 
   OR referee_id IS NULL;

Method 2: Value Substitution

Alternatively, one can use a function to substitute the NULL value with a placeholder (such as 0) that will satisfy the inequality condition. The COALESCE function (or IFNULL in MySQL) is suitable for this.

SELECT name 
FROM customer 
WHERE COALESCE(referee_id, 0) <> 2;

Deep Dive: Three-Valued Logic

The Nature of NULL

In databace theory, NULL serves as a marker for the absence of a value, which generally falls into two categories:

  • Unknown: The value exists but is currently missing (e.g., the color of a person's eyes in a photo where they are wearing sunglasses).
  • Inapplicable: The attribute does not exist for the entity (e.g., the email adress of a rock).

While these concepts are distinct logically, SQL Relational Database Management Systems (RDBMS) typically treat them as a single NULL state.

Predicates and NULLs

Predicates in SQL (such as =, <, BETWEEN, IN) are functions that return boolean values. Because NULL is not a value itself but a placeholder, applying a comparison predicate to it is semantically invalid, resulting in UNKNOWN.

This explains why IS NULL is required rather than = NULL. The IS NULL construct should be viewed as a single, atomic predicate (conceptually similar to IS_NULL) designed specifically to detect the presence of this marker. Unlike standard comparison operators, it returns TRUE when the marker is detected.

Logical Operations with UNKNOWN

When combining UNKNOWN with standard boolean values, the result follows specific priority rules. It is crucial to distinguish between the boolean value unknown and the SQL NULL marker.

  • NOT unknown results in unknown.
  • unknown = unknown results in true (treating them as the same boolean state).
  • UNKNOWN = UNKNOWN (comparing the markers) results in unknown.

Understanding the interaction in AND and OR operations is vital for writing complex conditions:

-- OR Operations: TRUE dominates
TRUE OR unknown     => TRUE
unknown OR unknown  => unknown
FALSE OR unknown    => unknown

-- AND Operations: FALSE dominates
TRUE AND unknown    => unknown
unknown AND unknown => unknown
FALSE AND unknown   => FALSE

A useful mnemonic for precedence is:

  • For AND: FALSE > unknown > TRUE
  • For OR: TRUE > unknown > FALSE

Tags: sql MySQL Database Optimization Three-Valued Logic Null Handling

Posted on Sun, 07 Jun 2026 17:20:07 +0000 by ollmorris