SQL Aggregate Functions: Monthly Transaction Analysis

Prerequisite Knowledge

The DATE_FORMAT() function converts date and time values into different string representasions. It accepts two parameters: a valid date value and a format string that specifies the output pattern.

DATE_FORMAT(date, expression)

Parameters:

  • date: The date value to be formatted
  • expression: A string defining the output format

Common Format Specifiers:

  • %Y: Four-digit year (e.g., 2024)
  • %m: Two-digit month (01-12)
  • %d: Two-digit day of month (01-31)
  • %H: 24-hour format hour (00-23)
  • %i: Two-digit minutes (00-59)
  • %s: Two-digit seconds (00-59)

Example Usage:

SELECT
    product_name,
    DATE_FORMAT(sale_date, '%Y-%m-%d') AS formatted_date
FROM SalesRecords;

Problem Analysis: Monthly Transaction Aggregation

This problem requires grouping transacsions by month and country, then calculating several metrics including total transaction count, approved transaction count, total amount, and approved transaction amount.

Incorrect Approach

SELECT 
    DATE_FORMAT(transaction_date, '%Y-%m') AS month,
    country_code,
    COUNT(CASE WHEN status = 'approved' THEN 1 END) AS approved_count
    COUNT(month) AS transaction_count,
    COUNT(sale_amount) AS total_amount,
    COUNT(CASE WHEN status = 'approved' THEN sale_amount END) AS approved_amount
FROM FinancialTransactions
GROUP BY month, country_code;

Correct Implementation

SELECT DATE_FORMAT(transaction_date, '%Y-%m') AS month,
    country_code,
    COUNT(*) AS transaction_count,
    COUNT(IF(status = 'approved', 1, NULL)) AS approved_count,
    SUM(sale_amount) AS total_amount,
    SUM(IF(status = 'approved', sale_amount, 0)) AS approved_amount
FROM FinancialTransactions
GROUP BY month, country_code

Detailed Error Analysis

Point 1: Conditional Counting Equivalence

Both approaches below produce identical results. When the condition is not met, the CASE expression implicitly returns NULL, which COUNT ignores:

COUNT(CASE WHEN status = 'approved' THEN 1 END) AS approved_count
COUNT(IF(status = 'approved', 1, NULL)) AS approved_count

Point 2: COUNT Behavior with NULL Values

Understanding COUNT's behavior is crucial:

  • COUNT(*) counts all rows in the result set, regardless of NULL values
  • COUNT(column_name) counts only rows where the specified column contains a non-NULL value
  • Use COUNT(column_name) when you want to exclude NULL entries or when you know the column contains no NULL values

Point 3: SUM vs COUNT for Numerical Aggregation

The critical error in the incorrect approach is using COUNT on numerical columns. COUNT returns the number of non-NULL values, not their sum. When you need to calculate the total of numerical values, always use SUM:

Point 4: Handling NULL in SUM Expressions

These two formulations are equivalent. Remember that SQL uses = for comparison, not ==:

SUM(CASE WHEN status = 'approved' THEN sale_amount ELSE 0 END) AS approved_amount
SUM(IF(status = 'approved', sale_amount, 0)) AS approved_amount

Important: When using conditional SUM with CASE expressions, always include an ELSE clause that returns 0. Without this, non-matching rows produce NULL, and SUM(NULL) results in NULL rather then 0.

Point 5: Grouping Consideration

The GROUP BY clause should include both the formatted month and the country identifier. This ensures transactions are properly segmented by both time period and geographic region.

Tags: MySQL sql date_format aggregate-functions sum

Posted on Mon, 18 May 2026 17:48:48 +0000 by yashvant