Mastering Oracle Analytic and Window Functions

Understanding Window Functions

Oracle’s window functions enable calculations across a set of table rows that are related to the current row, without collapsing the result set into a single aggregated output like standard GROUP BY queries. These operations rely on the OVER() clause, which defines three primary analytical components:

  • PARTITION BY: Divides the result set into logical groups.
  • ORDER BY: Determines the sequential arrangement of rows within each partition.
  • Frame Clause (ROWS/RANGE): Specifies the physical or logical boundaries of the calculation window.

Syntax Template:

function_name(argument_list) OVER (
  [PARTITION BY partition_column]
  [ORDER BY sort_column]
  [frame_clause]
)

Note: The OVER() evaluation occurs after all filtering and grouping clauses are processed. If an outer ORDER BY matches the inner one, the database optimizer may skip redundant sorting steps.

Frame Clauses: ROWS vs RANGE

The frame clause restricts the subset of rows participating in the calculation. It must be combined with an ORDER BY clause inside OVER().

  • RANGE: Defines a logical window based on values. All rows with identical ORDER BY values fall within the same range.
  • ROWS: Defines a physical window based on row positions relative to the current row.

If omitted alongside ORDER BY, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Without ORDER BY, the entire partition constitutes the window.

WITH transaction_log AS (
  SELECT level AS txn_id, 
         CASE WHEN MOD(level, 3) = 0 THEN 100
              WHEN MOD(level, 5) = 0 THEN 200
              ELSE level * 10 END AS amount
  FROM dual CONNECT BY level <= 8
)
SELECT txn_id,
       amount,
       SUM(amount) OVER(ORDER BY txn_id RANGE BETWEEN 1 PRECEDING AND 2 FOLLOWING) AS range_window_sum,
       SUM(amount) OVER(ORDER BY txn_id ROWS BETWEEN 1 PRECEDING AND 2 FOLLOWING) AS row_window_sum
FROM transaction_log;

Core Analytic Functions

Ranking Operations

These functions assign positional integers based on sorted data. They all require an ORDER BY specification.

  • ROW_NUMBER(): Assigns unique, sequential integers. Ideal for deduplication.
  • RANK(): Leaves gaps after ties (e.g., 1, 2, 2, 4).
  • DENSE_RANK(): Maintains consecutive ranking despite ties (e.g., 1, 2, 2, 3).
-- Sample Dataset: warehouse inventory
CREATE TABLE inventory_batch (
  batch_id NUMBER,
  sku VARCHAR2(20),
  quantity NUMBER,
  region VARCHAR2(10)
);

INSERT INTO inventory_batch VALUES (101, 'WIDGET-A', 50, 'NORTH');
INSERT INTO inventory_batch VALUES (102, 'WIDGET-B', 75, 'NORTH');
INSERT INTO inventory_batch VALUES (103, 'WIDGET-C', 75, 'SOUTH');
INSERT INTO inventory_batch VALUES (104, 'WIDGET-D', 120, 'SOUTH');

SELECT sku, quantity, region,
       ROW_NUMBER() OVER(PARTITION BY region ORDER BY quantity DESC) rn_unique,
       RANK() OVER(PARTITION BY region ORDER BY quantity DESC) rnk_skip,
       DENSE_RANK() OVER(PARTITION BY region ORDER BY quantity DESC) drn_consecutive
FROM inventory_batch;

Aggregation Over Windows

Standard aggregates can operate continuously across partitions or ordered sequences.

-- Cumulative and partitioned metrics
SELECT sku, quantity, region,
       SUM(quantity) OVER() total_stock,
       AVG(quantity) OVER() global_avg,
       COUNT(*) OVER(PARTITION BY region) regional_count,
       SUM(quantity) OVER(PARTITION BY region ORDER BY sku) running_total
FROM inventory_batch;

Key Behavior: Adding ORDER BY transforms static partition aggregates into rolling/cumulative computations extending from the partition's start to the current row.

Distribution & Positional Functions

  • NTILE(n): Distributes rows evenly into n buckets.
  • FIRST_VALUE(col) / LAST_VALUE(col): Extracts boundary values. Use IGNORE NULLS to skip blanks.
  • LAG(col, offset, default) / LEAD(col, offset, default): Access preceding or succeeding row values.
-- Bucketing and offset access
SELECT sku, quantity, region,
       NTILE(4) OVER(ORDER BY quantity) quartile_bucket,
       LAG(quantity, 1, 0) OVER(PARTITION BY region ORDER BY sku) prev_sku_qty,
       LEAD(sku, 1, 'END') OVER(PARTITION BY region ORDER BY quantity DESC) next_high_sku
FROM inventory_batch;

Statistical & Percentile Functions

Oracle provides robust statistical window functions:

  • STDDEV() / STDDEV_SAMP() / STDDEV_POP(): Standard deviation calculations.
  • VARIANCE() / VAR_SAMP() / VAR_POP(): Variance metrics.
  • PERCENT_RANK(): Relative rank as (rank-1)/(total_rows-1).
  • CUME_DIST(): Cumulative distribution probability.
  • PERCENTILE_CONT(x): Continuous interpolation for exact quantile values.
-- Statistical windows
SELECT sku, quantity, region,
       PERCENT_RANK() OVER(PARTITION BY region ORDER BY quantity) prc_rank,
       CUME_DIST() OVER(ORDER BY quantity) cum_prob,
       PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY quantity) OVER(PARTITION BY region) median_qty
FROM inventory_batch;

Maintaining Top/Bottom Values with KEEP

The KEEP (DENSE_RANK FIRST/LAST ORDER BY ...) clause allows extracting associated columns corresponding to extreme ranked values.

-- Retrieve SKU associated with highest quantity per region
SELECT sku, quantity, region,
       MAX(sku) KEEP(DENSE_RANK LAST ORDER BY quantity) OVER(PARTITION BY region) top_sku_by_region
FROM inventory_batch;

Relative Proportions

RATIO_TO_REPORT(expression) calculates the percentage contribution of a row relative to its partition total. Ordering is explicitly disabled for this function.

SELECT sku, quantity, region,
       ROUND(RATIO_TO_REPORT(quantity) OVER(PARTITION BY region) * 100, 2) pct_of_region
FROM inventory_batch;

Conditional Logic Alternatives: DECODE vs CASE

While modern development favors ANSI-standard CASE, Oracle’s legacy DECODE offers compact equality-based barnching.

-- Mapping status codes using DECODE
SELECT sku, quantity,
       DECODE(SIGN(quantity - 50), -1, 'LOW_STOCK', 1, 'PLENTY', 'EXACT') status_level
FROM inventory_batch;

-- Equivalent ANSI implementation
SELECT sku, quantity,
       CASE SIGN(quantity - 50)
         WHEN -1 THEN 'LOW_STOCK'
         WHEN 1 THEN 'PLENTY'
         ELSE 'EXACT'
       END status_level
FROM inventory_batch;

Crosstabulation Techniques: PIVOT

Static Conversion

The PIVOT operator rotates row values into column headers. Requires explicit enumeration in the IN clause.

-- Assuming an orders_expanded table exists
SELECT * FROM (
  SELECT category, product_type, units_sold 
  FROM sales_records
) PIVOT (
  SUM(units_sold) FOR product_type IN ('Hardware' AS hw, 'Software' AS sw, 'Accessories' AS acc)
);

Dynamic Conversion via PL/SQL

When categories are unpredictable, construct the PIVOT string dynamically using LISTAGG.

CREATE OR REPLACE PROCEDURE sp_dynamic_pivot(
  p_src_sql    VARCHAR2,
  p_col_query  VARCHAR2,
  p_agg_func   VARCHAR2,
  p_target_col VARCHAR2,
  p_view_out   VARCHAR2
) IS
  v_in_clause VARCHAR2(4000);
  v_full_stmt VARCHAR2(4000);
BEGIN
  -- Build the comma-separated list for the IN clause
  EXECUTE IMMEDIATE 
    'SELECT LISTAGG(''"'' || col_val || ''" ''' || ' || 'col_val || '', '') WITHIN GROUP (ORDER BY id)'
    INTO v_in_clause
    FROM (' || p_col_query || ');

  -- Assemble and create the view
  v_full_stmt := 'CREATE OR REPLACE VIEW ' || p_view_out || ' AS ' ||
                 'SELECT * FROM (' || p_src_sql || ') PIVOT (' ||
                 p_agg_func || ' FOR ' || p_target_col || ' IN (' || v_in_clause || '))';
  
  EXECUTE IMMEDIATE v_full_stmt;
END sp_dynamic_pivot;/

Source Differentiation with Union Logic

To merge datasets from two tables while tagging their origin, UNION ALL paired with a literal flag marker provides an efficient pattern.

-- Mark records existing in target table vs those absent
SELECT master_id, name_val, type_flag, 1 AS source_tag
FROM primary_catalog pc
WHERE EXISTS (SELECT 1 FROM secondary_audit sa WHERE sa.id = pc.master_id)
UNION ALL
SELECT master_id, name_val, type_flag, 2 AS source_tag
FROM primary_catalog pc
WHERE NOT EXISTS (SELECT 1 FROM secondary_audit sa WHERE sa.id = pc.master_id);

Tags: Oracle-SQL Window-Functions Analytic-Queries PIVOT-operator PL-SQL

Posted on Sun, 31 May 2026 22:53:36 +0000 by sungpeng