Apache Hive previously used the global parameter hive.mapred.mode to enable or disable strict mode. Setting this parameter to strict would enforce rules like requiring partition filters for queries on partitioned tables.
In Hive 3.1.3, this coarse-grained parameter is deprecated. The new approach uses specific, granular parameters for different strictness checks. If hive.mapred.mode is set, it overrides all new parameters, so it should be avoided. The new parameters are boolean, accepting true or false.
1. hive.strict.checks.orderby.no.limit
Prevents ORDER BY clauses without a LIMIT. This is a syntactic check, independent of actual data volume.
When hive.strict.checks.orderby.no.limit is true, this query fails:
SELECT id, name FROM employees ORDER BY id;
A query with LIMIT is allowed:
SELECT id, name FROM employees ORDER BY id LIMIT 50;
2. hive.strict.checks.no.partition.filter
Prevents queries on partitioned tables without a filter on a partition column. This is a syntactic check.
Given a table sales with partition columns year and month, the following query fails when the parameter is true:
SELECT product, SUM(revenue)
FROM sales
WHERE region = 'North'
GROUP BY product;
A valid query must include atleast one partition filter:
SELECT product, SUM(revenue)
FROM sales
WHERE region = 'North'
AND year = 2024
GROUP BY product;
3. hive.strict.checks.type.safety
Enforces strict type safety, prohibiting certain comparisons that may cause precision loss or errors.
When enabled, comparisons between BIGINT and STRING or BIGINT and DOUBLE are disallowed.
With the parameter set:
CREATE TABLE data_types(large_num BIGINT, text_val STRING, decimal_val DOUBLE);
Executing this query generates a warning about potential precision loss:
SELECT * FROM data_types WHERE large_num > decimal_val;
4. hive.strict.checks.cartesian.product
Prevents Cartesian products (cross joins) without explicit join conditions. When enabled, this query fails:
SELECT * FROM users, orders;
A join condition must be specified using ON or WHERE:
SELECT * FROM users JOIN orders ON users.id = orders.user_id;
-- OR
SELECT * FROM users, orders WHERE users.id = orders.user_id;
5. hive.strict.checks.bucketing
When enabled, prohibits using LOAD DATA INPATH to load data directly into bucketed tables.