Hive configuration parameters directly influence the behavior of querry execution and resource management. Familiarity with the current runtime settings is essential for performance tuning and debugging.
Categories of Hive Configuration
Settings are generally grouped into two types based on when they take effect. Permanent configuration requires changes to files like hive-site.xml and a service restart to become active. Session-level parameters can be modified during a Hive session using specific commands and take effect immediately.
Retrieving Current Parameter Values
The primary method for examining active settings is the SET command within a Hive session.
-
Initiate a Hive Session Open a terminal and start the Hive command-line interface.
hive -
Execute the SET Command Running
SETwithout any arguments prints all current configuration properties to the console.SET;The output is a comprehensive list of name-value pairs, including system defaults, static configurations from
hive-site.xml, and any session-level overrides. -
Sample Output Excerpt The command produces output similar to the following, often spanning many lines.
hive.exec.dynamic.partition=true hive.exec.max.dynamic.partitions.pernode=1000 mapreduce.job.queuename=default ...
Querying Specific Parameters
To check the value of a single parameter, use SET followed by the property name.
sql SET hive.exec.compress.output;
This returns the specific value, such as hive.exec.compress.output=false.
Guidelines for Parameter Adjustment
When modifying settings for optimization:
- Understand the Impact: Research the parameter's function to avoid unintended consequences.
- Adjust Incrementally: Make changes one at a time and test performence after each modification.
- Session vs. Permanent: Use
SETfor temporary, session-scoped changes. For permanent adjustments, update thehive-site.xmlconfiguration file. - Validate Changes: Use the
SETcommand to confirm that new values are active.
Accurate knowledge of the active configuraton is a prerequisite for effective Hive performance management and troubleshooting.