Hive Fundamentals and Core Concepts
Hive Introduction
What is Hive?
Hive is an open-source data warehouse solution originally developed by Facebook that operates on Hadoop infrastructure
It provides SQL-like query capabilities (HQL) for structured data stored in HDFS
Core functionality involves translating SQL queries into MapReduce jobs
Primary use case: batch data analytics wi ...
Posted on Mon, 15 Jun 2026 18:24:52 +0000 by bobbfwed
Deploying Apache Hive 2.3.6 on Hadoop 2.10.0
Binary Extraction and Setup
Acquire the Apache Hive 2.3.6 binary archive from the official distribution repository. Extract the contents to a standard application directory and establish a symbolic link for simplified version management.
tar -xzf apache-hive-2.3.6-bin.tar.gz -C /usr/local/
cd /usr/local
sudo ln -s apache-hive-2.3.6-bin hive
E ...
Posted on Fri, 15 May 2026 00:46:07 +0000 by Hardwarez
Hadoop Cluster Configuration and Data Pipeline Setup for Offline Data Warehouse
When configuring a Hadoop cluster for an offline data warehouse, proper host mapping and configuration file adjustments are essential.
In core-site.xml, proxy user settings should allow access from any host, group, or user:
<property>
<name>hadoop.proxyuser.atguigu.hosts</name>
<value>*</value>
</property> ...
Posted on Thu, 07 May 2026 07:42:31 +0000 by bruckerrlb