Understanding Hive, HBase, and HDFS in the Hadoop Ecosystem
Hive, HBase, and HDFS serve distinct but complementary roles in the Hadoop architecture—each addressing different data access patterns and storage requirements.
Hive: SQL Abstraction Over Batch Processing
Hive is a data warehousing infrastructure built atop Hadoop that translates declarative SQL-like queries (HiveQL) into distributed batch jobs ...
Posted on Sun, 21 Jun 2026 17:35:21 +0000 by drbigfresh
Setting Up HBase Single-Node Environment for Big Data Processing
Prerequisites
Server Specifications
Cloud Instance: Basic tier (pay-as-you-go)
Operating System: Linux CentOS 6.8
CPU: 1 core
Memory: 1GB
Storage: 40GB
Software Stack
Java Development Kit: Version 1.8 (jdk-8u144-linux-x64.tar.gz)
Apache Hadoop: Version 2.8.2 (hadoop-2.8.2.tar.gz)
Apache HBase: Version 1.2.6 (hbase-1.2.6-bin.tar.gz)
Download ...
Posted on Tue, 16 Jun 2026 16:49:07 +0000 by Yesideez
Troubleshooting HBase Snapshot Reads with LZO Compression
Issue 1: UnsatisfiedLinkError to gplcompression
When attempting to read HBase snapshot data, the following error occurs:
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:31)
at com.hadoop.compression.lzo.LzoCodec.<clinit> ...
Posted on Sun, 14 Jun 2026 16:47:10 +0000 by zMastaa
Deploying a Three-Node HBase Cluster on Ubuntu and Inserting Sample Data
To set up a functional HBase cluster across three Ubuntu servers and insert sample data, follow this guide. The setup assumes the following IP addresses:
Master node: 192.168.1.101
RegionServer nodes: 192.168.1.102, 192.168.1.103
1. System Preparation
On all three machines, begin by updating the system and installing Java:
sudo apt update &am ...
Posted on Sat, 06 Jun 2026 17:02:54 +0000 by psyqosis
Understanding and Parsing HBase Configuration Files
HBase relies on several key configuration files to manage its distributed, column-oriented NoSQL database behavior. These files define critical settings for cluster operation, integration with HDFS, ZooKeeper coordination, and performance tuning.
Core Configuration Files
The primary configuration files include:
hbase-site.xml: Contains site-sp ...
Posted on Fri, 29 May 2026 21:13:42 +0000 by vbcoach
Configuring LZO Compression for Hadoop 3.1.2 and HBase 2.2.0
To implement LZO compression within a HBase environment running on Hadoop, it is necessary to compile the native LZO libraries and the corresponding Hadoop-LZO Java bridge from source. Older guides often reference the deprecated hadoop-gpl-compression library, which is incompatible with modern Hadoop versions. The following procedure outlines t ...
Posted on Mon, 18 May 2026 18:24:19 +0000 by neron-fx
Troubleshooting Common Errors in Big Data Environment Setup: Hadoop, Spark, HBase, Hive, and ZooKeeper
Hadoop Pseudo-Distributed Mode Issues
Configuration Parsing Failure in hdfs-site.xml
When you encounter FATAL conf.Configuration: error parsing conf hdfs-site.xml, the root cause is typically an encoding mismatch. Resolve it by opening the file and saving it with a uniform character encoding such as UTF-8.
HDFS Command Deprecation Warning
The w ...
Posted on Wed, 13 May 2026 16:56:31 +0000 by ashutosh.titan
Building a Music Ranking System with HBase and MapReduce
Environment: Windows 10, CentOS 7.9, Hadoop 3.2, HBase 2.5.3, and Zookeeper 3.8 in fully distributed mode;
Environment setup procedures can be found in these articles:
CentOS7 Hadoop3.X Fully Distributed Environment Setup
Hadoop3.x Fully Distributed Environment Setup with Zookeeper and Hbase
1. Integrating MapReduce and HBase
Copy hbase-site.x ...
Posted on Wed, 13 May 2026 00:15:56 +0000 by LawsLoop