From Static Tables to Continuous Streams: The Evolution of Streaming SQL

Modern data architectures are shifting from batch-oriented processing to real-time analysis. In traditional systems, data is stored in static tables and queried at a specific point in time. However, in today’s data-driven landscape, information is generated continuously by sensors, logs, and transactions. To handle this, engineers are moving be ...

Posted on Sun, 24 May 2026 20:00:35 +0000 by firemankurt

Setting Up a Flink Cluster in Standalone and YARN Modes

Configuring TaskManager Hostnames Each TaskManager must be configured with its respective hostname in flink-conf.yaml: taskmanager.host: hadoop103 On another node: taskmanager.host: hadoop104 Starting and Stopping a Standalone Cluster From the JobManager node (hadoop102): # Start cluster bin/start-cluster.sh # Stop cluster bin/stop-cluster.s ...

Posted on Wed, 20 May 2026 05:09:43 +0000 by quark76

Key Features and Enhancements in Apache Flink 1.14 to 1.17

Apache Flink 1.14.0 Highlights Core Features Checkpointing for Bounded Streams. Mixed DataStream and Table/SQL Applications in Batch Execution Mode. Introduction of the Hybrid Source for seamless reading across multiple sources. Buffer Debloating to minimize checkpoint latency. Fine-Grained Resource Management for dynamic Slot sizing. New Puls ...

Posted on Thu, 07 May 2026 19:17:24 +0000 by kade119

Apache Flink Checkpoint Configuration Guide

Prerequisites Exactly Once Processing For exactly once semantics to work properly: Source systems: Must support data retransmission (e.g., message queues like Kafka, distributed file systems like HDFS) Sink systems: Must support idempotent operations (e.g., Doris supports deduplication) At Least Once Processing For at least once semantics: S ...

Posted on Thu, 07 May 2026 06:14:41 +0000 by smpdawg