EMQX Configuration Fundamentals and Distributed Cluster Architecture

Network Endpoint Allocation

EMQX relies on a predefined set of network ports to handle different protocol layers and administrative interfaces. These endpoints can be adjusted in the primary configuration file alongside their respective plugin manifests.

# Primary MQTT over TCP
listener.tcp.primary.address = 0.0.0.0:1883

# Secure MQTT over TLS
listener.ssl.secure.port = 8883

# MQTT over WebSocket
listener.websocket.bridge.port = 8083

Management and dashboard interfaces are configured separately through their plugin configuration files:

# HTTP Management API
management.listener.http.port = 8080

# Web Dashboard
dashboard.listener.http.port = 18083

Note that the Erlang Port Mapper Daemon (epmd) consistently binds to TCP port 4369 for node discovery. This binding is hardcoded at the VM level, preventing multiple standalone EMQX instances from running on the same host without containerization or network namespace isolation.

Erlang Virtual Machine Tuning

Connection capacity is directly bounded by Erlang VM resource limits. Two critical parameters govern scalability:

  • node.process_limit: Defines the maximum number of concurrent Erlang processes. Each active client connection typically spawns two internal processes (connection handler and session manager).
  • node.max_ports: Specifies the upper bound for open ports. In the Erlang context, a port represents an I/O resource driver (functionally similar to a file descriptor), and each client connection consumes one.
# Adjust based on expected concurrent connections
node.process_limit = 2097152
node.max_ports = 1048576

Distributed Cluster Architecture

A cluster consists of multiple independent nodes cooperating over a network to present a unified messaging service. This architecture delivers several operational advatnages:

  • High Availability: Node failures do not cascade into total service outages.
  • Load Distribution: Traffic is distributed across available instances, preventing resource saturation.
  • Horizontal Scalability: Capacity expands linearly by adding nodes without downtime.

Node Identification and Communication

Each cluster member is identified by a unique identifier in the format name@host, where name is user-defined and host resolves to an IP address or fully qualified domain name. Example identifiers:

emqx_broker@10.0.0.10
emqx_broker@node-primary.cluster.local

Internal node communication relies on epmd for TCP port mapping. All participating nodes must share an identical Erlang magic cookie for authentication. The cluster supports IPv4, IPv6, and TLS-encrypted internal channels.

Message Routing Mechanics

Each client maintains a persistent connection to a single cluster node. Message distribution follows two core principles:

  1. Subscription requests propagate across the entire cluster topology.
  2. Published messages are forwarded exclusively to nodes hosting active subscribers for that specific topic, leveraging a distributed topic tree and routing table.

Every node maintains a synchronized copy of the routing table. A simplified routing table structure might resemble:

Topic Pattern          | Destination Nodes
-----------------------------------------
sensors/temp/reading   | broker-a, broker-c
sensors/humidity/reading | broker-a
alerts/#               | broker-b
system/config          | broker-c
-----------------------------------------

Subscriber-to-client mappings are stored locally on each node to minimize cross-cluster metadata traffic.

Cluster Initialization Strategies

Manual Joining

Administrators can explicitly connect nodes using the CLI. Configuration requires defining node names, synchronizing the cluster cookie, and setting the discovery mode to manual:

node.name = emqx_broker@10.0.0.10
node.cookie = shared_cluster_secret
cluster.discovery = manual

After startup, execute emqx_ctl cluster join emqx_broker@10.0.0.20 on the initiating node. Cluster state can be verified via emqx_ctl cluster status. Node removal follows equivalent CLI commands.

Automatic Discovery

EMQX utilizes the Ekka library to automate node discovery and membership management. Supported strategies include:

  • static: Predefined node list
  • mcast: UDP multicast discovery
  • dns: DNS-based resolution
  • etcd: etcd key-value store
  • k8s: Kubernetes API integration

Ekka also provides automatic split-brain recovery and dead node eviction. The static strategy requires no external dependencies and operates over standard TCP:

cluster.discovery = static
cluster.static.seeds = emqx_broker@10.0.0.10,emqx_broker@10.0.0.20,emqx_broker@10.0.0.30

All nodes must reference the identical seed list. Upon startup, automatic topology synchronization occurs.

Load Balancing Integration

While optional, deploying a load balancer in front of the cluster optimizes request distribution, reduces latency, and maximizes throughput. Compatible solutions include cloud-native load balancers, NGINX, or HAProxy. The load balancer should target the MQTT listener ports and maintain sticky sessions if session affinity is required.

Protocol Parameters and Zone Management

MQTT protocol behavior can be tuned under the mqtt. configuration namespace:

mqtt.client_id_max_length = 64
mqtt.max_packet_size = 1048576
mqtt.keepalive_backoff_factor = 1.5

The keepalive_backoff_factor adjusts the disconnect timeout calculation. If a client remains unresponsive for keepalive × backoff_factor × 2, the broker terminates the connection.

EMQX groups listeners into logical zones, allowing granular policy application. Configuration follows the pattern zone.<zone_name>.<parameter>. Multiple listeners can attach to a single zone, and clients inherit zone-level constraints upon connection.

Listener Definitions

Listeners define the network interfaces and protocols the broker exposes. The syntax listener.<type>.<name> = <address>:<port> assigns an endpoint. Zones are bound to listeners using:

listener.tcp.primary.zone = production_zone
listener.ssl.secure.zone = secure_zone

Plugin Ecosystem

Extended functionality operates through a modular plugin system. Each plugin requires a corresponding .conf manifest:

plugins.etc_dir = etc/plugins/
plugins.loaded_file = data/loaded_plugins

The loaded_plugins file dictates which extensions activate during broker startup.

Traffic Regulation and Rate Limiting

To prevent resource exhaustion from high-frequency connections or unbounded publishing, EMQX implements multi-layer rate control:

# Byte-level throughput cap (bytes per second, burst buffer)
listener.tcp.primary.throughput_limit = 2048,8192

# Connection acceptance threshold (connections per second)
listener.tcp.primary.connection_rate = 2000

# Publish rate cap per client (messages per time window)
zone.production_zone.message_publish_cap = 20,2m

Exceeding byte or connection limits triggers temporary suspension of the offending socket. Publish limits enforce quota enforcement at the zone level, resetting per the defined time window.

Advanced Subscription Models

EMQX supports shared subscriptions to distribute message load across multiple consumers:

Topic Syntax               | Subscription Example
-----------------------------------------------------
$queue/<topic>             | $queue/upstream/data
$share/<group>/<topic>     | $share/sensors_group/upstream/data

Messages matching these patterns are delivered to only one subscriber within the group, enabling round-robin or consistent-hashing distribution.

Proxy Subscriptions and External Bridging

Broker-side proxy subscriptions allow administrative or automated clients to subscribe on behalf of other devices, managed via REST APIs or dedicated modules. For external system integration, EMQX provides bridge connectors that synchronize data with heterogeneous message brokers, including Apache Kafka, RabbitMQ, and custom TCP endpoints, enabling seamless cross-platform event routing.

Tags: EMQX MQTT erlang cluster-management message-broker

Posted on Wed, 27 May 2026 20:34:23 +0000 by fusioneko