Serial communication remains a cornerstone in embedded and IoT systems due to its simplicity, reliability, and hardware-level support. However, developers new to serial protocols often face a persistent challenge: even when the sender transmits data in structured packets, the receiver frequently ingests incomplete, concatenated, or fragmented data streams. This phenomenon — known as frame stick (multiple frames merged) or frame split (a single frame fragmented across multiple reads) — is inherent to the byte-stream nature of serial transport.
1. Root Causes of Frame Stick and Frame Split
Unlike network protocols where the transport layer guarantees packet boundaries, serial communication exposes the underlying byte stream directly to the application. Without explicit framing, there is no built-in mechanism to delimit individual messages.
- Frame stick occurs when two or more frames are received in a single read, e.g.,
0xAA 0x01 0x02 0x55 0xAA 0x03 0x04 0x55appears as one continuous chunk. Common in high-frequency transmission or delayed read loops. - Frame split happens when a single frame spans multiple reads — e.g.,
0xAA 0x01in one event,0x02 0x55in the next. Prevalent at low baud rates or with large payloads.
Contributing factors include:
- Hardware FIFO buffering behavior
- OS scheduling granularity
- Asynchronous sender/receiver timing
- Absence of application-layer framing rules
In a real-world project involving multi-sensor environmental monitoring, unhandled framing led to misaligned readings: humidity values were overwriten by adjacent sensor IDs, and temperature readings drift due to payload misassociation.
2. Framing Strategies for Robust Parsing
A reliable protocol parser mustUnblock on well-defined frame boundaries. Below are three battle-tested framing schemes, each suited to specific distribution and complexity trade-offs.
2.1 Fixed-Length Frames
Best for constrained, static-message systems (e.g., periodic sensor telemetry with known structure).
Frame layout:
- 1-byte sync header (
0xAA) - 2-byte sensor ID
- 2-byte temperature (scaled ×10)
- 2-byte humidity (scaled ×10)
- 1-byte checksum
- 1-byte footer (
0x55)
Total fixed size: 9 bytes
Implementation sketch (C++-style parser logic):
struct Frame {
uint8_t sync; // 0xAA
uint16_t id;
int16_t temp;
int16_t hum;
uint8_t cksum;
uint8_t end; // 0x55
} __attribute__((packed));
Optional<Frame> parseFrame(Span<const uint8_t>& buffer) {
if (buffer.size() < sizeof(Frame))
return std::nullopt; // Incomplete (half-frame)
Frame f;
std::memcpy(&f, buffer.data(), sizeof(Frame));
if (f.sync != 0xAA || f.end != 0x55) {
buffer.remove_prefix(1); // Discard erroneous byte and retry
return std::nullopt;
}
uint8_t calcCksum = f.sync ^ f.id ^ (f.id >> 8) ^ f.temp ^ (f.temp >> 8) ^
f.hum ^ (f.hum >> 8);
if (calcCksum != f.cksum) {
buffer.remove_prefix(1);
return std::nullopt;
}
buffer.remove_prefix(sizeof(Frame));
return f;
}
Pros: Minimal CPU overhead, deterministic parsing, ideal for 8-bit MCUs. Cons: Inflexible; padding wastes bandwidth for variable-length payloads.
2.2 Length-Prefix Framing
Ideal for command/response or variable-length telemetry (e.g., firmware update packets, embedded logs).
Frame layout:
- 2-byte sync:
0xAA 0x55 - 1-byte length L (bytes following the length field, excluding checksum)
- L-byte payload
- 2-byte CRC16-CCITT
Example frame: 0xAA 0x55 0x03 0x12 0x34 0x56 0x78 0x9A
Design considerations:
Lmust encode only payload length (no sync, length, or CRC纳入)- Cap L at a safe upper bound (e.g., 255 → 1 KB internal buffer)
- Use big-endian explicitly for cross-endianness compatibility
Parser skeleton (state machine):
class LengthPrefixParser:
HEADER = bytes([0xAA, 0x55])
MIN_LEN = len(HEADER) + 1 # sync + length
MAX_PAYLOAD = 255
def __init__(self):
self._buf = bytearray()
self._expected_len = None
self._payload = bytearray()
def push(self, data: bytes) -> list:
self._buf.extend(data)
frames = []
while len(self._buf) >= self.MIN_LEN:
if self._expected_len is None:
# Try to locate header
if self._buf[:2] != self.HEADER:
self._buf.pop(0) # Discard non-sync bytes
continue
if len(self._buf) < 3:
break # Wait until length byte arrives
self._expected_len = self._buf[2]
if self._expected_len > self.MAX_PAYLOAD:
raise ValueError("Frame length overflow")
self._buf = self._buf[3:]
# Now expecting payload + CRC
needed = self._expected_len + 2 # payload + 2-byte CRC
if len(self._buf) < needed:
break # Still incomplete
payload_crc = self._buf[: self._expected_len + 2]
payload, crc = payload_crc[:-2], payload_crc[-2:]
if self._verify_crc(payload, crc):
frames.append(payload)
self._buf = self._buf[needed:]
self._expected_len = None
return frames
Why it’s preferred industrially: Supports variable payloads, decouples frame size from transmission rate, and avoids reliance on end markers (which may appear in data).
2.3 Delimiter-Based Framing (e.g., ASCII/Text Protocols)
Best for debug interfaces, JSON/CLI-style protocols, or human-readable logs.
Example:
$ID=0x10|T=235|H=61#<CR><LF>
$ID=0x11|T=237|H=59#<CR><LF>
- Start delimiter:
$ - End delimiter:
#+ optional<CR><LF> - Delimiter escape handling:
#→\#
Typical parsing strategy:
std::vector<std::string> extractLines(BufferView& data) {
std::vector<std::string> msgs;
size_t start = 0;
while (true) {
auto end = find_span(data, start, '#');
if (end == String::npos) break;
auto line = data.substr(start, end - start + 1);
// Remove CR/LF suffix if any
while (!line.empty() && (line.back() == '\r' || line.back() == '\n'))
line.pop_back();
msgs.push_back(line);
start = end + 1;
}
data.remove_prefix(start);
return msgs;
}
Caveats: Requires escaping special characters in payloads or limiting character set to avoid ambiguity. Not suitable for binary/raw data unless escaping or base64 is applied.
3. Robust Radar: Error Resilience & Edge Cases
Beyond framing, production-grade parsers must handle:
- Out-of-sync starts: Header detection with partial matches (e.g.,
0xAAalone, then later0x55) - Clock drift in timeouts: Use monotonic timestamps per byte instead of polling intervals
- Buffer thrashing: Reset state on malformed CRC/frame length to prevent cascading errors
- Memory pressure: Preflight allocation of bounded ring buffers; never
mallocin hot parsing loop
In high-throughput industrial gateways, mixing framing schemes (e.g., protocol header uses length-prefix, payload contains delimited sub-frames) provides both throughput and introspectability.