What data type is 'data' in Node.js TCP sockets? How to split a byte stream into separate messages?

In the provided Node.js TCP server snippet, the socket.on('data', ...) callback receives a Buffer object. This data represents raw binary data sent from the client over the TCP connection. TCP is a stream-oriented protocol; there is no inherent message boundary. To extract discrete messages from this continuous byte stream, you must implement a framing protocol.

1. Data type of data

data is a Buffer object—Node.js’s way of handling raw binary data. You can convert it to a string or manipulate it byte by byte:

socket.on('data', (chunk) => {
  console.log('Received Buffer:', chunk);
  console.log('As string:', chunk.toString());
});

2. Common message framing strategies

Because TCP may deliver data in arbitrary sized chunks, you need a way to recognize message boundaries. Below are three widely used approaches.

Method 1: Fixed‑length messages

If all messages have the same size, read exactly that many bytes each time.

const MSG_LEN = 10;

socket.on('data', (chunk) => {
  let offset = 0;
  while (offset + MSG_LEN <= chunk.length) {
    const msg = chunk.slice(offset, offset + MSG_LEN);
    console.log('Fixed message:', msg.toString());
    offset += MSG_LEN;
  }
  // leftover bytes are ignored or stored for next data event
});

Method 2: Length‑prefixed messages

Each message is preceded by a fixed‑size integer indicating its length. This works well for variable‑length messages.

const LEN_SIZE = 4; // 4 bytes for length (uint32)

socket.on('data', (chunk) => {
  let pos = 0;
  while (pos + LEN_SIZE <= chunk.length) {
    const msgLen = chunk.readUInt32BE(pos);  // big‑endian
    pos += LEN_SIZE;
    if (pos + msgLen <= chunk.length) {
      const msg = chunk.slice(pos, pos + msgLen);
      console.log('Prefixed message:', msg.toString());
      pos += msgLen;
    } else {
      // incomplete message – break and wait for more data
      break;
    }
  }
  // store remaining bytes (pos..end) in a persistent buffer
});

Method 3: Delimiter‑based messages

For text‑based protocols, use a delimiter such as \n to separate messages.

socket.on('data', (chunk) => {
  const lines = chunk.toString().split('\n');
  for (const line of lines) {
    if (line) {
      console.log('Delimited message:', line);
    }
  }
});

3. Handling partial data across multiple data events

TCP does not guarantee that a complete message arrives in a single data callback. You must buffer incomplete data and process it only when enough byte are available.

let buffer = Buffer.alloc(0);

socket.on('data', (chunk) => {
  buffer = Buffer.concat([buffer, chunk]);

  while (buffer.length >= LEN_SIZE) {
    const msgLen = buffer.readUInt32BE(0);
    if (buffer.length >= msgLen + LEN_SIZE) {
      const msg = buffer.slice(LEN_SIZE, LEN_SIZE + msgLen);
      console.log('Complete message:', msg.toString());
      buffer = buffer.slice(LEN_SIZE + msgLen);
    } else {
      break;
    }
  }
});

4. Choosing a framing method

  • Fixed length: simplest, but only works if all messages have identical size.
  • Length prefix: flexible for variable‑length messages; requires careful handling of multi‑byte integer encoding (big‑endian vs little‑endian).
  • Delimiter: easy to implement for text, but you must escape the delimiter inside the message if it can appear naturally.

Always design your protocol with error handling for malicious or malformed data. For production systems, consider using a library like length‑prefix or existing framing protocols (e.g., HTTP, WebSockets).

Tags: Node.js tcp Buffer message framing byte stream

Posted on Sat, 09 May 2026 06:54:08 +0000 by SnakeFox