System Architecture & Workflow
Implementing client-side segmented file transfer requires a deterministic pipeline: acquiring the binary payload, calculating segmentation parameters, distributing processing across auxiliary threads, generating integrity hashes, and transmitting payloads to a backend service.
Core Configuration & Metadata Calculation
The segmentation logic relies on a fixed byte threshold, typically configured at 5 * 1024 * 1024 (5 MB). Total segments are derived by dividing the total file size by this threshold. To prevent main-thread blocking during intensive cryptographic operations, processsing is delegated to Web Workers. The optimal thread pool size aligns with navigator.hardwareConcurrency, defaulting to a safe baseline when unavailable.
Segment distribution requires precise index mapping. Each worker receives a designated start and end range. Since total segments rarely divide evenly across available cores, the final worker handles the remainder. A completion tracker aggregates worker responses, resolving the final payload once all auxiliary threads terminate.
Parallel Processing & Incremental Hashing
File segmentation utilizes the Blob.prototype.slice method combined with FileReader to extract ArrayBuffer data. Integrity verification employs the SparkMD5 library. Rather than loading the entire document into memory, the hash is computed incrementally within each worker thread. Each segment appends its binary data to the MD5 instance, finalizing the hash upon completion before releasing the buffer, effectively capping memory consumption.
Server-Side API Contract
The back end must expose three endpoints to support resumable transfers:
- Integrity Verification: Accepts the file hash. Returns a status indicating whether the upload is complete (enabling instant transfer), partially complete (returning the next required segment index), or missing.
- Segment Transmission: Receives individual chunks with their corresponding metadata.
- Assembly Trigger: Validates the hash and instructs the server to concatenate stored segments into the final file.
Client-Side Implementation
The architecture is modularized across an initialization script, a coordinator module, a worker definition, and a segment extractor.
Coordinator Module (segmentator.js)
Handles range calculation, thread instantiation, and response aggregation.
const SLICE_BYTE_LIMIT = 1024 * 1024 * 5;
const AVAILABLE_CORES = navigator.hardwareConcurrency || 4;
export async function segmentDocument(targetFile) {
return new Promise((resolve) => {
const totalSegments = Math.ceil(targetFile.size / SLICE_BYTE_LIMIT);
const segmentsPerWorker = Math.ceil(totalSegments / AVAILABLE_CORES);
let activeThreads = 0;
let completedThreads = 0;
const workerResults = new Array(totalSegments);
for (let coreIndex = 0; coreIndex < AVAILABLE_CORES; coreIndex++) {
const rangeStart = coreIndex * segmentsPerWorker;
if (rangeStart >= totalSegments) break;
const rangeEnd = Math.min(rangeStart + segmentsPerWorker, totalSegments);
activeThreads++;
const worker = new Worker('./chunk-worker.js', { type: 'module' });
worker.onmessage = ({ data }) => {
const { payload, origin } = data;
for (let i = 0; i < payload.length; i++) {
workerResults[origin + i] = payload[i];
}
worker.terminate();
completedThreads++;
if (completedThreads === activeThreads) {
resolve(workerResults);
}
};
worker.postMessage({
sourceBlob: targetFile,
byteLimit: SLICE_BYTE_LIMIT,
startOffset: rangeStart,
endOffset: rangeEnd
});
}
});
}
Initialization Entry (entrypoint.js)
Binds the UI input to the segmentation pipeline.
import { segmentDocument } from './segmentator.js';
const fileSelector = document.querySelector('input[type="file"]');
fileSelector.addEventListener('change', async ({ target }) => {
if (!target.files.length) return;
const processedSegments = await segmentDocument(target.files[0]);
console.log('Segments ready for transmission:', processedSegments);
});
Worker Definition (chunk-worker.js)
Manages concurrent execution of segment extraction and hashing.
import { extractSegment } from './segment-extractor.js';
self.onmessage = async ({ data }) => {
const { sourceBlob, byteLimit, startOffset, endOffset } = data;
const processingQueue = [];
for (let idx = startOffset; idx < endOffset; idx++) {
processingQueue.push(extractSegment(sourceBlob, idx, byteLimit));
}
const resolvedPayload = await Promise.all(processingQueue);
self.postMessage({
payload: resolvedPayload,
origin: startOffset
});
};
Segment Extractor (segment-extractor.js)
Reads binary slices and computes cryptographic fingerprints.
import SparkMD5 from './spark-md5.min.js';
export function extractSegment(blob, segmentIndex, sizeLimit) {
return new Promise((resolve) => {
const byteStart = segmentIndex * sizeLimit;
const byteEnd = byteStart + sizeLimit;
const hashEngine = new SparkMD5.ArrayBuffer();
const reader = new FileReader();
reader.onload = ({ target: { result } }) => {
hashEngine.append(result);
const segmentBlob = blob.slice(byteStart, byteEnd);
resolve({
index: segmentIndex,
data: segmentBlob,
fingerprint: hashEngine.end(),
byteRange: { start: byteStart, end: byteEnd }
});
};
reader.readAsArrayBuffer(blob.slice(byteStart, byteEnd));
});
}