High-Performance Multipart Upload and Merge with MinIO

Multipart Upload Architecture

MinIO delivers near-linear scalability for large-object workloads by combining goroutine-level parallelism with a shared-nothing distributed design. Written in Go and bypassing tradisional GC pauses, it exposes the complete S3 multipart API surface:

  1. CreateMultipartUpload – reserve an upload session
  2. UploadPart – stream individual parts concurrently
  3. CompleteMultipartUpload – atomically stitch parts into a single object
  4. AbortMultipartUpload – rollback on failure or timeout

These primitives map one-to-one to MinIO SDK calls, eliminating the need for custom glue code.

Detecting an Existing Part

Before each upload, verify that the part has not already been persisted under the key chunks/{fileHash}/{partNumber}.

private boolean partExists(String bucket, String objectKey) {
    try {
        minioClient.statObject(
            StatObjectArgs.builder()
                          .bucket(bucket)
                          .object(objectKey)
                          .build()
        );
        return true;
    } catch (ErrorResponseException e) {
        if (e.errorResponse().code().equals("NoSuchKey")) {
            return false;
        }
        throw new RuntimeException("Unable to query object", e);
    }
}

Streaming a Part

public void uploadPart(MultipartFile source, String bucket, String key) {
    PutObjectArgs args = PutObjectArgs.builder()
                                      .bucket(bucket)
                                      .object(key)
                                      .stream(source.getInputStream(),
                                              source.getSize(),
                                              PartSizeCalculator.DEFAULT)
                                      .contentType(source.getContentType())
                                      .build();
    minioClient.putObject(args);
}

End-to-End Merge Pipeline

1. Authorization Gate

Query file_upload by composite key (userId, fileMd5) to confirm the user enitiated the upload.

SELECT id FROM file_upload WHERE user_id = ? AND file_md5 = ? AND status = 'PENDING';

2. Completenses Check

Compute expected part count:

long expectedParts = (fileSizeBytes + PART_SIZE - 1) / PART_SIZE;
long uploadedParts = jdbcTemplate.queryForObject(
        "SELECT COUNT(*) FROM chunk_info WHERE user_id = ? AND file_md5 = ? AND status = 'OK'",
        Long.class, userId, fileMd5);
if (uploadedParts != expectedParts) {
    throw new IncompleteUploadException(uploadedParts, expectedParts);
}

3. Atomic Merge

MinIO natively supports server-side concatenation via ComposeObject. The orchestrator iterates over the part keys and builds a single manifest:

List<ComposeSource> sources = IntStream.range(0, expectedParts)
    .mapToObj(i -> ComposeSource.builder()
                                .bucket("uploads")
                                .object("chunks/%s/%d".formatted(fileMd5, i))
                                .build())
    .toList();

minioClient.composeObject(
    ComposeObjectArgs.builder()
                     .bucket("uploads")
                     .object("final/%s".formatted(fileMd5))
                     .sources(sources)
                     .build()
);

4. Post-Merge Cleanup

  • Verify merged object size equals original file size via statObject.
  • Delete part objects in parallel using removeObjects.
  • Evict Redis keys: DEL chunks:{fileMd5}:*.
  • Update file_upload.status = 'COMPLETED'.
  • Produce a Kafka event FileMerged containing the object key and user ID.
  • Generate a presigned GET URL with one-hour expiry for immediate client access.

The entire pipeline is idempotent: re-invoking the merge endpoint with the same parameters is safe because the final object key is deterministic and prior cleanup removes any stale state.

Tags: MinIO multipart-upload s3 object-storage java

Posted on Fri, 08 May 2026 14:21:11 +0000 by Robert07