In Java applications, it's common to convert CSV (Comma-Separated Values) data into a binary stream—for instance, when preapring files for download or transmitting structured data over a network. This article demonstrates how to read CSV content and serialize it in to a binary byte stream using standard I/O classes and the OpenCSV library.
1. Reading CSV Content
To parse a CSV file efficiently, third-party libraries like OpenCSV simplify handling edge cases such as quoted fields and embedded commas. The following example reads a CSV file line by line:
import com.opencsv.CSVReader;
import java.io.FileReader;
import java.io.IOException;
try (CSVReader csvReader = new CSVReader(new FileReader("data.csv"))) {
String[] record;
while ((record = csvReader.readNext()) != null) {
// Each 'record' is an array of field values
}
} catch (IOException ex) {
ex.printStackTrace();
}
2. Serializing CSV Rows to a Binary Stream
Once parsed, each row can be reconstructed as a comma-separated string and encoded into bytes. These byte are accumulated in a ByteArrayOutputStream, which acts as an in-memory binary buffer:
import com.opencsv.CSVReader;
import java.io.*;
import java.nio.charset.StandardCharsets;
try (CSVReader reader = new CSVReader(new FileReader("data.csv"));
ByteArrayOutputStream buffer = new ByteArrayOutputStream()) {
String[] row;
while ((row = reader.readNext()) != null) {
String line = String.join(",", row);
buffer.write(line.getBytes(StandardCharsets.UTF_8));
buffer.write('\n'); // Preserve line breaks
}
byte[] binaryPayload = buffer.toByteArray();
// 'binaryPayload' now holds the full CSV as a binary stream
} catch (IOException e) {
e.printStackTrace();
}
Note the explicit use of StandardCharsets.UTF_8 to ensure consistent encoding, and the addition of newline characters to maintain row separation in the output.
3. Writing the Binary Stream to a File or HTTP Response
The resulting byte array can be written to a file:
try (FileOutputStream fos = new FileOutputStream("output.bin")) {
fos.write(binaryPayload);
}
Or served directly in a web application via an HTTP servlet:
import javax.servlet.http.*;
import java.io.*;
public class CsvDownloadServlet extends HttpServlet {
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws IOException {
try (CSVReader reader = new CSVReader(new FileReader("data.csv"));
ByteArrayOutputStream buffer = new ByteArrayOutputStream()) {
String[] row;
while ((row = reader.readNext()) != null) {
buffer.write(String.join(",", row).getBytes(StandardCharsets.UTF_8));
buffer.write('\n');
}
byte[] content = buffer.toByteArray();
resp.setContentType("application/octet-stream");
resp.setContentLength(content.length);
resp.setHeader("Content-Disposition", "attachment; filename=\"data.bin\"");
try (OutputStream out = resp.getOutputStream()) {
out.write(content);
}
} catch (IOException e) {
resp.sendError(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);
}
}
}
Understanding Binary Streams in Java
Binary streams in Java are handled through the InputStream and OutputStream abstract classes. Unlike character streams, they operate on raw bytes and are suitable for non-textual data or when precise byte-level control is needed.
- InputStream subclasses like
FileInputStreamandByteArrayInputStreamread raw bytes. - OutputStream subclasses like
FileOutputStreamandByteArrayOutputStreamwrite raw bytes.
Key methods include: read(byte[] b) for bulk reading, and write(byte[] b) for bulk writing. Always close streams using try-with-resources to prevent resource leaks.