A recent industry-focused code review competition highlighted numerous common pitfalls found in production-level Java applications. The exercise involved identifying logical errors, security vulnerabilities, performance bottlenecks, and style inconsistencies across various code snippets. The following analysis breaks down key defect categories discovered during this audit process.
Variable Naming and Input Validation
The first layer of defects often resides in basic coding standards. A frequent observation involves typos in variable names, such as using cazeList instead of caseList. While seemingly minor, inconsistent spelling reduces code readability and maintenance efficiency. Furthermore, assuming that input parameters like caseList are never null is a critical oversight. Accessing the first element (get(0)) without an explicit null check can lead to immediate runtime exceptions. Robust code must validate inputs at entry points.
// Improved Version
public void process(List testCaseList) {
if (testCaseList == null || testCaseList.isEmpty()) {
return;
}
// Safe access to data
}
Exception Handling Logic
Catching exceptions to alter program flow is generally discouraged. A common mistake involves placing business logic, such as creating a database table, inside a catch block intended for unexpected errors. Table existence checks should occur before the operation attempt, typically wrapped in a standard conditional check rather than relying on exception propagation. This improves performance and clarity.
// Better Practice: Pre-check instead of Catch-block logic
if (!tableExists(tableName)) {
createTable(tableName);
}
batchInsert();
Spring Transaction Management
A classic anti-pattern observed involves the internal invocation of methods annotated with @Transactional. Spring AOP manages transactions through proxies. If a method calls another transactional method within the same class (self-invocation), the proxy is bypassed, and the transactional semantics do not apply. To resolve this, transactional methods should be called via the injected proxy reference.
Singleton Pattern Implementation
In concurrent environments, the Double-Checked Locking (DCL) singleton pattern requires careful implementation. Simply declaring the instance variable is insufficient. It must be marked with the volatile keyword to prevent instruction reordering during object initialization. Without volatile, other threads might retrieve an uninitialized object instance, leading to potential crashes or data corruption.
Resource Management in Streams
Operating system resources, such as file output streams, require deterministic cleanup. Failing to close streams within a finally block or utilizing Java 7+ try-with-resources syntax can result in file lock issues and memory leaks. Additionally, log statements involving stack traces must include the exception object as a second argument (e.g., logger.error("msg", e)). Passing only a message string discards the stack trace, hindering root cause analysis.
Data Processing Optimization
Efficiency is paramount in bulk operations. Querying large datasets entirely into memory before filtering is inefficient compared to pushing the filter criteria (like pagination limits) to the database level. Fetching 100,000 records to sort the top 10 locally wastes bandwidth and memory. Concurrent processing also warrants attention: parallel streams should not be used if the underlying collection is mutable during iteration without synchronization.
Concurrency and Thread Safety
Using non-thread-safe collections like ArrayList within multi-threaded contexts poses significant risks. Specifically, calling addAll on a shared list from multiple threads can corrupt internal state. While parallel streams iterate safely over immutable views, mutating a shared list concurrently requires protection. Similarly, iterating over an aray or list while removing elements triggers ConcurrentModificationException. The safe approach is using an iterator for removals.
Another subtle concurrency issue involves modifying static fields of enum instances concurrently. Since enums are singletons, mutating their internal state (e.g., appending an ID to a reason string) across multiple threads leads to race conditions where logs reflect incorrect correlation IDs.
Database Interaction and Injection Risks
Passing table names or column identifiers directly into SQL queries creates SQL injection vulnerabilities. Parameterized queries should strictly separate executable code from dynamic data. Deep paging strategies (e.g., skipping milllions of rows) degrade performance significantly; cursor-based pagination using range queries (last seen ID) is preferred for large datasets.
Serialization Consistency
Mixing different serialization libraries (e.g., FastJSON for serialization and Gson for deserialization) can cause data loss. Specific field naming conventions differ between implementations. For instance, boolean getters prefixed with is might be serialized differently depending on the library version. Using wrapper types (e.g., Boolean instead of boolean) is safer to distinguish between missing data (null) and false values (false).
Thread Pool Configuration
Default thread pools like newCachedThreadPool create new threads without limit, which can exhaust system resources under heavy load, especially with high-volume tasks like processing URLs. A bounded executor with a fixed core size and rejection policies tailored to application needs is recommended. Never leave rejection handlers silent; they must log errors to ensure operational visibility.
String Manipulation Security
Regular expressions passed as arguments to string replacement functions can be exploited if user input contains special regex characters. Special characters like * or . in input strings may trigger unintended behavior during replacements. Literal matching or escaping special characters mitigates this risk.
Atomic Operations in Caching
Setting keys in distributed caches (e.g., Redis) followed by setting expiration times must be treated as a single atomic operation. Performing set(key, val) then expire(key, time) separately leaves a window where the key exists but lacks a timeout, potentially causing cache bloat. Using setnx or native TTL support ensures atomicity.
Coding Standards and Equality
Comparing Integer objects using == is unreliable due to caching mechanisms for values between -128 and 127. Always use the equals method for object comparison. Custom objects added to Sets or HashMaps must override both hashCode and equals to ensure correct identity mapping and containment checks.
Observability and Logging
Debugging production issues requires contextual data. Log entries should include unique identifiers (like Order IDs or Trace IDs) to correlate events across distributed systems. Generic logs without context are ineffective for incident response.