Avoiding Negative Hash Codes from Java's String.hashCode() in Sharding Scenarios

When sharding data by a non-integer primary key or a composite key (usually concatenated in to a string), developers often rely on Java's built-in String.hashCode() method to derive a hash value, which is then used for modulo-based sharding. While convenient, this approach can produce negative hash codes, leading to invalid shard numbers. Understanding why this happens and how to mitigate it is essential.

Java's String.hashCode() calculates a hash using the following formula:

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

Where s[i] is the i-th character of the string, n is the string length. The computation uses int arithmetic, meaning results wrap around on overflow. The implementation is:

public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;
        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

The hash is an int, so its range is Integer.MIN_VALUE (-2147483648) to Integer.MAX_VALUE (2147483647). When the string is long enough, the repeated multiplication by 31 and addition of character values can exceed Integer.MAX_VALUE, causing integer overflow and a negative result.

Here are some illustrative examples:

String hashStr0 = "35953305172933/";
System.out.println(hashStr0.hashCode());            // 2147483647   Integer.MAX_VALUE
System.out.println(Math.abs(hashStr0.hashCode()));  // 2147483647   Integer.MAX_VALUE
System.out.println("-------------------");
String hashStr = "359533051729330";
System.out.println(hashStr.hashCode());             // -2147483648  Integer.MIN_VALUE
System.out.println(Math.abs(hashStr.hashCode()));   // -2147483648  Integer.MIN_VALUE
System.out.println("-------------------");
String hashStr2 = "56800004874";
System.out.println(hashStr2.hashCode());            // -2082984168
System.out.println(Math.abs(hashStr2.hashCode()));  // 2082984168
System.out.println("-------------------");
String hashStr3 = "";
System.out.println(hashStr3.hashCode());            // 0
System.out.println(Math.abs(hashStr3.hashCode()));  // 0

Notice that "359533051729330" yields Integer.MIN_VALUE. Calling Math.abs() on Integer.MIN_VALUE returns Integer.MIN_VALUE (a negative number) because the absolute value overflows int range. This makes Math.abs() unreliable for ensuring non-negative results in edge cases.

To safely obtain a non-negative hash, use a bitwise AND with Integer.MAX_VALUE. This clears the sign bit while preserving the rest of the value:

int safeHash = str.hashCode() & Integer.MAX_VALUE;

Since Integer.MAX_VALUE is 0x7FFFFFFF, and its most significant bit is 0, the result of & is always non-negative (in the range 0 to Integer.MAX_VALUE). This approach handles all strings, including those whose hash code is Integer.MIN_VALUE.

Tags: java hashCode Sharding integer overflow String hashing

Posted on Tue, 19 May 2026 15:09:12 +0000 by brianbehrens