Key-Value vs. Unique-Key Abstractions
Java’s Map and Set interfaces serve distinct roles in efficient data retrieval. Map implements a key-value association model—ideal for scenarios like counting word frequencies or mapping identifiers to attributes. In contrast, Set models uniqueness-only lookups—suitable for membership checks, such as verifying whether a username exists in a registry.
Map<K, V> is not part of the Collection hierarchy; it declares operations for associating keys with values. Keys must be unique; values may repeat. Set<E> extends Collection, enforcing element uniqueness without value pairing.
Core Map Operations
public static void demonstrateMapUsage() {
Map<String, Integer> frequencyMap = new HashMap<>();
// Insert or update entry
frequencyMap.put("apple", 1);
// Retrieve value; returns null if absent
Integer count = frequencyMap.get("apple");
// Retrieve with fallback default
int safeCount = frequencyMap.getOrDefault("banana", 0);
// Remove mapping
frequencyMap.remove("apple");
// Check existence
boolean hasApple = frequencyMap.containsKey("apple");
boolean hasZero = frequencyMap.containsValue(0);
}
Key behavioral notes:
Mapimplementations includeHashMap(average O(1) access, permitsnullkeys/values) andTreeMap(O(log n), sorted iteration, disallowsnullkeys).- Keys are immutable in-place; updating a key requires removal + reinsertion.
- Keys can be extracted via
keySet()(aSet<K>), while values form aCollection<V>—not necessarily unique.
Core Set Operations
public static void demonstrateSetUsage() {
Set<String> userNames = new HashSet<>();
// Add element; no-op if duplicate
userNames.add("alice");
// Clear all entries
userNames.clear();
// Membership test
boolean exists = userNames.contains("bob");
// Remove element
userNames.remove("alice");
// Size and emptiness
int size = userNames.size();
boolean isEmpty = userNames.isEmpty();
}
Important distinctions:
HashSetuses hashing for average O(1) operations;TreeSetmaintains sorted order via red-black tree (O(log n)).LinkedHashSetpreserves insertion order using a doubly-linked list overlay.TreeSetrejectsnull;HashSetaccepts it.- Like
Mapkeys,Setelements cannot be mutated post-insertion without removal.
Hash Table Fundamentals
A hash table enables near-constant-time lookup by transforming keys into array indices via a hash function. For example:
int index = key.hashCode() % table.length;
When two distinct keys map to the same index—a collision—the structure must resolve it. Collisions are inevitable when the number of possible keys exceeds table capacity.
Collision Resolution Strategies
Open Addressing (Closed Hashing)
- Probes alternate slots within the same array upon collision.
- Linear probing: check
index + 1,index + 2, etc., wrapping modulo length. - Quadratic probing: try
index + i²fori = 1, 2, ...to reduce clustering. - Requires tombstone markers for deletions to preserve search integrity.
Separate Chaining (Open Hashing)
- Each bucket holds a linked list (or other collection) of colliding entries.
- Lookup involves computing the bucket index, then traversing its chain.
- Scalable: chains grow dynamically; load factor controls performance degradation.
Load Factor and Resizing
Load factor α = (number of entries) / (table capacity). Java’s HashMap defaults to 0.75. Exceeding this triggers rehashing: a larger table is allocated, and every existing entry is rehashed into the new structure.
Minimal Hash Table Implementation
class SimpleHashMap {
private static class Entry {
final int key;
int value;
Entry next;
Entry(int k, int v) {
this.key = k;
this.value = v;
}
}
private Entry[] buckets;
private int size;
private static final double LOAD_THRESHOLD = 0.75;
@SuppressWarnings("unchecked")
SimpleHashMap() {
this.buckets = new Entry[16];
}
void put(int key, int value) {
int idx = Math.abs(key) % buckets.length;
Entry current = buckets[idx];
// Update existing key
while (current != null) {
if (current.key == key) {
current.value = value;
return;
}
current = current.next;
}
// Insert new entry at head
Entry newNode = new Entry(key, value);
newNode.next = buckets[idx];
buckets[idx] = newNode;
size++;
if (size >= LOAD_THRESHOLD * buckets.length) {
resize();
}
}
int get(int key) {
int idx = Math.abs(key) % buckets.length;
Entry current = buckets[idx];
while (current != null) {
if (current.key == key) {
return current.value;
}
current = current.next;
}
return -1; // Not found
}
private void resize() {
Entry[] oldBuckets = buckets;
@SuppressWarnings("unchecked")
Entry[] newBuckets = new Entry[oldBuckets.length * 2];
buckets = newBuckets;
size = 0;
for (Entry head : oldBuckets) {
while (head != null) {
put(head.key, head.value); // Rehash each entry
head = head.next;
}
}
}
}
This implementation uses separate chaining with linear probing via linked lists per bucket, automatic resizing based on load factor, and basic integer hashing.