Hash Maps
The most versatile interview data structure — internals, patterns, and when they fail
Hash Maps
If you could only bring one data structure to an interview, bring the hash map. It turns O(n) lookups into O(1), and that single property solves an astonishing number of problems. Most "optimize from brute force" interview answers boil down to "use a hash map."
How Hash Maps Work
A hash map stores key-value pairs in an array of "buckets." The process:
- Hash the key -- convert the key to an integer (the hash code)
- Map to a bucket -- hash code % array size = bucket index
- Handle collisions -- when two keys map to the same bucket
Collision Resolution
Chaining (most common): Each bucket holds a linked list. Collisions just append to the list.
bucket[0] -> (key1, val1) -> (key5, val5)
bucket[1] -> (key2, val2)
bucket[2] -> null
bucket[3] -> (key3, val3) -> (key7, val7) -> (key9, val9)Open addressing: On collision, probe other buckets (linear probing, quadratic probing, double hashing). Used by Python's dict.
Performance
| Operation | Average | Worst case |
|---|---|---|
| Get | O(1) | O(n) |
| Set | O(1) | O(n) |
| Delete | O(1) | O(n) |
| Has key | O(1) | O(n) |
Worst case happens when all keys hash to the same bucket. In practice with a decent hash function, this essentially never happens.
Load factor: When the ratio of entries to buckets exceeds a threshold (typically 0.75), the map resizes -- doubles the bucket array and rehashes everything. This is O(n) but amortized O(1) per insertion.
Pattern 1: Frequency Counting
Count occurrences of each element.
function topKFrequent(nums: number[], k: number): number[] {
// Step 1: Count frequencies
const freq = new Map<number, number>();
for (const num of nums) {
freq.set(num, (freq.get(num) ?? 0) + 1);
}
// Step 2: Bucket sort by frequency
const buckets: number[][] = Array.from({ length: nums.length + 1 }, () => []);
for (const [num, count] of freq) {
buckets[count].push(num);
}
// Step 3: Collect top K from highest frequency
const result: number[] = [];
for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) {
result.push(...buckets[i]);
}
return result.slice(0, k);
}
// Time: O(n), Space: O(n)Character Frequency
function firstUniqChar(s: string): number {
const count = new Map<string, number>();
for (const c of s) {
count.set(c, (count.get(c) ?? 0) + 1);
}
for (let i = 0; i < s.length; i++) {
if (count.get(s[i]) === 1) return i;
}
return -1;
}Pattern 2: Two Sum (Complement Lookup)
The canonical hash map problem. For each element, check if its complement exists.
function twoSum(nums: number[], target: number): [number, number] {
const seen = new Map<number, number>(); // value -> index
for (let i = 0; i < nums.length; i++) {
const complement = target - nums[i];
if (seen.has(complement)) {
return [seen.get(complement)!, i];
}
seen.set(nums[i], i);
}
throw new Error('No solution');
}
// Time: O(n), Space: O(n)This pattern generalizes: whenever you need to find a pair that satisfies some relationship, hash the first element and look up the second.
Pattern 3: Group By
Group elements by a computed key.
function groupAnagrams(strs: string[]): string[][] {
const groups = new Map<string, string[]>();
for (const str of strs) {
// Sort characters as the group key
const key = [...str].sort().join('');
if (!groups.has(key)) groups.set(key, []);
groups.get(key)!.push(str);
}
return [...groups.values()];
}
// Time: O(n * k log k) where k = max string lengthOptimization: Instead of sorting, use a character count as the key:
function groupAnagramsFast(strs: string[]): string[][] {
const groups = new Map<string, string[]>();
for (const str of strs) {
const count = new Array(26).fill(0);
for (const c of str) {
count[c.charCodeAt(0) - 'a'.charCodeAt(0)]++;
}
const key = count.join(',');
if (!groups.has(key)) groups.set(key, []);
groups.get(key)!.push(str);
}
return [...groups.values()];
}
// Time: O(n * k) -- no sorting neededPattern 4: Seen / Visited Tracking
Track which elements you have encountered.
function containsDuplicate(nums: number[]): boolean {
const seen = new Set<number>();
for (const num of nums) {
if (seen.has(num)) return true;
seen.add(num);
}
return false;
}
function longestConsecutive(nums: number[]): number {
const numSet = new Set(nums);
let longest = 0;
for (const num of numSet) {
// Only start counting from the beginning of a sequence
if (!numSet.has(num - 1)) {
let length = 1;
let current = num;
while (numSet.has(current + 1)) {
current++;
length++;
}
longest = Math.max(longest, length);
}
}
return longest;
}
// Time: O(n), Space: O(n)LRU Cache
A classic design problem that combines a hash map with a doubly linked list.
class LRUCache {
private capacity: number;
private cache: Map<number, number>;
constructor(capacity: number) {
this.capacity = capacity;
// Map preserves insertion order in JS/TS
this.cache = new Map();
}
get(key: number): number {
if (!this.cache.has(key)) return -1;
// Move to end (most recently used)
const val = this.cache.get(key)!;
this.cache.delete(key);
this.cache.set(key, val);
return val;
}
put(key: number, value: number): void {
if (this.cache.has(key)) {
this.cache.delete(key);
}
this.cache.set(key, value);
if (this.cache.size > this.capacity) {
// Delete least recently used (first inserted)
const firstKey = this.cache.keys().next().value!;
this.cache.delete(firstKey);
}
}
}
// All operations: O(1)Note: This uses the fact that JavaScript Maps preserve insertion order. In other languages, you would use a doubly linked list + hash map explicitly.
When Hash Maps Fail
Hash maps are not always the answer:
| Situation | Better alternative |
|---|---|
| Need sorted order | Balanced BST, sorted array |
| Need range queries | Balanced BST, segment tree |
| Need min/max quickly | Heap |
| Keys are small integers (0..n) | Plain array -- faster constant |
| Memory is extremely tight | Sorted array + binary search |
| Need persistent/immutable version | Trie, immutable tree |
Hash map overhead: Each entry has overhead for the hash, pointers, and load factor slack. For small integer keys where the range is known, a plain array is both faster and more memory-efficient.
Tips for Interviews
- Default to hash map when you need O(1) lookup. It is almost always the right instinct.
- Consider the key carefully. The grouping key determines the pattern. Anagram -> sorted string. Coordinate -> "x,y" string.
- Watch out for hash collisions in custom keys. If you use a string as a compound key, make sure the encoding is unambiguous.
- Sets are hash maps without values. Use
Setfor membership testing,Mapwhen you need associated data. - Frequency counting + bucket sort gives O(n) top-K without a heap. Useful when the frequency range is bounded by n.