Introduction to HashMap

Principle

Its in essence a advanced array. It’s implementation is based on an array

  • Key is unique and values can be duplicated
key -> hash function -> index -> value

If we can make sure hash function has time complexity of O(1), then we can achieve O(1) time complexity for search, insert and delete operations.

Hash Collision

Its possible that two different keys generate the same index, this is called hash collision.

  • Two keys can generate the same index
  • Two common ways to resolve collision:
    • Chaining: use a linked list to store multiple values at the same index
    • Open addressing: find the next available index

Naturally, to avoid hash collision(if we already have an awesome hash function, there are too many kv pairs in the HashMap), we want to have a large array size (resizing). But this will lead to memory waste. So we need to find a balance between size of array and number of kv pairs. This is called load factor.

\[\text{Load Factor} = \frac{\text{Number of Elements}}{\text{Size of Array}}\]

Resizing happens when load factor exceeds a certain threshold.

In addition, the load factor for chaining can be infinite, but for open addressing, it must be less than 1.

Additional Notes

  • Dont rely on iterative order of HashMap
  • Key should be immutable(something that never change)

Implementation

Chaining

TODO

Open addressing

TODO

LinkedHashMap

HashMap is unordered, so what if we want to maintain the insertion order? We can use LinkedHashMap, which is a combination of HashMap and Doubly Linked List(A Must).

ArrayHashMap

What if we want to add randomKey() function to HashMap, which returns a random key from the HashMap. We can use ArrayHashMap, which is a combination of HashMap and Dynamic Array.

It’s a bit subtle. The Dynamic array will be list of kv nodes. The HashMap will have key to index mapping.

To avoid the cost of O(n) when deleting an element from the dynamic array, we can swap the element to be deleted with the last element, and then pop the last element. This way, we only need to update the index of the swapped element in the HashMap.