Map internals in Go 1.24

Maps in Go 1.24 have undergone a complete rewrite, significantly improving their performance. The new implementation draws inspiration from Google's high-performance hash map design known as Swiss Tables.

Under the new implementation, a Map is a collection of groups of 8 key/value pairs. Each group contains 8 slots of data in addition to a metadata field that holds a control word. The control word is 64 bits in size, each byte representing one of the slots.

+---------------------+
|  Control Word (64b) |  <- 8 bytes (1 byte per slot)
+---------------------+
|  Key 0  |  Value 0  |  
|  Key 1  |  Value 1  |  
|  Key 2  |  Value 2  |  
|  Key 3  |  Value 3  |  
|  Key 4  |  Value 4  |  
|  Key 5  |  Value 5  |  
|  Key 6  |  Value 6  |  
|  Key 7  |  Value 7  |  
+---------------------+

These groups are distributed across multiple tables, with each table containing a number of groups that is always a power of two. A power of two is a number that can be written as (2^n), such as 1, 2, 4, 8, etc...

+---------------------+
|       Table 0       |  
+---------------------+
+---------------------+
|  Control Word (64b) |  <- Group 0
+---------------------+
|  Key 0  |  Value 0  |   
...
|  Key 7  |  Value 7  |  
+---------------------+
+---------------------+
|  Control Word (64b) |  <- Group 1
+---------------------+
|  Key 0  |  Value 0  |  
...
|  Key 7  |  Value 7  |  
+---------------------+

The map holds a pointer to the array of tables it manages and replaces the array with another when the number of tables change.

Key Hashing

When you lookup or insert/update a key. A hashing function is used to generate a 64 bit hash code. This code is then divided into two parts:

The first 57 bits: referred to as h1.
The last 7 bits: referred to as h2.

The full hash code is used to determine which table stores a given key in a multi-table map. Within each table, the h1 portion of the code identifies the potential groups that may contain the key. Once a group is selected, the h2 portion is utilized in the control word to enable fast key lookups within that group.

Table Selection

The hash code of a key is used to determine which table the key belongs to. This selection process is done by using a portion of the hash code, specifically a number of bits, based on how many tables are available. For example, if there's only a single table, the hash code is not considered and all keys go straight to the table. If there are two tables, the first bit in the code is used to determine which table the key belongs to:

Hash:     : 101010110110
First bit : 1
Table     : 1

Hash:     : 001010110110
First bit : 0
Table     : 0

Now let's say there are 4 tables. In that case, two bits are used:

Hash:        : 101010110110
First 2 bits : 10
Table        : 2

Hash:        : 001010110110
First 2 bits : 00
Table        : 0

Hash:        : 011010110110
First 2 bits : 01
Table        : 1

Group Selection

Using the first 57 bits of the hash (h1), an internal function calculates the index of the group where a key should be stored. If that group is full (i.e., all 8 slots are occupied), quadratic probing is used to find the next group to check. Instead of simply checking the groups one after another (as in linear probing), quadratic probing uses a mathematical formula to determine the next group, with the step size increasing as the number of groups grows. This ensures that keys are distributed across as many groups as possible, reducing congestion and leading to faster lookups.

If linear probing were used instead, the runtime would just check the next group in line each time it encounters a full group. For example:

Try group 2 => (full).
Try group 3 => (also full).
Try group 4 => (available slot).

This could cause congestion in the groups at the beginning of the table, and lookups would become slower because the runtime would have to check more and more groups to find the desired key.

In contrast, with quadratic probing, the steps spread out across the table, reducing congestion and making lookups faster. Even when groups are full, the formula guarantees the next probe is further away in a way that distributes keys more evenly.

The control word

As mentioned earlier, the control word is 64 bits in size and each byte represents one of the slots. The last 7 bits of the hash (h2) are used to fill the last 7 bits of the byte representing the slot. The one extra bit is used to flag the slot as empty, deleted, or occupied.

Empty   : 10000000
Deleted : 11111110
Occupied: 0******* (where * is a bit of h2)

When looking up a key in a group, the control word is scanned to find potential slots in which the control byte matches h2 of the key. Once a slot is located, the runtime compares the input key with the key occupying the slot. If there's a match, the value is returned. If not, the next possible slot is checked.

In this example, occupied slots where the h2 value matches 1011011 appear twice in the control word. This means there are two slots where the key could potentially be stored:

[01011011]01011010[01011011]1111111011111110010101010111011100101001

When updating a key, the control word is first used to find if the key already exists in the group. If it doesn't exist in the group, or any other group in the probe sequence, the first available slot is used to insert the new key.

Key Lookups

The process for looking up a key in a map follows these steps:

Use the hash to determine the table.
Create a probe sequence for potential groups that may house the key (using h1).
Go over the groups one by one.
Scan the group's control word to find possible slots that match h2.
Check the key stored in each matching slot to verify it matches the input key.

While traversing the groups, the search will stop immediately if the runtime encounters a group with an empty slot. This optimization improves the lookup process by allowing the search to return early, rather than continuing to probe all the groups in the sequence unnecessarily.

For this to work effectively, it is important that a slot which previously held a deleted key is not marked as empty. Instead, it is marked as "deleted." This ensures that the search process can differentiate between truly empty slots and those that were once occupied but now deleted, allowing the lookup to behave correctly without mistakenly stopping prematurely.

The optimizations in this process stem from the following factors:

The runtime only needs to scan a single table in a large map.
It only checks a subset of groups within that table.
It uses the control word first to verify the key's existence, rather than iterating through each key-value pair individually.
It returns early if an empty slot is found.

Key Modifications

When inserting a value into a map m[k] = v, the runtime first attempts to find a slot that is occupied by the same key, using the same probing sequence as in key lookups.

If the runtime encounters a group that contains an empty slot during the search, it concludes that the key does not exist in the map. The empty slot is then used to store the new key-value pair.

If no empty slots are found, the runtime proceeds to look for the first slot that is marked as "deleted." This slot is available for reuse, and the new key-value pair is inserted into this deleted slot, ensuring that the table's capacity is efficiently used.

Key Deletion

When deleting a key from a map, the runtime searches for the key using the same probing sequence as in key lookups. For each group, the runtime checks for the key using the control word and then inspects the actual key stored in each matching slot.

If the key was found, the runtime checks if there are empty slots in the group. If yes, the key is marked as empty. Otherwise, it's marked as deleted to avoid interrupting lookups prematurely as explained in a previous section.

If the key wasn't found in a group while the group had empty slots. The search stops as this means the key doesn't exist in the map.

Growth

When you make a small map, size <= 8, the runtime creates a map with no tables and a single group. Inside that single group, all slots are either empty or occupied (no slots are marked as "deleted"). This is because a small map doesn't use the probing sequence for lookups, it simply iterates over the 8 slots of the single group until it finds a match.

As the map grows in size, or if you make a large map, the tables concept is introduced to speed up lookups by distributing the keys on multiple groups.

When a table is first created, it is allocated a capacity that allows it to hold between 16 and 1024 slots (equivalent to 2 to 128 groups). As the number of occupied slots approaches ~87% of the table capacity, the table undergoes a growth operation in which the runtime multiplies the current capacity by 2. If the new value is 1024 slots or less, the table is simply replaced with a new one that has double the old capacity. However, if expanding the table would push its capacity beyond 1024 slots, the table is instead split into two separate tables to manage the growing data efficiently.

When a table is split into two, the tables array in the map data structure is replaced with a larger array, whose size is always a power of two (1, 2, 4, 8, …). For example, imagine a scenario where a map initially has two tables (Table 0 and Table 1). If Table 0 needs to be split, the total number of tables increases to three. However, to maintain a power-of-two-sized array, the tables array grows to size 4 and multiple entries in the new array can point to the same table:

0 => Table 0a (split version of Table 0)
1 => Table 0b (split version of Table 0)
2 => Table 1
3 => Table 1

Using the key hash, what used to map to table 0 before the split, may now map to table 0a or 0b. And what used to map to table 1 before the growth, will keep pointing to table 1 (represented by index 2 and 3 in the array.):

Before Growth:

Selection bit = 0  -->  [index 0] -->  Table 0
Selection bit = 1  -->  [index 1] -->  Table 1

After Growth:

Selection bits = 00  -->  [index 0]  -->  Table 0a
Selection bits = 01  -->  [index 1]  -->  Table 0b
Selection bits = 10  -->  [index 2]  -->  Table 1
Selection bits = 11  -->  [index 3]  -->  Table 1

This techniques avoids rehashing the entire map and allows for individual tables to grow separately.

Wrapping Up

Reading the source code of Go 1.24's map implementation was a fascinating experience. One of the things I truly admire about Go is that its core functionality is implemented in the language itself. This makes it possible to dive deep into the internals, understand exactly how things work behind the scenes, and learn from the incredibly talented contributors who have shaped the language.

If you're curious and want to explore the source code yourself, you can start here.