Handmade Rust Part 3: Containers

The most commonly used kinds of containers are arrays and maps. Pretty much any other container type can be built using those two, so that’s what we’ll build today!

The Rust standard library has Vec which is a dynamic array and HashMap and BTreeMap which are two kind of maps, the first one relying on the key being hashable, and the second one relying on the key being sortable.

The hash map has the cost of hashing the key but then lookup is constant time (in theory at least), whereas the ordered map only compares the keys, but lookup is logarithmic time. I chose to implement a hash map because it’s the one I find myself using most of the time, but ordered map are interesting as well.

Of course, just like for Unq, we won’t be making simple replacements, instead we’ll be making the most minimal containers necessary for now and add features later as needed, but we’ll be make them allocator aware.

As always, full source code is available at github.com/stevenlr/HandmadeRust.

RawArray: the dumbest container

All containers will somehow need to allocate aggregates of data. We’ll be making the RawArray type for allocating contiguous regions of memory, resizing them, and then deallocating the memory. This container will not be taking care of initializing memory, calling destructors, or anything like that.

pub struct RawArray<T, A: Allocator>
{
    pub ptr: *mut T,
    pub capacity: usize,
    alloc: A,
    _phantom: PhantomData<T>,
}

Let’s analyze this. First we have two type parameters. T is the type contained in the array. A is an allocator, as defined in the first part of this series. ptr is the pointer to the memory region containing our data. capacity is the size of that region. Finally, _phantom is used to tell the compiler that this struct acts like it contains some data of type T.

When creating a RawArray, we simply set the pointer to null and the capacity to 0. Unless T is zero-sized. In this case, we set the capacity to the max value of a usize. That way, the array will never grow, because we don’t need it to grow.

We also define a reserve method that takes in a new capacity. If the current capacity is larger that the new capacity, we don’t do anything. Otherwise we allocate a new memory region of the desired size, copy the old data using ptr::copy_from, and dealloc the old region. Simple! Note that I don’t know how this container will behave with the Pin type, and actually I don’t care for now.

Finally, when dropped, the raw array simply deallocates its memory if its pointer is not null.

Array: the less dumb container

Now that we can allocate raw arrays, we can define the proper Array type that can be pushed to and poped from.

pub struct Array<T, A: Allocator>
{
    size: usize,
    buffer: RawArray<T, A>,
}

As you can see, the Array simply wraps a raw array and keeps track of the number of elements actually contained. Unlike RawArray, Array will guarantee that memory is always properly initialized for the size first elements of the memory region, and it will drop the content of the elements when needed.

First let’s define the push operation. It needs to grow the container when its max capacity is reached, and then write the new value to the end of the used region.

pub fn push(&mut self, value: T)
{
    if self.size == self.buffer.capacity
    {
        self.grow_auto();
    }

    unsafe
    {
        self.buffer.ptr.offset(self.size as isize).write(value);
    }

    self.size += 1;
}

grow_auto is a simple method that calls reserve on the raw array and doubles the current capacity (or sets it to 1 if the array was empty). Note that when writing the value, we don’t use write_unaligned because we made sure that RawArray properly aligns its memory region.

pub fn pop(&mut self) -> Option<T>
{
    if self.size == 0
    {
        None
    }
    else
    {
        let value = unsafe
        {
            self.buffer.ptr.offset((self.size - 1) as isize).read()
        };

        self.size -= 1;
        Some(value)
    }
}

Nothing very exciting here. Compared to C++, pop not only resizes the array, but at the same time returns the element. In C++ you need to access the last element, then pop to have the same effect, and you also need to check if the array is empty or not. We can avoid all of this simply because we have the Option type!

pub fn clear(&mut self)
{
    if needs_drop::<T>()
    {
        unsafe
        {
            for i in 0..self.size
            {
                drop_in_place(self.buffer.ptr.offset(i as isize));
            }
        }
    }

    self.size = 0;
}

When clearing, we need to drop the elements contained in the array. We do this by calling drop_in_place which calls drop without needing any copy. This is of course unsafe. Then, drop for the array can be implemented by simply calling clear.

Finally, let’s talk about resize. There are two cases here:

  • When resizing to a smaller size, we need to drop all elements above the new size.
  • When resizing to a bigger size, we may need to reserve to the new size, and then to write a default value to each new slot. For this T needs to implement Clone. However we don’t really want our array to only be usable with cloneable types. Instead we’ll provide a resize method that takes in a closure that generates new Ts. When T is cloneable, that closure can simply be || value.clone(). We could also provide a resize_default method that only takes in the new size when T implements Default.
pub fn resize_with<F>(&mut self, new_size: usize, f: F)
    where F: Fn() -> T
{
    if new_size < self.size && needs_drop::<T>()
    {
        for i in new_size..self.size
        {
            unsafe
            {
                drop_in_place(self.buffer.ptr.offset(i as isize));
            }
        }
    }
    else if new_size > self.size
    {
        if new_size > self.buffer.capacity
        {
            self.reserve(new_size);
        }

        for i in self.size..new_size
        {
            unsafe
            {
                self.buffer.ptr.offset(i as isize).write(f());
            }
        }
    }

    self.size = new_size;
}

pub fn resize(&mut self, new_size: usize, value: T)
    where T: Clone
{
    self.resize_with(new_size, || value.clone());
}

Now, that’s all fine, but we’d like our array to behave like any other contiguous region of memory: we want to index in it (possibly even index ranges), iterate over it, and basically be able to do anything a slice can. There is a very simple way of doing this.

impl<T, A: Allocator> Deref for Array<T, A>
{
    type Target = [T];

    #[inline]
    fn deref(&self) -> &Self::Target
    {
        unsafe
        {
            slice::from_raw_parts(self.buffer.ptr, self.size)
        }
    }
}

Also implement DerefMut and voilĂ ! Our array can behave just like a slice, and that was free!

Actually we may want to implement Index so we don’t have to pay for creating a slice everytime we need to index. Careful generated assembly examination will tell if that’s needed.

HashMaps need to Hash

Before implementing the hash map, we need to somehow be able to hash some data. The Rust core library defines the Hash trait. This trait indicates that a type can be hashed using a Hasher. No hasher is provided by default in the core library so let’s make one! I chose to implement SipHash 2-4 because that’s what the standard library uses, and it looked simple enough.

I won’t bother you with the implementation, just know that the Hasher trait requires two methods:

  • write to run a few rounds of hashing on a slice of bytes
  • finish to actually finish hashing and return the hash.

It is also a good idea for the hasher to implement Default so we can use BuildHasherDefault which implements BuildHasher, basically a hasher factory.

HashMap, finally

Now we have everything we need to build our hash map. I won’t go into details of the implementation, but just go through some interesting implementation points.

I chose to implement something that resembles Google’s swiss table. The basic idea is that you have a first array that stores 7 bits of the hash of each element, the 8th bit is used to know when an element is present. When looking up a slot, this is used to quickly find a suitable slot without having to iterate though the main array that contains the keys and values. Yay for cache locality. Note that with some SIMD magic, this can be made even faster. In my implementation, this 8 bits hint is called a ShortBucket.

    1xxx xxxx = occupied, xxx xxxx = first 7 bits of the key hash.
    0000 0001 = free
    0000 0010 = deleted, required for linear probing

In my implementation, the main array is defined as containing Option<Bucket>. Bucket itself contains the hash of the key (so we don’t need to re-hash it), the key, and the value. This is not really efficient, but I didn’t want to have to deal with uninitialized values, and Option takes care of that. Note that I could have used RawArray instead, but I’d have had more boilerplate code to write. I’ll probably revise all of this one day, but for now it works well enough.

struct Bucket<K, V>
where
    K: Sized + Eq + Hash,
    V: Sized,
{
    hash: u64,
    key: K,
    value: V,
}

struct HashMapInner<K, V, A>
where
    K: Sized + Eq + Hash,
    V: Sized,
    A: Allocator + Clone,
{
    shortb: Array<ShortBucket, A>,
    buckets: Array<Option<Bucket<K, V>>, A>,
}

pub struct HashMap<K, V, A>
where
    K: Sized + Eq + Hash,
    V: Sized,
    A: Allocator + Clone,
{
    alloc: A,
    hash: BuildHasherDefault<SipHash>,
    inner: HashMapInner<K, V, A>,
}

The reason for having HashMapInner is for resizing: the simplest way of doing that is to create a new HashMapInner with the new capacity, then iterate over all elements and swap them into the new container, and finally replace the old HashMapInner with the new one.

For finding a suitable slot, we define the find_bucket method. For a given key, it returns either the matching slot, or the first one that the key can be inserted into.

  • It first hashes the key to find the start position for linear probing, then iterates over the short buckets array.
    • If the first 7 bits match the short bucket, we check the full hash in the buckets array. If that matches too we check the keys for equality. If that matches too, we found the correct bucket, we’re done.
    • If we find a deleted slot, we mark it as acceptable for insertion if none was marked as such before and then continue.
    • If we find a free slot, linear probing is done. If we had an acceptable bucket for insertion, we return it, otherwise we return the current one.
  • If we reach back the starting index, we’re done and we return the acceptable slot for insertion (which may be none).

This method alone is enough for lookup, insertion, deletion, etc.

When we want to count the elements in the map, we just need to iterate over the short buckets and count the ones that are occupied.

Currently, resizing only happens when we run out of space. Ideally it should be based on load factor.

There are many things that this hash map doesn’t do such as iterating over elements, but for now that’s enough…

HashSet is free

Since Array supports zero-sized types, …

pub struct HashSet<T, A>
where
    T: Sized + Eq + Hash ,
    A: Allocator + Clone,
{
    map: HashMap<T, (), A>,
}

… done! (Well you still need to implement a few methods but you know…)

Dynamic Strings

Dynamic strings just wrap a Array<u8, A>. We implement a method for creating them from str.

pub fn from_str(s: &str, alloc: A) -> Self
{
    let slice = s.as_bytes();
    let mut buf = Array::new(alloc);
    buf.resize(slice.len(), 0);

    unsafe
    {
        copy_nonoverlapping(s.as_ptr(), buf.as_mut_ptr(), slice.len());
    }

    Self { buf }
}

Just like Array can be coerced to a slice, we would like this string to coerce to a str so we can do anything str can with String. We implement Deref and DerefMut, and a as_str method.

impl<A: Allocator> String<A>
{
    #[inline]
    pub fn as_str(&self) -> &str
    {
        self
    }
}

impl<A: Allocator> Deref for String<A>
{
    type Target = str;

    #[inline]
    fn deref(&self) -> &str
    {
        unsafe
        {
            str::from_utf8_unchecked(&self.buf)
        }
    }
}

Now we can very easily implement useful traits such as Display, Hash or PartialEq:

impl<A: Allocator> hash::Hash for String<A>
{
    fn hash<H: hash::Hasher>(&self, h: &mut H)
    {
        hash::Hash::hash(self.as_str(), h);
    }
}

And that’s it for this container, for now. We’ll probably want to be able to concatenate and do other mutating operations on it eventually, but the basic structure is here.

Wrap up

The containers we have implemented so far should be enough for most things we need. Of course they lack some very useful features at the moment, but we’ll get back to them as needed. I don’t suggest anybody use those in production because, me not being an expert at Rust, there probably are many bugs remaining. I guess time will tell. Next time we’ll take a look at initializing Vulkan. Until then, you can contact me on Twitter.

Contents