num
: Lessons in Rust optimisation
- Published: March 15th 2025
Recently, this one embarked on an experiment (ridiculously late at night, it should add as per usual with its neurodivergent little brain) to learn how to format integers as ASCII strings. You may ask why but that isn't the point of this blog post. The point is to condense the lessons this unit learned about optimisation which allowed it to go from printing 100000000 in 16.74 seconds to 0.001 seconds.
This blog post will go over the generic optimisations which can be applied to a vast range of Rust programs. If you want to see each optimisation done at a code level including specific optimisations to this type of program, click here to be sent to the commit history of num
.
First lesson: conversions tank performance
At first in num
, this one was using a Vec<char>
so that it did not have to concatenate a string at the end of the function and increased each character by converting it to a u8
and adding it to a constant u8
value and then converting it back to a char
.
While a lot of other programming languages such as C don't make a distinction between their equivalents of char
and u8
in a lot of cases (sometimes having a string being just an array of bytes), Rust does since Rust uses UTF-8 under the hood so conversions between char
and u8
cost time and a lot of time at that.
After switching from using Vec<char>
to Vec<u8>
and directly adding to and from the u8
without conversions, this one managed to cut time to print 100,000,000 from 14.04 seconds to 8.33 seconds.
Second lesson: cloning and re-allocating data on the heap is convenient but costly
There are a few ways of modifying a parameter Vec<T>
in a function and having it affect said parameter outside the scope of the function: one way being to clone the Vec<T>
and return it from the function, another being to pass the Vec<T>
as a mutable reference.
If you find yourself doing the first often, stop and do the second. For example switching from the first method to the second would result in fn foo(bar: Vec<T>) -> Vec<T>
becoming fn foo(bar: &mut Vec<T>)
and calling it going from bar = foo(bar);
to just foo(&mut bar);
.
The first method of modifying a parameter and changing it is inefficient and causes time to be wasted re-allocating the data on the heap. Doing this is inefficient because of the fact that Vec<T>
is a dynamic size and can be as small as 0 elements to usize::MAX
elements long. This results in more than the amount of data being stored in Vec<T>
being assigned to help manage things such as resizing.
Modifying the Vec<T>
directly through a mutable reference saves cloning the memory and then effectively deleting it at the end of the scope of the function, removing wasteful allocations. Doing this optimisation on num
allowed me to go from 8.33 seconds to 7.55 seconds to print 100,000,000.
Third lesson: Use arrays instead of vectors for data with a known maximum or fixed length
As established in the previous section, the heap can be inefficient. What's more efficient than the heap then? The stack! This one isn't going to get into the logistics and the theory of the stack but in short, arrays are like vectors but more efficient. The cost of arrays are the fixed length: the length has to be defined at compile time and cannot be changed, they do not shrink, they do not grow.
The upshot of this is that the memory that arrays have is much faster to use than vectors because the compiler knows exactly how long it will be and where it will be, reducing checks and other such things.
For example in num
, it replaced the Vec<u8>
it was using with a [u8; 20]
since the largest 64 bit number has 20 digits so the formatted string can only be 20 characters long at most. This greatly improved speed from 5.76 seconds to 0.34 seconds to print 100,000,000 showing the great speed ups that an array can bring over a vector if used in a case where data has a maximum or fixed length especially in combination with mutable references.
Overall lesson
The first version of a program you write will most likely not be the most efficient it can be, that's okay. Sometimes efficiency is less important and functioning is the most important thing even if it is a bit slow. Remember to get a solid foundation first before optimising and remember to enjoy your programming! :)