This is an old post that has been migrated over one or more times. It may contain issues with certain images and formatting.
I am by no means proficient in Rust as you very well will see. As a result, these results should be taken with a pinch of salt. Any improvements are most welcome! There are some great discussions over at the /r/rust subreddit for more information about code optimisation that I highly recommend reading.
I recently had a task to implement a very simple Kohonen-Grossberg Neural Network which was particularly fun due to being relatively simple to implement.
My initial implementation was in Python with less than 60 lines of code. I wrapped a CLI around it and sat at around 90 lines of code.
After some thought, I figured that this would be a great learning experience for Rust (and it was) and would give me the opportunity to compare the two languages from multiple perspectives.
The following are the two implementations of KNN in Python and Rust. Keep in mind that this was originally written in Python and then ported to Rust.
The following is the Python implementation which you can find here.
=
=
return
=
=
return
=
return
=
= 0.0
+=
return
=
=
=
=
=
=
# Used to determine the number of columns in generate_random_units call
# assuming that the dataset is consistent in width
=
=
=
You can find the Rust implementation here.
extern crate clap;
use File;
use App;
use ;
If you want to test this with the sample dataset I used, you can find it here. Ofcourse, feel free to test this with larger datasets, both in dimensions and count.
The CLI for both implementations are essentially the same (minus defaults not being handled in Rust via Clap).
KohonenGrossberg-NN.exe --help
KohonenGrossberg-NN 1.0
Juxhin D. Brigjaj <juxhinbox at gmail.com>
A naive Kohonen-Grossberg Counterpropogation Network in Rust
USAGE:
KohonenGrossberg-NN.exe --csv-file <CSVFile> --epoch <Epoch> --learning-rate <R> --neurons <Neurons>
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
-f, --csv-file <CSVFile> Path to CSV file containing dataset
-e, --epoch <Epoch> Number of epochs to complete
-l, --learning-rate <R> Float indicating the learning rate (step) the network should use (i.e. 0.1)
-n, --neurons <Neurons> Number of neurons (units) to generate
Running it against the previous dataset with the following parameters.
KohonenGrossberg-NN.exe -f "\path\to\2d-unnormalised-dataset.csv" -e 100 -l 0.1 -n 3
[+] Normalising dataset
[-0.85355574, 0.5210016]
... TRUNCATED ...
[0.99772507, -0.06741386]
Starting Weights:
[0.8944668, 0.8694155]
[0.0746305, 0.84058756]
[0.34859443, 0.71816105]
[+] Running Epoch #100
[+] Final Weights:
[0.95343673, -0.24061918]
[-0.75190365, 0.6541567]
[0.43543196, 0.8938945]
If we take each distinct section and plot it using desmos we can observe the result.
Unfortunately these images have been lost in time. If demanded, I can attempt to recreate them.
Starting off with the Python implementation on a small dataset with 5000 data entries.
Measure-Command { python .\KohonenGrossberg-NN.py -f ".\2d-unnormalised-dataset.csv" -e 100 -l 0.1 -n 3 }
TotalSeconds : 4.2432031
TotalMilliseconds : 4243.2031
Compared with the Rust implementation on the same dataset.
Measure-Command { .\KohonenGrossberg-NN.exe -f ".\2d-unnormalised-dataset.csv" -e 100 -n 3 -l 0.1 }
TotalSeconds : 0.0667547
TotalMilliseconds : 66.7547
I kept increasing the dataset in checkpoints increasing over time up to 150k lines.
Python Rust Lines
72.114 13.1077 24
117.7726 18.2308 48
141.9611 18.8265 100
476.7803 21.0633 500
884.6529 23.1228 1000
4243.2031 66.7547 4999
124274.4748 1751.4639 150000
Whilst I did expect Rust to be faster, the margin seemed excessive. Profiling the Python implementation, I noticed the following.
Name # Calls Time(ms)
<built-in method builtins.round> 5508904 2701
calculate_nets 499900 3588
update_units 499900 1273
main 1 5098
Looks like the built-in round()
function took 53% of the execution time! Removing all floating-point round()
calls alters the result by a noticeable degree.
Python Rust Lines
59.4613 13.1077 24
64.5703 18.2308 48
79.1932 18.8265 100
174.9629 21.0633 500
304.2106 23.1228 1000
1321.0426 66.7547 4999
39051.7759 1751.4639 150000
Overall we are looking at a ~22x advantage that Rust has over Python. This seemed more reasonble compared to the previous result.
The idea behind this post was not to point out how fast Rust is compared to Python. Rather, how fast it is against the ease-of-use that Python provides.
There are many simple scenarios that Python handles beautifully if accurate assumptions are made. For example, getting the index of the largest f32
in a list.
In Python we can simply write the following.
=
The same can’t be said for Rust (see this StackOverflow question I posted).
let mut iter = nets.iter.enumerate;
let init = iter.next.unwrap;
let _i = iter.try_fold.unwrap.0;
Another example is generating the initial random weights. Using uniform distribution in python is beautiful.
Whereas with Rust it’s far more obscure and hard to understand, at the very least to an untrained eye (in functional programming).
Note — looking back at this blog post I would definitely say that the second example is a lot more intuitive to read!
let mut rng = thread_rng;
repeat_with
.take
.collect
Despite the odd quirks and syntax, I’m in love with Rust, its performance, compiler and community. That said, I still use Python every day for most operations and feel that I can use Rust and Python hand-in-hand to complement eachother when needed.I can also see myself creating effective Proof-of-Concepts in Python then porting them over to Rust