Optimization
This handout is tied to the following lecture
13. OptimizationAssembly, CPU instructions and computer architecture
For the audience that takes this course it will be quite unlikely that you will ever have to actually read or write assembly. However, it can be important to understand that it exists, what it does, and how it works. For example, understanding that a square root is many many more instructions in the CPU than a multiplication can maybe help you write your algorithms in such a way that they are naturally fast.
Below are some good videos that go trough these concepts.
Computer Timescales Mapped onto Human Timescales - Computerphile
How CPU Memory & Caches Work - Computerphile
How Branch Prediction Works in CPUs - Computerphile
Theo - I finally know how CPUs work (w/ Casey Muratori)
Actually, the entire playlist with Matt Godbolt on Computerphile is fantastic and I can really recommend watching trough it at some point. And yes, that is the legendary Godbolt that created Compiler Explorer.
If you want to try writing some assembly yourself just to experiment, I have played around with fasm before which you can install a compiler for.
Additionally, while trying to figure out if my compiler actually managed to properly optimize my code and while reading the output assembly I used these resources:
- University of Virginia Computer Science - x86 Assembly Guide
- Agner Fog. Technical University of Denmark - Software optimization resources
- Agner Fog. Technical University of Denmark - Optimizing subroutines in assembly language
Comments to lecture
During the lecture we noticed the “page faults” statistic from perf. As we discussed, we assumed
it is quite close to cache misses (see CPU cache).
However, it turns out the answer is both yes and no, looking at definition of a page
fault, it is actually due to the much more complicated
layout of modern CPU’s which now has a memory management unit (MMU). So far I have not found good
documentation that details the relation between page faults and cache misses but it seems that in
most cases, a page fault is triggered by a cache miss, see this stackoverflow
post.
Optimization
Casey Muratori is a well known proponent of writing performant code by default and he has a wealth of knowledge on the subject. So if you want to get into optimization I can recommend having a listen to the series of lectures he has created on the subject, or at least the first episode that is linked below which is about his take on the subject.
Molly Rocket - Refterm Lecture Part 1 - Philosophies of Optimization
In one of the previous lectures I linked a video about optimizing hash maps. In case you did not watch it that time, I would recommend giving it a go for this lecture as now with this context in mind it might make a lot of sense!
strager - Faster than Rust and C++: the PERFECT hash table
Lastly, below are two very entertaining stores about optimization that I can recommend watching/listening to.
First is the story of hatetris on the episode CORECURSIVE #109: HATETRIS - Obsession, Friendship, and World Records. Hatetris is a version of Tetris where you always get the worst piece, and how people beat world records by both algorithmic advances but also optimization.
I like math, and sometimes I do some “recreational math” and watch videos about math. One evening I found a video about a hilarious story of community optimization. The problem itself is about finding five-letter words with twenty-five unique letters.
Stand-up Maths - Someone improved my code by 40,832,277,770%
Algorithms
In the lecture I mentioned the visualization of sorting algorithms and data representations used by chess engine algorithms, so below I have linked those videos!
Timo Bingmann - 15 Sorting Algorithms in 6 Minutes
Coding with Tom - I Coded a Chess Engine in 7 Languages to test Performance!
Testing language speed
Lastly I would like to highlight how bad most “language speed tests” are. They are almost never representations of the real world applications and sometimes just plain bad. Below is a longer discussion about why one of these tests that has made the rounds on the internet is not good.
ThePrimeTime - It’s Really Just That Bad