A New Way of Thinking
Motivation
Rust was my language of the year. You know, that thing where programmers set out to learn a new programming language every year. Usually not to be productive at it but to familiarize themselves with current trends in language design, implementation, and paradigms. I had heard a lot of good stuff about Rust and decided late last year to make it my 2018 language. I even attended the all-hands of the Rust Core Team, and participated in a few discussions. To be fair, it was in the hopes of meeting my internet friend Yehuda Katz in person. He wasn’t there.
A few days ago (it’s November already) I began serious study using both Philipp Oppermann’s Writing an OS in Rust and The Rust Programming Language Book. You’ve probably heard it said that Rust is a systems programming language. Well, an operating system is the ultimate system, so it’s as perfect a match as it gets. Plus both materials are available for free.
I’m only a few days in but I’ve been smacked by some of what I consider the best ideas in programming I’ve encountered yet. I’ve also come to grips with gaps in my knowledge, things I took for granted but shouldn’t have, and a better understanding of my current tools of the trade. In this post I talk about two things: the move (haven’t met in any language aside Rust) and scope, a concept familiar to most programming language.
The Move
In most programming languages, code similar to what’s below will compile or run. What we want to do has been done many times and over. Assign a value to a variable, bind one more variable to the same value but referring to it by the previous variable, and continue to use both variables in the same scope.
1// This code accessible on the Rust playground
2// at this link:
3// https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=1640e708979885d7219b84ec45917d76
4
5fn main() {
6 let s = String::from("hello");
7 let t = s;
8 println!("{} is the same as {}", t, s);
9}
In every language I’ve written useful software in this is perfectly valid. But not in Rust, because, basically it fucks up memory reclamation if you’re allowed to do that.
You see, Rust is not garbage collected. That means
you allocate and free memory yourself, C-style. If
this second statement was self-evident to you then
you just got smacked by the first falsehoods
programmers believe about garbage collection.
Rust isn’t GC-ed but you don’t manually malloc
and free
either. Instead heap memory is allocated
and freed as you enter and leave scopes. And this
is the difference between Rust and GC-ed
languages: Rust frees unused memory immediately
the scope has been exited while GC-ed languages
like Go, through a sophistication that I now no
longer find necessary, reserve dead objects to be
pruned later.
That was a digression. Back to what The Move is
or what happened in the code above. In simple
terms, s
moved to t
, after all who needs two
references to the same thing? Give me a good
reason why you’d have two variables point to the
same place in memory. I’ll wait. This is the
commonsense explanation of The Move. The computer
science-y explanation is: it avoid trying to free
the same memory address twice when we exit the
scope. Asking that the same memory location be
cleared twice, accidentally, can have dire
consequences. If in-between the two calls the
operating system offered the spot to another
process, we effectively made life difficult for
this process. If Newton lived long enough to make
laws of allocating and freeing memory, he’d have
quipped that in a given scope, number of
mallocs and frees are equal but opposite. In the
code above there’s only one allocation and it’s on
line 6. Line 7 doesn’t allocate. So there can be
only one freedom at the end of the scope. Thus
both s
and t
can’t survive to the end of the
scope.
The book has more details on this behavior, and
which types it applies to. It’s
here, with more details about the
Copy
trait, clone
-ing, and a lesson on how to
explain potentially incendiary concepts.
Scope
Scope can be an elusive concept, but only if you
let it be. You can get away with thinking of it as
the stuff in between the curly braces (in
programming languages with curly braces). In
Python it’s defined by the indentation, in Ruby a
mix of things. Let’s stick with the curly braces
for now. Everything in the opening ({
) and
closing (}
) braces are in the same scope,
including, wait for it, other scopes. It could be
scopes all the way down—it depends on how you
like your sugar-free coke, really. For example,
here’s an inception of scopes:
fn main() { // parent scope
let s = String::from("hello");
// len, a function, defines another scope.
// Thus the scope of `len` is a child of
//`main`'s scope.
let l = len(s);
println!("{} has length of {}", s, l);
}
fn len(s: String) -> usize { // child scope
s.len()
}
The asymmetry of life applies: child has access to everything in/on parent but not the other way round. By all means everything we already know and expect of scope inceptions.
That aside, the code above has been written many
times over in several languages, text editors, and
color schemes. And all variations compile or work.
But not with Rust. The innocent looking code isn’t
really innocent: it tries to double spend a
value. How? you ask. Well it sold ownership of
its variable s
to another scope (the len
function)
but tries to use it immediately after. What? sold
what? you ask again. Well, that’s how scopes work.
They own, and can bequeath ownership of their
resources. Thus passing s
to len
is a
statement from the main
scope to bequeath
ownership of s
to len
. Henceforth len
owns
s
. Which means that main
has lost every right
to use it (and so shouldn’t try to use it). It
also means that immediately the scope of len
is over,
the memory occupied by s
is freed. I wasn’t
expecting this. Born and bred on the jarring
principles of pass by value and pass by
reference, Rust’s ownership is a breath of
fresh air. Once a variable is passed to another
scope, the parent scope has lost ownership and
can’t use it again. I don’t even want to think in
terms of variables any more. I want to think of
everything as a resource. This is probably me
taking it too far but that’s what happens when
you’re hit by a revelation. It changes a lot of
things for you at the same time.
Any workarounds? I’d love to still use s
after
finding its length. No, no workaround needed. We
could, instead of bequeathing ownership, lease it
to len
. How? len
could be defined as a
function that doesn’t take ownership of its
arguments but rather requests a lease valid for
the duration of its scope. Here’s how the new
len
looks like (together with main
):
fn main() {
let s = String::from("hello");
let l = len(&s);
println!("{} has length of {}", s, l);
}
// The New & Improved `len`. Instead of taking
// ownership of its arguments, it requests a
// lease (aka a reference) and operates with that.
fn len(s: &str) -> usize {
s.len()
}
With this change, the heap memory used by s
won’t be reclaimed when len
's scope is over. In
fact, in the current implementation there’s no
allocation so there will be zero freeing. The
allocation was in main
and so the deallocation
or freeing or reclamation will happen there,
mainly because it didn’t bequeath ownership of s
to anyone.
Conclusion
These concepts were pretty interesting to think about and use. And once you get a hang of it the entire field of garbage collection, which the best minds of our generation continue to tinker to some desired perfection, begin to look like wasted effort. But I’d like to think otherwise and hope that there’s a significant benefit to garbage collection (over immediately freeing memory) that I just don’t know yet.
That said I’ve learnt a few things and as I continue to write and maintain programs in GC-ed languages I’ll see what lessons from Rust I could apply there.
The biggest takeaway is that I’ll still do my interviews in Python.
Enjoy!
Got comments or corrections for factual errors? There’s a Hacker News thread for that.