Rust is a multi-paradigm, general-purpose programming language, and it is used widely in recent years, especially in the blockchain world. Polkadot and Near smart contracts use Rust. Subgraph will Rust to build a server indexing data from blockchains. It is reasonable to explore Rust features. Slides: docs.google.com/presentation/d/1H7mXnoaZRTw..
Cargo
Cargo is Rust's build system and the package manager. Cargo is installed together with Rust by the official installer. There are some CLI Cargo commands we can use, but the most common ones are:
cargo new
creates a new executable project with--bin
flag or a new library project with--lib
. Executables are binaries used to perform some tasks, e.g. server, commands. Libraries are common codes for other to use. For example, we develop NEAR smart contracts as libraries, then NEAR runtime will run our libs.cargo run
builds and run the program. If we develop libraries, we can usecargo build
.
A project created by Cargo will have the following scheme:
src/
contains all our source code.target/
contains all build output. It is ignored in.gitignore
file.- Cargo.toml is the manifest containing metadata and dependencies of the project.
- Cargo.lock is generated by Cargo automatically. It locks the versions of all dependencies.
Package structure
We are all confused when reading a large project in a language we don't know well, e.g. what is the entry point, this file is submodule of what file, etc. So, it's reasonable to talk about package structure first.
Executables
After creating an executable with Cargo, we will have 1 file, main.rs
, in the src/
directory. This main.rs
is the default executable, which we can run it by cargo run --bin {project_name}
. The function named main
is the project entry point.
A project can have multiple commands. Each extra command is a .rs
file in bin/
directory. For example, if we want a command named another-exe
, we will need a main
function in bin/another-exe.rs
.
At some points in development, we need to restructure large files into smaller ones. We cannot put a file named some_module.rs
at bin/
level because Cargo will perceive it as a new some_module
command created. If we want to create a multi-file-exe
command, we need to create a directory named multi-file-exe
in bin/
. We will add bin/multi-file-exe/main.rs
as the entry point and as many modules we need like bin/multi-file-exe/some_module.rs
.
Libraries
After creating a library with Cargo, we will have 1 file, lib.rs
, in the src/
directory. We can put anything like funtions in lib.rs
, or group functions in namespaces. In Rust, namespaces is modules (mod
).
At some points in development, there are some super large modules in single file lib.rs
, and we may need seperate them into other files. To do this, we only need to create a file with the named of the module and put everything of this module into the created file. Then, include this module in lib.rs
, for example use pub mod some_module
to include everything of some_module
module in some_module.rs
.
If we want to seperate modules recursively, we need to create a directory. For example, if we want to seperate some_submodule
in nested_module.rs
, we will create nested_module
directory. Then, put everything of some_submodule
into some_submodule.rs
, and the left code in nested_module/mod.rs
. Remember to include pub mod some_submodule
in mod.rs
.
Common concepts
This section will cover some notes about common concepts. For full details, refer to doc.rust-lang.org/book/ch03-00-common-progr...
We can use let
to declare a variable. By default, variables in Rust are immutable. If we want to mutate the value, we can use let mut
. In most common cases, we don't need to specify type, because the complier will do this for us.
Native data types in Rust is divided into two main groups: Scalar and Compound. Scalar contains integer, float, boolean and character. Compound type contains tuple and array. Like Python, tuple can hold multiple types of its elements. Note that array in Rust is fixed size. All variables of native data types are located on stack.
Rust distinguishes statements and expressions very clearly. Statements perform actions, while expressions only evaluate and return value. Morover, the last expression is the returned value of its block. For exapmle, in the below image, x
is the returned value of the function. If we change return x+1;
to x+1
, Cargo won't compile. Because x+1
will be the returned value of if
block.
Ownership
Ownership is the most interesting topic of Rust. It may be unfamiliar at first, but later, we will thank Rust for this feature. The idea of Rust is "memory safety without garbage collector". In C, we have to clean up ourselves. In Go or JS, there are garbage collector, but this may cost runtime.
Principles
We will stack and heap to explain the ownership feature. Stack are LIFO and used for fixed-size variables while heap contains growable variables or unknown size variables at the compiled time.
When we declare a number let num = 9;
, the program will allocate an integer slot on stack. If we declare a growable vector, like let vector = vec![1, 2, 3, 5];
, the program will ask for memory allocation on heap, then create a management variable (pointer, length, and capacity) on stack to point to data on heap.
And the below trick helps me understand ownership feature
- Owners are “on stack”, e.g. num, vector in the below image
- Data has 1 owner. Data on stack own its self, and there cannot be multiple owner. In other languages, data on head can have multiple owner pointing to it, e.g. vector2, vector3, but Rust won't allow that.
- Owner dropped, so data dropped. No need for garbage collector.
- In assignment, data is “moved” to new owner if it does not implement the
Copy
trait (interface). Let's move to the example section for clearer.
Examples
After declaring num
, we make an assignment let n = num
. Because all native data types implement Copy
trait, the data is copied from num to n. So, num
still owns 9
, and n
owns a different 9
.
After declaring vector
, we make an assignment let own_v = vector
. Because Vector
does not implement Copy
trait, the data is moved to the new owner own_v
. If we try to use vector
variable after this, we will receive compiling errors. We can "borrow" the data by making a reference let ref_v = &own_v
.
Structs
In Rust, there are 3 types of struct: named field, tuple-like, unit-like. Named field structs are like other languages. Tuple structs are useful to specify tuple type and remove too verbose field names. And, unit structs are useful to implement traits but not requiring data. Not that functions not requiring an object to call are associated functions, and they can be used as constructors.
Traits and Derive
Traits in Rust are like interfaces in other languages. We can specify placeholder functions in traits or even define default implementation for them. To implement traits for struct, use impl {trait} for {struct}
. We can override default implementation too.
Derive macros is the code that generate Rust code. Rust exports some useful traits and its implementation for us, e.g. Copy
, Clone
, Debug, etc. With
Copyimplementation, struct assignments are not moved to the new owner anymore. However, the
Copytrait can be applied only for structs with all fields implementing
Copy`.
Enums
Like C/C++, enums in Rust are used to list variations of a type. But in Rust, enums are more powerful. Generally, we can define 3 types of struct as enum variations.
Error handling
Rust have two types of errors: unrecoverable using panic!
and recoverable using Result
enum. In smart contracts, people prefer unrecoverable one. In server development, they all use recoverable version, because Rust don't have something like try ... catch ...
. Rust defines ?
for the common Result
enum handling. For example, let f = File::open("hello.txt")?;
is equivalent to
let f = File::open("hello.txt");
let mut f = match f {
Ok(file) => file,
Err(e) => return Err(e),
};
That's all. I hope this blog helps you start your Rust journey quickly. Thank you for reading my blog.
Reference: doc.rust-lang.org/book