Rust: Data types

Rust is a statically typed language so it must know the types of all variables at compile time. The data types in Rust are divided into two subsets: scalar and compound.

All the data types discussed here are stored on the stack.

Scalar Types

Represents a single value. Four primary types:

  • Integer, defaults to i32
  • Floating-point, IEEE-754 compliant, defaults to f64
  • Boolean
  • Character, char type, four bytes in size and represents a Unicode Scalar Value.

Caution: Rust handles integer overflow differently in debug and release modes. In debug mode, Rust checks for integer overflow and causes the program to panic at runtime if this behaviour occurs. In release mode, Rust does not include checks for integer overflow. Instead, if overflow occurs, Rust performs two’s complement wrapping.

Compound Types

Multiple values into one type. Two primitive compound types:

  • Tuple, a fixed-length collection of various types
  • Array, a fixed-length collection of same type

Arrays are useful when you want your data allocated on the stack rather than the heap.

A vector is similar to an array but is allowed to grow or shrink in size.

Stack and heap

In most programming languages, one need not think about the stack and the heap very often. But in a systems programming language like Rust, where a value is on the stack or the heap has more of an effect on how the language behaves and why you have to make certain decisions.

The stack stores values in the order it gets them and removes the values in the opposite order. Last in, first out. Adding data is called pushing onto the stack, and removing data is called popping off the stack. All data stored on the stack must have a known, fixed size at compile time.

Data with an unknown size at compile time or a size that might change must be stored on the heap instead. The heap is less organised: when you put data on the heap, you request a certain amount of space. The operating system finds an empty spot in the heap that is big enough, marks it as being in use, and return a pointer, which is the address of that location. This process is called allocating on the heap and is sometimes abbreviated as just allocating.

Pushing to the stack is faster than allocating on the heap because the operating system never has to search for a place to store new data; that location is always at the top of the stack.

Accessing data in the heap is slower than accessing data on the stack because you have to follow a pointer to get there.

When your code calls a function, the values passed into the function (including the pointers to data on the heap) and the function’s local variables get pushed onto the stack. When the function is over, those values get popped off the stack.