Advanced Topics: Unsafe Rust and FFI
Rust’s strict compile-time safety guarantees are foundational. However, there are scenarios where these guarantees need to be circumvented to achieve specific goals, such as interacting with hardware, operating system features, or code written in other languages. This is where unsafe Rust and the Foreign Function Interface (FFI) come into play.
Entering an unsafe block means you are telling the compiler, “I know what I’m doing, and I guarantee that this code is memory-safe.” The compiler will trust you. It is your responsibility to ensure that unsafe code actually upholds Rust’s memory safety guarantees.
Understanding unsafe
unsafe blocks (or unsafe fn for entire functions) allow you to do five things that the safe Rust compiler normally prohibits:
- Dereference a raw pointer.
- Call an
unsafefunction orunsafemethod. - Access or modify a mutable static variable.
- Implement an
unsafetrait. - Access fields of
unions.
Let’s look at each of these.
1. Dereference a Raw Pointer
Raw pointers (*const T and *mut T) are analogous to pointers in C/C++. They can be null, dangle, or point to invalid memory. They do not have ownership, borrowing rules, or lifetime guarantees enforced by the compiler.
fn main() {
let mut num = 5;
let r1 = &num as *const i32; // Immutable raw pointer
let r2 = &mut num as *mut i32; // Mutable raw pointer
unsafe {
println!("r1 is: {}", *r1); // Dereferencing raw pointer
println!("r2 is: {}", *r2); // Dereferencing raw pointer
*r2 = 10; // Modifying value through mutable raw pointer
}
println!("num is now: {}", num); // Output: num is now: 10
}
- You can create raw pointers in safe code (e.g., by casting a reference).
- You can only dereference them within an
unsafeblock. - DANGER: Dereferencing an invalid raw pointer is undefined behavior.
2. Calling an unsafe Function or Method
Functions can be marked unsafe to indicate that they contain code that relies on certain invariants that the caller must uphold for the function to be safe.
unsafe fn dangerous() {
println!("This is a dangerous function!");
}
fn main() {
// Calling `dangerous()` requires an unsafe block
unsafe {
dangerous();
}
}
Standard library functions that are unsafe include slice::get_unchecked (which doesn’t perform bounds checking) and std::mem::transmute (for reinterpreting bytes as a different type).
3. Accessing or Modifying a Mutable Static Variable
Static variables are similar to global variables. While immutable static variables are safe to access, mutable static variables (static mut) are inherently unsafe because they introduce the possibility of data races if multiple threads try to access and modify them without synchronization.
static mut COUNTER: u32 = 0; // Mutable static variable
fn add_to_counter(inc: u32) {
unsafe {
COUNTER += inc; // Modifying mutable static variable
}
}
fn main() {
add_to_counter(1);
// If multiple threads called `add_to_counter` without `Mutex`, it would be a data race.
unsafe {
println!("COUNTER: {}", COUNTER); // Accessing mutable static variable
}
}
For shared mutable state, static variables wrapped in Mutex and Arc (e.g., static MY_DATA: OnceCell<Mutex<u32>> = OnceCell::new(); or similar lazy_static/once_cell pattern) are preferred for safety.
4. Implementing an unsafe Trait
A trait can be marked unsafe if its implementation has invariants that the compiler cannot verify. For example, the Send and Sync traits (used for marking types as safe to send across threads or share between threads, respectively) are implicitly unsafe to implement manually, though they are usually auto-derived. If you implement Send or Sync for a type containing raw pointers, you take on the responsibility for thread safety.
unsafe trait MyUnsafeTrait {
// ...
}
struct MyStruct;
unsafe impl MyUnsafeTrait for MyStruct {
// ... implement methods
}
This is a rare operation, mostly for low-level system programming.
5. Accessing Fields of unions
unions are similar to structs, but only one of their fields can be active at a time. Accessing the wrong field of an active union is undefined behavior and requires unsafe. Unions are primarily used for FFI with C code.
Foreign Function Interface (FFI)
FFI is the mechanism that allows code in one programming language to call functions implemented in another. In Rust, FFI primarily involves interacting with C code.
extern "C" Functions
To call functions written in other languages (typically C), you declare them in an extern "C" block. The "C" part specifies the Application Binary Interface (ABI) that Rust should use to call the function.
// In Rust code (src/main.rs or src/lib.rs)
#[link(name = "c_math")] // Link against the `c_math` library
extern "C" {
// Declare a C function that adds two integers
fn c_add(a: i32, b: i32) -> i32;
// Declare a C function that multiplies two doubles
fn c_multiply(a: f64, b: f64) -> f64;
// Declare a C function that takes a C string and returns a C string
fn c_greet(name: *const libc::c_char) -> *const libc::c_char;
}
fn main() {
let result_add: i32;
let result_mult: f64;
let greeting_ptr: *const libc::c_char;
// `c_add`, `c_multiply`, `c_greet` are `unsafe` functions because
// Rust cannot guarantee their behavior or memory safety from the C side.
unsafe {
result_add = c_add(10, 20);
result_mult = c_multiply(2.5, 3.0);
let c_string_name = std::ffi::CString::new("Alice").unwrap();
greeting_ptr = c_greet(c_string_name.as_ptr());
let greeting = std::ffi::CStr::from_ptr(greeting_ptr).to_str().unwrap();
println!("Result of c_add: {}", result_add); // Output: 30
println!("Result of c_multiply: {}", result_mult); // Output: 7.5
println!("Greeting from C: {}", greeting); // Output: Hello, Alice!
}
}
To make this example runnable, you’d need a C library:
c_math.h:
// c_math.h
#include <stdint.h> // For int32_t
#include <stddef.h> // For size_t
int32_t c_add(int32_t a, int32_t b);
double c_multiply(double a, double b);
const char* c_greet(const char* name);
c_math.c:
// c_math.c
#include "c_math.h"
#include <stdio.h> // For printf
#include <stdlib.h> // For malloc
#include <string.h> // For strcpy
int32_t c_add(int32_t a, int32_t b) {
return a + b;
}
double c_multiply(double a, double b) {
return a * b;
}
const char* c_greet(const char* name) {
printf("C side received name: %s\n", name);
// Be careful with memory management across FFI!
// Returning a dynamically allocated C string is tricky.
// For simplicity, let's return a static string or allocate and manage.
// A proper solution would pass a buffer from Rust for C to fill, or use `libc::free` in Rust.
static char buffer[100]; // Dangerous: not thread-safe, fixed size
sprintf(buffer, "Hello, %s!", name);
return buffer;
}
Compilation steps (Linux/macOS):
- Compile C code into a static library:
gcc -c c_math.c -o c_math.o ar rcs libc_math.a c_math.o - Place
libc_math.ain a directory where Rust can find it (e.g., in your project root or intarget/debug). - Add
#[link(name = "c_math", kind = "static")]to your Rust code if it’s a static library. If you create a shared library (.soor.dylib), you’d typically omitkindand ensure it’s in yourLD_LIBRARY_PATH. - Run your Rust code.
Exposing Rust Functions to C
You can also make Rust functions callable from C.
#[no_mangle]: Prevents the Rust compiler from “mangling” the function name, ensuring it has a predictable name in the compiled output.extern "C": Makes the function follow the C calling convention and ABI.
// In src/lib.rs
#[no_mangle]
pub extern "C" fn rust_hello() {
println!("Hello from Rust!");
}
#[no_mangle]
pub extern "C" fn rust_add_one(x: i32) -> i32 {
x + 1
}
To use this from C:
main.c (C program that calls Rust functions):
#include <stdio.h>
#include <stdint.h> // For int32_t
// Declare the Rust functions
extern void rust_hello();
extern int32_t rust_add_one(int32_t x);
int main() {
rust_hello(); // Call Rust function
int32_t result = rust_add_one(5);
printf("Rust added one: %d\n", result); // Output: 6
return 0;
}
Compilation steps (Linux/macOS):
- Compile the Rust library:
cargo build --release # Compiles src/lib.rs into target/release/libyour_project_name.so (or .dylib) - Compile the C code, linking against the Rust library:
gcc main.c -L target/release -l your_project_name -o c_caller # -L specifies library search path, -l specifies library name (without lib prefix or .so/.dylib) - Run the C executable:
LD_LIBRARY_PATH=target/release ./c_caller # Linux DYLD_LIBRARY_PATH=target/release ./c_caller # macOS
Best Practices for FFI
- Be extremely careful: FFI is the most common place to introduce memory unsafety into Rust programs.
- Encapsulate
unsafe: WrapunsafeFFI calls in safe Rust functions, providing a safe API to the rest of your application. - Types and ABI: Ensure Rust types correctly map to C types. Use
libccrate for C primitive types (c_char,c_int, etc.). Pay attention to data alignment and packing. - Memory Management: This is the biggest pitfall. Who allocates memory, and who deallocates it?
- Rust allocates, Rust deallocates: Pass a Rust-owned pointer to C for read-only access.
- C allocates, Rust deallocates: Requires C to provide a
freefunction (or you uselibc::freein Rust). - Rust allocates, C deallocates: Dangerous and usually discouraged.
- Pass references/slices: For data owned by Rust that C just needs to peek at, pass
&Tor&[T]which convert to*const Tand*const Tplus length respectively.
- C Strings: Rust’s
Stringand&strare not C-compatible. Usestd::ffi::CString(Rust to C) andstd::ffi::CStr(C to Rust) for null-terminated C strings. - Error Handling: C functions often return error codes. Map these to Rust’s
Resultenum.
unsafe Rust and FFI are powerful tools for specific, low-level tasks, but they require a deep understanding of memory management and careful application to maintain the overall safety of your Rust codebase.
Exercises / Mini-Challenges
Exercise 10.1: Unsafe Array Access
Write a function get_unchecked_element(arr: &[i32], index: usize) -> i32 that uses unsafe to access an array element without bounds checking.
WARNING: This function is inherently unsafe. Only call it with valid indices.
Instructions:
- Define the
get_unchecked_elementfunction. - Inside, cast the slice to a raw pointer and use
add()to get the pointer to the desired element. - Dereference the raw pointer inside an
unsafeblock. - In
main, call this function with a valid index and print the result. - (Optional, for demonstration only) Call it with an invalid index and observe the runtime panic/crash. Comment out this line after observing!
// Solution Hint:
/*
fn get_unchecked_element(arr: &[i32], index: usize) -> i32 {
unsafe {
let ptr = arr.as_ptr(); // Get a raw pointer to the start of the slice
*ptr.add(index) // Calculate pointer to desired index and dereference
}
}
fn main() {
let numbers = [10, 20, 30, 40, 50];
// Safe usage
let value = get_unchecked_element(&numbers, 2);
println!("Element at index 2: {}", value); // Output: 30
// DANGEROUS: This will cause a runtime error (segmentation fault or similar)
// if you uncomment and run it.
// let invalid_value = get_unchecked_element(&numbers, 10);
// println!("Element at invalid index 10: {}", invalid_value);
}
*/
Exercise 10.2: Simple FFI with a C Function (Simulated)
Simulate calling a C function c_power(base: i32, exp: i32) -> i32 from Rust that calculates base raised to the power of exp.
Instead of actually linking to a C library, define a Rust unsafe extern "C" block for c_power and then provide a stub implementation within Rust itself to make the code compile and run for demonstration purposes.
Instructions:
- Add
libcto yourCargo.toml[dependencies]section if it’s not already there (it provides C type definitions).cargo add libc
- Declare the
c_powerfunction inside anextern "C"block inmain.rs. - Provide a Rust implementation for
c_poweras a separatefn(this simulates the C function). Make sure this simulated function is alsoextern "C"and has#[no_mangle]. - In
main, call theunsafec_powerfunction with some values and print the result.
// Solution Hint:
/*
use libc::c_int; // For C integer type
// Declare the C function
extern "C" {
fn c_power(base: c_int, exp: c_int) -> c_int;
}
// --- THIS PART SIMULATES THE C CODE ---
// In a real scenario, this would be in a `.c` file and compiled as a library.
#[no_mangle]
pub extern "C" fn c_power(base: c_int, exp: c_int) -> c_int {
if exp < 0 {
return 0; // Simplified error handling
}
let mut res = 1;
for _ in 0..exp {
res *= base;
}
res
}
// --- END OF SIMULATED C CODE ---
fn main() {
let base = 2;
let exp = 5;
let result: c_int;
unsafe {
result = c_power(base, exp);
}
println!("{} to the power of {} is: {}", base, exp, result); // Output: 32
let base2 = 3;
let exp2 = 4;
unsafe {
result = c_power(base2, exp2);
}
println!("{} to the power of {} is: {}", base2, exp2, result); // Output: 81
}
*/