Theory and Design of PL (CS 538)
April 29, 2020
If you want to know more, talk to Mark!
unsafeDefines allowed, disallowed, and unspecified behaviors.
null pointerbool that is not true or falsea = f(b) + g(c)f or g?there are no restrictions on the behavior of the program.
Compilers are not required to diagnose undefined behavior (although many simple situations are diagnosed),
and the compiled program is not required to do anything meaningful.
ISO C++ forbids mutating string literals (ISO C++ §2.13.4p2)
Deferencing an invalid pointer is forbidden (ISO C §6.5.3.2p4)
fn pointersbool that isn’t true or falsechar outside the ranges [0x0, 0xD7FF] and [0xE000, 0x10FFFF]strfn do_the_foo_thing() {
let foo1 = Arc::new(Mutex::new(Foo(None)));
let foo2 = Arc::new(Mutex::new(Foo(None)));
// Reference cycle
foo1.lock().unwrap().0 = Some(Arc::clone(&foo2));
foo2.lock().unwrap().0 = Some(Arc::clone(&foo1));
// `foo1` and `foo2` are never dropped!
// Memory never freed. Foo::drop never called. No UB!
}MAX_INT + 1)In my program (Rust):
/// Read from file `fd` into buffer `buf`.
fn read_file(fd: i32, buf: &mut [u8]) {
let len = buf.len();
libc::read(fd, buf.as_mut_ptr(), len);
}In libc (C):
Compiler error: no unsafe C from safe Rust!
/// Read from the file descriptor into the buffer.
fn read_file(fd: i32, buf: &mut [u8]) {
let len = buf.len();
libc::read(fd, buf.as_mut_ptr(), len); // Compile error!
}Ok, but how do we call C libraries or the OS?
unsafeCompiler can’t check these: Be careful!
/// Read from the file descriptor into the buffer.
fn read_file(fd: i32, buf: &mut [u8]) {
let len = buf.len();
unsafe {
libc::read(fd, buf.as_mut_ptr(), len);
}
}Rust compiles, but C code may do something bad: Be careful!
unsafe mean?unsafe blocks”unsafe blocks”fn main() {
let mut my_vec = Vec::with_capacity(0); // empty vector
my_vec.set_len(100);
my_vec[30] = 0; // UB!
}Huh?!? UB in safe Rust? How?
unsafe fnimpl Vec {
/// Sets the length of the vector to `new_len`.
pub unsafe fn set_len(&mut self, new_len: usize) {
self.len = new_len;
}
}Can only be called in an unsafe block!
But why is it possible in the first place?
bool is always true or falseunsafe, breaking program invariants can break lang. invariants, leading to UBLanguage invariant: no accesses to invalid memory
Program invariant: len is no longer than buf
Bad use of Vec::set_len violates program invariant => access memory out of bounds == UB.
Not sufficient to just look in unsafe blocks!
unsafe: someone promises to uphold invariants!
“Promise” is called a proof obligation.
unsafeunsafe { ... } blocks
unsafe fn
unsafeunsafe trait and unsafe impl
Idea: Abstraction hides unsafe
unsafe to expose dangerous interfacesVecUsing only safe methods of Vec, it is impossible to cause UB, even though Vec uses unsafe internally.
Vec all uphold invariants.unsafe (e.g. set_len)fn main() -> std::io::Result<()> {
// Open: call libc and OS. Safely!
let file = File::open("foo.txt")?;
let mut buf_reader = BufReader::new(file);
let mut contents = String::new();
// Read: call libc and OS. Safely!
buf_reader.read_to_string(&mut contents)?;
assert_eq!(contents, "Hello, world!");
Ok(())
// Close: call libc and OS. Safely!
}File, BufReader are safe abstractions that uphold invariants about files, memory, etc.
#[repr(C)] and #[repr(packed)]Vec*const T and *mut T
unsafe to dereferencestd::ptrNonNullimpl Vecimpl Vecimpl RawVecimpl Vecimpl Vecpub fn push(&mut self, value: T) {
// Are we out of space?
if self.len == self.buf.cap {
self.buf.double(); // alloc more space
}
// put the element in the `Vec`
unsafe {
// compute address of end of buffer
let end = self.buf.ptr.offset(self.len);
ptr::write(end, value); // write data to raw pointer
self.len += 1; // increase length
}
}impl RawVecimpl RawVecunsafe tools#[repr(...)]extern fn
Strings, variadic fns (e.g. printf), extern types