Skip to content
Snippets Groups Projects
Commit 344ab33b authored by Jasper Clemens Gräflich's avatar Jasper Clemens Gräflich
Browse files

Complete 1.1.2, 1.1.3, and 1.1.4

Also update to \inline.
parent 0bdb032f
No related branches found
No related tags found
No related merge requests found
Pipeline #86557 passed
......@@ -5,7 +5,7 @@
\subsection{Ownership}
In most programming languages, one can freely create and delete resources. For example, in C the programmer allocates memory on the heap with \lstinline{malloc} and frees the memory when they don’t need the memory anymore using \lstinline{free}.
In most programming languages, one can freely create and delete resources. For example, in C the programmer allocates memory on the heap with \inline{malloc} and frees the memory when they don’t need the memory anymore using \inline{free}.
\begin{lstlisting}[language=C, caption={Memory management in C}]
// Allocate space for an `int`.
......@@ -44,9 +44,9 @@ with open("path/to/some.file", "r") as file:
This code is equivalent to the previous one, but even better: It also closes the file in case of an exception. Programmers can also define custom context managers for their own resources.
Other languages have different strategies. Go supports the \lstinline{defer} statement which takes an operation that is not to be run immediately but at the end of the scope.
Other languages have different strategies. Go supports the \inline{defer} statement which takes an operation that is not to be run immediately but at the end of the scope.
\begin{lstlisting}[language=Go, caption={The \lstinline{defer} statement in Go}]
\begin{lstlisting}[language=Go, caption={The \inline{defer} statement in Go}]
{ // Block starts here.
file, err := os.Open("path/to/some.file")
defer file.Close()
......@@ -61,7 +61,7 @@ in the end and changes in one line of the code have to be mirrored in the other.
\subsubsection{\acs{RAII} and Drop Responsibilities}
A third variant is \acfi{RAII}, also called \acfi{SBRM}, used by C++ and Rust. This ensures that the destructor of an object is run when it goes out of scope, which will clean up automatically. An example is the \lstinline{std::unique_ptr} in C++. The pointer \emph{owns} the memory it points to, meaning when it is created, it automatically allocates memory on the heap and when it goes out of scope, it frees the memory.
A third variant is \acfi{RAII}, also called \acfi{SBRM}, used by C++ and Rust. This ensures that the destructor of an object is run when it goes out of scope, which will clean up automatically. An example is the \inline{std::unique_ptr} in C++. The pointer \emph{owns} the memory it points to, meaning when it is created, it automatically allocates memory on the heap and when it goes out of scope, it frees the memory.
\begin{lstlisting}[language=C++, caption={\ac{RAII} in C++}]
{ // Block starts here.
......@@ -72,7 +72,7 @@ A third variant is \acfi{RAII}, also called \acfi{SBRM}, used by C++ and Rust. T
} // Block ends here; run the destructor and free the memory
\end{lstlisting}
\ac{RAII} ensures that resources are always freed when they are not live (can’t be accessed) anymore. The programmer can’t forget to free the resources nor cause use after free or double free. While in C++ only objects conform to \ac{RAII}, in Rust every value is owned by a variable. For example, the Rust analog to an owning pointer is \lstinline{Box}.
\ac{RAII} ensures that resources are always freed when they are not live (may not be used) anymore. The programmer can’t forget to free the resources nor cause use after free or double free. While in C++ only objects conform to \ac{RAII}, in Rust every value is owned by a variable. For example, the Rust analog to an owning pointer is \inline{Box}.
\begin{lstlisting}[language=Rust,caption={Heap allocation in Rust}]
{ // Block starts here.
......@@ -84,7 +84,7 @@ A third variant is \acfi{RAII}, also called \acfi{SBRM}, used by C++ and Rust. T
\end{lstlisting}
As soon as a variable falls out of scope, it is \emph{dropped}, meaning all
associated resources are freed. It is said that a variable has a \emph{drop responsibility}. A programmer can define custom drop behavior for their own resources by implementing the \lstinline{Drop} trait for their type, but normally the compiler does it for us.
associated resources are freed. It is said that a variable has a \emph{drop responsibility}. A programmer can define custom drop behavior for their own resources by implementing the \inline{Drop} trait for their type, but normally the compiler does it for us.
\subsubsection{Move Semantics}
......@@ -110,13 +110,13 @@ We create a pointer and use it as an argument to a function that prints the pont
error[E0382]: use of moved value: `*ptr`
\end{lstlisting}
Because we supplied \lstinline{ptr} as an argument to \lstinline{print_value}, it took ownership of that value, and now has the drop responsibility. As soon as the call to \lstinline{print_value} is finished, the memory is freed and accessing it afterwards is a use after free. We say that the value was \emph{moved} out of \lstinline{ptr} and into \lstinline{print_value}.
Because we supplied \inline{ptr} as an argument to \inline{print_value}, it took ownership of that value, and now has the drop responsibility. As soon as the call to \inline{print_value} is finished, the memory is freed and accessing it afterwards is a use after free. We say that the value was \emph{moved} out of \inline{ptr} and into \inline{print_value}.
\subsubsection{Clone}
We can still make it work by \emph{cloning} \lstinline{ptr} and supply the clone as an argument. \lstinline{ptr.clone()} creates a new \lstinline{Box}, allocates new memory, and initializes it with the same value as \lstinline{ptr}’s:
We can still make it work by \emph{cloning} \inline{ptr} and supply the clone as an argument. \inline{ptr.clone()} creates a new \inline{Box}, allocates new memory, and initializes it with the same value as \inline{ptr}’s:
\begin{lstlisting}[language=Rust, caption={Usage of \lstinline{Clone}}]
\begin{lstlisting}[language=Rust, caption={Usage of \inline{Clone}}]
fn print_value(ptr: Box<i32>) {
println!("{}", *ptr);
}
......@@ -130,13 +130,13 @@ fn main() {
}
\end{lstlisting}
Now everything works but keep in mind that the clone is completely independent from the original value. If we changed the value of the clone in \lstinline{print_value}, it would not be visible to the outside.
Now everything works but keep in mind that the clone is completely independent from the original value. If we changed the value of the clone in \inline{print_value}, it would not be visible to the outside.
\subsubsection{Copy}
We previously noted that ownership, and therefore move semantics, apply to every value in Rust. But if we change the code to use an \lstinline{i32} directly instead of putting it on the heap, we don’t need to clone anything:
We previously noted that ownership, and therefore move semantics, apply to every value in Rust. But if we change the code to use an \inline{i32} directly instead of putting it on the heap, we don’t need to clone anything:
\begin{lstlisting}[language=Rust, caption={Number types are \lstinline{Copy}}]
\begin{lstlisting}[language=Rust, caption={Number types are \inline{Copy}}]
fn print_value(number: i32) {
println!("{}", number);
}
......@@ -150,18 +150,18 @@ fn main() {
}
\end{lstlisting}
This compiles and works. That is not because \lstinline{i32} breaks move semantics but because the type implements the \lstinline{Copy} trait. Normally, if a value is moved, the physical bits making up that value are moved to a new location, e. g. the new stack frame, and are not available at the previous position anymore. If a type implements \lstinline{Copy}, the bit pattern is instead copied over and retained. Because \lstinline{i32} doesn’t allocate any heap or handles any resources, such a copy is valid. A \lstinline{Box} is not \lstinline{Copy}, because then two owners for the same resource would exist which would violate drop responsibility. If we want to duplicate a \lstinline{Box}, we need to allocate new memory on the heap and initialize it properly, which is what \lstinline{clone} does.
This compiles and works. That is not because \inline{i32} breaks move semantics but because the type implements the \inline{Copy} trait. Normally, if a value is moved, the physical bits making up that value are moved to a new location, e. g. the new stack frame, and are not available at the previous position anymore. If a type implements \inline{Copy}, the bit pattern is instead copied over and retained. Because \inline{i32} doesn’t allocate any heap or handles any resources, such a copy is valid. A \inline{Box} is not \inline{Copy}, because then two owners for the same resource would exist which would violate drop responsibility. If we want to duplicate a \inline{Box}, we need to allocate new memory on the heap and initialize it properly, which is what \inline{clone} does.
\subsubsection{ManuallyDrop}
Sometimes we don’t want \ac{RAII} to happen, we want to free resources ourselves. If that is the case, we can wrap a value in a \lstinline{ManuallyDrop} which tells the compiler to not drop it for us. This is a general trend in Rust: The correct way should be the easiest to do, and all potentially unsafe constructs are opt-in. More information on \lstinline{ManuallyDrop} can be found in the documentation for the type.\footnote{\url{https://doc.rust-lang.org/stable/std/mem/struct.ManuallyDrop.html}}
Sometimes we don’t want \ac{RAII} to happen, we want to free resources ourselves. If that is the case, we can wrap a value in a \inline{ManuallyDrop} which tells the compiler to not drop it for us. This is a general trend in Rust: The correct way should be the easiest to do, and all potentially unsafe constructs are opt-in. More information on \inline{ManuallyDrop} can be found in the documentation for the type.\footnote{\url{https://doc.rust-lang.org/stable/std/mem/struct.ManuallyDrop.html}}
\subsection{Borrowing}
As we have seen, ownership and move semantics ensure memory safety. But they are not easy to use, \lstinline{clone}ing data has a runtime overhead, and many correct programs are rejected. For example, \autoref{lst:ownership-transfer} describes a program that would work if it were not for move semantics.
As we have seen, ownership and move semantics ensure memory safety. But they are not easy to use, \inline{clone}ing data has a runtime overhead, and many correct programs are rejected. For example, \autoref{lst:ownership-transfer} describes a program that would work if it were not for move semantics.
That is where references and borrowing come in. Instead of taking ownership of a value, a function can only borrow it through a reference. Then the drop responsibility stays with the caller. References, of course, can not be used for everything, but for our case it is sufficcient. We mark the argument to \lstinline{print_value} as a reference using \lstinline{&}, and creating a reference from a value works the same.
That is where references and borrowing come in. Instead of taking ownership of a value, a function can only borrow it through a reference. Then the drop responsibility stays with the caller. References, of course, can not be used for everything, but for our case it is sufficcient. We mark the argument to \inline{print_value} as a reference using \inline{&}, and creating a reference from a value works the same.
\begin{lstlisting}[language=Rust, caption={Borrow using references to prevent a move}, label={lst:borrow}]
fn print_value(ptr: &Box<i32>) {
......@@ -177,7 +177,7 @@ fn main() {
}
\end{lstlisting}
\autoref{lst:borrow} compiles without an error and does what we would expect. There are two kinds of references. We just looked at \emph{shared} or \emph{immutable references}. The other kind is \emph{unique} or \emph{mutable references}, and they are denoted with \lstinline[language=Rust]{&mut}. The different kinds of references have a different semantics attached to them. While both are used to access values without taking ownership, there are specific rules for the creation and guarantees associated with each:
\autoref{lst:borrow} compiles without an error and does what we would expect. There are two kinds of references. We just looked at \emph{shared} or \emph{immutable references}. The other kind is \emph{unique} or \emph{mutable references}, and they are denoted with \inline{&mut}. The different kinds of references have a different semantics attached to them. While both are used to access values without taking ownership, there are specific rules for the creation and guarantees associated with each:
\begin{itemize}
\item A shared reference to a value can always be created, as long as there is no unique reference to the same value. It is not possible to mutate the pontee through a shared reference (which is why it is also called immutable).
......@@ -188,12 +188,12 @@ The part of the compiler that enforces these rules is the \emph{borrow checker}.
\subsubsection{The Owner of a Borrowed Value}
References have to agree to these rules, but the owner has to as well. While shared references exist, the owner may not mutate, e. g. \lstinline{let mut x = 42; let xref = &x; x += 1} is forbidden. Similarly, the owner can’t access a value at all as long as there is a unique reference around.
References have to agree to these rules, but the owner has to as well. While shared references exist, the owner may not mutate, e. g. \inline{let mut x = 42; let xref = &x; x += 1} is forbidden. Similarly, the owner can’t access a value at all as long as there is a unique reference around.
\todo[inline]{
\textbf{This doesn’t not work, I have to think about this some more. I want something to throe Error E0507.}
\textbf{This doesn’t not work, I have to think about this some more. I want something to throw Error E0507.}
In particular, a value may not be moved, as long as there is a reference to it because that would create a dangling pointer. This means that \lstinline{let y = &x; let z = *y;} doesn’t work.
In particular, a value may not be moved, as long as there is a reference to it because that would create a dangling pointer. This means that \inline{let y = &x; let z = *y;} doesn’t work.
%\begin{lstlisting}[language=Rust, caption={Moving out of borrowed value}]
% let x = 42;
......@@ -205,26 +205,39 @@ References have to agree to these rules, but the owner has to as well. While sha
\subsubsection{Lexical Lifetimes and Lifetime Analysis}
Before we can discuss extensions to the borrow checker, let’s discuss naïve references. References are values like every other and therefore have an owner themselves. That means if we create a reference to a value, the owner can only access it after the reference has gone out of scope again.
The parts of the program in which a reference is valid is called its \emph{lifetime}. The borrow checker keeps track of all references’ lifetimes and their so-called \emph{provenances}, that is, who the original owner of the referenced value is. References are values like all others and so they are dropped at the end of the scope they are defined in. In other words: Their lifetime starts at their creation and ends at the end of the scope. This is called a \emph{lexical lifetime}. Therefore the code in \autoref{lst:lexical-lifetimes} does not compile.
\begin{lstlisting}[language=Rust, caption={Lexical Lifetimes prevent usage}, label={lst:lexical-lifetimes}]
let mut x = 0;
{
let xref = &x; // Create a reference to `x`
x += 1; // Error: Cannot mutate while a reference exists
}
let xref = &x; // `xref` is created
println!("{xref}");
x += 1; // error
} // `xref` is dropped
x += 1; // OK, no references exist anymore
\end{lstlisting}
The program in \autoref{lst:lexical-lifetimes} is rejected by our naïve borrow checker since it registers a mutation to \lstinline{x} while a reference to it still exists. Only after the block ends, the borrow is returned to the owner and it can used again. Similarly, we would not be able to create a unique reference while a shared one exists and vice versa.
The program in \autoref{lst:lexical-lifetimes} is rejected by our borrow checker since it registers a mutation to \inline{x} while a reference to it still exists. Only after the block ends, the borrow is returned to the owner and it can be used again. Similarly, we would not be able to create a unique reference while a shared one exists and vice versa.
\todo{Example for Scoped Lifetimes using regions (?)}
But we can see that the program is correct since \inline{xref} is never accessed after \inline{x} is changed and we can save the code by introducing an additional scope.
But we can see that the program is correct since \lstinline{xref} is never accessed and therefore no aliasing happens. This code could be saved by wrapping the line in an extra pair of braces but we would like to avoid it. This can be achieved with \acfi{NLL}, which are part of Rust since version 1.31.0.
\begin{lstlisting}[language=Rust, caption={More scopes make the code check}, label={lst:saving-lexical-lifetimes}]
let mut x = 0;
{
{
let xref = &x; // `xref` is created
println!("{xref}");
} // `xref` is dropped
x += 1;
}
x += 1;
\end{lstlisting}
This is tedious, though, and we would like the borrow checker to recognize patterns like this automatically. Luckily, Rust implements an improved borrow checker since version 1.31. Because it is not based on lexical scope for lifetime analysis anymore, these new lifetimes are called \acfi{NLL}. We will take a closer look at them in \autoref{subsec:non-lexical-lifetimes}.
\subsubsection{Access Guards}\todo{Put this in an info box?}
Sometimes we need access to a resource from multiple places at the same time, for example when sharing data between threads. For this, Rust provides the \lstinline{Mutex} container type. References to a mutex can be shared freely, but to change the value in the container, one has to acquire a lock, therefore making the access \emph{guarded}.\todo{Add example} While the mutex is locked, no other thread can access the data at all, a mutex lock is therefore similar to a \lstinline[language=Rust]{&mut} but its guarantees are enforced at runtime. The \lstinline{RwLock} type can give out both read and write locks which have behavior analogous to \lstinline[language=Rust]{&} and \lstinline[language=Rust]{&mut}, respectively. There are more constructs for similar use cases in the standard library, like \lstinline{Arc} and \lstinline{Cow}.
Sometimes we need access to a resource from multiple places at the same time, for example when sharing data between threads. For this, Rust provides the \inline{Mutex} container type. References to a mutex can be shared freely, but to change the value in the container, one has to acquire a lock, therefore making the access \emph{guarded}.\todo{Add example} While the mutex is locked, no other thread can access the data at all, a mutex lock is therefore similar to a \inline{&mut} but its guarantees are enforced at runtime. The \inline{RwLock} type can give out both read and write locks which have behavior analogous to \inline{&} and \inline{&mut}, respectively. There are more constructs for similar use cases in the standard library, like \inline{Arc} and \inline{Cow}.
These data structures are implemented using so-called \emph{raw pointers}. Raw pointers, like pointers in C, are not borrow checked and can therefore alias. The programmer has to check manually that no rules are violated and should provide a safe interface that hides the raw pointers from the users. \todo{Pointers to Strict Provenance, Miri, …?}
......@@ -238,25 +251,63 @@ fn to_ref(number: i32) -> &i32 {
}
\end{lstlisting}
This fails because \lstinline[language=Rust]{number} is owned by \lstinline[language=Rust]{to_ref} and is dropped as soon as the function returns. The reference would already be invalid when the caller gets access to it. But if the function takes a reference as an argument, it can pass another reference back to the caller. This is called a \emph{borrow-through} and looks like the following:
This fails because \inline{number} is owned by \inline{to_ref} and is dropped as soon as the function returns. The reference would already be invalid when the caller gets access to it. But if the function takes a reference as an argument, it can pass another reference back to the caller. This is called a \emph{borrow-through} and looks like the following:
\begin{lstlisting}[language=Rust, caption={Borrow-through}]
fn to_ref(number: &i32) -> &i32 {
&number
}
\end{lstlisting}
fn to_ref(number: &i32) -> &i32 {
&number
}
\end{lstlisting}
Now Rust can couple both references to each other. The returned reference has the same provenance as the input one, and the borrow checker can check that the input reference’s lifetime is at least as long as the returned one.
\subsubsection{Lifetime Polymorphism}
\todo[inline]{Inferred lifetimes/Lifetime elision/Lifetime Polymorphism}
Consider a function that takes two references and returns one reference itself. Imagine for example, we have two slices and want to create an iterator over all elements of both slices. Then we could try to write a function like in \autoref{lst:two-refs}. The \inline{impl Iterator<Item=&T>} is a so-called existential type. It just means that we don’t care about what type exactly it is as long as it is an iterator.
\begin{lstlisting}[language=Rust, caption={Iterator over two slices}, label={lst:two-refs}]
fn both(first: &[i32], second: &[i32])
-> impl Iterator<Item=&i32>
{
first.iter().chain(second.iter())
}
\end{lstlisting}
This does not work however.
\begin{lstlisting}
error[E0106]: missing lifetime specifier
\end{lstlisting}
The compiler is confused because it does not know which provenance and lifetime to assign to the returned references. We as programmers can see that the return value depends on both input references, so we can help Rust by providing lifetime hints.\footnote{Actually, the compiler could infer these lifetimes but it would rely on global program analysis to do so. It is also possible to infer function types, but Rust chose to always be explicit about the types and lifetimes you use and have them be explicit parts of the API of a function.}
\begin{lstlisting}[language=Rust, caption={Iterator over two slices}, label={lst:two-refs}]
fn both<'a>(first: &'a [i32], second: &'a [i32])
-> impl Iterator<Item=&'a i32>
{
first.iter().chain(second.iter())
}
\end{lstlisting}
\inline{'a} is a \emph{polymorphic lifetime parameter}. We tell Rust that the output relies on both inputs being valid for at least as long as itself. If the output relied on only one of the arguments, we would mark only one of them with a lifetime specifier. This would for example be the case if we want to search for an element in a slice and, if it exists, return a mutable reference to that element. We implement such a function in \autoref{lst:two-refs-one-dependent}.
\begin{lstlisting}[language=Rust, caption={The output lifetime depends only on one of the inputs}, label={lst:two-refs-one-dependent}]
fn find_mut<'a>(haystack: &'a mut [i32], needle: &i32)
-> Option<&'a mut i32>
{
haystack.iter_mut().find(|x| **x == *needle)
}
\end{lstlisting}
\subsection{Reference Conversions}
\subsubsection{\lstinline[language=Rust]{Deref}, \lstinline[language=Rust]{AsRef}, and \lstinline[language=Rust]{Borrow}}
Sometimes, we have a reference to one type but need a reference to another, similar type. For example, a vector of some type \inline{Vec<T>} is conceptually the same as a slice of that same type \inline{[T]}, except that a \inline{Vec} can grow and shrink and a slice can not. This means that all functions which operate on slices should also work with vectors, and in fact there is a library method \inline{Vec::as_slice} that takes a reference to a vector and provides a reference to a slice. Similarly there is \inline{String::as_str} which transforms a \inline{&String} into a \inline{&str}.
Sometimes, we have a reference to one type but need a reference to another, similar type. For example, a vector of some type \lstinline[language=Rust]{Vec<T>} is conceptually the same as a slice of that same type \lstinline[language=Rust]{[T]}, except that a \lstinline[language=Rust]{Vec} can grow and shrink and a slice can not. This means that all functions which operate on slices should also work with vectors, and in fact there is a library method \lstinline[language=Rust]{Vec::as_slice} that takes a reference to a vector and provides a reference to a slice. Similarly there is \lstinline[language=Rust]{String::as_str} which transforms a \lstinline[language=Rust]{&String} into a \lstinline[language=Rust]{&str}.
\subsubsection{AsRef}
Generally, there are many types that can act as a substitute for another. The common interface for this behavior is the \lstinline[language=Rust]{AsRef} trait. There can be many implementations of this trait for a type. for example, \lstinline[language=Rust]{String} can stand in for \lstinline[language=Rust]{str}, \lstinline[language=Rust]{[u8]}, \lstinline[language=Rust]{OsStr} and \lstinline[language=Rust]{Path}. Everytime a reference to one of these types are needed, we can use a \lstinline[language=Rust]{String} instead, if we first call \lstinline[language=Rust]{as_ref} on it:
Generally, there are many types that can act as a substitute for another. The common interface for this behavior is the \inline{AsRef} trait. There can be many implementations of this trait for a given type. for example, \inline{String} can stand in for \inline{str}, \inline{[u8]}, \inline{OsStr} and \inline{Path}. Everytime a reference to one of these types are needed, we can use a \inline{String} instead, if we first call \inline{as_ref} on it:
\begin{lstlisting}[language=Rust, caption={Use \lstinline{String} in place of \lstinline{[u8]}}]
\begin{lstlisting}[language=Rust, caption={Use \inline{String} in place of \inline{[u8]}}, label={lst:as_ref_bytes}]
fn needs_bytes(x: &[u8]) { /*...*/ }
// ...
......@@ -265,13 +316,99 @@ let s = String::from("Hello Bytes");
needs_bytes(s.as_ref());
\end{lstlisting}
\todo[inline]{\lstinline[language=Rust]{Deref}, \lstinline[language=Rust]{Borrow}, and \lstinline[language=Rust]{mut} versions}
\subsubsection{Deref}
Always calling \inline{as_ref} is a bit cumbersome, expecially if it is clear which type the target should be. For example, if we have a \inline{Box<T>} there is really only one reasonable type we want to get out of it, namely a \inline{&T}. The trait \inline{Deref} can be used for that. It specifies a single type which Rust automatically converts into if it is needed. For example, \inline{String} has \inline{str} as its Deref target which allows us to omit the \inline{as_ref} in that case and leave the conversion implicit:
\begin{lstlisting}[language=Rust, caption={Use \inline{String} in place of \inline{[u8]}}]
fn needs_str(x: &str) { /*...*/ }
// ...
let s = String::from("Hello Bytes");
needs_bytes(s);
\end{lstlisting}
We couldn’t do the same in \autoref{lst:as_ref_bytes} because \inline{[u8]} is not Deref target of String.
\subsubsection{Borrow}
Sometimes we want to express an even stronger connection between two types. For example, a \inline{HashMap} needs to take ownership of its entries but we want to be able to do a lookup even if we only have a reference. But that requires that the owned and the borrowed value behave exactly the same. This is what the \inline{Borrow} trait signals. In particular, \inline{x.borrow() == y.borrow()} if and only if \inline{x == y} and \inline{x.hash() == x.borrow().hash()}.
\subsubsection{Mutable reference conversions}
All three of the aforementioned traits work for shared references only, but their variants \inline{AsMut}, \inline{DerefMut} and \inline{BorrowMut} all take and provide an exclusive reference.
\subsection{Non-Lexical Lifetimes}\label{subsec:non-lexical-lifetimes}
In \autoref{lst:lexical-lifetimes} we saw how lexical lifetimes stood in our way and we had to manually wrap a reference in an additional block just to make our correct program pass the borrow checker. This was because the lifetime of any value is bound to its scope, at the end of which it is dropped. \acl{NLL} is a different approach to lifetime analysis. A reference is live for as long as it may be used later. The borrow checker determines the points of the program in which the reference is live by building a \acfi{CFG}. All nodes of the \ac{CFG} that can be reached from the point of creation until the last use of the reference are where it is live. Consider the program in \autoref{lst:non-lexical-lifetimes}.
\begin{lstlisting}[language=Rust, caption={\ac{NLL} example}, label={lst:non-lexical-lifetimes}]
let mut x = 0;
let xref = &x;
println!("{xref}");
x += 1;
\end{lstlisting}
Then the (simplified) \ac{CFG} looks something like \autoref{fig:cfg-non-lexical-lifetimes}. It is quite boring because the control flow is linear. The nodes with thick borders are where \inline{xref} is live. The first node is the one in which the reference is created and the second one is where it is last used. Because \inline{x} is not modified in that section of the \ac{CFG}, the borrow checker doesn’t complain.
\begin{figure}[!h]
\centering
\begin{tikzpicture}
\begin{scope}[every node/.style={draw, ellipse}]
\node (A) at (0,4.5) {\inline{let mut x = 0;}};
\node[ultra thick] (B) at (0,3) {\inline{let xref = &x;}};
\node[ultra thick] (C) at (0,1.5) {\inline{println!("\{xref\}");}};
\node (D) at (0,0) {\inline{x += 1;}};
\end{scope}
\path [->] (A) edge node[left] {} (B);
\path [->, ultra thick] (B) edge node[left] {} (C);
\path [->] (C) edge node[left] {} (D);
\end{tikzpicture}
\caption{Control-flow diagram for \autoref{lst:non-lexical-lifetimes}}
\label{fig:cfg-non-lexical-lifetimes}
\end{figure}
Consider a more complicated example that includes branching in \autoref{lst:non-lexical-lifetimes-branch}.
\begin{lstlisting}[language=Rust, caption={\ac{NLL} }, label={lst:non-lexical-lifetimes-branch}]
let mut x = 0;
let xref = &x;
if xref != &0 {
x += 1;
println!("xref equals {xref}");
} else {
println!("xref is 0");
}
x = 42;
\end{lstlisting}
Now the \ac{CFG} splits up to accomodate both possible paths the program could take during execution. At the \inline{if}, the graph splits into two, but in the last node, both paths join up again. Here we can see that in the \inline{false} branch no problem occurs, but in the \inline{true} branch we try to modify \inline{x} even though \inline{xref} is used later in the same path. Modifying \inline{x} on the last line is okay, though.
\begin{figure}[!h]
\centering
\begin{tikzpicture}
\begin{scope}[every node/.style={draw, rectangle}]
\node (A) at (0,7.5) {\inline{let mut x = 0;}};
\node[ultra thick] (B) at (0,6) {\inline{let xref = &x;}};
\node[ultra thick] (C) at (0,4.5) {\inline{if xref != &0 }};
\node[ultra thick, draw=red] (D) at (-3,3) {\inline{x += 1;}};
\node[ultra thick] (E) at (-3,1.5) {\inline{println!("xref equals \{xref\}");}};
\node (G)[ultra thick] at (3,2.5) {\inline{println!("xref is 0");}};
\node (H) at (0,0) {\inline{x = 42;}};
\end{scope}
\path [->] (A) edge node[left] {} (B);
\path [->, ultra thick] (B) edge node[left] {} (C);
\path [->, ultra thick] (C) edge node[left] {\inline{true}} (D);
\path [->, ultra thick] (D) edge node[left] {} (E);
\path [->, ultra thick] (C) edge node[right] {\inline{false}} (G);
\path [->] (G) edge node[left] {} (H);
\path [->] (E) edge node[left] {} (H);
\end{tikzpicture}
\caption{Control-flow diagram for \autoref{lst:non-lexical-lifetimes-branch}}
\label{fig:cfg-non-lexical-lifetimes-branch}
\end{figure}
\subsection{Non-Lexical Lifetimes}
\begin{enumerate}
\item Example: Correct but rejected program
\item Solution: Early drop of reference/non-lexical lifetimes
\end{enumerate}
\subsection{Reborrows, the Borrow Stack and Two-Phase Borrowing}
\begin{enumerate}
\item What is a reborrow?
......@@ -308,12 +445,13 @@ needs_bytes(s.as_ref());
\item Dependent Refinement Types/Refined Function Types
\item Refinement Types are “first-class”
\end{enumerate}
\subsection{Liquid Types SMT Solving}
\subsection{Liquid Types and SMT Solving}
\begin{enumerate}
\item Refinement Types are not decidable
\item SAT is decidable
\item SMT is decidable
\item Liquid Types = $\{ T : \text{RefinementType } | \text{ decidable } T \}$
\item SMT-LIB2 and Z3
\end{enumerate}
\subsection{Hybrid Type Checking}
\begin{enumerate}
......@@ -321,14 +459,14 @@ needs_bytes(s.as_ref());
\item Sprinkle in casts everywhere an assertion is made
\item If the compiler can prove that this is ok or not ok, remove the cast (and potentially throw a compiler error)
\item All other casts are dynamic and lead to a panic if they fail.
\item In Rust, \lstinline{Result} and \lstinline{?} could be used instead. Problem: The error must be handled somewhere.
\item In Rust, \inline{Result} and \inline{?} could be used instead. Problem: The error must be handled somewhere.
\end{enumerate}
\section{Other Extensions to a Type System}
\subsection{Totality and Termination Metrics}
\begin{enumerate}
\item Some functions can panic or loop indefinitely
\item In Rust, \lstinline{!} as return type helps (such a function does definitely not terminate)
\item In Rust, \inline{!} as return type helps (such a function does definitely not terminate)
\item There is no “may not terminate”, though, so you have to rely on the documentation
\item You can prove that a loop/recursion finishes by termination metrics
\item Panics can technically occur in every function (every function call needs memory space for a stack frame)
......@@ -340,7 +478,7 @@ needs_bytes(s.as_ref());
\item What are structural rules? Exchange, Weakening, Contraction
\item Move semantics are affine types
\item Clone and Copy is weakening
\item \lstinline{#[must_use]} and relevant types
\item \inline{#[must_use]} and relevant types
\item Linear types
\item The borrow stack is ordered types
\item Uniqueness types: There is only one reference (in contrast to linear types: no more references can be created)
......@@ -357,6 +495,4 @@ needs_bytes(s.as_ref());
\item non-totality is an effect, but do termination metrics fit in?
\item Weakening and Contraction are effects, but what about borrowing?
\item Refinements $\cap$ Effects
\end{enumerate}
\subsection{SMT Solving, Z3, SMT-LIB2}
\end{enumerate}
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment