Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_impl_paths/recursion-checking does not handle Sync/Send inference #1335

Open
tjhance opened this issue Nov 5, 2024 · 3 comments
Open

Comments

@tjhance
Copy link
Collaborator

tjhance commented Nov 5, 2024

Consider this variation of the recently added test test_recursion_through_sync_impl_is_checked from 4ab656f:

trait Tr {
    proof fn tr_g() {
    }
}

struct Xinner<T, S> {
    rc: std::rc::Rc<u64>,
    t: T,
    s: S,
}

struct X<T, S> {
    inner: Xinner<T, S>,
}

#[verifier::external]
unsafe impl<T: Sync + Tr, S: Sync> Sync for Xinner<T, S> {
}


trait Sr {
    proof fn f() { }
}

struct Y<R> {
    r: R,
}

impl<W: Sync> Sr for Y<W> {
    proof fn f() { }
}


struct A1 { }

struct B1 { }

impl Tr for A1 {
    proof fn tr_g() {
        test();
    }
}

proof fn test() {
    let r = Y::<X<A1, B1>>::f();
}

The cycle has the dependency Xinner<T, S>: Sync --> X<T, S>: Sync which is not covered by get_impl_paths

@tjhance
Copy link
Collaborator Author

tjhance commented Nov 5, 2024

this might be okay appealing to the coinductiveness of Send/Sync? I'm not sure.

@Chris-Hawblitzel
Copy link
Collaborator

Not all cycles are unsound. For example, cycles through exec functions are sound and we don't need to report them as errors (unless we deliberately want to prohibit non-terminating exec functions). Mainly, we just need to guarantee termination of spec and proof functions.

@Chris-Hawblitzel
Copy link
Collaborator

I'll try to write out a more rigorous analysis. This follows the OOPSLA paper and the comments in check_traits in recursive_types.rs. To prove soundness, we write out operational semantics and typing rules (where verification is included in the typing) and prove that well-typedness is preserved when we step according to the operational semantics. Additionally, we want to be able to erase ghost blocks (including calls to ghost functions) when we actually compile and execute the code, so we have to prove that ghost code can safely be skipped during execution. To do this, we prove that well-typed ghost code always terminates by translating ghost code into some other typed language that is known to terminate, such as CIC. (The CIC code must simulate the original ghost code so that if the CIC code terminates, then the original ghost code also terminates.) This translation into CIC is where cycle checking appears, since, roughly speaking, CIC prohibits cycles in a series of definitions.

Note that we only have to translate the ghost code, not the exec code, since the exec code doesn't need to terminate.

To extend this argument with traits, we encode traits using datatypes. check_traits in recursive_types.rs describes how to do this, with a note that says REVIEW: we could be more lenient and allow cycles through exec functions.. I'll try to sketch out the argument for the exec function leniency, since I think understanding why cycles are allowed in exec functions in traits makes it clearer why cycles are allowed with Send and Sync. Let's represent each trait T as two dictionary structs, one struct Exec_T<Self> for the exec value dictionary and one struct Ghost_T<Self> for the ghost value dictionary. An exec function fn f<A: T>() will become fn f<A>(a_Exec_T: Exec_T<A>, a_Ghost_T: Ghost_T<A>). A ghost function like proof fn g<A: T>() will become proof fn g<A>(a_Ghost_T: Ghost_T<A>), since ghost functions cannot access exec trait members.

Furthermore, if T has no exec values, there is no need for Exec_T, and if T has no ghost values, there is no need for Ghost_T (because if a dictionary is just an empty struct, anyone can create the dictionary from scratch at any point; there's no need to pass empty dictionaries around). In fact, since Send and Sync have no values at all, we don't need dictionaries for them at all. They disappear entirely in the CIC encoding and play no role in cycle checking.

Nevertheless, just to spell out the argument carefully, and because I think non-termination for exec trait members is interesting, I'll write out the example above using an Exec_Sync<Self> dictionary, as if Sync declared some exec members:

struct Exec_Sync<Self> {}

struct Ghost_Tr<Self> {
    tr_g: proof fn() -> (),
}
struct Exec_Tr<Self> {}

struct Xinner<T, S> {
    rc: std::rc::Rc<u64>,
    t: T,
    s: S,
}

struct X<T, S> {
    inner: Xinner<T, S>,
}

fn impl1_exec<T, S>(t_Exec_Sync: Exec_Sync<T>, t_Exec_Tr: Exec_Tr<T>, t_Ghost_Tr: Ghost_Tr<T>, s_Exec_Sync: Exec_Sync<S>) -> ExecSync<Xinner<T, S>> {
    Exec_Sync {}
}

struct Ghost_Sr<Self> {
    f: proof fn() -> (),
}
struct Exec_Sr<Self> {}

struct Y<R> {
    r: R,
}

proof fn impl2_ghost<W>() -> Ghost_Sr<Y<W>> {
    Ghost_Sr {
        f: proof || {},
    }
}
fn impl2_exec<W>(w_Exec_Sync: Exec_Sync<W>>) -> Exec_Sr<Y<W>> {
    Exec_Sr {}
}

struct A1 {}

struct B1 {}

proof fn test() {
    // note: we're not calling impl2_exec,
    // so we don't need an Exec_Sync<X<A1, A2>> here,
    // so there's no cycle
    let y_Sr_Ghost: Ghost_Sr<X<A1, A2>> = impl2_ghost();
    let r = y_Sr_Ghost.f();
}

proof fn impl3_ghost() -> Ghost_Tr<A1> {
    Ghost_Tr {
        tr_g: proof || { test() }
    }
}
fn impl3_exec() -> Exec_Tr<A1> {
    Exec_Tr {}
}

This doesn't mean that we can ignore Send and Sync entirely in a proof of Verus's soundness (or Rust's soundness). If, for example, we wanted to prove that an Rc value was never accessed from two different threads, we'd need to make sure that the typing rules prohibited passing an Rc value from one thread to another. For this, we need to be able to mark types as sendable/syncable in the type system. I don't think it's likely that this would influence the CIC encoding used for termination, though. In particular, I don't think it's likely we would be passing around dictionaries to represent Send/Sync bounds in the CIC encoding, given that Send/Sync don't have any members that would make the dictionaries non-empty. Making this argument rigorous is hard without choosing a particular set of concurrency operations, since we could always contrive to invent some operations for which a single proof block would fail to terminate if there were a Send/Sync-based cycle in the trait graph. But I'll sketch out some simple extensions to the OOPSLA formalization to show why I think Send/Sync would not matter to the CIC encoding in most natural operations.

The OOPSLA formalization contains a points-to type permission(i -> t) for memory address i holding a value of type t. There is a spec-level expression pdata(...) for specifying the current value associated with a permission and exec pread(...) and pwrite(...) operations for reading and writing. The CIC translation of permission(i -> t) is simply the CIC translation of t. I could also imagine formalizing heap invariants that can be shared among threads, for which the CIC translation might require representing the heap as a map (or something more complicated for higher-order invariants). Although the OOPSLA paper's pread and pwrite were exec only, it should be easy to allow proof-mode read and write operations without affecting the CIC translation (e.g. permission(i -> t) would still be translated to t).

Now suppose we want to add Send/Sync to this formalization. Perhaps each address i is owned by a particular thread and some types t are only usable from a particular thread, so we want to mark some permissions as being non-Send. It's the job of the typing rules and operational semantics to express these restrictions. For example, the operational semantics could crash if a restricted value were used from the wrong thread, and the typing rules would ensure that this never happens. The soundness proof would ensure the lack of crashes on the unerased code. Once this is established, we would then turn to the CIC translation to show that the ghost code (which has already been shown not to crash) always terminates and can be erased. This is made easier because a proof block executes atomically with respect to other threads. So the proof block's CIC translation doesn't really care about multithreaded execution; it just receives CIC-translated points-to permissions or CIC-translated heaps and acts on those permissions and heaps without regard to other threads. In other words, in this setting, even with Send/Sync restrictions on the permissions, the CIC translation of permission(i -> t) is still just the CIC translation of t, regardless of whether t is Send/Sync or not.

It's probably worth checking this for fancier, higher-order heap encodings as well, but my expectation is for the purpose of checking that single proof blocks terminate in one atomic step, Send and Sync can be omitted from the CIC encoding that is used to check this termination.

I've been assuming in this discussion that we want to allow nontermination of exec code. In other discussions, we've talked about also checking exec code for termination, ranging from lightweight decreases checking for while loops (already supported) and of recursive functions, to heavyweight checking that guarantees no infinite executions or even guarantees non-blocking execution for primitive blocking operations. For heavyweight guaranteed exec termination of concurrent code, there are many concurrency operations we would have to consider more carefully, and it might be that Send/Sync matters again for this. Even if this is the case, though, I think we could coordinate the cycle checking policy for Send/Sync with the cycle checking policy for exec values in traits: if nonterminating exec code is allowed, then cycles are allowed through exec values in traits and are also allowed through the Send/Sync traits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants