ICFP 2024
Mon 2 - Sat 7 September 2024 Milan, Italy

This paper provides a novel approach to reconciling complex low-level memory model features, such as pointer–integer casts, with desired refinements that are needed to justify the correctness of program transformations. The idea is to use a ``two-phased'' memory model, one with and unbounded memory and corresponding unbounded integer type, and one with a finite memory; the connection between the two levels is made explicit by our notion of refinement that handles out-of-memory behaviors. This approach allows for more optimizations to be performed and establishes a clear boundary between the idealized semantics of a program and the implementation of that program on finite hardware.

To demonstrate the utility of this idea in practice, we instantiate the two-phase memory model in the context of Zakowski et al.’s VIR semantics, yielding infinite- and finite-memory models of LLVM IR, including low-level features like undef and bitcast. Both the infinite and finite models, which act as specifications, can provably be refined to executable reference interpreters. The semantics justify optimizations, such as dead-alloca-elimination, that were previously impossible or difficult to prove correct.

The artifact contains the development including the Coq proofs of the important theorems. The source code can be used to build the executable interpreter as well, which can execute LLVM programs.