The Long Way to Deforestation: A Type Inference and Elaboration Technique for Removing Intermediate Data StructuresDistinguished Paper
Deforestation is a compiler optimization that removes intermediate data structure allocations from functional programs to improve their efficiency. This is an old idea, but previous approaches have proved limited or impractical — they either only worked on compositions of predefined combinators (shortcut fusion), or involved the aggressive unfolding of recursive definitions until a depth limit was reached or a reoccurring pattern was found to tie the recursive knot, resulting in impractical algorithmic complexity and large amounts of code duplication. We present Lumberhack, a general-purpose deforestation approach for purely functional call-by-value programs. Lumberhack uses subtype inference to reason about data structure production and consumption and uses an elaboration pass to fuse the corresponding recursive definitions. It fuses large classes of mutually recursive definitions while avoiding much of the unproductive (and sometimes counter-productive) code duplication inherent in previous approaches. We prove the soundness of Lumberhack using logical relations and experimentally demonstrate significant speedups in the standard nofib benchmark suite. We manually adapted nofib programs to call-by-value semantics and compiled them using the OCaml compiler. The average speedup over the 38 benchmarked programs is 8.2% while the average code size increases by just about 1.79x. In particular, 19 programs see their performance mostly unchanged, 17 programs improve significantly (by an average speedup of 16.6%), and only three programs visibly worsen (by an average slowdown of 1.8%). As a point of comparison, we measured that the well-proven but semi-manual list fusion technique of the Glasgow Haskell Compiler (GHC), which only works for call-by-need programs, had an average speedup of 6.5%. Our technique is in its infancy still and misses many deforestation opportunities. We are confident that further refinements to the core technique will yield greater performance improvements in the future.
Thu 5 SepDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change
10:30 - 12:00 | Refinement Types, Type InferenceICFP Papers and Events / JFP First Papers at Green 1-2-3 Chair(s): Dominic Orchard University of Kent; University of Cambridge | ||
10:30 18mTalk | The Long Way to Deforestation: A Type Inference and Elaboration Technique for Removing Intermediate Data StructuresDistinguished Paper ICFP Papers and Events Yijia Chen Hong Kong University of Science and Technology, Lionel Parreaux HKUST (The Hong Kong University of Science and Technology) DOI | ||
10:48 18mTalk | Contextual Typing ICFP Papers and Events DOI | ||
11:06 18mTalk | Synchronous Programming with Refinement Types ICFP Papers and Events Jiawei Chen University of Michigan at Ann Arbor, José Luiz Vargas de Mendonça University of Michigan at Ann Arbor, Bereket Shimels Ayele Addis Ababa Institute of Technology, Bereket Ngussie Bekele Addis Ababa Institute of Technology, Shayan Jalili University of Michigan at Ann Arbor, Pranjal Sharma University of Michigan at Ann Arbor, Nicholas Wohlfeil University of Michigan at Ann Arbor, Yicheng Zhang University of Michigan at Ann Arbor, Jean-Baptiste Jeannin University of Michigan at Ann Arbor DOI | ||
11:24 18mTalk | Automated Verification of Higher-Order Probabilistic Programs via a Dependent Refinement Type System ICFP Papers and Events DOI File Attached | ||
11:42 18mTalk | Trace contractsJFP First Paper JFP First Papers DOI |