Sampling Optimized Code for Type Feedback

Article by Olivier Flückiger, Andreas Wälchli, Sebastián Krynski, Jan Vitek, published at DLS ‘20. Slides, Talk.

To efficiently execute dynamically typed languages, many language implementations have adopted a two-tier architecture. The first tier aims for low-latency startup times and collects dynamic profiles, such as the types of every program variable. The second tier provides high-throughput using an optimizing compiler that specializes code to the recorded type information. If the program behavior changes to the point that not previously seen types occur in specialized code, that specialized code becomes invalid, it is deoptimized, and control is transferred back to the first tier execution engine which will start specializing anew. However, if the program behavior becomes more specific, for instance, if a variable that was recorded as holding values of many types becomes monomorphic, no deoptimization will be triggered. Once the program is running optimized code, there are no means to notice that an opportunity for optimization has been missed or to restart specialization.

We propose to employ a sampling-based profiler to monitor native code without any instrumentation. The absence of instrumentation means that when the profiler is not active, no overhead is incurred. When the profiler is active, the overhead can be controlled by limiting the sampling rate. Our implementation is in the context of the Ř just-in-time, optimizing, compiler for the R language. Based on the sampled profiles, we are able to detect when the native code produced by Ř is specialized for stale type feedback and recompile it to more type-specific code. We show that recording in our profiler adds an overhead of less than 3% in most cases and up to 9% in few cases when engaged and that it reliably detects stale type feedback within milliseconds.