Interpreter Taming to Realize Multiple Compilations in a Meta-Tracing JIT Compiler Framework (bibtex)
by Yusuke Izawa, Hidehiko Masuhara and Carl Friedrich Bolz-Tereick
Abstract:
There exist a wide range of JIT compilation policies with different compilation scopes, such as method-based, trace-based, region-based, etc. These heavyweight compilation techniques can produce fast machine code but consume compilation time. Suppose such a heavyweight compilation is applied to non-frequently executed code parts (here we call warm spots). In that case, the overhead of compilation time could be inevitable since little benefit could be obtained. To avoid this problem, today's VMs employ a lightweight compiler that generates machine code quickly instead of applying time-consuming optimizations. For example, current Java and JavaScript VMs have multiple compilers – HotSpot has C1/C2, V8 has Ignition/Turbofan, and JavaScriptCore has Baseline/DFG/FTL compilers. By applying heavyweight compilation to hot spots and lightweight compilation to not hot but warm spots, today's VMs balance code quality and compilation time. This multilevel compilation technique is important because a larger application tends to have a small area of hot spots. The RPython framework is a toolchain that generates a VM equipped with a heavyweight trace-based JIT compiler from a bytecode interpreter. Although RPython makes it easier to create a high-performance VM, there is a dilemma: the generated VMs are hard to extend because they need to use the fixed VM components provided by RPython. Given this context, if one wishes to realize a lightweight compilation or a new heavyweight compilation with a new compilation scope, it requires much engineering effort to implement them by extending the meta-tracing JIT compiler. We propose Multilevel RPython, which can perform two-level compilation with different compilation scopes. The compilation level of Multilevel RPython consists of a lightweight level, which emits method-based threaded code, and a heavyweight level, which emits trace-based optimizing code. Multilevel RPython is realized not by creating different compilers from scratch but by taming a bytecode interpreter given to the RPython toolchain. In other words, the lightweight compilation is performed by an interpreter tamed for threaded code generation, and the heavyweight compilation is used for tracing JIT compilation. In this talk, we present the implementation status of Multilevel RPython. In particular, we implemented an inline caching technique for threaded code generation and a prototype of the compilation-level shifting mechanism. Both techniques are realized by taming the definition of an interpreter and a slight modification for the meta-tracing compiler. The microbenchmark evaluation showed that inline caching makes threaded code generation approximately 20% faster than threaded code generation without the inline caching technique. In addition, we conducted a multilevel JIT experiment on an application combining large benchmark programs to simulate a real-world workload. This experiment shows the multilevel JIT compilation on Multilevel RPyhon is about 14% faster.
Reference:
Interpreter Taming to Realize Multiple Compilations in a Meta-Tracing JIT Compiler Framework (Yusuke Izawa, Hidehiko Masuhara and Carl Friedrich Bolz-Tereick), Talk at the MoreVMs workshop 2023, 2023.
Bibtex Entry:
@misc{izawa2023morevms,
  author = {Yusuke Izawa and Hidehiko Masuhara and {Carl Friedrich} Bolz-Tereick},
  title = {Interpreter Taming to Realize Multiple Compilations in a Meta-Tracing {JIT} Compiler Framework},
  howpublished = {Talk at the MoreVMs workshop 2023},
  month = mar,
  date = {2023-03-13},
  year = 2023,
  url = {https://2023.programming-conference.org/details/MoreVMs-2023-papers/2/Interpreter-Taming-to-Realize-Multiple-Compilations-in-a-Meta-Tracing-JIT-Compiler-Fr},
  slides = {morevms2023-slides.pdf},
  abstract = {There exist a wide range of JIT compilation policies with different compilation scopes, such as method-based, trace-based, region-based, etc. These heavyweight compilation techniques can produce fast machine code but consume compilation time. Suppose such a heavyweight compilation is applied to non-frequently executed code parts (here we call warm spots). In that case, the overhead of compilation time could be inevitable since little benefit could be obtained. To avoid this problem, today's VMs employ a lightweight compiler that generates machine code quickly instead of applying time-consuming optimizations. For example, current Java and JavaScript VMs have multiple compilers -- HotSpot has C1/C2, V8 has Ignition/Turbofan, and JavaScriptCore has Baseline/DFG/FTL compilers. By applying heavyweight compilation to hot spots and lightweight compilation to not hot but warm spots, today's VMs balance code quality and compilation time. This multilevel compilation technique is important because a larger application tends to have a small area of hot spots.

The RPython framework is a toolchain that generates a VM equipped with a heavyweight trace-based JIT compiler from a bytecode interpreter. Although RPython makes it easier to create a high-performance VM, there is a dilemma: the generated VMs are hard to extend because they need to use the fixed VM components provided by RPython. Given this context, if one wishes to realize a lightweight compilation or a new heavyweight compilation with a new compilation scope, it requires much engineering effort to implement them by extending the meta-tracing JIT compiler.

We propose Multilevel RPython, which can perform two-level compilation with different compilation scopes. The compilation level of Multilevel RPython consists of a lightweight level, which emits method-based threaded code, and a heavyweight level, which emits trace-based optimizing code. Multilevel RPython is realized not by creating different compilers from scratch but by taming a bytecode interpreter given to the RPython toolchain. In other words, the lightweight compilation is performed by an interpreter tamed for threaded code generation, and the heavyweight compilation is used for tracing JIT compilation.

In this talk, we present the implementation status of Multilevel RPython. In particular, we implemented an inline caching technique for threaded code generation and a prototype of the compilation-level shifting mechanism. Both techniques are realized by taming the definition of an interpreter and a slight modification for the meta-tracing compiler.

The microbenchmark evaluation showed that inline caching makes threaded code generation approximately 20\% faster than threaded code generation without the inline caching technique. In addition, we conducted a multilevel JIT experiment on an application combining large benchmark programs to simulate a real-world workload. This experiment shows the multilevel JIT compilation on Multilevel RPyhon is about 14\% faster.}
}
Powered by bibtexbrowser