Bytecode
This article needs additional citations for verification. (January 2009) |
Program execution |
---|
General concepts |
Types of code |
Compilation strategies |
|
Notable runtimes |
|
Notable compilers & toolchains |
|
Bytecode (also called portable code or p-code[
The name bytecode stems from instruction sets that have one-
Since bytecode instructions are processed by software, they may be arbitrarily complex, but are nonetheless often akin to traditional hardware instructions: virtual stack machines are the most common, but virtual register machines have been built also.[2][3] Different parts may often be stored in separate files, similar to object modules, but dynamically loaded during execution.
Execution
A bytecode program may be executed by parsing and directly executing the instructions, one at a time. This kind of bytecode interpreter is very portable. Some systems, called dynamic translators, or
Because of its performance advantage, today many language implementations execute a program in two phases, first compiling the source code into bytecode, and then passing the bytecode to the virtual machine. There are bytecode based virtual machines of this sort for Java, Raku, Python, PHP,[a] Tcl, mawk and Forth (however, Forth is seldom compiled via bytecodes in this way, and its virtual machine is more generic instead). The implementation of Perl and Ruby 1.8 instead work by walking an abstract syntax tree representation derived from the source code.
More recently, the authors of V8[1] and Dart[7] have challenged the notion that intermediate bytecode is needed for fast and efficient VM implementation. Both of these language implementations currently do direct JIT compiling from source code to machine code with no bytecode intermediary.[8]
Examples
- ActionScript executes in the ActionScript Virtual Machine (AVM), which is part of Flash Player and AIR. ActionScript code is typically transformed into bytecode format by a compiler. Examples of compilers include one built into Adobe Flash Professional and one built into Adobe Flash Builder and available in the Adobe Flex SDK.
- Adobe Flash objects
- BANCStar, originally bytecode for an interface-building tool but used also as a language
- Berkeley Packet Filter
- Berkeley Pascal[9]
- Byte Code Engineering Library
- C to Java virtual machine compilers
- CLISP implementation of Common Lisp used to compile only to bytecode for many years; however, now it also supports compiling to native code with the help of GNU lightning
- CMUCL and Scieneer Common Lisp implementations of Common Lispcan compile either to native code or to bytecode, which is far more compact
- Common Intermediate Language executed by Common Language Runtime, used by .NET languages such as C#
- Dalvik bytecode, designed for the Android platform, is executed by the Dalvik virtual machine
- Dis bytecode, designed for the Dis virtual machine
- EiffelStudio for the Eiffel programming language
- EM, the Amsterdam Compiler Kit virtual machine used as an intermediate compiling language and as a modern bytecode language
- Emacs is a text editor with most of its functions implemented by Emacs Lisp, its built-in dialect of Lisp. These features are compiled into bytecode. This architecture allows users to customize the editor with a high level language, which after compiling into bytecode yields reasonable performance.
- Embeddable Common Lisp implementation of Common Lisp can compile to bytecode or C code
- Common Lisp provides a
disassemble
function[10] which prints to the standard output the underlying code of a specified function. The result is implementation-dependent and may or may not resolve to bytecode. Its inspection can be utilized for debugging and optimization purposes.[11] Steel Bank Common Lisp, for instance, produces:
(disassemble '(lambda (x) (print x))) ; disassembly for (LAMBDA (X)) ; 2436F6DF: 850500000F22 TEST EAX, [#x220F0000] ; no-arg-parsing entry point ; E5: 8BD6 MOV EDX, ESI ; E7: 8B05A8F63624 MOV EAX, [#x2436F6A8] ; #<FDEFINITION object for PRINT> ; ED: B904000000 MOV ECX, 4 ; F2: FF7504 PUSH DWORD PTR [EBP+4] ; F5: FF6005 JMP DWORD PTR [EAX+5] ; F8: CC0A BREAK 10 ; error trap ; FA: 02 BYTE #X02 ; FB: 18 BYTE #X18 ; INVALID-ARG-COUNT-ERROR ; FC: 4F BYTE #X4F ; ECX
- Ericsson implementation of Erlang uses BEAM bytecodes
- Ethereum's Virtual Machine (EVM) is the runtime environment, using its own bytecode, for transaction execution in Ethereum (smart contracts).
- Icon[12] and Unicon[13] programming languages
- Infocom used the Z-machine to make its software applications more portable
- Java bytecode, which is executed by the Java virtual machine
- ASM
- BCEL
- Javassist
- Oberon operating systemmore portable.
- LLVM IR
- LSL, a scripting language used in virtual worlds compiles into bytecode running on a virtual machine. Second Life has the original Mono version, Inworldz developed the Phlox version.
- Lua language uses a register-based bytecode virtual machine
- m-code of the MATLAB language[16]
- machine languagefor a ternary virtual machine.
- Visual C++ and Visual Basic
- Multiplan[17]
- O-code of the BCPLprogramming language
- OCaml language optionally compiles to a compact bytecode form
- p-code of UCSD Pascal implementation of the Pascal language
- Parrot virtual machine
- MultiValue BASIC
- The R environment for statistical computing offers a bytecode compiler through the compiler package, now standard with R version 2.13.0. It is possible to compile this version of R so that the base and recommended packages exploit this.[18]
- Pyramid 2000 adventure game
- Python scripts are being compiled on execution to Python's bytecode language, and the compiled files (.pyc) are cached inside the script's folder
- Compiled code can be analysed and investigated using a built-in tool for debugging the low-level bytecode. The tool can be initialized from the shell, for example:
>>> import dis # "dis" - Disassembler of Python byte code into mnemonics. >>> dis.dis('print("Hello, World!")') 1 0 LOAD_NAME 0 (print) 2 LOAD_CONST 0 ('Hello, World!') 4 CALL_FUNCTION 1 6 RETURN_VALUE
- Scheme 48 implementation of Scheme using bytecode interpreter
- Bytecodes of many implementations of the Smalltalk language
- The Parallax Propeller microcontroller
- The SQLite database engine translates SQL statements into a bespoke byte-code format.[19]
- Apple SWEET16
- Tcl
- TIMI is used by compilers on the IBM i platform.
- Tiny BASIC
- Visual FoxPro compiles to bytecode
- WebAssembly
- YARV and Rubinius for Ruby
- ZCODE
- Zend Engine opcodes for PHP
See also
- Intermediate representation
- Platform (computing)
- Runtime system
Notes
- ^ PHP has just-in-time compilation in PHP 8,[5][6] and before while not on in the default version, had options like HHVM. For older versions of PHP: Although PHP opcodes are generated each time the program is launched, and are always interpreted and not just-in-time compiled.
References
- ^ a b "Dynamic Machine Code Generation". Google Inc. Archived from the original on 2017-03-05. Retrieved 2023-02-21.
{{cite web}}
: CS1 maint: bot: original URL status unknown (link) - ^ "The Implementation of Lua 5.0". (NB. This involves a register-based virtual machine.)
- ^ "Dalvik VM". Archived from the original on 2013-05-18. Retrieved 2012-10-29. (NB. This VM is register based.)
- ^ "Byte Code Vs Machine Code". www.allaboutcomputing.net. Retrieved 2017-10-23.
- ^ O’Phinney, Matthew Weier. "Exploring the New PHP JIT Compiler". Zend by Perforce. Retrieved 2021-02-19.
- ^ "PHP 8: The JIT - stitcher.io". stitcher.io. Retrieved 2021-02-19.
- ^ Loitsch, Florian. "Why Not a Bytecode VM?". Google. Archived from the original on 2013-05-12.
- ^ "JavaScript myth: JavaScript needs a standard bytecode". 2ality.com.
- ^ G., Adam Y. (2022-07-11). "Berkeley Pascal". GitHub. Retrieved 2022-01-08.
- ^ "CLHS: Function DISASSEMBLE". www.lispworks.com.
- ^ "Performance Tuning and Tips". lispcookbook.github.io.
- ^ "The Implementation of the Icon Programming Language" (PDF). Archived from the original (PDF) on 2016-03-05. Retrieved 2011-09-09.
- ^ "The Implementation of Icon and Unicon a Compendium" (PDF). Archived (PDF) from the original on 2022-10-09.
- KEYBdriver has such a huge memory footprint compared to table-driven keyboard drivers which can be done in 3 - 4 Kb getting the same level of function except for the interpreter. […]
- KEYBOARD.SYSinstead of mixing Microsoft and IBM versions […] (NB. What is meant by "procedures" here are some additional bytecodes in the IBM KEYBOARD.SYS file not supported by the Microsoft version of the KEYB driver.)
- ^ "United States Patent 6,973,644". Archived from the original on 2017-03-05. Retrieved 2009-05-21.
- floating point format to calculate on, and an external (standard) format, which was binary coded decimal(BCD). The PACK and UNPACK instructions converted between the two.
- ^ "R Installation and Administration". cran.r-project.org.
- ^ "The SQLite Bytecode Engine". Archived from the original on 2017-04-14. Retrieved 2016-08-29.