diff options
author | Johannes Stoelp <johannes.stoelp@gmail.com> | 2024-12-27 10:47:02 +0100 |
---|---|---|
committer | Johannes Stoelp <johannes.stoelp@gmail.com> | 2024-12-27 10:47:02 +0100 |
commit | 941cc50b5038c74fe44910731d13f1f7f10a2e7d (patch) | |
tree | c4aa9ff765d892c3b91f47bde2735465733c3fc8 /content | |
parent | cb0562f9fba187a0db111a7f4532706a5140945e (diff) | |
download | blog-941cc50b5038c74fe44910731d13f1f7f10a2e7d.tar.gz blog-941cc50b5038c74fe44910731d13f1f7f10a2e7d.zip |
Diffstat (limited to 'content')
-rw-r--r-- | content/2024-12-20-xpost-juicebox-asm/index.md | 57 |
1 files changed, 37 insertions, 20 deletions
diff --git a/content/2024-12-20-xpost-juicebox-asm/index.md b/content/2024-12-20-xpost-juicebox-asm/index.md index d8796bd..15c16f4 100644 --- a/content/2024-12-20-xpost-juicebox-asm/index.md +++ b/content/2024-12-20-xpost-juicebox-asm/index.md @@ -11,8 +11,8 @@ in rust. The code is available on github [>> juicebox-asm <<][jb]. The _assembler_ only implements a small set of the x86 instruction set, but enough to create some interesting examples. Additionally, the repository -provides a simple _runtime_ based on [mmap(2)][mmap] (only on linux), which is -used to execute the dynamically assembled code and provided examples. +provides a simple _runtime_ based on [mmap(2)][mmap] (linux only), which is +used to execute the dynamically assembled code and the provided examples. ## Assembler The _assembler_ implements all the supported instructions and provides a similar @@ -33,23 +33,26 @@ asm.disasm(); // 00000006 C3 ret ``` -An interesting design-choice is given by the implementation of the assembler API -for the various instructions. Since x86 is a _CISC_ instruction set, it allows -same the mnemonic to be used with different types and permutations of -operands. This can be seen in the code listing above where the _add_ instruction -is used with both `register-register` and `register-memory` operands. It becomes -even clearer when looking into the instruction reference manual, in the example -of the _add_ instruction. +An interesting design-choice is given by the implementation of the assembler +API for the various instructions, since x86 overloads instruction mnemonics +with many operand types. This can be seen in the code listing above where the +_add_ instruction is used with both `register-register` and `register-memory` +operands. It becomes even clearer when looking into the instruction reference +manual, in the example of the _add_ instruction. <img src="x86-add.png"> > Table taken from [Intel® 64 and IA-32 Architectures Software Developer’s > Manual - Volume 2 Instruction Set Reference][intel-im]. -Since rust does not have function overloading +One approach is to implement the functions with _explicit_ types for the +operands. This has the benefit that the API clearly expresses which operands +are allowed for a given instruction and the compiler can check if wrong +operands are provided. However, since rust does not have function overloading <sup id="sup-1-src"><a href="#sup-1-dst">1</a></sup> -as in cpp, one does need to define functions with different function names to -implement the _add_ instruction for different combinations of operands. +as in cpp, one does need to define functions with different names to +implement the instructions. The following shows an example for the _add_ +instruction with different operands. ```rust // Pseudo code. fn add_r64_r64(Reg64, Reg64) @@ -57,10 +60,24 @@ fn add_r64_m64(Reg64, Mem64) // .. ``` -However, in this experiment, I did not go with that approach. Instead I used a -different solution and defined generic traits for the various instructions which -have different operand combinations. The _Add_ trait below gives an example of -that. +Another approach is to provided a _flexible_ `Operand` type and implement the +instructions with that. The benefit here is that there is only **one** _add_ +function. The downside however, is that the API does not clearly express which +operands are allowed and the operand check must be done at _runtime_ rather +than _compile-time_. +```rust +// Pseudo code. +fn add(Operand, Operand) +``` + +During this experiment I have learned that there is a way in rust to emulate +sort of function overloading, which allows to get all the benefits of the above +discussed approaches while getting rid of all the downsides at the same time at +the expense of a little boiler-plate code. + +The idea is to define generic traits for the various instructions with +overloaded mnemonics. The listing below shows the `Add` trait as example for +the _add_ instruction. ```rust pub trait Add<T, U> { @@ -69,9 +86,9 @@ pub trait Add<T, U> { ``` This in turn allows to implement the traits for all the different combinations -of operands that should be supported by each instruction. It gives fine control -of what to support and provides an ergonomic API as seen in the first example -code listing. +of (explicit) operands that should be supported by each instruction. It gives +fine control of what to support and provides an ergonomic API as seen in the +first example code listing. ```rust impl Add<Reg64, Reg64> for Asm { @@ -124,7 +141,7 @@ fn main() { ## Label A label is used to associate a location in the generated code with a symbolic name (the label), which can then be used as branch target for control-flow -instructions as for example `jz`, `jnz` and so on. +instructions as for example `jz`, `jnz`. The act of associating a label with a location is called _binding_. Each label can only be bound **once**, else the branch target would become _ambiguous_. |