From 941cc50b5038c74fe44910731d13f1f7f10a2e7d Mon Sep 17 00:00:00 2001 From: Johannes Stoelp Date: Fri, 27 Dec 2024 10:47:02 +0100 Subject: xpost: minor enhancements to juicebox-asm --- content/2024-12-20-xpost-juicebox-asm/index.md | 57 +++++++++++++++++--------- 1 file changed, 37 insertions(+), 20 deletions(-) (limited to 'content') diff --git a/content/2024-12-20-xpost-juicebox-asm/index.md b/content/2024-12-20-xpost-juicebox-asm/index.md index d8796bd..15c16f4 100644 --- a/content/2024-12-20-xpost-juicebox-asm/index.md +++ b/content/2024-12-20-xpost-juicebox-asm/index.md @@ -11,8 +11,8 @@ in rust. The code is available on github [>> juicebox-asm <<][jb]. The _assembler_ only implements a small set of the x86 instruction set, but enough to create some interesting examples. Additionally, the repository -provides a simple _runtime_ based on [mmap(2)][mmap] (only on linux), which is -used to execute the dynamically assembled code and provided examples. +provides a simple _runtime_ based on [mmap(2)][mmap] (linux only), which is +used to execute the dynamically assembled code and the provided examples. ## Assembler The _assembler_ implements all the supported instructions and provides a similar @@ -33,23 +33,26 @@ asm.disasm(); // 00000006 C3 ret ``` -An interesting design-choice is given by the implementation of the assembler API -for the various instructions. Since x86 is a _CISC_ instruction set, it allows -same the mnemonic to be used with different types and permutations of -operands. This can be seen in the code listing above where the _add_ instruction -is used with both `register-register` and `register-memory` operands. It becomes -even clearer when looking into the instruction reference manual, in the example -of the _add_ instruction. +An interesting design-choice is given by the implementation of the assembler +API for the various instructions, since x86 overloads instruction mnemonics +with many operand types. This can be seen in the code listing above where the +_add_ instruction is used with both `register-register` and `register-memory` +operands. It becomes even clearer when looking into the instruction reference +manual, in the example of the _add_ instruction. > Table taken from [Intel® 64 and IA-32 Architectures Software Developer’s > Manual - Volume 2 Instruction Set Reference][intel-im]. -Since rust does not have function overloading +One approach is to implement the functions with _explicit_ types for the +operands. This has the benefit that the API clearly expresses which operands +are allowed for a given instruction and the compiler can check if wrong +operands are provided. However, since rust does not have function overloading 1 -as in cpp, one does need to define functions with different function names to -implement the _add_ instruction for different combinations of operands. +as in cpp, one does need to define functions with different names to +implement the instructions. The following shows an example for the _add_ +instruction with different operands. ```rust // Pseudo code. fn add_r64_r64(Reg64, Reg64) @@ -57,10 +60,24 @@ fn add_r64_m64(Reg64, Mem64) // .. ``` -However, in this experiment, I did not go with that approach. Instead I used a -different solution and defined generic traits for the various instructions which -have different operand combinations. The _Add_ trait below gives an example of -that. +Another approach is to provided a _flexible_ `Operand` type and implement the +instructions with that. The benefit here is that there is only **one** _add_ +function. The downside however, is that the API does not clearly express which +operands are allowed and the operand check must be done at _runtime_ rather +than _compile-time_. +```rust +// Pseudo code. +fn add(Operand, Operand) +``` + +During this experiment I have learned that there is a way in rust to emulate +sort of function overloading, which allows to get all the benefits of the above +discussed approaches while getting rid of all the downsides at the same time at +the expense of a little boiler-plate code. + +The idea is to define generic traits for the various instructions with +overloaded mnemonics. The listing below shows the `Add` trait as example for +the _add_ instruction. ```rust pub trait Add { @@ -69,9 +86,9 @@ pub trait Add { ``` This in turn allows to implement the traits for all the different combinations -of operands that should be supported by each instruction. It gives fine control -of what to support and provides an ergonomic API as seen in the first example -code listing. +of (explicit) operands that should be supported by each instruction. It gives +fine control of what to support and provides an ergonomic API as seen in the +first example code listing. ```rust impl Add for Asm { @@ -124,7 +141,7 @@ fn main() { ## Label A label is used to associate a location in the generated code with a symbolic name (the label), which can then be used as branch target for control-flow -instructions as for example `jz`, `jnz` and so on. +instructions as for example `jz`, `jnz`. The act of associating a label with a location is called _binding_. Each label can only be bound **once**, else the branch target would become _ambiguous_. -- cgit v1.2.3