aboutsummaryrefslogtreecommitdiffhomepage
path: root/content/2024-12-20-xpost-juicebox-asm/index.md
diff options
context:
space:
mode:
Diffstat (limited to 'content/2024-12-20-xpost-juicebox-asm/index.md')
-rw-r--r--content/2024-12-20-xpost-juicebox-asm/index.md57
1 files changed, 37 insertions, 20 deletions
diff --git a/content/2024-12-20-xpost-juicebox-asm/index.md b/content/2024-12-20-xpost-juicebox-asm/index.md
index d8796bd..15c16f4 100644
--- a/content/2024-12-20-xpost-juicebox-asm/index.md
+++ b/content/2024-12-20-xpost-juicebox-asm/index.md
@@ -11,8 +11,8 @@ in rust. The code is available on github [>> juicebox-asm <<][jb].
The _assembler_ only implements a small set of the x86 instruction set, but
enough to create some interesting examples. Additionally, the repository
-provides a simple _runtime_ based on [mmap(2)][mmap] (only on linux), which is
-used to execute the dynamically assembled code and provided examples.
+provides a simple _runtime_ based on [mmap(2)][mmap] (linux only), which is
+used to execute the dynamically assembled code and the provided examples.
## Assembler
The _assembler_ implements all the supported instructions and provides a similar
@@ -33,23 +33,26 @@ asm.disasm();
// 00000006 C3 ret
```
-An interesting design-choice is given by the implementation of the assembler API
-for the various instructions. Since x86 is a _CISC_ instruction set, it allows
-same the mnemonic to be used with different types and permutations of
-operands. This can be seen in the code listing above where the _add_ instruction
-is used with both `register-register` and `register-memory` operands. It becomes
-even clearer when looking into the instruction reference manual, in the example
-of the _add_ instruction.
+An interesting design-choice is given by the implementation of the assembler
+API for the various instructions, since x86 overloads instruction mnemonics
+with many operand types. This can be seen in the code listing above where the
+_add_ instruction is used with both `register-register` and `register-memory`
+operands. It becomes even clearer when looking into the instruction reference
+manual, in the example of the _add_ instruction.
<img src="x86-add.png">
> Table taken from [Intel® 64 and IA-32 Architectures Software Developer’s
> Manual - Volume 2 Instruction Set Reference][intel-im].
-Since rust does not have function overloading
+One approach is to implement the functions with _explicit_ types for the
+operands. This has the benefit that the API clearly expresses which operands
+are allowed for a given instruction and the compiler can check if wrong
+operands are provided. However, since rust does not have function overloading
<sup id="sup-1-src"><a href="#sup-1-dst">1</a></sup>
-as in cpp, one does need to define functions with different function names to
-implement the _add_ instruction for different combinations of operands.
+as in cpp, one does need to define functions with different names to
+implement the instructions. The following shows an example for the _add_
+instruction with different operands.
```rust
// Pseudo code.
fn add_r64_r64(Reg64, Reg64)
@@ -57,10 +60,24 @@ fn add_r64_m64(Reg64, Mem64)
// ..
```
-However, in this experiment, I did not go with that approach. Instead I used a
-different solution and defined generic traits for the various instructions which
-have different operand combinations. The _Add_ trait below gives an example of
-that.
+Another approach is to provided a _flexible_ `Operand` type and implement the
+instructions with that. The benefit here is that there is only **one** _add_
+function. The downside however, is that the API does not clearly express which
+operands are allowed and the operand check must be done at _runtime_ rather
+than _compile-time_.
+```rust
+// Pseudo code.
+fn add(Operand, Operand)
+```
+
+During this experiment I have learned that there is a way in rust to emulate
+sort of function overloading, which allows to get all the benefits of the above
+discussed approaches while getting rid of all the downsides at the same time at
+the expense of a little boiler-plate code.
+
+The idea is to define generic traits for the various instructions with
+overloaded mnemonics. The listing below shows the `Add` trait as example for
+the _add_ instruction.
```rust
pub trait Add<T, U> {
@@ -69,9 +86,9 @@ pub trait Add<T, U> {
```
This in turn allows to implement the traits for all the different combinations
-of operands that should be supported by each instruction. It gives fine control
-of what to support and provides an ergonomic API as seen in the first example
-code listing.
+of (explicit) operands that should be supported by each instruction. It gives
+fine control of what to support and provides an ergonomic API as seen in the
+first example code listing.
```rust
impl Add<Reg64, Reg64> for Asm {
@@ -124,7 +141,7 @@ fn main() {
## Label
A label is used to associate a location in the generated code with a symbolic
name (the label), which can then be used as branch target for control-flow
-instructions as for example `jz`, `jnz` and so on.
+instructions as for example `jz`, `jnz`.
The act of associating a label with a location is called _binding_. Each label
can only be bound **once**, else the branch target would become _ambiguous_.