xpost: minor enhancements to juicebox-asm

author: Johannes Stoelp <johannes.stoelp@gmail.com> 2024-12-27 10:47:02 +0100
committer: Johannes Stoelp <johannes.stoelp@gmail.com> 2024-12-27 10:47:02 +0100
commit: 941cc50b5038c74fe44910731d13f1f7f10a2e7d (patch)
tree: c4aa9ff765d892c3b91f47bde2735465733c3fc8 /content
parent: cb0562f9fba187a0db111a7f4532706a5140945e (diff)
download: blog-941cc50b5038c74fe44910731d13f1f7f10a2e7d.tar.gz
blog-941cc50b5038c74fe44910731d13f1f7f10a2e7d.zip
1 files changed, 37 insertions, 20 deletions
diff --git a/content/2024-12-20-xpost-juicebox-asm/index.md b/content/2024-12-20-xpost-juicebox-asm/index.md
index d8796bd..15c16f4 100644
--- a/content/2024-12-20-xpost-juicebox-asm/index.md
+++ b/content/2024-12-20-xpost-juicebox-asm/index.md
@@ -11,8 +11,8 @@ in rust. The code is available on github [>> juicebox-asm <<][jb].
 
 The _assembler_ only implements a small set of the x86 instruction set, but
 enough to create some interesting examples. Additionally, the repository
-provides a simple _runtime_ based on [mmap(2)][mmap] (only on linux), which is
-used to execute the dynamically assembled code and provided examples.
+provides a simple _runtime_ based on [mmap(2)][mmap] (linux only), which is
+used to execute the dynamically assembled code and the provided examples.
 
 ## Assembler
 The _assembler_ implements all the supported instructions and provides a similar
@@ -33,23 +33,26 @@ asm.disasm();
 //   00000006  C3        ret
 ```
 
-An interesting design-choice is given by the implementation of the assembler API
-for the various instructions. Since x86 is a _CISC_ instruction set, it allows
-same the mnemonic to be used with different types and permutations of
-operands. This can be seen in the code listing above where the _add_ instruction
-is used with both `register-register` and `register-memory` operands. It becomes
-even clearer when looking into the instruction reference manual, in the example
-of the _add_ instruction.
+An interesting design-choice is given by the implementation of the assembler
+API for the various instructions, since x86 overloads instruction mnemonics
+with many operand types. This can be seen in the code listing above where the
+_add_ instruction is used with both `register-register` and `register-memory`
+operands. It becomes even clearer when looking into the instruction reference
+manual, in the example of the _add_ instruction.
 
 <img src="x86-add.png">
 
 > Table taken from [Intel® 64 and IA-32 Architectures Software Developer’s
 > Manual - Volume 2 Instruction Set Reference][intel-im].
 
-Since rust does not have function overloading
+One approach is to implement the functions with _explicit_ types for the
+operands. This has the benefit that the API clearly expresses which operands
+are allowed for a given instruction and the compiler can check if wrong
+operands are provided. However, since rust does not have function overloading
 <sup id="sup-1-src"><a href="#sup-1-dst">1</a></sup>
-as in cpp, one does need to define functions with different function names to
-implement the _add_ instruction for different combinations of operands.
+as in cpp, one does need to define functions with different names to
+implement the instructions. The following shows an example for the _add_
+instruction with different operands.
 ```rust
 // Pseudo code.
 fn add_r64_r64(Reg64, Reg64)
@@ -57,10 +60,24 @@ fn add_r64_m64(Reg64, Mem64)
 // ..
 ```
 
-However, in this experiment, I did not go with that approach. Instead I used a
-different solution and defined generic traits for the various instructions which
-have different operand combinations. The _Add_ trait below gives an example of
-that.
+Another approach is to provided a _flexible_ `Operand` type and implement the
+instructions with that. The benefit here is that there is only **one** _add_
+function. The downside however, is that the API does not clearly express which
+operands are allowed and the operand check must be done at _runtime_ rather
+than _compile-time_.
+```rust
+// Pseudo code.
+fn add(Operand, Operand)
+```
+
+During this experiment I have learned that there is a way in rust to emulate
+sort of function overloading, which allows to get all the benefits of the above
+discussed approaches while getting rid of all the downsides at the same time at
+the expense of a little boiler-plate code.
+
+The idea is to define generic traits for the various instructions with
+overloaded mnemonics. The listing below shows the `Add` trait as example for
+the _add_ instruction.
 
 ```rust
 pub trait Add<T, U> {
@@ -69,9 +86,9 @@ pub trait Add<T, U> {
 ```
 
 This in turn allows to implement the traits for all the different combinations
-of operands that should be supported by each instruction. It gives fine control
-of what to support and provides an ergonomic API as seen in the first example
-code listing.
+of (explicit) operands that should be supported by each instruction. It gives
+fine control of what to support and provides an ergonomic API as seen in the
+first example code listing.
 
 ```rust
 impl Add<Reg64, Reg64> for Asm {
@@ -124,7 +141,7 @@ fn main() {
 ## Label
 A label is used to associate a location in the generated code with a symbolic
 name (the label), which can then be used as branch target for control-flow
-instructions as for example `jz`, `jnz` and so on.
+instructions as for example `jz`, `jnz`.
 
 The act of associating a label with a location is called _binding_. Each label
 can only be bound **once**, else the branch target would become _ambiguous_.
author	Johannes Stoelp <johannes.stoelp@gmail.com>	2024-12-27 10:47:02 +0100
committer	Johannes Stoelp <johannes.stoelp@gmail.com>	2024-12-27 10:47:02 +0100
commit	941cc50b5038c74fe44910731d13f1f7f10a2e7d (patch)
tree	c4aa9ff765d892c3b91f47bde2735465733c3fc8 /content
parent	cb0562f9fba187a0db111a7f4532706a5140945e (diff)
download	blog-941cc50b5038c74fe44910731d13f1f7f10a2e7d.tar.gz blog-941cc50b5038c74fe44910731d13f1f7f10a2e7d.zip