rv64iss: add quick thought about instruction handler

author: Johannes Stoelp <johannes.stoelp@gmail.com> 2022-05-12 01:16:26 +0200
committer: Johannes Stoelp <johannes.stoelp@gmail.com> 2022-05-12 01:16:26 +0200
commit: 3389d0a128d874d930e22f256b2646dd86b3b402 (patch)
tree: f5af2667f85764108fe4caf5afd57b5c0355e1fe /content
parent: 71c44dcbafd8078f6099b47d89617e8e7279031d (diff)
download: blog-3389d0a128d874d930e22f256b2646dd86b3b402.tar.gz
blog-3389d0a128d874d930e22f256b2646dd86b3b402.zip
1 files changed, 138 insertions, 0 deletions
diff --git a/content/2022-05-12-rv64iss-qt-instruction-handler.md b/content/2022-05-12-rv64iss-qt-instruction-handler.md
new file mode 100644
index 0000000..c5d7e0f
--- /dev/null
+++ b/content/2022-05-12-rv64iss-qt-instruction-handler.md
@@ -0,0 +1,138 @@
++++
+title = "rv64 iss quick thought instruction handler"
+
+[taxonomies]
+tags = ["rust", "iss", "riscv"]
++++
+
+This post was mainly written as future reference and to capture a quick thought
+that came up while hacking on my riscv64 instruction set simulator (iss).
+
+While writing this post, the structure of the decoding and instruction
+interpreter looked something like the following.
+```rust
+enum Insn {
+    Add { rd: Register, rs1: Register, rs2: Register },
+    Sub { rd: Register, rs1: Register, rs2: Register },
+    // ..
+}
+
+fn decode(insn: u32) -> Insn {
+    match insn & 0x7f {
+        0 => {
+            // decode insn
+            Add { .. }
+        }
+        1 => {
+            // decode insn
+            Add { .. }
+        }
+        // ..
+    }
+}
+
+fn interp() {
+    // read insn from mem
+    match decode(insn) {
+        Add { .. } => // interpret add instruction
+        Sub { .. } => // interpret sub instruction
+        // ..
+    }
+}
+
+fn disasm(insn: &Insn) {
+    match insn {
+        Add { .. } => // format disasm for add
+        Sub { .. } => // format disasm for sub
+        // ..
+    }
+}
+```
+
+While sitting there and loosing myself in some future enhancements that I would
+like to add at some point, I thought about decode caching and getting rid of
+that huge match case in the critical path of the interpreter loop.
+
+The following came to my mind which I just wanted to capture here in this post
+for some later time.
+```rust
+trait InstructionHandler: Sized {
+    type Ret;
+    fn add(&mut self, insn: &Instruction<Self>) -> Self::Ret;
+    fn sub(&mut self, insn: &Instruction<Self>) -> Self::Ret;
+}
+
+struct Instruction<H: InstructionHandler> {
+    dst: usize,
+    op1: u32,
+    op2: u32,
+    exec: fn(&mut H, &Self) -> H::Ret,
+}
+
+impl<H: InstructionHandler> Instruction<H> {
+    fn run(&self, ctx: &mut H) -> H::Ret {
+        (self.exec)(ctx, self)
+    }
+}
+
+#[derive(Debug, Default)]
+struct Interpreter {
+    regs: [u32; 4],
+}
+
+impl InstructionHandler for Interpreter {
+    type Ret = ();
+
+    fn add(&mut self, insn: &Instruction<Self>) {
+        self.regs[insn.dst] = insn.op1 + insn.op2;
+    }
+
+    fn sub(&mut self, insn: &Instruction<Self>) {
+        self.regs[insn.dst] = insn.op1 - insn.op2;
+    }
+}
+
+struct Disassembler;
+
+impl InstructionHandler for Disassembler {
+    type Ret = String;
+
+    fn add(&mut self, insn: &Instruction<Self>) -> Self::Ret {
+        format!("add {}, {}, {}", insn.dst, insn.op1, insn.op2)
+    }
+
+    fn sub(&mut self, insn: &Instruction<Self>) -> Self::Ret {
+        format!("sub {}, {}, {}", insn.dst, insn.op1, insn.op2)
+    }
+}
+
+fn decode<H: InstructionHandler>(opc: usize) -> Instruction<H> {
+    match opc {
+        0 => Instruction { dst: 1, op1: 222, op2: 42, exec: H::add },
+        1 => Instruction { dst: 3, op1: 110, op2: 23, exec: H::sub },
+        _ => todo!(),
+    }
+}
+
+fn main() {
+    let mut c = Interpreter::default();
+    println!("{:?}", &c);
+
+    decode(0).run(&mut c);
+    decode(1).run(&mut c);
+
+    println!("{}", decode(0).run(&mut Disassembler));
+    println!("{}", decode(1).run(&mut Disassembler));
+
+    println!("{:?}", &c);
+}
+```
+
+The nice part is that the handler is directly attached to the instruction, the
+bad part is that the instruction is tied to one specific handler. So as it is
+sketched below, we can't just decode an instruction with the `Interpreter`  and
+then __'run'__ it with the `Disassembler`.
+
+Additionally it would be interesting to compare the generated code and
+benchmark the interpreter loop. But this has to wait until the iss is somewhat
+more mature :^)
author	Johannes Stoelp <johannes.stoelp@gmail.com>	2022-05-12 01:16:26 +0200
committer	Johannes Stoelp <johannes.stoelp@gmail.com>	2022-05-12 01:16:26 +0200
commit	3389d0a128d874d930e22f256b2646dd86b3b402 (patch)
tree	f5af2667f85764108fe4caf5afd57b5c0355e1fe /content
parent	71c44dcbafd8078f6099b47d89617e8e7279031d (diff)
download	blog-3389d0a128d874d930e22f256b2646dd86b3b402.tar.gz blog-3389d0a128d874d930e22f256b2646dd86b3b402.zip