+++ title = "rv64 iss quick thought instruction handler" [taxonomies] tags = ["rust", "iss", "riscv"] +++ This post was mainly written as future reference and to capture a quick thought that came up while hacking on my riscv64 instruction set simulator (iss). While writing this post, the structure of the decoding and instruction interpreter looked something like the following. ```rust enum Insn { Add { rd: Register, rs1: Register, rs2: Register }, Sub { rd: Register, rs1: Register, rs2: Register }, // .. } fn decode(insn: u32) -> Insn { match insn & 0x7f { 0 => { // decode insn Add { .. } } 1 => { // decode insn Add { .. } } // .. } } fn interp() { // read insn from mem match decode(insn) { Add { .. } => // interpret add instruction Sub { .. } => // interpret sub instruction // .. } } fn disasm(insn: &Insn) { match insn { Add { .. } => // format disasm for add Sub { .. } => // format disasm for sub // .. } } ``` While sitting there and loosing myself in some future enhancements that I would like to add at some point, I thought about decode caching and getting rid of that huge match case in the critical path of the interpreter loop. The following came to my mind which I just wanted to capture here in this post for some later time. ```rust trait InstructionHandler: Sized { type Ret; fn add(&mut self, insn: &Instruction) -> Self::Ret; fn sub(&mut self, insn: &Instruction) -> Self::Ret; } struct Instruction { dst: usize, op1: u32, op2: u32, exec: fn(&mut H, &Self) -> H::Ret, } impl Instruction { fn run(&self, ctx: &mut H) -> H::Ret { (self.exec)(ctx, self) } } #[derive(Debug, Default)] struct Interpreter { regs: [u32; 4], } impl InstructionHandler for Interpreter { type Ret = (); fn add(&mut self, insn: &Instruction) { self.regs[insn.dst] = insn.op1 + insn.op2; } fn sub(&mut self, insn: &Instruction) { self.regs[insn.dst] = insn.op1 - insn.op2; } } struct Disassembler; impl InstructionHandler for Disassembler { type Ret = String; fn add(&mut self, insn: &Instruction) -> Self::Ret { format!("add {}, {}, {}", insn.dst, insn.op1, insn.op2) } fn sub(&mut self, insn: &Instruction) -> Self::Ret { format!("sub {}, {}, {}", insn.dst, insn.op1, insn.op2) } } fn decode(opc: usize) -> Instruction { match opc { 0 => Instruction { dst: 1, op1: 222, op2: 42, exec: H::add }, 1 => Instruction { dst: 3, op1: 110, op2: 23, exec: H::sub }, _ => todo!(), } } fn main() { let mut c = Interpreter::default(); println!("{:?}", &c); decode(0).run(&mut c); decode(1).run(&mut c); println!("{}", decode(0).run(&mut Disassembler)); println!("{}", decode(1).run(&mut Disassembler)); println!("{:?}", &c); } ``` The nice part is that the handler is directly attached to the instruction, the bad part is that the instruction is tied to one specific handler. So as it is sketched below, we can't just decode an instruction with the `Interpreter` and then __'run'__ it with the `Disassembler`. Additionally it would be interesting to compare the generated code and benchmark the interpreter loop. But this has to wait until the iss is somewhat more mature :^)