-
Notifications
You must be signed in to change notification settings - Fork 654
Add Zcb and Zcmp extensions #2324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This is required to ensure we can trace all expanded instructions but also already advance the PC on the last expanded instruction
Tracking also the original instruction, not only the micro-op, allows the DV to track whether and which instruction we are expanding
The tracer usually only sees the instructions that reach the ID stage. Since the Zcmp instructions are expanded in the IF stage, they will be traced as their micro-ops. This adds information in the trace from which expanded instruction those micro-ops come from.
The handshake only considered whether the ID stage would be ready. But the actual pipeline register will also take the `pc_set_i` signal into account, which signals a jump. Since the compressed decoder has state now (through the Zcmp extension), this improper handshake led to some of the expanded instructions to get lost. At the same time, we also take this signal into account for the enable signal of the pipeline stage to avoid unnecessary switching.
Add a function and state to process multiple Ibex instructions that are modeled as a single instruction in riscv-isa-sim. The current function checks that the same registers are written overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of nitty comments on the draft (sorry, but I'd started reading so thought I may as well review it properly...)
I really like this though! Thank you all so much for the improvement.
rtl/ibex_compressed_decoder.sv
Outdated
2'b01, instr_i[9:7], {OPCODE_OP}}; | ||
end | ||
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boring minor note... but you'll probably want to go through and drop trailing whitespace when this is ready to go.
|
||
function automatic logic [6:0] cm_stack_adj_base(input logic [3:0] rlist); | ||
unique case (rlist) | ||
// Deliberately not written as `case .. inside` because that is not supported by all tools. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks perfectly sensible, but (Rupert is interested...) is this better than a chain of if/else branches? That would avoid you having to e.g. list the numbers between 8 and 12...
return rlist; | ||
endfunction | ||
|
||
typedef enum logic [3:0] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a while thinking about this, because there's no reason that e.g. CmPushStoreReg
and CmPopLoadReg
need to be different values (since the selection of "sub-FSM" depends on the instruction bits already).
So you could make things smaller (dropping a bit?) by making this a genuine counter ("sub phase"). But that's sad because you lose the explicit enum names. Squashing things up and keeping the names would (I think) need several different enums and a union type. But that's complicated. I'm very unconvinced it's worth the effort!
But maybe it's worth leaving a comment above the definition that explains what the enum is used for? I'd also consider splitting the list into blocks for the different instructions and leaving a mini comment above each?
rtl/ibex_compressed_decoder.sv
Outdated
// More registers have to be stored. | ||
// Remove the current register from `rlist`. | ||
cm_rlist_d -= 5'd1; | ||
// Initialize SP offset to 2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure this is right, but I haven't yet figured out how it works! Can you extend the comment slightly to explain why we start at offset 2 but will increment by just one later?
rtl/ibex_compressed_decoder.sv
Outdated
end else begin | ||
// More registers have to be loaded. | ||
// Remove the current register from `rlist` and decrement the SP offset. | ||
cm_rlist_d -= 5'd1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This took me a couple of minutes to understand because I hadn't twigged that cm_push_store_reg
subtracts its offset from a top address: I was struggling to work out why things didn't end up getting reversed!
For other readers who are equally dim, I wonder whether it would make sense to explicitly mention the stored register layout before the case switches for push and pop?
| | | stall characteristics (see above). | | ||
+-----------------------+--------------------------------------+-------------------------------------------------------------+ | ||
| Zcmp Push/Pop | 2 - N | The cm.push/pop instructions as defined in 'Zcmp' of the | | ||
| | | RISC-V specification. Internally they expand to multiple | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: To go in this order, I think the sentence needs a comma: "Internally, they expand ...". (Similarly for the move box below)
PC_BP | ||
} pc_sel_e; | ||
|
||
// Compressed instruction expansion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! This matches a comment I had on an earlier commit. Maybe we should just "promote" this change to there :-)
.rvfi_mem_rdata, | ||
.rvfi_mem_wdata | ||
.rvfi_mem_wdata, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray newline?
uint32_t dm_end_addr) | ||
: nmi_mode(false), pending_iside_error(false), insn_cnt(0) { | ||
: nmi_mode(false), pending_iside_error(false), insn_cnt(0), | ||
pending_expanded_insn(0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think the indentation is missing a couple of spaces here.
// does not expand instructions, so we have to consume a single Spike step | ||
// over multiple steps of Ibex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it makes sense to explicitly say that we're stepping spike at the last step in Ibex?
This PR adds support for the Zcb and Zcmp code-saving extensions. This support is implemented in a parameterizable way via the
RV32ZC
parameter, which allows choosing none, either, or both extensions. By default, both extensions are enabled because their combined hardware overhead is small, approximately 800 gate equivalents.The Zcb extension introduces new compressed encodings for common instructions already supported in Ibex. The Zcmp extension introduces single compressed instructions that expand into multiple existing instructions within Ibex. For example, a stack push expands into multiple store instructions and a stack pointer update. Therefore, both extensions are implemented in the IF stage.
Adding these new instructions requires updates to the following components:
riscv-isa-sim
(Spike): Updated Spike to support those new extensions Add the Zc* extensions riscv-isa-sim#27riscv-dv
: Add both extensions to riscv-dv. We can either try to upstream those or create a fork. I didn't open a PR yet, but the diff is here: chipsalliance/riscv-dv@master...SamuelRiedel:riscv-dv:zcbzcmpMissing Components and Known Issues: