Skip to content

Conversation

SamuelRiedel
Copy link
Contributor

This PR adds support for the Zcb and Zcmp code-saving extensions. This support is implemented in a parameterizable way via the RV32ZC parameter, which allows choosing none, either, or both extensions. By default, both extensions are enabled because their combined hardware overhead is small, approximately 800 gate equivalents.

The Zcb extension introduces new compressed encodings for common instructions already supported in Ibex. The Zcmp extension introduces single compressed instructions that expand into multiple existing instructions within Ibex. For example, a stack push expands into multiple store instructions and a stack pointer update. Therefore, both extensions are implemented in the IF stage.

Adding these new instructions requires updates to the following components:

  • Ibex RTL: Implemented in this PR
  • Documentation: Updated in this PR
  • Verification: Implemented in this PR as well. Some missing parts still (see at the bottom)
    • riscv-isa-sim (Spike): Updated Spike to support those new extensions Add the Zc* extensions riscv-isa-sim#27
    • riscv-dv: Add both extensions to riscv-dv. We can either try to upstream those or create a fork. I didn't open a PR yet, but the diff is here: chipsalliance/riscv-dv@master...SamuelRiedel:riscv-dv:zcbzcmp
      • Support for Zcb extension
      • Support for the Zcmp's CMMV instructions
      • Support for Zcmp's CMPP instructions
    • Ibex co-simulation: Updated in this PR. Requires changes to get Zcmp's CMPP to work fully.
      • Support for Zcb extension
      • Support for the Zcmp's CMMV instructions
      • Support for Zcmp's CMPP instructions
  • Compiler Support: We need to update the compiler to a version that supports the Zcb and Zcmp instructions. Currently, we are still on an rather old version. The latest RISC-V GCC and Clang support those instructions already.

Missing Components and Known Issues:

  • The verification for the Zcmp's CMPP instructions (cm.push, cm.pop(ret(z))) is not fully working yet. The primary issue lies in the cosimulation interface with Spike. Checking multiple memory accesses that occur within a single step causes riscv-isa-sim to throw an error.

Elias Christen and others added 12 commits October 11, 2025 01:35
This is required to ensure we can trace all expanded instructions but
also already advance the PC on the last expanded instruction
Tracking also the original instruction, not only the micro-op, allows
the DV to track whether and which instruction we are expanding
The tracer usually only sees the instructions that reach the ID stage.
Since the Zcmp instructions are expanded in the IF stage, they will be
traced as their micro-ops. This adds information in the trace from which
expanded instruction those micro-ops come from.
The handshake only considered whether the ID stage would be ready. But
the actual pipeline register will also take the `pc_set_i` signal into
account, which signals a jump. Since the compressed decoder has state
now (through the Zcmp extension), this improper handshake led to some of
the expanded instructions to get lost.

At the same time, we also take this signal into account for the enable
signal of the pipeline stage to avoid unnecessary switching.
Add a function and state to process multiple Ibex instructions that are
modeled as a single instruction in riscv-isa-sim. The current function
checks that the same registers are written overall.
Copy link
Contributor

@rswarbrick rswarbrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of nitty comments on the draft (sorry, but I'd started reading so thought I may as well review it properly...)

I really like this though! Thank you all so much for the improvement.

2'b01, instr_i[9:7], {OPCODE_OP}};
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boring minor note... but you'll probably want to go through and drop trailing whitespace when this is ready to go.


function automatic logic [6:0] cm_stack_adj_base(input logic [3:0] rlist);
unique case (rlist)
// Deliberately not written as `case .. inside` because that is not supported by all tools.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks perfectly sensible, but (Rupert is interested...) is this better than a chain of if/else branches? That would avoid you having to e.g. list the numbers between 8 and 12...

return rlist;
endfunction

typedef enum logic [3:0] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a while thinking about this, because there's no reason that e.g. CmPushStoreReg and CmPopLoadReg need to be different values (since the selection of "sub-FSM" depends on the instruction bits already).

So you could make things smaller (dropping a bit?) by making this a genuine counter ("sub phase"). But that's sad because you lose the explicit enum names. Squashing things up and keeping the names would (I think) need several different enums and a union type. But that's complicated. I'm very unconvinced it's worth the effort!

But maybe it's worth leaving a comment above the definition that explains what the enum is used for? I'd also consider splitting the list into blocks for the different instructions and leaving a mini comment above each?

// More registers have to be stored.
// Remove the current register from `rlist`.
cm_rlist_d -= 5'd1;
// Initialize SP offset to 2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure this is right, but I haven't yet figured out how it works! Can you extend the comment slightly to explain why we start at offset 2 but will increment by just one later?

end else begin
// More registers have to be loaded.
// Remove the current register from `rlist` and decrement the SP offset.
cm_rlist_d -= 5'd1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This took me a couple of minutes to understand because I hadn't twigged that cm_push_store_reg subtracts its offset from a top address: I was struggling to work out why things didn't end up getting reversed!

For other readers who are equally dim, I wonder whether it would make sense to explicitly mention the stored register layout before the case switches for push and pop?

| | | stall characteristics (see above). |
+-----------------------+--------------------------------------+-------------------------------------------------------------+
| Zcmp Push/Pop | 2 - N | The cm.push/pop instructions as defined in 'Zcmp' of the |
| | | RISC-V specification. Internally they expand to multiple |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: To go in this order, I think the sentence needs a comma: "Internally, they expand ...". (Similarly for the move box below)

PC_BP
} pc_sel_e;

// Compressed instruction expansion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! This matches a comment I had on an earlier commit. Maybe we should just "promote" this change to there :-)

.rvfi_mem_rdata,
.rvfi_mem_wdata
.rvfi_mem_wdata,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray newline?

uint32_t dm_end_addr)
: nmi_mode(false), pending_iside_error(false), insn_cnt(0) {
: nmi_mode(false), pending_iside_error(false), insn_cnt(0),
pending_expanded_insn(0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think the indentation is missing a couple of spaces here.

Comment on lines +220 to +221
// does not expand instructions, so we have to consume a single Spike step
// over multiple steps of Ibex.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to explicitly say that we're stepping spike at the last step in Ibex?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants