Fix/false load-use stall by gating rs1 amd rs2 hazard check #8

jgw0915 · 2025-12-18T17:18:15Z

Fix false load-use hazards by gating rs1/rs2 usage in ID stage

Background

While analyzing control hazard waveforms, I discovered that the current load-use hazard detection logic can spuriously trigger stalls for certain instruction patterns.

Specifically, the control logic in Control.scala compares rd_ex against both rs1_id and rs2_id unconditionally. However, for several instruction types, the bit positions corresponding to rs1 or rs2 do not represent architectural source registers.

Root Cause

For I-type load instructions (e.g., lw a0, 16(a5)), only rs1 is architecturally used.
rs2_id is still wired directly from instruction bits [24:20], which encode imm[4:0] for I-type instructions.
When the immediate value coincidentally equals a register index (e.g., imm = 16 → x16), the condition
```
rd_ex == rs2_id
```
may incorrectly evaluate to true.
This causes io_pc_stall, io_if_stall, and io_id_flush to assert, inserting an unnecessary bubble even though no real data dependency exists.

This behavior is technically consistent with the existing implementation but reflects a decode-level limitation: the hazard unit does not know whether the ID-stage instruction actually uses rs1 or rs2.

Fixes in this PR

Explicit rs1 / rs2 usage signals

Added decode-time signals to indicate whether an instruction uses rs1 and/or rs2
- Identified that jal, auipc, and lui do not use rs1 (the same bit positions encode immediates)
- rs2 is only considered for instruction classes that architecturally use a second source register, including:
  - R-type ALU instructions (e.g., add, sub, and, or)
  - S-type store instructions (e.g., sw)
  - B-type branch instructions (e.g., beq, bne)
These signals follow the added design hint mentioned in CA25 Exercise 19
Enables the control logic to reason about architectural source operands correctly

Result

Eliminates false-positive load-use stalls
Preserves correct handling of genuine hazards
Improves correctness and precision of control hazard detection
Waveform behavior after the fix matches architectural expectations and avoids unnecessary pipeline stalls

You can check the whole implementation logic in the Control.scala on my Github repo

3-pipeline/src/main/scala/riscv/core/fivestage_final/InstructionDecode.scala

jserv

The pull request adds uses_rs1_id and uses_rs2_id signals and wires them to Control, but Control.scala still has ? placeholders. The hazard detection logic is not modified to use these signals.
Looking at the author's full implementation in their fork, the intended fix is:

  val hazard_ex_rs1 = io.uses_rs1_id && (io.rd_ex === io.rs1_id)
  val hazard_ex_rs2 = io.uses_rs2_id && (io.rd_ex === io.rs2_id)

  when(
    ((io.memory_read_enable_ex || io.jump_instruction_id) &&
     (io.rd_ex =/= 0.U) &&
     (hazard_ex_rs1 || hazard_ex_rs2))
    ||
    (io.jump_instruction_id &&
     io.memory_read_enable_mem &&
     (io.rd_mem =/= 0.U) &&
     ((io.uses_rs1_id && (io.rd_mem === io.rs1_id)) ||
      (io.uses_rs2_id && (io.rd_mem === io.rs2_id))))
  ) { ... }

The PR needs to include this Control.scala change or the fix is non-functional.

3-pipeline/src/main/scala/riscv/core/fivestage_final/InstructionDecode.scala

jserv · 2025-12-19T04:07:45Z

Exercise 19 Intent: This is part of CA25 Exercise 19. The PR adds the usage signals which is good guidance, but should:

Keep the exercise structure intact (placeholders remain for students)
Add the signals as "hints" without solving the exercise completely

Since this PR is meant to be a complete fix, the Control.scala hazard logic needs updating.

jserv · 2025-12-19T04:08:47Z

Consider to contribute test case demonstrating false stall is eliminated:

lw  x16, 16(x0)   # imm[4:0] = 16 = x16
add x1, x2, x3    # rs2 = x3, should NOT stall despite x16 match

jgw0915 · 2025-12-24T09:08:47Z

So, in the requested change, I should put my full implementation of Control.scala in this PR (replace placeholders), modify camelCase to snake_case, append missing edge cases, and contribute false stall test cases, right?

jserv · 2025-12-25T01:44:40Z

So, in the requested change, I should put my full implementation of Control.scala in this PR (replace placeholders), modify camelCase to snake_case, append missing edge cases, and contribute false stall test cases, right?

Yes, that is the purpose of review.

3-pipeline/csrc/Makefile

jgw0915 · 2025-12-25T12:31:06Z

I have updated a test case for false load use stall in Section 11 of hazard_extended.S. The reason why Section 11 checks mem[0x38] == 3 is that this test is designed to detect only the presence of an extra (spurious) stall, not to count absolute clock cycles.

In hazard_extended.S, Section 11 measures a local cycle window using two csrr cycle instructions surrounding two back-to-back lws:

csrr a2, cycle
lw   a6, 0(a5)
lw   a7, 16(a5)
csrr a3, cycle
a3 = a3 - a2

Assume the first csrr a2, cycle reads cycle = 1000.

Between the two CSR reads, there are two intervening instructions (lw a6, 0(a5) and lw a7, 16(a5)), which execute in the next two cycles:

after first lw: cycle = 1001
after second lw: cycle = 1002

The second csrr a3, cycle itself executes in the following cycle and therefore reads cycle = 1003.

Thus, the measured delta is:

a3 - a2 = 1003 - 1000 = 3

This value (3) represents the expected baseline when no extra bubble is inserted.

The purpose of this test is to detect whether the hazard unit inserts an additional stall cycle due to a false-positive load-use hazard. If rs2 is incorrectly considered for I-type instructions (i.e., imm[4:0] is treated as rs2), the control logic asserts pc_stall / if_stall / id_flush, inserting one extra bubble. In that case, the second csrr is delayed by one more cycle and reads 1004, yielding:

a3 - a2 = 4

Therefore:

mem[0x38] == 3 ⇒ no false stall (correct behavior)
mem[0x38] == 4 ⇒ one spurious stall inserted (buggy behavior)

Section 11 is thus a regression test that isolates decode-level false hazard detection, rather than a test of absolute cycle timing.

mem[0x38] == 3 ⇒ no false stall (correct behavior)
mem[0x38] == 4 ⇒ one spurious stall inserted (buggy behavior)

This makes Section 11 a regression test that isolates decode-level false hazard detection, rather than a test of absolute cycle accounting.

Here is the actual waveform observed with no gating of uses_rs1 and uses_rs2, you can see the false stall happen at 301~305 ps :

((io.memory_read_enable_ex || io.jump_instruction_id) && // Either:
      // - Jump in ID needs register value, OR
      // - Load in EX (load-use hazard)
      (io.rd_ex =/= 0.U) &&                                 // Destination is not x0
      ((io.rd_ex === io.rs1_id) || (io.rd_ex === io.rs2_id))) // Destination matches ID source

After implementing the gating with uses_rs1 and uses_rs2, the false stall is successfully eliminated in the test case.

    ((io.memory_read_enable_ex || io.jump_instruction_id) && // Either:
      // - Jump in ID needs register value, OR
      // - Load in EX (load-use hazard)
      (io.rd_ex =/= 0.U) &&                                 // Destination is not x0
      ((io.uses_rs1_id && (io.rd_ex === io.rs1_id) || io.uses_rs2_id && (io.rd_ex === io.rs2_id)))) // Destination matches ID source

3-pipeline/src/main/scala/riscv/core/fivestage_final/Control.scala

jserv

Run 'git rebase -i' to squash commits and enforce the rules described in https://cbea.ms/git-commit/ .

Read the above carefully!

jgw0915 · 2025-12-26T02:59:36Z

My apologies, I thought the request was intended to shorten the commit message. Instead, I should squash commits to one or fewer meaningful commits, right?

jserv · 2025-12-26T03:01:27Z

My apologies, I thought the request was intended to shorten the commit message. Instead, I should squash commits to one or fewer meaningful commits, right?

Don't repeat my words.

Load-use detection compared rd_ex against rs1_id/rs2_id without checking whether the instruction actually consumes those operands. For I-type loads, rs2 encodes imm[4:0]; when it matched rd_ex the control unit stalled PC/IF and flushed ID, inserting a bubble even though rs2 was not a real source register. Decode previously exposed raw rs1/rs2 bits, so the hazard logic could not distinguish immediates from architectural sources. Add uses_rs1/uses_rs2 from decode and gate hazard checks on these flags. Also incorporate mem-read and jump enables to avoid false positives while preserving true hazards. Section 11 of hazard_extended.S now reports mem[0x38] == 3 (delta 3), rather than 4, indicating the extra bubble is removed.

jserv · 2025-12-26T05:45:39Z

Thank @jgw0915 for contributing!

jserv reviewed Dec 19, 2025

View reviewed changes

3-pipeline/src/main/scala/riscv/core/fivestage_final/InstructionDecode.scala Outdated Show resolved Hide resolved

jserv requested changes Dec 19, 2025

View reviewed changes

jserv reviewed Dec 19, 2025

View reviewed changes

3-pipeline/src/main/scala/riscv/core/fivestage_final/InstructionDecode.scala Outdated Show resolved Hide resolved

jserv changed the title ~~Fix/false load-use stall by gating rs1 amd rs2 harzard check~~ Fix/false load-use stall by gating rs1 amd rs2 hazard check Dec 19, 2025

jgw0915 force-pushed the Control-Hazard-Logic branch from b7ace8c to 0b2c4ff Compare December 24, 2025 09:36

jgw0915 force-pushed the Control-Hazard-Logic branch from d148958 to bbd344d Compare December 25, 2025 09:30

jserv reviewed Dec 25, 2025

View reviewed changes

3-pipeline/csrc/Makefile Outdated Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

jgw0915 force-pushed the Control-Hazard-Logic branch 3 times, most recently from a6d445e to 81839a6 Compare December 25, 2025 10:19

jgw0915 requested a review from jserv December 25, 2025 12:52

This comment was marked as outdated.

Sign in to view

jserv reviewed Dec 25, 2025

View reviewed changes

3-pipeline/src/main/scala/riscv/core/fivestage_final/Control.scala Outdated Show resolved Hide resolved

jgw0915 force-pushed the Control-Hazard-Logic branch from 81839a6 to e49f798 Compare December 25, 2025 15:33

jgw0915 requested a review from jserv December 25, 2025 15:34

This comment was marked as resolved.

Sign in to view

jgw0915 force-pushed the Control-Hazard-Logic branch from e49f798 to 8ffca19 Compare December 26, 2025 02:07

jgw0915 requested a review from jserv December 26, 2025 02:08

jserv requested changes Dec 26, 2025

View reviewed changes

jgw0915 force-pushed the Control-Hazard-Logic branch from 8ffca19 to c07e80c Compare December 26, 2025 03:18

jgw0915 requested a review from jserv December 26, 2025 03:18

This comment was marked as resolved.

Sign in to view

jgw0915 force-pushed the Control-Hazard-Logic branch from c07e80c to 5be6751 Compare December 26, 2025 04:16

jgw0915 requested a review from jserv December 26, 2025 04:18

jserv merged commit 3ca52d2 into sysprog21:main Dec 26, 2025

Fix/false load-use stall by gating rs1 amd rs2 hazard check #8

Fix/false load-use stall by gating rs1 amd rs2 hazard check #8

Conversation

jgw0915 commented Dec 18, 2025

Fix false load-use hazards by gating rs1/rs2 usage in ID stage

Background

Root Cause

Fixes in this PR

Result

Uh oh!

Uh oh!

jserv left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jserv commented Dec 19, 2025

Uh oh!

jserv commented Dec 19, 2025

Uh oh!

jgw0915 commented Dec 24, 2025

Uh oh!

jserv commented Dec 25, 2025

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

jgw0915 commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

jserv left a comment

Choose a reason for hiding this comment

Uh oh!

jgw0915 commented Dec 26, 2025

Uh oh!

jserv commented Dec 26, 2025

Uh oh!

This comment was marked as resolved.

Uh oh!

jserv commented Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jserv left a comment •

edited

Loading

jgw0915 commented Dec 25, 2025 •

edited

Loading