Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions arch/ext/Smctr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# yaml-language-server: $schema=../../schemas/ext_schema.json

$schema: "ext_schema.json#"
kind: extension
name: Smctr
long_name: Control Transfer Records
description: |
A method for recording control flow transfer history is valuable not only for performance
profiling but also for debugging.
Control flow transfers refer to jump instructions (including function calls and returns), taken
branch instructions, traps, and trap returns.
Profiling tools, such as Linux perf, collect control transfer history when sampling software
execution, thereby enabling tools, like AutoFDO, to identify hot paths for optimization.

Control flow trace capabilities offer very deep transfer history, but the volume of data produced
can result in significant performance overheads due to memory bandwidth consumption, buffer
management, and decoder overhead.
The Control Transfer Records (CTR) extension provides a method to record a limited history in
register-accessible internal chip storage, with the intent of dramatically reducing the
performance overhead and complexity of collecting transfer history.

CTR defines a circular (FIFO) buffer. Each buffer entry holds a record for a single recorded
control flow transfer.
The number of records that can be held in the buffer depends upon both the implementation (the
maximum supported depth) and the CTR configuration (the software selected depth).

Only qualified transfers are recorded.
Qualified transfers are those that meet the filtering criteria, which include the privilege mode
and the transfer type.

Recorded transfers are inserted at the write pointer, which is then incremented, while older
recorded transfers may be overwritten once the buffer is full.
Or the user can enable RAS (Return Address Stack) emulation mode, where only function calls are
recorded, and function returns pop the last call record.
The source PC, target PC, and some optional metadata (transfer type, elapsed cycles) are stored
for each recorded transfer.

The CTR buffer is accessible through an indirect CSR interface, such that software can specify
which logical entry in the buffer it wishes to read or write.
Logical entry 0 always corresponds to the youngest recorded transfer, followed by entry 1 as the
next youngest, and so on.
params:
CTR_CYCLE_COUNTER:
description: |
An internal counter used to count CPU cycles while CTR is active, where active implies that the
current privilege mode is enabled for recording and CTR is not frozen. This counter is only for
implementations that support cycle counting. It increments at the same rate as the mcycle counter.
This counter is used to populate the CC field of ctrdata when a qualified control transfer occurs.
It resets on writes to xctrctl and on execution of SCTRCLR.
schema:
type: integer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to represent the optionality of the the CC field in the ctrdata CSR? If so, this seems like this should be a boolean?

Also, it's not clear that this is what the parameter is for, so maybe include a bit more about the purpose of the parameter:

The ctrdata register may optionally include a count of CPU cycles elapsed since the prior CTR record.

Copy link
Author

@Zain2050 Zain2050 Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit vague in the spec. From what I've understood. CtrCycleCounter and ctrdata.CC are different. CtrCycleCounter is an independant counter whose value is stored in ctrdata in case of a control transfer. From
section 11.5.3

The CC field is encoded such that CCE holds 0 if the CtrCycleCounter value is less than 4096, otherwise it holds the index of the most significant one bit in the CtrCycleCounter value, minus 11.

ctrdata for each entry is just a copy of CtrCycleCounter at the time of control transfer. Also,

The SCTRCLR instruction performs the following operations:
Zeroes all CTR Entry Registers, for all DEPTH values
Zeroes the CTR cycle counter and CCV

Here, zeroing out entry registers meaning zeroing out ctrdata as well. But it explicitly states to zero out CTR cycle counter and CCV.
As it's a counter therefore, it's type would be integer, right?
I'll improve the description of the param as it seems a bit unclear.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following, sorry. The purpose of parameters is for configurations to resolve optionality in the spec. (For example, when the spec says "should", "may", or recommends a behavior, but does not mandate it with "must".) What optionality is being resolved with this parameter?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I might be using parameters the wrong way.
Spec states that STRCLR zeros out the CTR cycle counter.

The SCTRCLR instruction performs the following operations:
Zeroes all CTR Entry Registers, for all DEPTH values
Zeroes the CTR cycle counter and CCV

Therefore, I used params to represent the hardware counter and CCV flag. What would be the correct way to represent the counter so that I could zero it out in IDL?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so "CCV" is a field in the ctrdata register. This register is not yet defined in UDB, but needs to be for this work to proceed. It would probably be defined much like other CSRs, but it might be a little weird in that I think it can only be accessed indirectly (siselect + sireg3). And there isn't a single set of CTR registers, it's a buffer of up to 256 records that can be accessed by being mapped to indirect register selections 0x200-0x2ff. And, how many records are actually supported is implementation-dependent (needs a configuration parameter) -- see sctrdepth.

So, it's a little complicated. #554 is a start on defining the extension, but isn't merged, and doesn't have the parameters defined.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response was essentially that CtrCycleCounter is not directly visible machine state, only indirectly through ctrdata register(s). So, I think this needs to be implemented using a callout to a builtin function, akin to the use of read_mcycle() in cycle.yaml.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. What about clearing out the entry registers? We don't have direct access to those registers. Should I use a builtin for that as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I don't think it's terribly important whether it is represented as a set of (indirect) actual CSRs, or as a set of indirect CSRs that have sw_write() and sw_read() methods that access some builtin memory buffer. The former is pretty straight-forward. The latter would require defining a builtin accessor function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, my current approach makes sense, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are talking about configuration parameters, then no. :-)

Flipping through the extension documentation, I see at least these as needed parameters:

  • what DEPTH values are supported
  • whether MISP is supported
  • the optionality of each ctrdata field
  • number of supported CCE bits

For registers, you'll need to use indirect access as noted previously.

For CtrCycleCounter, you should implement as a call to a builtin function.

For CCV, each of those bits is contained in each Control Transfer Record, shadowed by the ctrdata register, accessed indirectly through sireg3 (when siselect = buffer offset/register index), so if you're already setting the register to zero, you have also set CCV to zero.

I hope that makes sense. :-)

CCV_HW:
description: |
An internal hardware flag, which is stored in ctrdata.CCV of the next record in case of a qualified
control transfer.
It is cleared out after a write to xctrctl or execution of SCTRCLR, since CTR_CYCLE_COUNTER is reset.
This flag should additionally be cleared after any other implementation-specific scenarios where
active cycles might not be counted in CTR_CYCLE_COUNTER.
schema:
type: boolean
type: privileged
versions:
- version: "1.0.0"
state: ratified
ratification_date: 2024-11
implies:
name: Ssctr
version: "1.0.0"
requires:
allOf:
- name: S
version: ~> 1.13 # The latest ratified version of S when Sscntr was ratified
- name: Smcsrind
version: ~> 1.0
19 changes: 19 additions & 0 deletions arch/ext/Ssctr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# yaml-language-server: $schema=../../schemas/ext_schema.json

$schema: "ext_schema.json#"
kind: extension
name: Ssctr
long_name: Control Transfer Records
description: |
The supervisor view of `Smctr`.
type: privileged
versions:
- version: "1.0.0"
state: ratified
ratification_date: 2024-11
requires:
allOf:
- name: S
version: ~> 1.13 # The latest ratified version of S when Sscntr was ratified
- name: Sscsrind
version: ~> 1.0
55 changes: 52 additions & 3 deletions arch/inst/Smdbltrp/sctrclr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,31 @@
$schema: inst_schema.json#
kind: instruction
name: sctrclr
long_name: No synopsis available.
long_name: Supervisor Control Transfer Record (CTR) clear
description: |
No description available.
definedBy: Smdbltrp
When `mstateen0.CTR`=1, the SCTRCLR instruction performs the following operations:

* Zeroes all CTR Entry Registers, for all DEPTH values
* Reset to Zero the optional CTR cycle counter where implemented
** `ctrdata.CC` and `ctrdata.CCV` bit fields.

Any read of `ctrsource`, `ctrtarget`, or `ctrdata` that follows SCTRCLR, such that it precedes the next
qualified control transfer, will return the value 0.

Further, the first recorded transfer following SCTRCLR will have `ctrdata.CCV`=0.

SCTRCLR execution causes an `IllegalInstruction` exception if:

* `Smctr` is not implemented
* The instruction is executed in S/VS/VU-mode and `Ssctr` is not implemented, or `mstateen0.CTR`=0
* The instruction is executed in U-mode

SCTRCLR execution causes a `VirtualInstruciton` exception if `mstateen0.CTR`=1 and:

* The instruction is executed in VS-mode and `hstateen0.CTR`=0
* The instruction is executed in VU-mode
definedBy:
anyOf: [Smctr, Ssctr]
assembly: sctrclr
encoding:
match: "00010000010000000000000001110011"
Expand All @@ -18,3 +39,31 @@ access:
vu: always
data_independent_timing: false
operation(): |
if (implemented?(ExtensionName::Smstateen)) {
if (CSR[mstateen0].CTR == 1'b0) {
if (mode() != PrivilegeMode::M) {
raise (ExceptionCode::IllegalInstruction, mode(), $encoding);
}
}
else if (implemented?(ExtensionName::H)) {
if (CSR[hstateen0].CTR == 1'b0 && mode() == PrivilegeMode::VS) {
raise (ExceptionCode::VirtualInstruction, mode(), $encoding);
}
}
}
if (mode() == PrivilegeMode::U) {
raise (ExceptionCode::IllegalInstruction, mode(), $encoding);
}
else if (implemented?(ExtensionName::H) && mode() == PrivilegeMode::VU) {
raise (ExceptionCode::VirtualInstruction, mode(), $encoding);
}
else {
for (U32 i = 0; i < (16 << CSR[sctrdepth].DEPTH); i++) {
CSR[siselect] = (0x200 + i);
CSR[sireg1] = 0;
CSR[sireg2] = 0;
CSR[sireg3] = 0;
}
Comment on lines +61 to +66
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, there is no sireg1... it's sireg. :-/

While this should accomplish the goal of clearing the CTR buffer, it has side-effect of changing the value of siselect. This should probably be saved and restored.

Now I'm wondering, though, if we also need to be concerned with changing the value of siselect at all here, because it would be an observable change during a trap, for example. The only way I can think to avoid that issue would be to bypass the indirect register usage and only manipulate the internal state directly.

So, let's define all of the CTR registers as CSRs. This is in line with how they are described in the spec under "CSR LIsting":

Image Image

You can use a "layout" to generate the 256 definitions.

Then, you can manipulate them directly without needing to change the value of siselect.

This is a nice case in support of #1169, but we don't have a means to treat these individual registers as an array. You could also use a layout to initialize/reset them all.

CTR_CYCLE_COUNTER = 16'b0;
CCV_HW = 1'b0;
}