Skip to content

Conversation

suisseWalter
Copy link
Contributor

On the request of @phsauter I have implemented a new STA Command that allows to read timing information from a config file.

It extracts the the longest/shortest path for each FF and then a report gets created with limitations according to the configuration.

For the configuration of the delays a modified liberty file is used. The user can provide their own definition of delay. for every cell type. It works for standard cells or also any yosys-internal cells. Early in the flows cells can be parameterised, they can have arbitrary widths. the config is built for that. It is possible to give one or two dimensional tables that represent the timing according to the width of the cells. for one dimensional tables it just takes the maximum width. if there is a two dimensional table one also needs to specify which port-widths get used as the indices.

For the output it is possible to specify the min and max pathlengths.
It will print all the paths longer as the max pathlength.
And it will print all the paths shorter then the min pathlength.
This can be used to extract all the setup and hold violations.

Due to the delays being part of the config file, it is possible to use any definition for that. I have used a nand2 gate as a reference and created configurations for any cell type using that as a base. But it would be possible to use any other definition.
I have created the following three definitions:
IHP130 standard cells: ihp130_stdcell.lib.txt
Yosys internal cells (which are used after techmap): nand2_based_internal_delay.lib.txt
Yosys internal cells (before techmap): by_width.lib.txt

This command produces outputs such as this:
large_verbosity.sta.txt
middle_verbositry.sta.txt
no_verbosity.sta.txt

There is currently the following problem with the generated report:
The design needs to be flattened first. This is done internaly in the sta2 command. But this generates a lot of output. This output gets printed in the same place as the timing report. This is not ideal. But i have no idea how this could be fixed.

As this is part of a small thesis I also have a report some other background work. if needed I can provide this aswell.

There is also a python generator for generating the lib file for pre-techmap, which uses nand2 equivalent delays. Adjusting it for any other concept of delay wouldn't be a problem. I'm not sure where this should be published if at all. liberty_generator.py.txt

@rowanG077
Copy link

rowanG077 commented Aug 11, 2025

It extracts the the longest/shortest path for each FF and then a report gets created with limitations according to the configuration.

Does this mean longest and shortest path for each source -> sink pair? If not that could be a problem since different FFs can have different setup/hold requirements. Meaning that, for example, if you take the shortest path from an FF to all it's consumers FFs that might be a path without hold violation. But a longer path to some specific FF could contain a violation.

@suisseWalter
Copy link
Contributor Author

It extracts the the longest/shortest path for each FF and then a report gets created with limitations according to the configuration.

Does this mean longest and shortest path for each source -> sink pair? If not that could be a problem since different FFs can have different setup/hold requirements. Meaning that, for example, if you take the shortest path from an FF to all it's consumers FFs that might be a path without hold violation. But a longer path to some specific FF could contain a violation.

Mostly yes. It is possible to define a "delay" property for a FF. This gets added to the path at both ends.
Therefore it is possible to define different delays for different FF. but it's not possible to differentiate between different constraints for a FF.
This is just a further simplification. I think this is reasonable in the context of where this command will most likely be used.
If the design consists of standardcells it should be possible to use something like OpenSta to get a fully accurate and more detailed report. OpenSta is also faster then this command.
If the design does not consist of standard-cells there is probably potential for optimisation. In that case the different delays for FF are not that significant in most cases.

If somebody has the need to have accurate setup and hold constraints for FF it should be possible to add them.

@Ravenslofty
Copy link
Collaborator

May I ask what the problem with the original sta command was?

@Ravenslofty Ravenslofty self-assigned this Aug 13, 2025
@suisseWalter
Copy link
Contributor Author

suisseWalter commented Aug 13, 2025

The original STA command had, as far as i can tell, three issues:

  • it only counted the number of cells on a path. There was no real notion of time/delay, just the number of hops. Which can be sufficient in certain cases. But there are many cases where this isn't enough information.
  • The design needs to be completely flattened before the command is run.
  • it uses the RTLIL::Module::derive() function. This is only implemented in Modules that are created by the AST Frontend(AstModule). There are now alternative frontends, especially for System Verilog (for example slang) that do not implement this function.

It might have been possible to change the existing command to accommodate these three things. but keeping it backwards compatible would have been hard. Therefore I implemented it as a separate command, this way people can still use the STA command where it's usefull.

@Ravenslofty
Copy link
Collaborator

The original STA command had, as far as i can tell, three issues:

* it only counted the number of cells on a path. There was no real notion of time/delay, just the number of hops. Which can be sufficient in certain cases. But there are many cases where this isn't enough information.

Did you get sta confused with ltp? sta supports timing information through Yosys' TimingInfo structures that reference $specify2/$specify3 cells inside a module.

* The design needs to be completely flattened before the command is run.

While I personally don't consider this an issue (if you have a hierarchical design, select the section you want to examine as top module, then run synthesis and sta), I accept this is a limitation for big designs.

* it uses the RTLIL::Module::derive() function. This is only implemented in Modules that are created by the AST Frontend(AstModule). There are now alternative frontends, especially for System Verilog (for example [slang](https://github.com/povik/yosys-slang)) that do not implement this function.

Yes, and this is something we've discussed with them, because deriving modules is used everywhere.

In Verilog specify syntax one can do if (condition) (A => B) = X; and one must derive the module to tell if this condition applies.

It might have been possible to change the existing command to accommodate these three things. but keeping it backwards compatible would have been hard. Therefore I implemented it as a separate command, this way people can still use the STA command where it's usefull.

Well, you only have to change one thing now.

To be honest, I think inside this PR are two much smaller PRs: a command to import specify delays, and sta supporting hierarchical designs.

@suisseWalter
Copy link
Contributor Author

I did not know that the specify2/3 construct exists.
Is there a way to generate them?
I have found the portarcs command. but that only works for purely combinatorial modules without any submodules.

@Ravenslofty
Copy link
Collaborator

read_verilog -specify on, for example, +/ice40/cells_sim.v will generate them.

More programmatically? Uh, there doesn't appear to be an RTLIL helper for this, likely on the basis that specify rules were never expected to be added after the fact.

So the Verilog front-end generates them directly:

specify_if TOK_LPAREN specify_edge expr TOK_SPECIFY_OPER specify_target TOK_RPAREN TOK_EQ specify_rise_fall TOK_SEMICOL {

I will admit I'm not that familiar with the internals of how this would be implemented, but I hope this is at least a little useful.

@suisseWalter
Copy link
Contributor Author

The read_verilog -specify only enables yosys to read specify cells. But this implies that somebody has to write all of these cells by hand. which is not feasable.
I would expect something that would generate them from the cells used by yosys. (it seems like they are part of the pdk. in most cases.(but sometimes they are just empty in there )

Because adding more work for the designer is not realistic in my eyes.

@Ravenslofty
Copy link
Collaborator

The read_verilog -specify only enables yosys to read specify cells. But this implies that somebody has to write all of these cells by hand. which is not feasable.

At risk of asking the obvious question: what makes writing specify blocks infeasible, but writing your pseudo-liberty format containing the same information feasible?

I would expect something that would generate them from the cells used by yosys. (it seems like they are part of the pdk. in most cases.(but sometimes they are just empty in there )

Then you can modify read_liberty to directly read the timing information and produce specify cells.

But the problem that both your pseudo-liberty and Verilog specify formats have is that the actual Liberty timing model is far more complicated than either, depending on things like input slew and output capacitances.

Because adding more work for the designer is not realistic in my eyes.

I agree, to be clear.

@suisseWalter
Copy link
Contributor Author

at risk of asking the obvious question: what makes writing specify blocks infeasible, but writing your pseudo-liberty format containing the same information feasible?

If we assume that we can write a generic specify block for each yosys internal cell then yes this is the same amount of effort.
But I can't really see a way to generate a specify cell given a module that contains a submodule. for a fully flattened design that would work.

One other aspect that is important for PULP is that it should be possible to use the STA command early on in the flow. if possible already while higher level cells are in use. In that case the width of the input cells has to be considered. And I'm pretty sure the specify2/3 do not support this. Because they do not even use the input width parameters anywhere.

Yes it might be possible to use the specify cells but the project would probably require:

  • new type of specify cell that can have arbitrary rules using any of the parameters.
  • new parser for this cell type.
  • modify STA to ensure that it can deal with these cells.
  • modify STA to flatten by itself. (should be simple)
  • modify slang to implement the derive() and any other thing that STA uses that is optional.
  • write specify definitions for all the yosys internal cells.

Yes this might be possible to do. But I see no advantage over using liberty files. because Liberty files are also used to define timing.

@Ravenslofty
Copy link
Collaborator

at risk of asking the obvious question: what makes writing specify blocks infeasible, but writing your pseudo-liberty format containing the same information feasible?

If we assume that we can write a generic specify block for each yosys internal cell then yes this is the same amount of effort. But I can't really see a way to generate a specify cell given a module that contains a submodule. for a fully flattened design that would work.

And then... how does your pseudo-liberty format somehow succeed where specify cells fail?

One other aspect that is important for PULP is that it should be possible to use the STA command early on in the flow. if possible already while higher level cells are in use.

Okay, sure, although I think this is not as useful as you think it is. You might as well be printing random numbers.

In that case the width of the input cells has to be considered. And I'm pretty sure the specify2/3 do not support this. Because they do not even use the input width parameters anywhere.

Correct, they don't use the input width parameters, because they look at the signal widths instead. So they support this behaviour just fine. It's necessary to support syntax like *>.

Yes it might be possible to use the specify cells but the project would probably require:

* new type of specify cell that can have arbitrary rules using any of the parameters.

Based on the above, we can use them directly.

* new parser for this cell type.

N/A.

* modify STA to ensure that it can deal with these cells.

N/A.

* modify STA to flatten by itself. (should be simple)

Sure.

* modify slang to implement the derive() and any other thing that STA uses that is optional.

Yep.

* write specify definitions for all the yosys internal cells.

Yep.

Yes this might be possible to do. But I see no advantage over using liberty files. because Liberty files are also used to define timing.

See, here it feels like you haven't read what I said, so I'm going to repeat myself:

But the problem that both your pseudo-liberty and Verilog specify formats have is that the actual Liberty timing model is far more complicated than either, depending on things like input slew and output capacitances.

One has to translate the Liberty timing model into something Yosys can use - specify cells (to avoid confusion, let's refer to the Verilog syntax as "specify blocks") - and either write Verilog specify blocks to do this, or modify read_liberty to import Liberty timing into specify cells. (I'm fine with either approach.)

@widlarizer
Copy link
Collaborator

Needs more integration with existing functionality in Yosys and deeper design discussions. This would best be done ahead of submitting a new PR so I'm closing this one

@widlarizer widlarizer closed this Aug 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants