Skip to content

Conversation

rocallahan
Copy link
Contributor

Format string checking happens at compile time if -std=c++20 (or greater) is enabled. Otherwise the checking happens at run time --- invalid format strings will result in a call to abort().

Compile-time format string checking requires the format string to be a compile-time constant, so fix a few places where that isn't true.

What are the reasons/motivation for this change?

See #5210. By replacing stringf() and then log() and related varargs functions with variadic template implementations, and adding direct support for C++ types such as std::string, we will be able to clean up some pretty ugly code and avoid some space leaks.

@rocallahan
Copy link
Contributor Author

A followup commit to extend this to log() is here if you want to see what that will look like.

For the sake of keeping this PR simple I've done the bare minimum to get things working and support std::string. There's a lot more that could be done. Runtime performance could be improved in various ways, e.g. by avoiding the calls to vstringf() for common conversions like %s and %d. A lot of .c_str() calls can be removed after this PR.

@KrystalDelusion
Copy link
Member

A lot of .c_str() calls can be removed after this PR.

👀

@rocallahan rocallahan force-pushed the typed-stringf branch 2 times, most recently from c8887f6 to 6a04722 Compare July 11, 2025 04:36
@rocallahan
Copy link
Contributor Author

The build is failing in VerificRewriter.cpp. I don't have that code so I guess someone who does will have to help with that. Maybe there is a declaration of stringf() with its old signature copied into a file somewhere?

@rocallahan
Copy link
Contributor Author

Also the Visual Studio C++ build is failing. I'll see if I can guess the problems and fix them...

@KrystalDelusion
Copy link
Member

You'll probably need to deprecate the stringf usage, as discussed in #5211 (comment); even ignoring that the Verific builds are failing, just removing it outright has a pretty good chance of breaking a lot of Yosys plugins.

@rocallahan
Copy link
Contributor Author

What compatibility expectations/guarantees are you offering? Are they documented anywhere?

Are you promising that Yosys maintains ABI compatibility, so that existing binary builds of Yosys-using code continue to work with future versions of Yosys?

If you require that people upgrading Yosys recompile their Yosys-using code, then this change should probably be OK. As you can see, only a very small number of the existing uses of stringf() had to be changed to compile with the new signature.

@rocallahan
Copy link
Contributor Author

The Windows build failure was because the Win32 header files helpfully #define NONE to something :-(. Fixed that by using FMTSPEC_ identifiers instead of enum class.

@KrystalDelusion
Copy link
Member

What compatibility expectations/guarantees are you offering? Are they documented anywhere?

Are you promising that Yosys maintains ABI compatibility, so that existing binary builds of Yosys-using code continue to work with future versions of Yosys?

If you require that people upgrading Yosys recompile their Yosys-using code, then this change should probably be OK. As you can see, only a very small number of the existing uses of stringf() had to be changed to compile with the new signature.

I don't think we provide any explicit guarantees, certainly not ABI compatibility. I know I do tend towards more verbosity than many of the other maintainers, but I think we generally deprecate public methods for a release version so that downstream developers get a deprecation warning that informs them of any changes necessary. Though in this case I'm not sure exactly how that applies since there is still a stringf method just not exactly the same. Unfortunately our resident Verific maintainer is on leave for the next few weeks so there might be some delay in checking the verific builds to try fix the failures there.

@rocallahan
Copy link
Contributor Author

in this case I'm not sure exactly how that applies since there is still a stringf method just not exactly the same.

Yeah, the intent here is to be 100% compatible with stringf at the source level. The only stringf calls I had to change in Yosys were where the format string was non-constant, and that only matters when overriding the default of C++17 to compile with C++20 or later. So I hope we can do this.

Unfortunately our resident Verific maintainer is on leave for the next few weeks so there might be some delay in checking the verific builds to try fix the failures there.

I guess there's some prebuilt library that links to Yosys symbols that needs to be rebuilt every time Yosys changes ABI?

@nakengelhardt nakengelhardt added the merge-before-release Merge: PR should be included in the next release label Jul 14, 2025
@nakengelhardt
Copy link
Member

Ah, this will be a little tricky to merge - there's some proprietary code (for our rewriters that manipulate the verific AST) that's pre-compiled and manually uploaded to our private CI runner each month along with the verific binaries, and this code is also making use of stringf. This means that we need to time merging this PR with uploading the new verific archive, since otherwise all CI verific builds will start failing. And as mentioned, the person who maintains the verific build and the CI server just left on holiday, so it'll most likely be about two weeks until we can merge this. I've added the label to make sure we'll catch it then.

@rocallahan
Copy link
Contributor Author

@nakengelhardt I have some other changes that reuse this code for log() and other variadic print functions. Is there a way for us to coordinate things so they all land in the same merge window? How would you like me to do that --- submit multiple PRs, attach more commits to this PR, something else? Thanks!!

@widlarizer
Copy link
Collaborator

I like making split PRs, with the disclaimer that until a dependency PR is merged, the diff will look messy, as some of the commits aren't part of the discussed change. That could be a good move here as well but requested changes then need awkward rebasing

FYI we have public development meetings on mondays 15:00 Central European (Summer) Time which are a good opportunity to discuss hairier contributions like this. I will unfortunately miss the next one as I'm away next week though

@rocallahan
Copy link
Contributor Author

FYI we have public development meetings on mondays 15:00 Central European (Summer) Time which are a good opportunity to discuss hairier contributions like this.

I will try to join.

@widlarizer
Copy link
Collaborator

Unfortunately this has a 3.4% performance regression when synthesizing the OpenROAD flow scripts jpeg example design that I use as a perf smoke test. The performance impact is spread out over commands that print things. As an isolated example, reading and dumping any RTLIL file should do

$ hyperfine "./uut/main-lto/bin/yosys $cmd"
Benchmark 1: ./uut/main-lto/bin/yosys -dp "read_rtlil garbage/jpeg.synth.il; dump"
  Time (mean ± σ):      63.8 ms ±   2.3 ms    [User: 51.8 ms, System: 11.6 ms]
  Range (min … max):    61.3 ms …  69.4 ms    42 runs

$ hyperfine "./uut/typed-stringf/bin/yosys $cmd"
Benchmark 1: ./uut/typed-stringf/bin/yosys -dp "read_rtlil garbage/jpeg.synth.il; dump"
  Time (mean ± σ):      67.2 ms ±   2.0 ms    [User: 55.1 ms, System: 11.7 ms]
  Range (min … max):    63.9 ms …  74.4 ms    39 runs

$ ./uut/main-lto/bin/yosys --version
Yosys 0.55+109 (git sha1 2223d7848, ccache clang++ 16.0.6 -fPIC -O3)
$ ./uut/typed-stringf/bin/yosys --version
Yosys 0.55+109 (git sha1 6ea8281fb, ccache clang++ 16.0.6 -fPIC -O3)
$ echo $cmd
-dp "read_rtlil garbage/jpeg.synth.il; dump"

jpeg.synth.il.txt

Sorry I only got to seriously playing with it now but I suppose it means we'll push this and the improvements that build on top of it into the next release cycle

@rocallahan
Copy link
Contributor Author

Sorry I only got to seriously playing with it now but I suppose it means we'll push this and the improvements that build on top of it into the next release cycle

OK. Thanks for looking at it anyway.

I dug into performance and it was reasonably easy to make improvements. Probably the most important one was to add a special case for %s on const char* arguments that doesn't go through vsprintf(), but I also added a few other tweaks like recording whether we hit the common case of no %% so we can accelerate processing the constant parts of the format, and fiddling with the code to make the fast paths we want the compiler to inline as small/simple as possible so it actually does inline them. With those changes, I see this PR now improving performance: (clang 19 -O3 on Linux):

hyperfine './yosys -dp "read_rtlil /usr/local/google/home/rocallahan/Downloads/jpeg.synth.il; dump"'

C++17 before: Time (mean ± σ): 101.3 ms ± 0.8 ms [User: 85.6 ms, System: 15.6 ms]
C++17 after: Time (mean ± σ): 98.4 ms ± 1.2 ms [User: 82.1 ms, System: 16.1 ms]
C++20 before: Time (mean ± σ): 100.9 ms ± 1.1 ms [User: 87.0 ms, System: 13.8 ms]
C++20 after: Time (mean ± σ): 97.8 ms ± 1.4 ms [User: 83.1 ms, System: 14.7 ms]

@rocallahan
Copy link
Contributor Author

One thing I should point out, which is now mentioned in the commit message, is that this makes format string processing more lenient. E.g. passing a long long to %d prints the full value without bit truncation. In particular doing something like passing an int to %hd will not truncate if that's desired. I had a quick scan of the codebase and couldn't see anywhere obviously caring about this, but it's hard to tell.

Applying strict bitwidth truncation could be done but it would require more code.

One other thing about performance: this PR really does open up more possibilities for improvements. For example everywhere that currently calls stringf() passing std::strings with .c_str() is a bit silly, since .c_str() forgets the string length which means we then have to check for null bytes in a loop. Incrementally removing those .c_str()s will not only clean up the code but also make things a bit more efficient since we can use the string length we already have.

… format string checking.

Checking only happens at compile time if -std=c++20 (or greater) is enabled. Otherwise
the checking happens at run time.

This requires the format string to be a compile-time constant (when compiling with
C++20), so fix a few places where that isn't true.

The format string behavior is a bit more lenient than C printf. For %d/%u
you can pass any integer type and it will be converted and output without
truncating bits, i.e. any length specifier is ignored and the conversion is
always treated as 'll'. Any truncation needs to be done by casting the argument itself.
For %f/%g you can pass anything that converts to double, including integers.

Performance results with clang 19 -O3 on Linux:
```
hyperfine './yosys -dp "read_rtlil /usr/local/google/home/rocallahan/Downloads/jpeg.synth.il; dump"'
```
C++17 before: Time (mean ± σ):     101.3 ms ±   0.8 ms    [User: 85.6 ms, System: 15.6 ms]
C++17 after:  Time (mean ± σ):      98.4 ms ±   1.2 ms    [User: 82.1 ms, System: 16.1 ms]
C++20 before: Time (mean ± σ):     100.9 ms ±   1.1 ms    [User: 87.0 ms, System: 13.8 ms]
C++20 after:  Time (mean ± σ):      97.8 ms ±   1.4 ms    [User: 83.1 ms, System: 14.7 ms]

The generated code is reasonably efficient. E.g. with clang 19, `stringf()` with a format
with no %% escapes and no other parameters (a weirdly common case) often compiles to a fully
inlined `std::string` construction. In general the format string parsing is often (not always)
compiled away.
Copy link
Collaborator

@widlarizer widlarizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, it's a (tiny) performance improvement now in that synth smoke test. I'm okay with removing accidental truncations in our string formatting

@mmicko mmicko merged commit 1d229ae into YosysHQ:main Jul 29, 2025
32 of 34 checks passed
@mmicko mmicko mentioned this pull request Jul 29, 2025
@rocallahan
Copy link
Contributor Author

Thanks for merging this! There's PR #5243 as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-before-release Merge: PR should be included in the next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants