Skip to content

Conversation

Noratrieb
Copy link
Member

@Noratrieb Noratrieb commented Sep 23, 2025

The current code size is really wastefully large. Originally, it was 1500 lines of assembly in Godbolt, now I reduced it to just under 800. The effect of .text size in hello world is from 297028 to 295453 (measured with -Clto=fat -Copt-level=s with a normal sysroot) which is small but not completely irrelevant. It's just a small fish in the bigger pond of DWARF parsing, but it's better than nothing.

I extracted the parsing of each component into a separate function to allow for better sharing. I replaced the string methods with manual iteration since that results in simpler code because it has to handle fewer cases. I also had to use unsafe because the bounds checks were sadly not optimized out and were really large.

I also made the parser less resilient against whitespace, now it no longer handles Unicode whitespace (an obvious simplification) but also no longer handles any whitespace except the normal SP. I think this is fine, it seems highly unlikely that a system would suddenly use another type of whitespace (but I guess not impossible?).

Another simplification was simply removing the parsing of unused fields that were not needed.

I can split it into separate commits if that helps review and maybe some of these changes are more of a hassle than it's worth (while some others like the field removals are obviously good), but I'll let you choose.

@workingjubilee
Copy link
Member

I think we can safely assume that procfs will only ever use '\x20', yes.

This will require careful review for the unsafe usage, but I am fairly happy to see it. I made a similar pass at this but wasn't quite happy with it.

@Noratrieb
Copy link
Member Author

apparently it's normal that these tests fail, lol

@workingjubilee
Copy link
Member

I am certainly treating the current macOS failures as nonblocking as it is very possible a rustc change is the underlying reason. I have opened #730

@workingjubilee
Copy link
Member

workingjubilee commented Sep 23, 2025

I can split it into separate commits if that helps review

@Noratrieb If you can cleanly split the changes that micro-optimize the parsing using bytes and such from those that do straight reduction of code, like removing fields, into separate commits, that would be appreciated. Order doesn't matter. Though if they don't have a reasonably clean split then it doesn't matter and it should remain one big mash.

The vec move is very funny, as moving it removes the vec drop from the
early return, which is fairly irrelevant but more than 0 bytes.
The current code size is really wastefully large. Originally, it was
1500 lines of assembly in Godbolt, now I reduced it to just under 800.
The effect of `.text` size in hello world is from 297028 to 295453 which
is small but not completely irrelevant. It's just a small fish in the
bigger pond of DWARF parsing, but it's better than nothing.

I extracted the parsing of each component into a separate function to
allow for better sharing. I replaced the string methods with manual
iteration since that results in simpler code because it has to handle
fewer cases. I also had to use unsafe because the bounds checks were
sadly not optimized out and were really large.

I also made the parser less resilient against whitespace, now it no
longer handles Unicode whitespace (an obvious simplification) but also
no longer handles any whitespace except the normal SP. I think this is
fine, it seems highly unlikely that a system would suddenly use another
type of whitespace (but I guess not impossible?).
@Noratrieb Noratrieb force-pushed the optimize-proc-self-map-size branch from b13e32d to 0b9eeb1 Compare September 26, 2025 18:54
.read_to_string(&mut buf)
.map_err(|_| "Couldn't read /proc/self/maps")?;

let mut v = Vec::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinda insane that this matters that much

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's only a tiny improvement, so that much is a bit of an overstatement :). actually i don't have numbers on this one, but I did see the call to drop_in_place before but not after.

Comment on lines 88 to 94
fn error(msg: &str) -> &str {
if cfg!(debug_assertions) {
msg
} else {
"invalid map entry"
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this makes sense.

Comment on lines +94 to +123
// While there are nicer standard library APIs available for this, we aim for minimal code size.

let mut state = s;

fn parse_start<'a>(state: &mut &'a str) -> &'a str {
// Unsafe is unfortunately necessary to get the bounds check removed (for code size).

let start_idx = state.bytes().position(|b| b != b' ');
if let Some(start_idx) = start_idx {
// SAFETY: It comes from position, so it's in bounds.
// It must be on a UTF-8 boundary as it's the first byte that isn't ' '.
*state = unsafe { state.get_unchecked(start_idx..) };
}
let match_idx = state.bytes().position(|b| b == b' ');
match match_idx {
None => {
let result = *state;
*state = "";
result
}
Some(match_idx) => {
// SAFETY: match_index comes from .bytes().position() of an ASCII character,
// so it's both in bounds and a UTF-8 boundary
let result = unsafe { state.get_unchecked(..match_idx) };
// SAFETY: Since match_idx is the ' ', there must be at least the end after it.
*state = unsafe { state.get_unchecked((match_idx + 1)..) };
result
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hkBst This is an example of the kind of microoptimization that is both very hard to do in the compiler and also can reap significant benefits since here we care mostly about code size, even in the dead code, as this code size is essentially multiplied by all Rust binaries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@workingjubilee Thanks for thinking of me, although I'm not quite sure why you did.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants