-
Notifications
You must be signed in to change notification settings - Fork 270
Optimize proc maps parsing code size #729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I think we can safely assume that procfs will only ever use '\x20', yes. This will require careful review for the |
apparently it's normal that these tests fail, lol |
I am certainly treating the current macOS failures as nonblocking as it is very possible a rustc change is the underlying reason. I have opened #730 |
@Noratrieb If you can cleanly split the changes that micro-optimize the parsing using bytes and such from those that do straight reduction of code, like removing fields, into separate commits, that would be appreciated. Order doesn't matter. Though if they don't have a reasonably clean split then it doesn't matter and it should remain one big mash. |
The vec move is very funny, as moving it removes the vec drop from the early return, which is fairly irrelevant but more than 0 bytes.
The current code size is really wastefully large. Originally, it was 1500 lines of assembly in Godbolt, now I reduced it to just under 800. The effect of `.text` size in hello world is from 297028 to 295453 which is small but not completely irrelevant. It's just a small fish in the bigger pond of DWARF parsing, but it's better than nothing. I extracted the parsing of each component into a separate function to allow for better sharing. I replaced the string methods with manual iteration since that results in simpler code because it has to handle fewer cases. I also had to use unsafe because the bounds checks were sadly not optimized out and were really large. I also made the parser less resilient against whitespace, now it no longer handles Unicode whitespace (an obvious simplification) but also no longer handles any whitespace except the normal SP. I think this is fine, it seems highly unlikely that a system would suddenly use another type of whitespace (but I guess not impossible?).
b13e32d
to
0b9eeb1
Compare
.read_to_string(&mut buf) | ||
.map_err(|_| "Couldn't read /proc/self/maps")?; | ||
|
||
let mut v = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kinda insane that this matters that much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's only a tiny improvement, so that much is a bit of an overstatement :). actually i don't have numbers on this one, but I did see the call to drop_in_place
before but not after.
fn error(msg: &str) -> &str { | ||
if cfg!(debug_assertions) { | ||
msg | ||
} else { | ||
"invalid map entry" | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this makes sense.
// While there are nicer standard library APIs available for this, we aim for minimal code size. | ||
|
||
let mut state = s; | ||
|
||
fn parse_start<'a>(state: &mut &'a str) -> &'a str { | ||
// Unsafe is unfortunately necessary to get the bounds check removed (for code size). | ||
|
||
let start_idx = state.bytes().position(|b| b != b' '); | ||
if let Some(start_idx) = start_idx { | ||
// SAFETY: It comes from position, so it's in bounds. | ||
// It must be on a UTF-8 boundary as it's the first byte that isn't ' '. | ||
*state = unsafe { state.get_unchecked(start_idx..) }; | ||
} | ||
let match_idx = state.bytes().position(|b| b == b' '); | ||
match match_idx { | ||
None => { | ||
let result = *state; | ||
*state = ""; | ||
result | ||
} | ||
Some(match_idx) => { | ||
// SAFETY: match_index comes from .bytes().position() of an ASCII character, | ||
// so it's both in bounds and a UTF-8 boundary | ||
let result = unsafe { state.get_unchecked(..match_idx) }; | ||
// SAFETY: Since match_idx is the ' ', there must be at least the end after it. | ||
*state = unsafe { state.get_unchecked((match_idx + 1)..) }; | ||
result | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hkBst This is an example of the kind of microoptimization that is both very hard to do in the compiler and also can reap significant benefits since here we care mostly about code size, even in the dead code, as this code size is essentially multiplied by all Rust binaries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@workingjubilee Thanks for thinking of me, although I'm not quite sure why you did.
The current code size is really wastefully large. Originally, it was 1500 lines of assembly in Godbolt, now I reduced it to just under 800. The effect of
.text
size in hello world is from 297028 to 295453 (measured with-Clto=fat -Copt-level=s
with a normal sysroot) which is small but not completely irrelevant. It's just a small fish in the bigger pond of DWARF parsing, but it's better than nothing.I extracted the parsing of each component into a separate function to allow for better sharing. I replaced the string methods with manual iteration since that results in simpler code because it has to handle fewer cases. I also had to use unsafe because the bounds checks were sadly not optimized out and were really large.
I also made the parser less resilient against whitespace, now it no longer handles Unicode whitespace (an obvious simplification) but also no longer handles any whitespace except the normal SP. I think this is fine, it seems highly unlikely that a system would suddenly use another type of whitespace (but I guess not impossible?).
Another simplification was simply removing the parsing of unused fields that were not needed.
I can split it into separate commits if that helps review and maybe some of these changes are more of a hassle than it's worth (while some others like the field removals are obviously good), but I'll let you choose.