-
Notifications
You must be signed in to change notification settings - Fork 10
RFE: encoder: add stride & various raw input formats support #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
See also: Devolutions/IronRDP#670 |
@aldanor hi, wdyt? |
@aldanor ping :thanks: |
@aldanor are you still maintaining the crate? thanks |
The api feels kinda confusing too me. It would be cool to control Input format and output format independently. I have the use case that I have an image with RGBA data where I know that the image has no translucency, so I want to save it as RGB. I don't know how I would do this with this PR. Which is why I created #15 Would be cool if it was possible to control the output format independently like in my pr. And I don't really understand the stride argument. That should maybe be documented better. |
You can use the RawChannels::Rgbx or Bgrx, or Xbgr etc.. from this PR for that.
Well, you can't really mix input and output formats freely. You need alpha channel input for alpha channel output. And QOI has only two output formats, rgb and rgba...
https://learn.microsoft.com/en-us/windows/win32/medfound/image-stride |
True, didn't think about that. But what if someone wants to add an alpha channel?
Adding an alpha channel with default values of 255 would be imho fine and not unexpected. |
Well, this use case goes beyond the typical simple encoding imo. |
Why is |
We could make stride an Option. Alternatively, we could have an EncoderBuilder, but this might be overkill. |
Stride could even be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the EncoderBuilder. It really improves the usability. Have a few nitpics
src/encode.rs
Outdated
if stride * (height - 1) as usize + width as usize * raw_channels.bytes_per_pixel() < size { | ||
return Err(Error::InvalidImageLength { size, width, height }); | ||
} | ||
if guess_stride && size != width as usize * height as usize * raw_channels.bytes_per_pixel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
width as usize * raw_channels.bytes_per_pixel()
is a common element in the if statements, maybe that could be put in a variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
width_usize ? hmm, I don't know if this is a common pattern. I am a bit unsure.
RFE: encoder: add stride & various raw input formats support aldanor#11
@elmarco Added some comments, please lmk if you want to continue working on this pr or not? Also, you'd need to rebase your branch after removing commits from this branch as lots of fixes are already on main:
There's also a few questions:
|
Signed-off-by: Marc-André Lureau <[email protected]>
@aldanor thanks for the review
This is the biggest problem, it seems I didn't benchmark properly. I get a -33% encoding perf. I am trying to understand where it comes from and how to fix it. thanks |
It might come from a double loop with now-unknown number of iterations in each preventing some common loop optimizations? I guess you can godbolt something equivalent. Yea, -33% is pretty bad for sure... -3% would still be not nice but within noise bounds. |
it turns out it's the assert_eq!(), this adds a bunch of panic handling code and prevent some optimizations. I switched it to debug_assert!(). Now, no performance regression is observable (with perf stat). |
9fd810a
to
5b9aa9b
Compare
assert_eq
debug_assert_eq
For me it's still slower with debug_assert_eq |
Signed-off-by: Marc-André Lureau <[email protected]>
did you compile with --release ? |
I ran |
This extra feature allows to turn on/off the extra input formats and shows that encode_impl() isn't correctly optimized independently of the various existing formats. This should probably be reported or analyzed by the compiler team. At least, I am not able to explain the reason. Signed-off-by: Marc-André Lureau <[email protected]>
(I switched to stable rust, as I get slightly different results with nightly which can create confusion) @Joshix-1 try with the latest commit. You should not see performance loss (I actually observe a slight improvement something like 0.5%). Something is weird when enabling the "extra-source" feature, the encode_impl() don't get optimized the same way and we can observe -5-10%. It seems to be related to the Fn / closure somehow. But each format or fn specialization should be receiving an independant analysis and optimization no? It may be worth to ask/report to the compiler team. In the mean time, perhaps the "extra-source" feature flag is acceptable? |
ce61683e8682edb68f8f040f4cbbce1eae143fb7 has a similar effect. Seems to be some weird case where the compiler doesn't optimize |
@Joshix-1 I am really curious to know how you found about adding this extra E/Infallible ! |
Created a flamegraph and saw Try take a big chunk. |
I tried to make a subset test case to report to the compiler without success atm. @Joshix-1 what to do next?:
thanks |
I think my commit could also improve the performance of master. Which would mean that this pr has to get even faster. I have some ideas I want to test out. I'll do some more testing an benchmarking. Stream writer is imho useful, I don't see a reason to remove it |
Could not improve performance of master. I would suggest using 50293f3 With [profile.release]
opt-level = 3
debug = true in
without the
on master I get:
But I don't really know anymore, it's all really weird and unexpected |
indeed, I am ok with 50293f3, we can later revert it. |
Why revert it later? The change makes sense. Infallible methods shouldn't return results that could contain errors |
The compiler should be able to infer that BytesMut Writer is in fact Infallable when inlining, I think that's what it's doing when "extra-source" is off. |
The input format is not necessarily RGB or RGBA, and doing a pre-pass conversion can be quite costly (adding about 15-20% of total time from empirical study)
For simplicity reasons, I made sure not to break the existing API.
Those changes don't seem to affect the encoder performance in a significant way.