Skip to content

Conversation

@v0y4g3r
Copy link

@v0y4g3r v0y4g3r commented Aug 25, 2025

Enhance Benchmarking and Serialization Logic

  • src/ser.rs:
    • Optimized put_slice method in MaybeFlip to handle non-flip cases more efficiently by directly using put_slice.
  • benches/serde.rs:
    • Added a new bytes benchmark function to test serialization performance for varying byte sizes.

Performance

Before:

bytes/size-10           time:   [30.731 ns 30.753 ns 30.777 ns]
                        change: [+0.0016% +0.1073% +0.2163%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
bytes/size-10-reverse   time:   [37.930 ns 37.962 ns 37.990 ns]
                        change: [-0.1676% -0.0705% +0.0326%] (p = 0.16 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
bytes/size-100          time:   [78.906 ns 79.051 ns 79.212 ns]
                        change: [-8.0844% -7.8302% -7.5333%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
bytes/size-100-reverse  time:   [76.496 ns 76.588 ns 76.687 ns]
                        change: [-2.4687% -2.2936% -2.1130%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
bytes/size-1000         time:   [465.03 ns 465.70 ns 466.41 ns]
                        change: [-0.1893% +0.0319% +0.2457%] (p = 0.77 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
bytes/size-1000-reverse time:   [444.55 ns 445.19 ns 445.83 ns]
                        change: [-0.7640% -0.5800% -0.4119%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

After:

bytes/size-10           time:   [30.266 ns 30.295 ns 30.321 ns]
                        change: [-1.4914% -1.3650% -1.2513%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild
bytes/size-10-reverse   time:   [39.182 ns 39.214 ns 39.260 ns]
                        change: [+3.2080% +3.3230% +3.4508%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe
bytes/size-100          time:   [58.992 ns 59.056 ns 59.111 ns]
                        change: [-26.162% -25.929% -25.712%] (p = 0.00 < 0.05)
                        Performance has improved.
bytes/size-100-reverse  time:   [56.670 ns 56.719 ns 56.768 ns]
                        change: [-26.225% -26.116% -26.011%] (p = 0.00 < 0.05)
                        Performance has improved.
bytes/size-1000         time:   [201.11 ns 202.35 ns 203.84 ns]
                        change: [-56.517% -56.362% -56.195%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
bytes/size-1000-reverse time:   [188.81 ns 192.68 ns 197.55 ns]
                        change: [-57.698% -57.399% -56.973%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

 ### Enhance Benchmarking and Serialization Logic

 - **`benches/serde.rs`**:
   - Added a new `bytes` benchmark function to test serialization performance for varying byte sizes.
   - Integrated `black_box` to prevent compiler optimizations during benchmarking.

 - **`src/ser.rs`**:
   - Optimized `put_slice` method in `MaybeFlip` to handle non-flip cases more efficiently by directly using `put_slice`.

Signed-off-by: Lei, HUANG <[email protected]>
@v0y4g3r v0y4g3r changed the title perf: improve serialize bytes perf: improve bytes serialization performance Aug 25, 2025
Signed-off-by: Lei, HUANG <[email protected]>
Comment on lines +83 to +97
let num_chunks = src.len() / 8;
let remainder = src.len() % 8;
let mut tmp = [0u8; 8];
for chunk in 0..num_chunks {
for idx in 0..8 {
tmp[idx] = src[chunk * 8 + idx];
}
self.output.put_slice(&tmp);
}
if remainder != 0 {
for idx in 0..remainder {
tmp[idx] = src[num_chunks * 8 + idx];
}
self.output.put_slice(&tmp[0..remainder]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not self.output.put_slice(src)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants