Skip to content

Commit f0ad39a

Browse files
authored
Merge pull request #4 from intel/iaa_batch_release_v1.0
Updated with changes required for IAA batching benchmarking
2 parents bd87b7c + cc0c0ee commit f0ad39a

File tree

10 files changed

+341
-253
lines changed

10 files changed

+341
-253
lines changed

tests/madvise/README.md

Lines changed: 38 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,55 +1,51 @@
1-
# Instructions
2-
madvise-test suite allows benchmarking of zswap with different compression algorithms. It loads a data corpus (silesia.tar for example), swap-out all the pages, swap-in all the pages in the meanwhile monitoring the swap-in and swap-out latency along with the compressed size using bpftrace.
3-
4-
## Prerequisite
5-
#### Hardware
6-
* Intel 4th Gen Intel® Xeon® Scalable Processors or later with IAA enebled. Please see [IAA user guide](https://cdrdv2-public.intel.com/780887/354834_IAA_UserGuide_June23.pdf)for more details about IAA configuration.
7-
#### Software
8-
This framework has dependencies on IAA RFC kernel patches. Please see [instructions on building the kernel with right patch sets](https://github.com/intel/memory-usage-analyzer/wiki/Integration-of-IAA-RFC-patches-to-6.12-upstream-kernel))
9-
10-
## Run in Baremetal
11-
1. Run
12-
```
13-
./make_swap_space.sh
14-
```
15-
Additional details of make_swap_space.sh script:
16-
Command Line Arguments: The script now accepts -l for specifying the swap file location and -s for specifying the swap size in GB.
17-
Default Values: If no arguments are provided, it defaults to /mnt/nvme1/swapfile for the location and 1GB for the size.
18-
Dynamic Path Handling: It dynamically checks the available space in the directory derived from the provided or default swap file location.
19-
Example of creating 4GB swap space at /mnt/nvme1/swapfile
20-
```
21-
./make_swap_space.sh [-l <path_to_swap_file>] [-s <swap_size_in_GBi>]
22-
```
23-
2. Configure IAA device
24-
```
25-
./enable_kernel_iaa
26-
```
27-
3. Active zswap by runnig
28-
```
29-
./enable_zswap.sh
30-
```
31-
4. Collect data and generate reports for all the compressors.
32-
```
33-
./collect_all.sh
34-
```
35-
This will generate html files for CDFs of compress, decompress, compression ratio and page fault latencies and a summary.xlsx file. This will also generate (to stdout) P50 & P99 values.
1+
# Introducution
2+
zswap performance benchmarking with IAA uses madvise() system call and MADV_PAGEOUT. It loads the entire dataset (silesia.tar for example) to memory, swap-out all the pages and swap-in all the pages, monitoring the time spent in swap-in and swap-out and other key metrics. There are two benchmarking scenarios
3+
4+
1. benchmark single page
5+
2. benchmark with IAA batching to take advantage of parallel processing in IAA.
6+
7+
## Prerequisites
8+
1. Platform with Intel Xeon 4th generation (or higher) processor and IAA.
9+
2. [Memory Usage Analyzer Framework](https://github.com/intel/memory-usage-analyzer/tree/main?tab=readme-ov-file#install_)
10+
3. Kernel with IAA RFC patches. Please see instructions on building the kernel [here](https://github.com/intel/memory-usage-analyzer/wiki/Integration-of-IAA-RFC-patches-to-6.12-upstream-kernel).
11+
12+
## Run single-page Microbenchmarks
13+
14+
Collect data and generate reports for all the compressors for single-page. Depending on the number of IAA devices on the system, the setup scripts needs to be modified. The list of compressors and datasets can be modified as needed.
15+
```
16+
# For all 4 devices` per socket
17+
./collect_single_page.sh | tee single_page.txt
18+
19+
# For SKUs with only 1 IAA device per socket
20+
./collect_single_page.sh -d 1 | tee single_page.txt
21+
22+
```
23+
This will generate a summary of the key metrics for each dataset. In addition to that more detailed data points like CDFs and a summary .xls will be generated under the results_* directory
24+
25+
## Run Microbenchmarks with IAA batching
26+
Collect data and generate reports for all the compressors for batch processing. The list of compressors, datasets and batch sweep can be modifed as needed.
27+
```
28+
# For all 4 devices` per socket
29+
./collect_batch.sh | tee batch.txt
30+
# For SKUs with only 1 IAA device per socket
31+
./collect_batch.sh -d 1 | tee batch.txt
32+
```
33+
This will generate swap-in and swap-out latency reports for different batches for IAA along with software compressors.
3634

3735
## Additional details
38-
For running individual compressors
36+
For running individual compressors low-level script can be utilized.
3937

40-
```
38+
```
4139
echo 'lz4' > /sys/module/zswap/parameters/compressor
4240
./collect_bpftraces.sh
4341
echo 'deflate-iaa-canned' > /sys/module/zswap/parameters/compressor
4442
./collect_bpftraces.sh
45-
```
43+
```
4644
... and so on. This will generate a <compressor>_output file for each run
4745

4846
Once all runs are collected, run the post-processing Python script:
49-
```
47+
```
5048
./process_bpftraces.py
51-
```
52-
This will generate html files for CDFs of compress, decompress, compression ratio and page fault latencies and a summary.xlsx file.
53-
This will also generate (to stdout) P50 & P99 values.
49+
```
5450

5551

tests/madvise/clear_logs.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
#!/bin/bash
2+
#SPDX-License-Identifier: BSD-3-Clause
3+
#Copyright (c) 2025, Intel Corporation
4+
#Description: clear intermediate logs
5+
rm -f *_output
6+
rm -f *.html
7+
rm -f *.xlsx
8+
rm -f perf_*

tests/madvise/collect_bpftraces.sh

Lines changed: 79 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,32 @@
1-
#!/usr/bin/env bash
1+
#!/usr/bin/bash
22
#SPDX-License-Identifier: BSD-3-Clause
3-
#Copyright (c) 2023, Intel Corporation
3+
#Copyright (c) 2025, Intel Corporation
4+
#Description: Wrapper which runs madvise workload and collect bpftraces
45

56
# Define the location of the swap file
67
SWAP="silesia.tar"
78
COMP=$(cat /sys/module/zswap/parameters/compressor 2>/dev/null)
89
CORE_FREQUENCY=2500
10+
CBATCH=1
11+
DBATCH=0
12+
MTHP="4kB"
913

1014

11-
while getopts "f:t:" opt; do
15+
while getopts "f:t:c:d:m:" opt; do
1216
case $opt in
1317
f) CORE_FREQUENCY="$OPTARG" ;;
1418
t) SWAP="$OPTARG" ;;
19+
c) CBATCH="$OPTARG" ;;
20+
d) DBATCH="$OPTARG" ;;
21+
m) MTHP="$OPTARG" ;;
1522
\?) echo "Invalid option: -$OPTARG" >&2; exit 1 ;;
1623
esac
1724
done
1825

1926
# Function to handle errors
2027
handle_error() {
2128
echo "Error: $1"
22-
exit 1
29+
#exit 1
2330
}
2431

2532

@@ -35,7 +42,10 @@ if [ ! -f "$SWAP" ] ; then
3542
"silesia.tar")
3643
echo "Downloading $SWAP from http://wanos.co/assets/silesia.tar"
3744
wget --no-check-certificate http://wanos.co/assets/silesia.tar || handle_error "Failed to download $SWAP" ;;
38-
*) echo "Invalid file $SWAP" ; exit 1 ;
45+
"4300_all")
46+
echo "using $SWAP locally" ;;
47+
*) echo "Invalid file $SWAP" ; exit 1 ;;
48+
3949
esac
4050
fi
4151

@@ -53,41 +63,92 @@ if [ ! -f madvise_test ]; then
5363
fi
5464

5565
# Turn off read-ahead to optimize performance
56-
echo 0 > /proc/sys/vm/page-cluster || handle_error "Failed to set page-cluster"
66+
#echo 0 > /proc/sys/vm/page-cluster || handle_error "Failed to set page-cluster"
5767

5868

69+
if [[ $COMP == 'deflate-iaa-canned' || $COMP == 'deflate-iaa-dynamic'||$COMP == 'deflate-iaa' ]]; then
70+
echo "true" > /sys/kernel/mm/swap/singlemapped_ra_enabled
71+
else
72+
echo "false" > /sys/kernel/mm/swap/singlemapped_ra_enabled
73+
#CBATCH=1
74+
#DBATCH=0
75+
fi
76+
77+
echo ${DBATCH} > /proc/sys/vm/page-cluster || handle_error "Failed to set page-cluster"
78+
sysctl vm.compress-batchsize=${CBATCH} || handle_error "Failed to set compress-batchsize"
79+
5980
# Clear transparent huge pages configuration
6081
echo 'never' > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled || handle_error "Failed to clear hugepages-2048kB configuration"
6182
echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled || handle_error "Failed to clear transparent_hugepage configuration"
6283

63-
# Calculate swap size and pages
64-
SZ=$(ls -s "$SWAP" | awk '{print $1}')
65-
NPGS=$(echo "scale=0; $SZ/4-1" | bc -l)
6684

6785

68-
CMD="./madvise_test ${SWAP} ${NPGS}"
86+
# Set mTHP
87+
mthp_sizes=('16kB' '32kB' '64kB' '128kB' '256kB' '512kB' '1024kB' '2048kB')
88+
IFS=',' read -a mthp_list <<< "$MTHP"
89+
#echo ${mthp_list[@]}
90+
91+
for mthp in "${mthp_sizes[@]}"; do
92+
echo 'never'> /sys/kernel/mm/transparent_hugepage/hugepages-${mthp}/enabled
93+
done
94+
95+
max_mthp=0
96+
for mthp in "${mthp_list[@]}"; do
97+
mthp=`echo $mthp | tr -d ' '`
98+
if [[ ${mthp_sizes[@]} =~ ($mthp) ]] ; then
99+
if [ $mthp != '4kB' ]; then
100+
echo "configuring mthp ${mthp}"
101+
echo 'always' > /sys/kernel/mm/transparent_hugepage/hugepages-${mthp}/enabled
102+
fi
103+
fi
104+
done
105+
106+
107+
# Calculate swap size and pages
108+
SZ=$(ls -s "$SWAP" | awk '{print $1}')
109+
PAGE_SIZE=${mthp_list[0]}
110+
PAGE_SIZE=4kB
111+
echo "page-size:${PAGE_SIZE}"
112+
PAGE_SIZE=$(echo ${PAGE_SIZE} | awk '{print substr($1,1,length($1)-2)}' )
113+
NPGS=$(echo "scale=0; $SZ/${PAGE_SIZE}" | bc -l)
114+
# change it to bytes
115+
PAGE_SIZE=$(( PAGE_SIZE * 1024))
116+
117+
QAT_ENABLED=`bpftrace -l | grep qat_comp_alg_compress | wc -l`
118+
119+
CMD="./madvise_test ${SWAP} ${NPGS} ${PAGE_SIZE}"
120+
echo ${CMD}
69121
if [[ $COMP == 'deflate-iaa-canned' || $COMP == 'deflate-iaa-dynamic'||$COMP == 'deflate-iaa' ]];then
70122
COMP_STR="iaa_comp_acompress"
71123
DCOMP_STR="iaa_comp_adecompress"
72124
SZ_STR="arg0+68"
125+
#SZ_STR="((struct acomp_req*)arg0)->dlen"
73126
elif [ $COMP == 'lzo-rle' ]; then
74127
COMP_STR="lzorle_scompress"
75128
DCOMP_STR="lzorle_sdecompress"
76129
SZ_STR="arg4"
130+
elif [ $COMP == 'qat_deflate' ] && [ ${QAT_ENABLED} -gt 0 ]; then
131+
COMP_STR="qat_comp_alg_compress"
132+
DCOMP_STR="qat_comp_alg_decompress"
133+
SZ_STR="arg4"
77134
else
78135
COMP_STR="${COMP}_scompress"
79136
DCOMP_STR="${COMP}_sdecompress"
80137
SZ_STR="arg4"
81138
fi
82139

83-
PROG=`echo "kprobe:${COMP_STR} { @start=nsecs; @sz=${SZ_STR}; } kretprobe:${COMP_STR} { printf (\"C %d\\nR %d\\n\", nsecs-@start, *@sz); }
84-
kprobe:${DCOMP_STR} { @start=nsecs; } kretprobe:${DCOMP_STR} { printf (\"D %d\\n\", nsecs-@start); }
85-
kprobe:handle_mm_fault /(arg1&0xfff) == 0/{@pf[cpu]=nsecs; if(arg2!=0x1254) {@pf[cpu]=0;}} kretprobe:handle_mm_fault {if(@pf[cpu]) {printf(\"P %d\\n\", nsecs-@pf[cpu]);} @pf[cpu]=0;}
86-
kprobe:swap_writepage { @start_swap_write=nsecs; } kretprobe:swap_writepage { printf (\"SW %d\\n\", nsecs-@start_swap_write); }
87-
kprobe:swap_read_folio { @start_swap_read=nsecs; } kretprobe:swap_read_folio { printf (\"SR %d\\n\", nsecs-@start_swap_read);}
88-
kprobe:zswap_compress { @start_zswap_comp=nsecs; @zsz=arg1;} kretprobe:zswap_compress { printf (\"ZC %d\\nZCSZ %d\\nZCR %d\\n\", nsecs-@start_zswap_comp, *(@zsz+8), retval&0x1);}
89-
kprobe:zswap_decompress { @start_zswap_decomp=nsecs; } kretprobe:zswap_decompress { printf (\"ZD %d\\n\", nsecs-@start_zswap_decomp);}
90-
END { clear(@pf); delete(@start); delete(@sz);}"`
140+
141+
# TODO: count errors from swap_crypto_acomp_compress_batch
142+
143+
PROG=`echo "kprobe:${COMP_STR} { @start=nsecs; @sz=${SZ_STR}; } kretprobe:${COMP_STR} { printf (\"C %d\\nR %d\\n\", nsecs-@start, *@sz); }
144+
kprobe:${DCOMP_STR} { @start=nsecs; } kretprobe:${DCOMP_STR} { printf (\"D %d\\n\", nsecs-@start); }
145+
146+
kprobe:swap_writepage { @start_swap_write=nsecs; } kretprobe:swap_writepage { printf (\"SW %d\\n\", nsecs-@start_swap_write); }
147+
kprobe:swap_read_folio { @start_swap_read=nsecs; } kretprobe:swap_read_folio { printf (\"SR %d\\n\", nsecs-@start_swap_read);}
148+
149+
END { delete(@start); delete(@sz);}"`
91150

92151
perf stat -o perf_${COMP}.log bpftrace -e "${PROG}" -c "${CMD}" -o ${COMP}_output
93152
echo "Script completed successfully."
153+
154+

tests/madvise/disable_iaa

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
#!/usr/bin/env bash
22
#SPDX-License-Identifier: BSD-3-Clause
3-
#Copyright (c) 2023, Intel Corporation
3+
#Copyright (c) 2025, Intel Corporation
4+
#Description: Disable all IAA devices
5+
46

57
IAX_CONFIG_PATH=/sys/bus/dsa/devices
68

@@ -10,6 +12,7 @@ IAX_BIND_WQ_PATH=/sys/bus/dsa/drivers/crypto/bind
1012
IAX_UNBIND_PATH=/sys/bus/dsa/drivers/idxd/unbind
1113
IAX_UNBIND_WQ_PATH=/sys/bus/dsa/drivers/crypto/unbind
1214

15+
echo lzo > /sys/module/zswap/parameters/compressor
1316
#
1417
# count iax instances
1518
#
@@ -34,3 +37,9 @@ done
3437

3538
rmmod iaa_crypto
3639
modprobe iaa_crypto
40+
41+
42+
43+
44+
45+

tests/madvise/enable_kernel_iaa

Lines changed: 0 additions & 89 deletions
This file was deleted.

tests/madvise/enable_zswap.sh

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
#!/usr/bin/env bash
1+
#!/bin/bash
22
#SPDX-License-Identifier: BSD-3-Clause
3-
#Copyright (c) 2023, Intel Corporation
3+
#Copyright (c) 2025, Intel Corporation
4+
#Description: Configure zswap
45

56
# Enable zswap
67
echo "Enabling zswap..."

0 commit comments

Comments
 (0)