-
Notifications
You must be signed in to change notification settings - Fork 179
Add new cpu case of check die id #6567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughAdds a new CPU topology die-id test: a config with NUMA and die variants, plus a Python test that parses per-CPU sysfs topology, validates die/grouping/counts, and supports host_capabilities or on_vm execution with VM setup and cleanup. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Tester
participant Runner as Test Runner
participant Libvirt as libvirt/virsh
participant VM as Guest VM
participant Guest as Guest OS
participant SysFS as /sys/devices/system/cpu/...
rect rgb(248,249,255)
note right of Runner: Mode: host_capabilities
Tester->>Runner: run(..., host_capabilities)
Runner->>Libvirt: query capability_xml
Libvirt-->>Runner: capability XML
Runner->>Runner: inspect die_id per NUMA cell
Runner-->>Tester: result (pass/fail)
end
rect rgb(245,255,245)
note right of Runner: Mode: on_vm
Tester->>Runner: run(..., on_vm, dies=N)
Runner->>Libvirt: define VM XML (vcpu=N*8, NUMA, cpu topology)
Libvirt-->>Runner: VM defined
Runner->>Libvirt: start VM
Libvirt-->>VM: boot
Runner->>Guest: execute topology probe script
Guest->>SysFS: read cpu*/topology
SysFS-->>Guest: per-CPU fields
Guest-->>Runner: topology text
Runner->>Runner: parse_cpu_topology()
Runner->>Runner: check_cpu_topology(die_count=N)
alt Validation OK
Runner-->>Tester: pass with summary
else Validation errors
Runner-->>Tester: fail with errors
end
Runner->>Libvirt: restore/undefine VM XML
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (7)
libvirt/tests/cfg/cpu/cpu_topology_die_id.cfg (1)
14-16: NUMA cpus mapping is fixed to 0-15; dies=5/11 will leave most vCPUs unmapped.For dies=5 and 11, vcpu = dies*8 yields 40 and 88 vCPUs, but NUMA cells only cover 0-15. This risks undefined placement or libvirt rejection depending on version. Make the cpulists depend on dies.
Apply this diff to generate dynamic ranges per socket:
- cpu_numa_attrs = {'numa_cell': [{'id': '0', 'discard': 'yes', 'memory': '512000', 'cpus': '0-7', 'unit': 'KiB'}, {'memAccess': 'shared', 'id': '1', 'memory': '512000', 'cpus': '8-15', 'unit': 'KiB'}]} + cpu_numa_attrs = { + 'numa_cell': [ + {'id': '0', 'discard': 'yes', 'memory': '512000', + 'cpus': '0-' + str(${dies}*4 - 1), 'unit': 'KiB'}, + {'memAccess': 'shared', 'id': '1', 'memory': '512000', + 'cpus': str(${dies}*4) + '-' + str(${dies}*8 - 1), 'unit': 'KiB'} + ] + }If your environment requires integers in topology, you can also drop the quotes around '2' values in sockets/cores/threads.
libvirt/tests/src/cpu/cpu_topology_die_id.py (6)
3-7: Remove unused imports/const.
virsh,libvirt, andVIRSH_ARGSare unused. Trim to keep the module clean.-from virttest import virsh from virttest.libvirt_xml import vm_xml -from virttest.utils_test import libvirt from virttest.libvirt_xml import capability_xml -VIRSH_ARGS = {'ignore_status': False, 'debug': True}
138-141: Docstring mismatch.Update the docstring to reflect die-id topology validation.
- Test start VM with hyper-v related features + Validate die_id topology via host capabilities or inside a VM.
167-187: Heredoc is safer than echo for multi-line shell script.Prevents quoting/escape surprises and preserves tabs/newlines reliably.
- session.cmd(f"echo '{shell_str}' > {shell_file}") - LOG.debug(session.cmd(f'cat {shell_file}')) + session.cmd(f"cat > {shell_file} <<'EOF'\n{shell_str}\nEOF") + LOG.debug(session.cmd_output(f'cat {shell_file}'))
194-196: test.fail expects a string; join the errors.Otherwise you pass a list and get an unhelpful representation.
- if result['valid'] is False: - test.fail(result['errors']) + if result['valid'] is False: + test.fail('; '.join(result['errors']))
142-172: Ensure session/VM cleanup and restore XML after shutdown.Avoid leaking sessions and restoring XML while the VM is running.
@@ - vmxml = vm_xml.VMXML.new_from_dumpxml(vm_name) - bkxml = vmxml.copy() + vmxml = vm_xml.VMXML.new_from_dumpxml(vm_name) + bkxml = vmxml.copy() + session = None @@ - session = vm.wait_for_login() + session = vm.wait_for_login() @@ finally: - bkxml.sync() + try: + if session: + session.close() + finally: + if vm.is_alive(): + vm.destroy() + bkxml.sync()Also applies to: 197-199
82-111: Topology checks are fragile for non-contiguous lists.
phylist/dielistequality to single ranges assumes contiguous numbering. Real kernels may return comma-separated lists (e.g., “0-3,32-35”). Consider parsing lists into sets and validating membership instead of string equality.I can add a tiny cpulist parser and update the checks—want me to push a patch?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
libvirt/tests/cfg/cpu/cpu_topology_die_id.cfg(1 hunks)libvirt/tests/src/cpu/cpu_topology_die_id.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
libvirt/tests/src/cpu/cpu_topology_die_id.py
145-145: Use of possibly insecure function; consider using ast.literal_eval
(S307)
146-146: Local variable cpu_attrs is assigned to but never used
Remove assignment to unused variable cpu_attrs
(F841)
146-146: Use of possibly insecure function; consider using ast.literal_eval
(S307)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Python 3.9
- GitHub Check: Python 3.12
- GitHub Check: Python 3.8
- GitHub Check: Python 3.11
| vm_attrs = eval(params.get('vm_attrs', '{}')) | ||
| cpu_attrs = eval(params.get('cpu_attrs', '{}')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace eval() with ast.literal_eval() and drop unused variable.
Avoid eval on untrusted config. Also cpu_attrs local is unused.
+import ast
@@
- vm_attrs = eval(params.get('vm_attrs', '{}'))
- cpu_attrs = eval(params.get('cpu_attrs', '{}'))
+ vm_attrs = ast.literal_eval(params.get('vm_attrs', '{}'))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| vm_attrs = eval(params.get('vm_attrs', '{}')) | |
| cpu_attrs = eval(params.get('cpu_attrs', '{}')) | |
| import ast | |
| vm_attrs = ast.literal_eval(params.get('vm_attrs', '{}')) |
🧰 Tools
🪛 Ruff (0.12.2)
145-145: Use of possibly insecure function; consider using ast.literal_eval
(S307)
146-146: Local variable cpu_attrs is assigned to but never used
Remove assignment to unused variable cpu_attrs
(F841)
146-146: Use of possibly insecure function; consider using ast.literal_eval
(S307)
🤖 Prompt for AI Agents
libvirt/tests/src/cpu/cpu_topology_die_id.py around lines 145-146: replace the
unsafe eval() calls with ast.literal_eval() for parsing params.get('vm_attrs',
'{}') and remove the unused cpu_attrs variable; add "import ast" at the top of
the file if not already present and update the assignment to only create
vm_attrs = ast.literal_eval(params.get('vm_attrs', '{}')) (delete the cpu_attrs
line).
| try: | ||
| if case == 'host_capabilities': | ||
| capa = capability_xml.CapabilityXML() | ||
| cells = capa.cells_topology.get_cell() | ||
| die_id_dict = {cell.cell_id: [cpu['die_id'] | ||
| for cpu in cell.cpu] for cell in cells} | ||
| die_ids = [set(v) for v in die_id_dict] | ||
| if not all([len(x) == 1 for x in die_ids]): | ||
| test.error( | ||
| 'The values of die_id of the same cell should be the same') | ||
| die_id_set = set([int(x) for y in die_ids for x in y]) | ||
| if die_id_set != {0, 1}: | ||
| test.error( | ||
| f'The values of die_id should be 0 or 1, not {die_id_set}') | ||
| elif case == 'on_vm': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Fix host_capabilities logic: wrong dict iteration and over‑strict die_id check.
for v in die_id_dict iterates keys, not values. Also requiring die_id_set == {0,1} will fail on hosts with other die counts. Validate per‑cell die_id contiguity (0..max) instead.
if case == 'host_capabilities':
capa = capability_xml.CapabilityXML()
cells = capa.cells_topology.get_cell()
- die_id_dict = {cell.cell_id: [cpu['die_id']
- for cpu in cell.cpu] for cell in cells}
- die_ids = [set(v) for v in die_id_dict]
- if not all([len(x) == 1 for x in die_ids]):
- test.error(
- 'The values of die_id of the same cell should be the same')
- die_id_set = set([int(x) for y in die_ids for x in y])
- if die_id_set != {0, 1}:
- test.error(
- f'The values of die_id should be 0 or 1, not {die_id_set}')
+ die_id_map = {
+ cell.cell_id: [int(cpu['die_id']) for cpu in cell.cpu]
+ for cell in cells
+ }
+ die_sets = [set(v) for v in die_id_map.values()]
+ if any(len(s) == 0 for s in die_sets):
+ test.error('No die_id values found in one or more cells')
+ # Expect zero-based, contiguous die ids within each cell, e.g. {0..N}
+ for cid, s in zip(die_id_map.keys(), die_sets):
+ if sorted(s) != list(range(max(s) + 1)):
+ test.error(f'Cell {cid} die_id set must be contiguous starting at 0, got {sorted(s)}')If cells_topology.get_cell() or cell.cpu shapes differ in your tree, adjust attribute names accordingly.
🏁 Script executed:
#!/bin/bash
# Inspect capability_xml API to confirm attribute names used above.
fd -t f capability_*.py | xargs -I{} sh -c 'echo ">>> {}"; rg -n "class CapabilityXML|cells_topology|get_cell|die_id|cpu" -n {} -C2'Length of output: 0
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "PWD: $(pwd)"
echo "Top-level files:"
ls -la | sed -n '1,120p'
echo
echo "Searching repository for relevant symbols..."
rg -n --hidden --no-ignore -S "CapabilityXML|capability_xml|cells_topology|get_cell|die_id|cell\.cpu|cell_id" -g '!**/node_modules/**' -g '!**/.git/**' || true
echo
TARGET="libvirt/tests/src/cpu/cpu_topology_die_id.py"
if [ -f "$TARGET" ]; then
echo "---- Showing $TARGET (first 400 lines) ----"
nl -ba "$TARGET" | sed -n '1,400p'
else
echo "Target file not found: $TARGET"
fiLength of output: 1922
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "PWD: $(pwd)"
echo
echo "Listing target directory (if present):"
ls -la libvirt/tests/src/cpu 2>/dev/null || true
FILE="libvirt/tests/src/cpu/cpu_topology_die_id.py"
if [ -f "$FILE" ]; then
echo "---- Showing $FILE (up to 400 lines) ----"
sed -n '1,400p' "$FILE"
else
echo "$FILE not found. Searching for file..."
if command -v fd >/dev/null 2>&1; then
fd --hidden --type f 'cpu_topology_die_id.py' || true
else
find . -type f -name 'cpu_topology_die_id.py' -print || true
fi
fi
echo
echo "Searching repository for capability_xml / CapabilityXML / die_id occurrences..."
if command -v rg >/dev/null 2>&1; then
rg -n -uu "CapabilityXML|capability_xml|cells_topology|get_cell|die_id|cell\.cpu|cell_id" || true
else
echo "rg not available; falling back to grep"
grep -RIn --exclude-dir=.git -E "CapabilityXML|capability_xml|cells_topology|get_cell|die_id|cell\.cpu|cell_id" . || true
fiLength of output: 33720
Fix host_capabilities logic: wrong dict iteration and over‑strict die_id check.
for v in die_id_dict iterates keys, not values. Also requiring die_id_set == {0,1} will fail on hosts with other die counts — validate per‑cell die_id contiguity (0..max) instead.
if case == 'host_capabilities':
capa = capability_xml.CapabilityXML()
cells = capa.cells_topology.get_cell()
- die_id_dict = {cell.cell_id: [cpu['die_id']
- for cpu in cell.cpu] for cell in cells}
- die_ids = [set(v) for v in die_id_dict]
- if not all([len(x) == 1 for x in die_ids]):
- test.error(
- 'The values of die_id of the same cell should be the same')
- die_id_set = set([int(x) for y in die_ids for x in y])
- if die_id_set != {0, 1}:
- test.error(
- f'The values of die_id should be 0 or 1, not {die_id_set}')
+ die_id_map = {
+ cell.cell_id: [int(cpu['die_id']) for cpu in cell.cpu]
+ for cell in cells
+ }
+ die_sets = [set(v) for v in die_id_map.values()]
+ if any(len(s) == 0 for s in die_sets):
+ test.error('No die_id values found in one or more cells')
+ # Expect zero-based, contiguous die ids within each cell, e.g. {0..N}
+ for cid, s in zip(die_id_map.keys(), die_sets):
+ if sorted(s) != list(range(max(s) + 1)):
+ test.error(f'Cell {cid} die_id set must be contiguous starting at 0, got {sorted(s)}')If cells_topology.get_cell() or cell.cpu shapes differ in your tree, adjust attribute names accordingly.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| if case == 'host_capabilities': | |
| capa = capability_xml.CapabilityXML() | |
| cells = capa.cells_topology.get_cell() | |
| die_id_dict = {cell.cell_id: [cpu['die_id'] | |
| for cpu in cell.cpu] for cell in cells} | |
| die_ids = [set(v) for v in die_id_dict] | |
| if not all([len(x) == 1 for x in die_ids]): | |
| test.error( | |
| 'The values of die_id of the same cell should be the same') | |
| die_id_set = set([int(x) for y in die_ids for x in y]) | |
| if die_id_set != {0, 1}: | |
| test.error( | |
| f'The values of die_id should be 0 or 1, not {die_id_set}') | |
| elif case == 'on_vm': | |
| try: | |
| if case == 'host_capabilities': | |
| capa = capability_xml.CapabilityXML() | |
| cells = capa.cells_topology.get_cell() | |
| die_id_map = { | |
| cell.cell_id: [int(cpu['die_id']) for cpu in cell.cpu] | |
| for cell in cells | |
| } | |
| die_sets = [set(v) for v in die_id_map.values()] | |
| if any(len(s) == 0 for s in die_sets): | |
| test.error('No die_id values found in one or more cells') | |
| # Expect zero-based, contiguous die ids within each cell, e.g. {0..N} | |
| for cid, s in zip(die_id_map.keys(), die_sets): | |
| if sorted(s) != list(range(max(s) + 1)): | |
| test.error(f'Cell {cid} die_id set must be contiguous starting at 0, got {sorted(s)}') | |
| elif case == 'on_vm': |
🤖 Prompt for AI Agents
In libvirt/tests/src/cpu/cpu_topology_die_id.py around lines 152-166, the
host_capabilities branch iterates the die_id_dict keys instead of values and
then enforces a global set equality to {0,1}; change the comprehension to
iterate die_id_dict.values(), build per-cell sets of die_id integers, and
replace the global equality check with a per-cell contiguity check that verifies
each cell's die_id set equals set(range(0, max_die+1)) (i.e. starts at 0 and
contains contiguous IDs up to the max for that cell); ensure die_id values are
cast to int when forming sets and keep the existing fallback note about
adjusting attribute names if cells_topology.get_cell() or cell.cpu differ.
- RHEL-187329 - [Topology] Configure 'die_id' in CPU topology part for VM Signed-off-by: Haijiao Zhao <[email protected]>
bd985f6 to
496000d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
libvirt/tests/src/cpu/cpu_topology_die_id.py (1)
170-179: Consider security implications of shell script execution.The shell script construction uses string formatting with user-controlled input (
vcpu - 1). Whilevcpucomes from test parameters rather than external user input, consider validating it's a positive integer to prevent potential command injection.Add input validation:
vcpu = vm_attrs['vcpu'] + if not isinstance(vcpu, int) or vcpu <= 0: + test.error(f'Invalid vcpu value: {vcpu}') vmxml.setup_attrs(**vm_attrs)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
libvirt/tests/cfg/cpu/cpu_topology_die_id.cfg(1 hunks)libvirt/tests/src/cpu/cpu_topology_die_id.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- libvirt/tests/cfg/cpu/cpu_topology_die_id.cfg
🧰 Additional context used
🧬 Code graph analysis (1)
libvirt/tests/src/cpu/cpu_topology_die_id.py (2)
provider/guest_os_booting/guest_os_booting_base.py (1)
get_vm(18-56)libvirt/tests/src/serial/serial_functional.py (1)
wait_for_login(235-258)
🪛 Ruff (0.13.1)
libvirt/tests/src/cpu/cpu_topology_die_id.py
143-143: Use of possibly insecure function; consider using ast.literal_eval
(S307)
144-144: Local variable cpu_attrs is assigned to but never used
Remove assignment to unused variable cpu_attrs
(F841)
144-144: Use of possibly insecure function; consider using ast.literal_eval
(S307)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Python 3.8
- GitHub Check: Python 3.9
- GitHub Check: Python 3.11
- GitHub Check: Python 3.12
🔇 Additional comments (6)
libvirt/tests/src/cpu/cpu_topology_die_id.py (6)
143-144: Replace unsafe eval() with ast.literal_eval().The
eval()function poses security risks as it can execute arbitrary code, whileast.literal_eval()is a safer alternative that only processes basic Python literals.Apply this diff to address the security concern:
+import ast + - vm_attrs = eval(params.get('vm_attrs', '{}')) - cpu_attrs = eval(params.get('cpu_attrs', '{}')) + vm_attrs = ast.literal_eval(params.get('vm_attrs', '{}'))Note that the
cpu_attrsvariable is unused and should be removed entirely.
150-164: Fix dictionary iteration and overly restrictive die_id validation.The current logic has two issues:
for v in die_id_dictiterates over keys instead of values, and the global die_id check{0, 1}is too restrictive for systems with different die counts.Apply this diff to correct the logic:
if case == 'host_capabilities': capa = capability_xml.CapabilityXML() cells = capa.cells_topology.get_cell() - die_id_dict = {cell.cell_id: [cpu['die_id'] - for cpu in cell.cpu] for cell in cells} - die_ids = [set(v) for v in die_id_dict] - if not all([len(x) == 1 for x in die_ids]): - test.error( - 'The values of die_id of the same cell should be the same') - die_id_set = set([int(x) for y in die_ids for x in y]) - if die_id_set != {0, 1}: - test.error( - f'The values of die_id should be 0 or 1, not {die_id_set}') + die_id_map = { + cell.cell_id: [int(cpu['die_id']) for cpu in cell.cpu] + for cell in cells + } + die_sets = [set(v) for v in die_id_map.values()] + if any(len(s) == 0 for s in die_sets): + test.error('No die_id values found in one or more cells') + # Expect zero-based, contiguous die ids within each cell + for cid, s in zip(die_id_map.keys(), die_sets): + if sorted(s) != list(range(max(s) + 1)): + test.error(f'Cell {cid} die_id set must be contiguous starting at 0, got {sorted(s)}')
10-39: LGTM! Well-structured parsing function.The function correctly parses CPU topology output into structured data with appropriate field handling and error resilience for empty lines.
41-134: LGTM! Comprehensive validation logic.The validation function implements thorough checks for CPU topology consistency including total CPU count, physical ID distribution, phylist patterns, die list ranges, and die ID progression. The error reporting structure provides clear diagnostics.
164-194: LGTM! Well-implemented VM test execution.The VM setup, shell script execution, and validation flow are correctly implemented with proper session management and cleanup.
195-197: LGTM! Proper cleanup in finally block.The backup XML restoration ensures proper cleanup regardless of test outcome.
Test result:
(1/1) type_specific.local.cpu_topology_die_id.on_vm.dies_5: STARTED
(1/1) type_specific.local.cpu_topology_die_id.on_vm.dies_5: PASS (29.71 s)
(1/1) type_specific.local.cpu_topology_die_id.on_vm.dies_2: STARTED
(1/1) type_specific.local.cpu_topology_die_id.on_vm.dies_2: PASS (27.66 s)
(1/1) type_specific.local.cpu_topology_die_id.on_vm.dies_11: STARTED
(1/1) type_specific.local.cpu_topology_die_id.on_vm.dies_11: PASS (50.99 s)
Summary by CodeRabbit