-
Notifications
You must be signed in to change notification settings - Fork 177
virsh_domjobinfo: fix pipe file read stuck when background virsh cmd failed #6564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
virsh_domjobinfo: fix pipe file read stuck when background virsh cmd failed #6564
Conversation
WalkthroughAdds a bounded 6s wait and explicit error handling for two background Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test as Test harness
participant Proc as virsh subprocess
participant FIFO as Temp FIFO
Test->>Proc: Start background virsh action
note right of Test: New: bounded wait (timeout=6s)
rect rgba(230,245,255,0.5)
Test->>Proc: communicate(timeout=6)
alt Completes within 6s
Proc-->>Test: stdout, stderr, returncode
alt returncode != 0
Test->>Test: unlink FIFO
Test->>Test: test.error(stderr)
else returncode == 0
Test->>FIFO: Read FIFO (existing flow)
end
else TimeoutExpired
Test->>Test: Log debug "background command still running"
Test->>FIFO: Read FIFO (existing flow)
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. Comment Pre-merge checks✅ Passed checks (3 passed)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
libvirt/tests/src/virsh_cmd/domain/virsh_domjobinfo.py (2)
196-203
: Process cleanup bug: kill runs only when process already exited non‑zero
if process.poll():
is truthy only for non‑zero return codes; it never kills a still‑running process, risking a leak. Terminate whenpoll() is None
and fall back to kill on timeout.- if process: - if process.poll(): - try: - process.kill() - except OSError: - pass + if process and process.poll() is None: + try: + process.terminate() + process.wait(timeout=5) + except subprocess.TimeoutExpired: + try: + process.kill() + except OSError: + pass + except OSError: + pass
41-45
: Avoid shell=True; build argv with shlex to remove injection riskInputs ultimately come from params; using
shell=True
is unnecessary and risky. Build argv and run withshell=False
. Also log the joined command for parity.+import shlex @@ - command = "virsh %s %s %s" % (action, vm_name, file) - logging.debug("Action: %s", command) - p = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, - stderr=subprocess.PIPE) + cmd = ["virsh"] + shlex.split(action) + [vm_name] + if file: + cmd.append(file) + logging.debug("Action: %s", " ".join(cmd)) + p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)Also applies to: 1-6
🧹 Nitpick comments (4)
libvirt/tests/src/virsh_cmd/domain/virsh_domjobinfo.py (4)
153-160
: Good fix; add readable stderr and cleanup orderingDecode stderr for human-readable logs and unlink the FIFO before raising to avoid accidentally proceeding to open it. Minor, but makes failures clearer and safer.
- _, stderr = process.communicate(timeout=6) - if process.returncode: - test.error('Background cmd met unexpected failure of %s, abort the test.' % stderr) - os.unlink(tmp_pipe) + _, stderr_b = process.communicate(timeout=6) + if process.returncode: + # Decode for readable logs; ensure FIFO is removed before aborting. + stderr_s = (stderr_b.decode(locale.getpreferredencoding(), 'ignore') + if isinstance(stderr_b, (bytes, bytearray)) else str(stderr_b)) + try: + os.unlink(tmp_pipe) + except FileNotFoundError: + pass + test.error(f"Background virsh command failed: {stderr_s}")
140-141
: Guard against unboundf
and make the check explicitIf an exception occurs before assigning
f
,if process and f:
can raiseUnboundLocalError
whenprocess
is truthy. Initializef
toNone
and check explicitly.- process = None + process = None + f = None @@ - if process and f: + if process and f is not None: dummy = f.read() f.close()Also applies to: 187-190
106-110
: Make managedsave detection robust; trim action to avoid trailing‑space trapsRely on
actions
value and strip the composedaction
. Prevents brittle string equality with trailing spaces.- action = ' '.join([actions, act_opt]) + action = ' '.join([actions, act_opt]).strip() @@ - if action == "managedsave ": + if actions == "managedsave": process = get_subprocess(action, vm_name, "", None) else: process = get_subprocess(action, vm_name, tmp_pipe, None)Also applies to: 147-151
58-58
: Split once to avoid accidental multi‑colon key/value breakageSafer parsing if any value contains a colon.
- out_dict = dict([x.split(':') for x in out_list]) + out_dict = dict([x.split(':', 1) for x in out_list])
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
libvirt/tests/src/virsh_cmd/domain/virsh_domjobinfo.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Python 3.12
- GitHub Check: Python 3.8
- GitHub Check: Python 3.9
- GitHub Check: Python 3.11
7aa6817
to
48b0277
Compare
@smitterl please help review. |
…failed Since libvirt 11.5.0, the virsh dump --live option has been disabled and now returns an error message. This causes the test to hang while attempting to read the virsh dump pipe file's input. The fix implements subprocess.communicate() to handle the background command interaction: Command failure: the test aborts immediately with the error message. Command success: original logic remains unchanged. Note: If the background command fails after the timeout period, this likely indicates a libvirt issue, so no changes are made for now. Signed-off-by: Liang Cong <[email protected]>
575f781
to
8b89623
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
Since libvirt 11.5.0, the virsh dump --live option has been disabled and now returns an error message.
This causes the test to hang while attempting to read the virsh dump pipe file's input.
The fix implements subprocess.communicate() to handle the background command interaction:
Command failure: the test aborts immediately with the error message.
Command success: original logic remains unchanged.
Note: If the background command fails after the timeout period, this likely indicates a libvirt issue, so no changes are made for now.
Note: this PR comes from the discussion of closed PR: #6478
Test result with libvirt >= 11.5.0:
(01/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.live_dump.running_state.vm_id: STARTED
(01/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.live_dump.running_state.vm_id: ERROR: Background cmd met unexpected failure of b"error: Failed to core dump domain 'avocado-vt-vm1' to /var/tmp/avocado_m85vaczv/domjobinfo.fifo\nerror: unsupported flags (0x2) in function qemuDomainCoreDumpWithFormat\n", abort the test. (21.61 s)
(02/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.live_dump.paused_state.vm_name: STARTED
(02/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.live_dump.paused_state.vm_name: ERROR: Background cmd met unexpected failure of b"error: Failed to core dump domain 'avocado-vt-vm1' to /var/tmp/avocado_4zt69ar6/domjobinfo.fifo\nerror: unsupported flags (0x2) in function qemuDomainCoreDumpWithFormat\n", abort the test. (18.84 s)
(03/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.crash_dump.running_state.vm_id: STARTED
(03/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.crash_dump.running_state.vm_id: PASS (30.01 s)
(04/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.crash_dump.paused_state.vm_uuid: STARTED
(04/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.crash_dump.paused_state.vm_uuid: PASS (29.90 s)
(05/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.keep_complete_test.running_state.vm_name: STARTED
(05/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.dump_action.keep_complete_test.running_state.vm_name: ERROR: Background cmd met unexpected failure of b"error: Failed to core dump domain 'avocado-vt-vm1' to /var/tmp/avocado_62liqx51/domjobinfo.fifo\nerror: unsupported flags (0x2) in function qemuDomainCoreDumpWithFormat\n", abort the test. (22.09 s)
(06/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.save_action.running_state.vm_id: STARTED
(06/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.save_action.running_state.vm_id: PASS (37.78 s)
(07/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.save_action.paused_state.vm_name: STARTED
(07/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.save_action.paused_state.vm_name: PASS (37.94 s)
(08/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.managedsave_action.running_state.vm_id: STARTED
(08/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.managedsave_action.running_state.vm_id: PASS (37.37 s)
(09/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.managedsave_action.paused_state.vm_name: STARTED
(09/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.normal_test.managedsave_action.paused_state.vm_name: PASS (37.46 s)
(10/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.error_test.no_name: STARTED
(10/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.error_test.no_name: PASS (26.84 s)
(11/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.error_test.shutoff_state: STARTED
(11/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.error_test.shutoff_state: PASS (5.50 s)
(12/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.error_test.with_libvirtd_stop: STARTED
(12/12) type_specific.io-github-autotest-libvirt.virsh.domjobinfo.error_test.with_libvirtd_stop: PASS (26.66 s)
Summary by CodeRabbit
New Features
Bug Fixes
Tests