-
Notifications
You must be signed in to change notification settings - Fork 157
Generate SBOMs for repaired libraries #577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #577 +/- ##
==========================================
+ Coverage 92.83% 93.00% +0.16%
==========================================
Files 21 22 +1
Lines 1773 1815 +42
Branches 333 338 +5
==========================================
+ Hits 1646 1688 +42
Misses 77 77
Partials 50 50 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Taking this pull request out of draft to begin inviting reviews. I still need to add some integration tests to show that the SBOM documents get generated for many different operating systems offered by manylinux images :) cc @lkollar, @mayeut, and @captn3m0 who expressed interested in the linked issues. |
@@ -0,0 +1,359 @@ | |||
# SPDX-License-Identifier: MIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a vendored project "whichprovides" which I maintain. It's the abstraction of the package provider detection that was mentioned here: #541 (comment)
Please feel free to read and review this code, too. This code hasn't been reviewed by anyone except myself, so your perspective is more than welcome!
for more information, see https://pre-commit.ci
Okay @mayeut and @auvipy this pull request is now ready to be reviewed.
Let me know what else I can do to get this PR ready :) |
|
||
@staticmethod | ||
def distro() -> typing.Optional[str]: | ||
return PackageProvider.os_release().get("ID", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be cached or maybe _package_providers
should be cached and only return available providers (or add a cached method _available_package_providers()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created as a PR to whichprovides here: sethmlarson/whichprovides#2
src/auditwheel/sboms.py
Outdated
wheel_name, wheel_version, *_ = wheel_fname.split("-", 2) | ||
wheel_name = wheel_name.lower() | ||
wheel_purl = ( | ||
f"pkg:pypi/{quote(wheel_name, safe='')}@{quote(wheel_version, safe='')}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name normalization seems a bit fragile not sure if we can reuse some packaging stuff here https://github.com/package-url/purl-spec/blob/main/types-doc/pypi-definition.md#name-definition
The package name is probably safer to retrieve from metadata rather than filename which already undergoes normalization ?
The file_name
optional qualifier should be used I think https://github.com/package-url/purl-spec/blob/main/types-doc/pypi-definition.md#qualifiers-definition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work on this @sethmlarson
I left some inline comments
Co-authored-by: Matthieu Darbois <[email protected]>
Co-authored-by: Matthieu Darbois <[email protected]>
Co-authored-by: Matthieu Darbois <[email protected]>
Co-authored-by: Matthieu Darbois <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds automatic SBOM (Software Bill of Materials) generation for wheels that are repaired by auditwheel. The implementation leverages a vendored package called "whichprovides" to identify which system packages provide the shared libraries that get grafted into wheels during the repair process.
- Introduces SBOM generation using CycloneDX format for repaired wheels
- Vendors the "whichprovides" library to identify system package sources for grafted libraries
- Adds comprehensive test coverage for both unit and integration scenarios
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
src/auditwheel/sboms.py | Core SBOM creation logic that generates CycloneDX format documents |
src/auditwheel/repair.py | Integration of SBOM generation into the wheel repair workflow |
src/auditwheel/wheeltools.py | Moved wheel filename parsing regex for reuse |
src/auditwheel/_vendor/whichprovides/ | Vendored library for identifying package providers of system files |
tests/unit/test_sboms.py | Unit tests for SBOM creation functionality |
tests/integration/test_manylinux.py | Integration test validating SBOM generation with numpy wheel |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
||
def create_sbom_for_wheel( | ||
wheel_fname: str, sbom_filepaths: list[Path] | ||
) -> None | dict[str, typing.Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type annotation uses None | dict
syntax which requires Python 3.10+. Consider using Optional[dict[str, typing.Any]]
for broader compatibility or ensure the project's minimum Python version supports this syntax.
) -> None | dict[str, typing.Any]: | |
) -> typing.Optional[dict[str, typing.Any]]: |
Copilot uses AI. Check for mistakes.
@@ -64,6 +59,9 @@ def repair_wheel( | |||
raise ValueError(msg) | |||
|
|||
dest_dir = Path(match.group("name") + lib_sdir) | |||
dist_info_dirs = glob.glob(str(ctx.path / "*.dist-info")) | |||
assert len(dist_info_dirs) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion lacks a descriptive error message. Consider adding a message like assert len(dist_info_dirs) == 1, f'Expected exactly one .dist-info directory, found {len(dist_info_dirs)}'
to help with debugging when this fails.
assert len(dist_info_dirs) == 1 | |
assert len(dist_info_dirs) == 1, f"Expected exactly one .dist-info directory, found {len(dist_info_dirs)}: {dist_info_dirs}" |
Copilot uses AI. Check for mistakes.
@@ -341,7 +342,7 @@ def docker_exec( | |||
ec, output = container.exec_run(cmd, workdir=cwd, environment=env) | |||
output = output.decode("utf-8") | |||
if ec != expected_retcode: | |||
print(output) | |||
logger.info("docker exec error %s: %s", container.id[:12], output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] This appears to be an unrelated change that converts a print statement to a logger call. While this is an improvement, it should be documented in the PR description or separated into its own commit as it's not directly related to SBOM generation.
Copilot uses AI. Check for mistakes.
(Draft, do not merge) This is my initial pull request for adding automatic SBOM generation based on package manager info for libraries that are repaired into wheels. I've tested locally by building wheels from various projects and operating systems, will work on getting those pulled into the test suite.
The majority of the "logic" comes from the project "whichprovides" which gets bundled into auditwheel as a single file. You can review that project in its entirety within this pull request. If you'd like to submit comments that you have about "whichprovides" you can do so here and I'll get them addressed in the upstream project.
Closes #541
Closes #398