Skip to content

Conversation

@DilumAluthge
Copy link
Member

@DilumAluthge DilumAluthge commented May 14, 2025

Motivation

Currently, PackageAnalyzer.jl offers the ability to process a Manifest.toml and extract the licenses for all of the Julia packages.

However, many Julia packages use binary artifacts, and the licenses for those binary artifacts may often differ from the license of the Julia wrapper.

This PR adds the ability to extract the licenses for the binary artifacts.

Example usage

import PackageAnalyzer

Pkg.activate("/path/to/my/project")
Pkg.instantiate()
Pkg.precompile()

my_manifest = "/path/to/my/project/Manifest.toml"

all_pkgs = PackageAnalyzer.find_packages_in_manifest(my_manifest)
jll_pkgs = filter(x -> endswith(x.name, "_jll"), all_pkgs)


artifact_hash_to_licenses = Dict{Base.SHA1,Vector{PackageAnalyzer.ArtifactLicenseInfo}}()

PackageAnalyzer.generate_artifact_hash_to_licenses!(
    artifact_hash_to_licenses,
    jll_pkgs;
    allow_no_artifacts=Base.PkgId[],
)

pkgid_to_licenses = PackageAnalyzer.artifact_license_map(
    jll_pkgs,
    artifact_hash_to_licenses;
    allow_no_artifacts=Base.PkgId[],
)

@DilumAluthge DilumAluthge force-pushed the dpa/artifact-licenses branch from 000b7b3 to f5dd073 Compare May 14, 2025 12:18
@DilumAluthge DilumAluthge marked this pull request as ready for review May 14, 2025 12:18
@DilumAluthge DilumAluthge force-pushed the dpa/artifact-licenses branch from f5dd073 to 7e9157c Compare May 14, 2025 12:43
@DilumAluthge
Copy link
Member Author

@ericphanson @giordano: Would you be able to review this?

@DilumAluthge DilumAluthge requested a review from giordano June 3, 2025 02:52
artifacts_toml_files = String[]
for (root, dirs, files) in walkdir(local_dir)
for name in files
if name in Artifacts.artifact_names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, this looks a bit weird to me, does Artifacts have a global registry of all names or something? is this the right way to match them? (can there be collisions?)

@ericphanson
Copy link
Member

I think it would make sense for this feature to be integrated more into the existing code. For example we can add artifact_licenses to PackageV1:

license_files::Vector{LicenseV1} # a table of all possible license files
licenses_in_project::Vector{String} # any licenses in the `license` key of the Project.toml

this is not breaking (see note https://juliaecosystem.github.io/PackageAnalyzer.jl/dev/#The-PackageV1-struct).

Then we can automatically populate it when analyzing that specific package.

Then we don't need to call obtain_code here or expand the API with artifact_license_map. Instead, we just update the analysis of the packagedir to also pull in artifact licenses and expose it in PackageV1. This allows the existing API (analyze_manifest, analyze with lists of packages, etc) to do a lot of the work, so we could have a smaller PR here but still get the artifact license functionality.

The other thing is I'm not sure we should use PkgId. That's a Base internal type and could expose us to breaking changes. I think it's probably enough to expose artifact licenses in PackageV1; the caller can map to PkgIds if they want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants