Skip to content

Conversation

Copy link

Copilot AI commented Dec 17, 2025

KGX validation incorrectly flagged mixin classes like biolink:GenomicEntity as invalid node categories. This occurred because BMT's get_descendants() returns only classes with is_a/mixin relationships, not the mixin classes themselves.

Changes

  • Added helper method _collect_valid_uris_with_mixins(): Single-pass collection of URIs from descendants and their referenced mixins, reducing complexity from O(n*m) to O(n+m)

  • Updated _get_valid_categories() and _get_valid_predicates(): Now include mixin classes that are actually used by descendants

  • Added tests: Verify GenomicEntity and other mixins are recognized as valid categories

Example

Before this fix, nodes with mixin categories were flagged:

{
    'id': 'HGNC:1234',
    'category': ['biolink:GenomicEntity'],  # ❌ Invalid warning
    'name': 'Test Gene'
}

After:

{
    'id': 'HGNC:1234', 
    'category': ['biolink:GenomicEntity'],  # ✅ Valid
    'name': 'Test Gene'
}

Mixins like GenomicEntity, PhysicalEssence, and OntologyClass are now correctly recognized as valid categories.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • astral.sh
    • Triggering command: /usr/bin/curl curl -LsSf REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>why is GenomicEntity not an allowed class for a node category?</issue_title>
<issue_description>Our team just ran our latest KGX knowledge graph through validate_biolink_kgx.py, and we are seeing warnings like this:

          {
            "type": "biolink-model validation",
            "severity": "WARN",
            "message": "Node at /nodes/247 has potentially invalid category 'biolink:GenomicEntity'",
            "instance_path": "unknown"
          },

On one level, I'm happy it's not an error, but on another level, I'd like to understand why. Is it because genomic entity is a mixin?

Thanks,
Steve</issue_description>

Comments on the Issue (you are @copilot in this section)

@saramsey Tagging @sierra-moxon @mbrush and @RichardBruskiewich @saramsey Also see this issue: https://github.com/biolink/biolink-model/issues/1405 @saramsey I know we allow edge predicates to be mixins, but maybe not node categories? @sierra-moxon I think this warning is invalid. It comes from the validator using BMT, which has flags for including or excluding mixins. Those flags aren’t being set in the validator, but they should be.

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI self-assigned this Dec 17, 2025
Copilot AI changed the title [WIP] Fix warnings for GenomicEntity category validation Fix validation to recognize mixin classes as valid node categories Dec 17, 2025
Copilot AI requested a review from sierra-moxon December 17, 2025 22:15
@sierra-moxon sierra-moxon marked this pull request as ready for review December 18, 2025 00:49
@sierra-moxon
Copy link
Member

This ended up being a bit more of a collaboration than a fully co-pilot, solved issue. tbf, I extended the scope a bit to centralize biolink naming convention transforms into a central method and reused bmt there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

why is GenomicEntity not an allowed class for a node category?

2 participants