Skip to content

Conversation

@fellen31
Copy link
Contributor

@fellen31 fellen31 commented Dec 11, 2025

Description

Added

  • REVEL, dbnsfp_gerp++_rs, dbnsfp_phastcons100way_vertebrate, dbnsfp_phylop100way_vertebrate for SNVs
  • most severe pli for SVs

Changed

  • CoLorsDB and LoqusDB SNVs to same frequencies as short-read SNV databases
  • gnomAD SVs to same frequencies and scores to same as in short-read
  • CoLorsDB and LoqusDB SVs to same frequencies as short-read SV databases, except set common to 5% (from 10%)

How to prepare for test

  • Ssh to relevant server (depending on type of change)
  • Use stage: us
  • Paxa the environment: paxa
  • Install on stage (example for Hasta):
    bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_[TOOL]-t [TOOL] -b [THIS-BRANCH-NAME] -a

How to test

  • Do ...

Expected test outcome

  • Check that ...
  • Take a screenshot and attach or copy/paste the output.

Review

  • Tests executed by
  • "Merge and deploy" approved by
    Thanks for filling in who performed the code review and the test!

This version is a

  • MAJOR - when you make incompatible API changes
  • MINOR - when you add functionality in a backwards compatible manner
  • PATCH - when you make backwards compatible bug fixes or documentation/instructions

Implementation Plan

  • Document in ...
  • Deploy this branch on ...
  • Inform to ...

+3 regardless of model

Add polyphen, revel, sift, dbnsfp_gerp++_rs, dbnsfp_phastcons100way_vertebrate, dbnsfp_phylop100way_vertebrate

Add most severe pli for SVs

update rank model suggestion
@fellen31 fellen31 marked this pull request as ready for review December 11, 2025 15:39
@fellen31 fellen31 requested review from dnil and jemten December 18, 2025 13:29
@fellen31 fellen31 force-pushed the update-nallo-rank-model branch from 946238e to 85fcb39 Compare December 29, 2025 15:16
Copy link
Member

@dnil dnil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Good test results overrides any guesswork we might have on scores. Note the diff on SV loqusdb not_reported/missing scores (4 vs 6): is it intentional and still valid?

[[common]]
score = -12
lower = 0.1
lower = 0.02
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

[[not_reported]]
score = 4

[[missing]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, are we still getting these? 🤔 Oh well, good fallback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, never got around to fixing it in echtvar..

upper = 0.01
upper = 0.0005

[revel]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

lower = 0.75
upper = 1

[dbnsfp_gerp++_rs]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

lower = 0
upper = 2

[dbnsfp_phastcons100way_vertebrate]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

lower = 0
upper = 0.8

[dbnsfp_phylop100way_vertebrate]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

score = 4
lower = 0
upper = 0.01
upper = 0.0005
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

separators = ',',

[[not_reported]]
score = 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 6 in the SR SV model. 🤔

[[common]]
score = -12
lower = 0.1
lower = 0.05
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🫡

lower = -400
upper = -1

[gene_intolerance_score]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯


[[not_reported]]
score = 0
score = 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels weird, but again, the results are most important. The genotypes for SVs are still too noisy I suppose. 😞

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have any variants that doesn't get a model, so in practice this doesn't seem to matter. I can change it back next update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes, yes, then no worries. Something to bump over at the CNV callers? We added a simple model for CNV-nator way back when, to at least get some genotypes for the copy number changes, where we have some decent statistics to work with. Is that better in Sawfish compared to HiFiCNV by any chance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted to 0.

@fellen31
Copy link
Contributor Author

fellen31 commented Jan 7, 2026

Nice! Good test results overrides any guesswork we might have on scores. Note the diff on SV loqusdb not_reported/missing scores (4 vs 6): is it intentional and still valid?

Yes, I tried giving +6 but I think that brings too many small intronic indels too high in the ranking.

I don't think the difference between something that has been seen once in loqusdb (gets +2) and something completely new should be that big, I wonder if there even should be a difference in score.

@dnil
Copy link
Member

dnil commented Jan 7, 2026

Nice! Good test results overrides any guesswork we might have on scores. Note the diff on SV loqusdb not_reported/missing scores (4 vs 6): is it intentional and still valid?

Yes, I tried giving +6 but I think that brings too many small intronic indels too high in the ranking.

I don't think the difference between something that has been seen once in loqusdb (gets +2) and something completely new should be that big, I wonder if there even should be a difference in score.

No, not really ever. The intention is that once the db grows to a size where the somewhat rare threshold is well estimated one can start differentiating. But indeed, that would likely need a few thousand cases with that very_rare frequency.

EDIT: and yes, maybe not for a long time, given a lot of small intronic events. Until we have some functional predictors for them perhaps.

@fellen31 fellen31 merged commit 0cd376c into master Jan 7, 2026
@fellen31 fellen31 deleted the update-nallo-rank-model branch January 7, 2026 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants