Skip to content

feat(website): show Jaccard index for predefined Nextclade variant mutations#1286

Draft
fhennig wants to merge 3 commits into
mainfrom
jaccard-index-for-nextclade-predefined-variants
Draft

feat(website): show Jaccard index for predefined Nextclade variant mutations#1286
fhennig wants to merge 3 commits into
mainfrom
jaccard-index-for-nextclade-predefined-variants

Conversation

@fhennig

@fhennig fhennig commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

resolves cbg-ethz/WisePulse#240

Compute and display a Jaccard index column for mutations loaded from Nextclade tree collections, using the same clinical LAPIS queries as the computed signature mode. When no clinical sequences match the lineage (e.g. no public data available), the column is silently omitted.

…tations (#240)

Compute and display a Jaccard index column for mutations loaded from
Nextclade tree collections, using the same clinical LAPIS queries as
the computed signature mode. When no clinical sequences match the
lineage (e.g. no public data available), the column is silently omitted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
dashboards Ready Ready Preview, Comment Jun 25, 2026 1:51pm

Request Review

@fhennig

fhennig commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Note to self, I haven't checked this yet.

@fhennig

fhennig commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

I see a few issues:

  • The Jaccard index isn't rounded to 2 digits, it looks weird - why isn't it rounded, is there code duplication?
  • From a UX perspective, I find it confusing, where is this number coming from?
  • If I Pick a variant like A.11, the column is just missing. There should be an info box

I'm thinking, maybe we could make an info box specifically titled 'Jaccard Index Information' and it would contain partially the information that is in the current box (how many clicincal seqs are there for the selected lineage?) I think that would help with the disconnect I'm now feeling

@fhennig

fhennig commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author
image

We have this now ... Hmm ... There are 100+ mutations, and we can't sort by relevancy, which is annoying.

i will try and see if we can implement the threshold at least.

... Hmmm with thresholding I don't see any relevant mutations anymore. very weird. I think I'll have to look at this differently.


I found these that seem with a "high" score:

image

In the other mode the scores are much higher. Something seems quite wrong with the computation.

I think I first need to reason about it myself: How is this computed? Since we're using two different data sets, it's easy to mix things up.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide a Jaccard Index for mutations from Nextclade tree

1 participant