Skip to content

don't conflate builds with datasets#284

Merged
eharkins merged 2 commits into
masterfrom
build-vs-dataset
Mar 30, 2021
Merged

don't conflate builds with datasets#284
eharkins merged 2 commits into
masterfrom
build-vs-dataset

Conversation

@eharkins

Copy link
Copy Markdown
Contributor

This changes a bunch of places where I had used the word "build" when I really meant "dataset". For a clear explanation of the difference, see:
#271 (comment)

I think of a "build" as a workflow. seasonal-flu is a build. ncov is a build. I think of the products produced by these builds (eg ncov_global.json) as "datasets". One build may produce multiple datasets. Perhaps this makes the most sense when encountering Auspice JSONs like ncov_global_2020-06-01 where it's clear this a "dataset" distinct from the ongoing day-to-day rebuilds.

Sorry I didn't see that comment until just now Trevor but it really clarified the difference for me! In basically none of the cases where I had used "build" on nextstrain.org pages did I mean

a workflow

and instead meant

the products produced by these builds (eg ncov_global.json)

@eharkins eharkins requested a review from trvrb March 18, 2021 17:11
@tsibley tsibley temporarily deployed to nextstrain-s-build-vs-d-qvx5gm March 18, 2021 17:12 Inactive
@eharkins

eharkins commented Mar 18, 2021

Copy link
Copy Markdown
Contributor Author

@trvrb

trvrb commented Mar 23, 2021

Copy link
Copy Markdown
Member

Thanks Eli! A couple thoughts:

This section is an index of public Nextstrain datasets for flu, organized by type. If you know of a dataset not listed here, please let us know!

Do we intend to allow non-core flu datasets to be included in this list? If this page is meant to replace /flu I'd keep it to just the datasets that live below /flu in the URL hierarchy. Otherwise, if we're keeping /influenza we could imagine separating this, like we do for /ncov vs /sars-cov-2.

Jump to our latest global SARS-CoV-2 dataset which is updated daily

I'd use slightly different language, perhaps: "Jump to the latest update to our SARS-CoV-2 globally-subsampled dataset which is updated daily"

We also maintain regional builds for Africa...

"We also keep updated regional datasets for Africa..."

@eharkins

Copy link
Copy Markdown
Contributor Author

Thanks @trvrb! I was under the impression that

this page is meant to replace /flu

so I removed language that suggests it's listing anything other than

just the datasets that live below /flu in the URL hierarchy

Let me know if you were intending the other way around.

For /sars-cov-2, I went with

Jump to our globally-subsampled SARS-CoV-2 dataset which is updated daily. We also keep updated regional datasets for...

I don't think we need to say "latest" if we are saying it's updated daily.

@trvrb

trvrb commented Mar 26, 2021

Copy link
Copy Markdown
Member

These changes are great. Thanks Eli. Looks like this may need to be rebased, but otherwise I think it's good to go.

this changes a bunch of places where
I had used the word build when I really
meant dataset. For a clear explanation
of the difference, see:
#271 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants