Skip to content

feat: 142 update usgs instant values endpoint#205

Draft
Enovotny wants to merge 3 commits into
mainfrom
142-update-usgs-instant-values-endpoint
Draft

feat: 142 update usgs instant values endpoint#205
Enovotny wants to merge 3 commits into
mainfrom
142-update-usgs-instant-values-endpoint

Conversation

@Enovotny

Copy link
Copy Markdown
Collaborator

create new version for the new USGS api
add backfill version to allow users to change the version of the timeseries if doing a backfill
add tests

@Enovotny Enovotny linked an issue Apr 21, 2026 that may be closed by this pull request
@Enovotny Enovotny changed the title 142 update usgs instant values endpoint feat: 142 update usgs instant values endpoint Apr 21, 2026
@Enovotny Enovotny linked an issue Apr 21, 2026 that may be closed by this pull request

@krowvin krowvin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really just curious on the plans for the api target version

We use cda v2 timeseries now but we do not call that cwms-cli load timeseries-v2

Not that it has to be consistent but I would think exposing api-v through arg is more futureproof?

Comment thread cwmscli/usgs/__init__.py


@usgs_group.command(
"timeseries-v2",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could get a bit confusing to have multiple sub commands for timeseries.

What are your thoughts on having an api-source with it defaulting to v1 until that goes away? Then you add a warning in this version+ that says

"switch to using source --version=2 as soon as possible"

Something like cwms-cli usgs timeseries --version=2

They're almost certainly going to make a v3, v4, v5 and so on

That would make the command look like this

commands:
  timeseries
  timeseries-v2
  timeseries-v3
  timeseries-v4
  timeseries-v5
  ...
  timeseries-v349

or you could even be more explicit and say --api-version

Then we continue to default to v1, then when USGS sunsets or at some point we just remove v1 as the default and/or attempt backwards compat. Same inputs but to a different api endpoint if possible.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe even just switch to the new USGS OGC API by default if there isn't a defined USGS method in the timeseries group and fall back to the legacy NWIS if there is a method defined w/ a warning.

Also, have a conversion script available to run to lookup up the OGC timeseries metadata and try and replace the old method with the new timeseries id in the cwms group.

Our local USGS changed a bunch of site configs on their end a couple weeks ago, so legacy nwis is not bringing data unless I specify the USGS method which is a pain to look up. If I use the new OGC, it just provides what is posted on the web. Not sure if that is rolling out to other USGS offices.

Comment thread cwmscli/usgs/__init__.py
help='Backfill version, save data to a different version of the timeseries (e.g. "Rev-USGS" instead of "Raw-USGS")',
)
@requires(reqs.cwms, reqs.requests)
def getusgs_timeseries_v2(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then you'd put this inside the other getusgs_timeseries and have a switch for based on the version the enduser wants to target?

@msweier msweier Jun 2, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to not require the backfill version tsid to be in the USGS group? That's how I'm using it now.

TS group has the full timeseries ID, for example, SABM5.Stage.Inst.~15Minutes.0.Raw-USGS but then I have a side script that will backfill to SABM5.Stage.Inst.~15Minutes.0.rev-USGS which isn't in the group. I use CCP to copy the Raw-USGS to rev-USGS. That way I can keep the record of the raw data and backfill the rev. That keeps the group 1/2 the size. And if folks want it in there USGS group it would still work without the CCP copy too.

Also, does it make sense to require the backfill argument to just be the cwms location rather than the full TSID? It's pulling all the parameters anyway. Is there a case were you would only want one parameter backfilled and not the others?

return usgs_ts


def CWMS_writeData(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion on this

Do not start a function/def with uppercase only do that with classes

Then we mix other case as well

PEP 8 would be lower_snake_case and would otherwise be conventionally cwms_write_data

See this in other variable names/ functions that mix camelCase and capitalization

)


_OGC_BASE_URL = "https://api.waterdata.usgs.gov/ogcapi/v0/collections/continuous"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we split the API domain/endpoints out from this and centralize them in the init.py so we will have one place to change them later on


Produces the same DataFrame structure as getUSGS_ts() so CWMS_writeData() works unchanged.

TODO: fill in OGC response parsing once sample JSON is available.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are heading towards a requests call?

We use dataretrieval in other places, do we know if there are plans to update it to v2? (or 1 since they use v0 now) - imo we keep the api target the same as theres and say v0 default, v1 for current to match?

(it's v2 for us sure)

https://github.com/DOI-USGS/dataretrieval-python

i.e.

from dataretrieval import waterdata

df, metadata = waterdata.get_continuous(
    monitoring_location_id="USGS-01646500",
    parameter_code="00065",
    time="2024-10-01/2025-09-30",
)

USGS_data_method = pd.DataFrame()

if len(sites) > 0:
USGS_data = getUSGS_ts_ogc(sites, startDT, endDT)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both of these getUSGS_ts_ogccould throw an error if they do -h because of the NotImplementedError?

)
usgs_ts = usgs_ts[~usgs_ts["timeseries-id"].isin(missing_original_ts)]

except Exception as e:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching every error generically will fail, but then continue anyways. We may want to return here so we don't continue anyways if we had failures?

And/or handle explicit exceptions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update USGS Instant Values endpoint. Get USGS Timeseries add backfill version option

3 participants