Skip to content

Communicate actual data availability per section when selected date range has gaps or no data #301

Description

@DominicBM

Problem

The dashboard allows users to select arbitrary date ranges, but the data behind each section of the dashboard comes from different sources with different start dates, different end dates, and in some cases known gaps in coverage. Currently the UI does not communicate actual data availability per section in a reliable or accurate way, leading to confusion when users see blank or near-zero numbers with no explanation.

Data Sources and Their Real Boundaries

Each section of the dashboard draws from a different source with its own date constraints:

Section Source Configured min date Actual data start (reality)
Website sessions / users / click-throughs GA4 Jan 2018 (min_date) May 2025 (hub-attributed events); raw page views from Apr 2023
Catalog / exhibition / PSS / click-through events GA4 Jan 2018 (min_date) May 2025 (hub attribution working)
Search terms GA4 Jan 2018 (min_date) May 2025
Metadata completeness (S3) AWS S3 Apr 2018 (mc_min_date) Continuous from Apr 2018
Wikimedia activity Wikimedia API Apr 2018 (mc_min_date) Per-hub variable (@wikimedia_start_date)
API item views GA4 (UA, now stubbed) May 2018 (api_min_date) Never — currently no data
BWS usage GA4 Sep 2020 (bws_min_date) Separate GA4 property; currently ungated
Item count / contributor count DPLA API All time All time

There is also a known gap in GA4 data from July 2023 to approximately May 2025 caused by the UA shutdown and the subsequent period before hub-attributed GA4 events were fully working. A user selecting any date range within that window will see near-zero or zero data across all GA4-sourced sections with no explanation.

Current Handling (Incomplete)

The existing _date_range_note partial shows "Showing [range]" or "Data starts [date]" — but:

  • website_data_start_date returns Settings.min_date (hardcoded Jan 2018), not the actual GA4 data start (May 2025). The note is wrong.
  • The data_start_date note only appears when no date range is active, not when a date range is selected that predates actual data.
  • api_data_for_date_range? and bws_data_for_date_range? gate entire sections, but with hardcoded global dates — not per-hub actual start dates.
  • Wikimedia has per-hub start date tracking (@wikimedia_start_date from cache), but this is only passed to _date_range_note on the hub sections page, not on contributor or events pages.
  • Metadata completeness, GA4-sourced events, and search terms have no per-section data-availability messaging at all.
  • When data is absent, sections show "No activity recorded for this period" — which is indistinguishable from a hub genuinely having zero traffic vs. the date range predating available data.

What the UI Needs

Each section that is date-filtered should communicate:

  1. The actual date range covered by data in that section, not just the selected range. If the user selected Jan 2023 – Dec 2023 but GA4 data only starts May 2025, the section should say so, not just show zeros.

  2. When a selected range partially overlaps available data, indicate which portion has data. Example: "Data available from May 2025; showing May–Dec 2025 only."

  3. When a selected range is entirely outside available data, show a clear message that distinguishes "no data for this section in this period" from "this hub had no activity." A generic "No activity recorded" is misleading.

  4. The GA4 gap (Jul 2023 – May 2025) specifically may warrant a site-wide notice for any selected range that overlaps it, since it affected all hubs equally and was caused by infrastructure issues, not lack of traffic.

Complicating Factors

  • Per-hub actual start dates vary. A hub that joined DPLA in 2022 has no data before that date, regardless of what min_date says. Issue Date range picker and data availability messaging should reflect actual per-hub data start date, not hardcoded 2018 #290 tracks making the date picker itself aware of actual data start per hub/contributor — the per-section messaging work here is the complementary UI layer.

  • Wikimedia and GA4 data do not overlap cleanly. A hub may have Wikimedia data going back to 2018 but GA4 data only from 2025. Selecting 2020–2022 would show Wikimedia data but no GA4 data; selecting 2025–present would show GA4 data but potentially limited Wikimedia data. Each section needs its own range annotation, not a single page-level note.

  • The current implementation uses a single _date_range_note partial shared across section types, which makes it difficult to provide per-section accuracy without refactoring the partial or passing richer context.

Proposed Approach

  1. Audit each section partial for what data source it uses and what the actual data start date is (including per-hub variability where applicable).
  2. Extend the data layer to return, alongside values, the actual date range covered (first and last month with data). This may already be partially available from the GA4 response.
  3. Replace the single page-level _date_range_note with per-section notes that reflect the actual coverage of that section's data source.
  4. Add a special-case notice for the known GA4 gap (Jul 2023 – Apr 2025) when a selected range overlaps it, explaining that low numbers in that window are due to a tracking outage rather than low traffic.

Coordinate with #290 (dynamic date range minimum in picker) — the two issues are complementary. #290 prevents users from selecting ranges with no data; this issue ensures that if they do (via URL parameter or edge cases), they get a clear explanation rather than silent zeros.

/cc @megannp4

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions