Skip to content

Added historical file in gcs bucket for EurostatData_LifeExpectancy #2032

Merged
niveditasing merged 4 commits into
datacommonsorg:masterfrom
niveditasing:added_historical_in_gcs
May 25, 2026
Merged

Added historical file in gcs bucket for EurostatData_LifeExpectancy #2032
niveditasing merged 4 commits into
datacommonsorg:masterfrom
niveditasing:added_historical_in_gcs

Conversation

@niveditasing
Copy link
Copy Markdown
Contributor

@niveditasing niveditasing commented May 22, 2026

The EurostatData_lifeExpectency feed shows 8,598 deleted records. Since the source has not officially stated that this data is discontinued, it may reappear. To prevent future duplicates that could arise from storing this deleted data in CNS, implemented a deduplication logic in the code & uploaded the historical data file to the GCS bucket so that code can read Data from there
Historical data : https://storage.mtls.cloud.google.com/unresolved_mcf/eurostat/life_expectancy/deleted_historical_data.csv
Differ after code change: https://storage.mtls.cloud.google.com/unresolved_mcf/eurostat/life_expectancy/obs_diff_log.csv

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to merge historical deleted data from a GCS path into the Eurostat life expectancy preprocessing pipeline. The code reviewer suggested enhancing the robustness of the data ingestion process by explicitly filtering for required columns, handling missing values, and enforcing consistent data types to prevent schema mismatches and ensure the integrity of the final dataset.

Comment thread scripts/eurostat/regional_statistics_by_nuts/life_expectancy/preprocess.py Outdated
@niveditasing niveditasing requested a review from saanikaaa May 22, 2026 09:50
@saanikaaa
Copy link
Copy Markdown
Contributor

Can you pls provide more explaination in PR description, exactly what we are doing and the reason behind that

Comment thread scripts/eurostat/regional_statistics_by_nuts/life_expectancy/preprocess.py Outdated
@niveditasing niveditasing merged commit 93d51ff into datacommonsorg:master May 25, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants