Skip to content

STRAST-UPM/DataIngestApi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Ingest Api

This repository contains a production-ready service designed to run on Kubernetes as a persistent data ingestion and delivery endpoint.

The service exposes a simple HTTP API that allows clients to ingest data payloads (such as CSV or JSON files) into durable storage and later retrieve, list, download, or preview those stored datasets through stable HTTP endpoints.

It is specifically designed to act as a target endpoint for data plane “push” transfers, for example when a dataset is delivered after a successful contract negotiation in a data connector. In this scenario, the service receives the transferred data over HTTP, persists it reliably, and makes it available for subsequent access or publication.

In addition to machine-to-machine transfers, the service also provides a multipart/form-data upload endpoint to support manual uploads, testing, and operational workflows without requiring a connector.


Features

  • Ingest endpoint for machine-to-machine transfers
    curl -i -X POST "https://csv-api.edaccit.anycastprivacy.org/ingest" \
    -H "Content-Type: text/csv" \
    -H "X-Filename: example.csv" \
    --data-binary @./example.csv
  • Manual upload endpoint
    • POST /files accepts multipart/form-data (field name: file)
  • File management
  • Persistent storage
    • Files + metadata persisted to a PVC mounted at /data
  • Health probes
    • GET /healthz liveness
    • GET /readyz readiness
  • CI/CD
    • GitHub Actions builds and publishes a Docker image to GitHub Container Registry (GHCR)

API Endpoints

Health

  • GET /healthz{"status": "ok"}
  • GET /readyz{"status": "ready"}

Ingest (recommended for connectors / data plane push)

POST /ingest

  • Body: raw bytes (CSV/JSON/binary)
  • Headers:
    • Content-Type: text/csv (or any appropriate type)
    • Optional: X-Filename: myfile.csv

Multipart Upload (Manual)

POST /files

  • Content-Type: multipart/form-data
  • Field name: file

List Files

GET /files

Returns a list of all uploaded files with their associated metadata.

Preview CSV

GET /files/{file_id}/preview?rows=10

Returns the first N rows of the specified CSV file, allowing clients to inspect the dataset without downloading the full file.

Download File

GET /files/{file_id}

Downloads the full file associated with the given file_id.

Storage Layout

The container stores data under DATA_DIR (default: /data):

  • /data/uploads/

  • /data/metadata.json
    JSON file containing metadata indexed by file_id

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors