API Data Upload

Overview

Use this API to upload a single tabular data file (CSV or TSV) inside an archive.

Current endpoints:

Quick Start

  1. Build a ZIP or TAR.GZ archive containing exactly one .csv or .tsv file.
  2. Compute the archive MD5 and send it in X-Checksum.
  3. Upload with bearer auth.
  4. Poll the status endpoint until status becomes ready or error.
ARCHIVE="sales_data.zip"
CHECKSUM=$(md5sum "$ARCHIVE" | awk '{print $1}')

curl -X POST "https://api.ecue.ai/api/v1/data-sources" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/zip" \
  -H "X-Checksum: ${CHECKSUM}" \
  --data-binary "@${ARCHIVE}"

Example 201 Created response:

{
  "source_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "table_name": "cust_...",
  "status": "processing"
}

Then check status:

curl -X GET "https://api.ecue.ai/api/v1/data-sources/a1b2c3d4-e5f6-7890-abcd-ef1234567890" \
  -H "Authorization: Bearer YOUR_TOKEN"

Authentication

All data source endpoints require Authorization: Bearer ....

Accepted bearer types:

If authentication fails, requests return 401.

Endpoint: Create Data Source

POST /api/v1/data-sources

Required headers:

Recommended header:

Body:

Success response:

Status values on create:

Endpoint: Get Data Source Status

GET /api/v1/data-sources/{source_id}

Required headers:

Success response:

{
  "source_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "table_name": "cust_...",
  "status": "ready",
  "row_count": 1234,
  "created_at": "2026-03-20T12:00:00Z",
  "updated_at": "2026-03-20T12:00:12Z"
}

Notes:

Endpoint: Update Data Source

PUT /api/v1/data-sources/{source_id}

Current behavior:

{
  "error": "not_implemented",
  "message": "Update endpoint not yet implemented"
}

Archive Requirements

Supported archive formats:

Archive validation rules:

Flat archive example:

sales_upload.zip
|- sales.csv
`- column_desc.json

column_desc.json

column_desc.json is optional. If present, it must be a non-empty JSON object with non-empty keys:

{
  "customer_id": "Unique customer identifier",
  "revenue": "Total revenue in USD"
}

Invalid JSON or invalid structure returns invalid_column_desc.

Checksum

X-Checksum must contain the MD5 checksum of the archive payload.

Linux/macOS:

md5sum data.zip | awk '{print $1}'

If the checksum is missing, malformed, or does not match the uploaded body, the request fails with 400.

Error Examples

Missing checksum:

{
  "error": "missing_checksum",
  "message": "X-Checksum header is required"
}

Invalid checksum format or mismatch:

{
  "error": "invalid_request",
  "message": "invalid checksum format: must be 32-character hexadecimal MD5 hash"
}

Invalid archive layout:

{
  "error": "nested_directories",
  "message": "Archive must have a flat structure"
}

Unsupported or malformed archive:

{
  "error": "invalid_archive",
  "message": "Failed to extract archive"
}

Polling Example (Python)

import time
import requests

def wait_for_source_ready(api_base, token, source_id, timeout_seconds=300):
    started = time.time()
    headers = {"Authorization": f"Bearer {token}"}

    while time.time() - started < timeout_seconds:
        response = requests.get(
            f"{api_base}/api/v1/data-sources/{source_id}",
            headers=headers,
            timeout=30,
        )
        response.raise_for_status()
        payload = response.json()

        if payload.get("status") == "ready":
            return payload
        if payload.get("status") == "error":
            raise RuntimeError(payload.get("error", "source failed"))

        time.sleep(2)

    raise TimeoutError("source did not become ready in time")