API Data Upload
Overview
Use this API to upload a single tabular data file (CSV or TSV) inside an archive.
Current endpoints:
POST /api/v1/data-sourcesGET /api/v1/data-sources/{source_id}PUT /api/v1/data-sources/{source_id}(currently returns501 not_implemented)
Quick Start
- Build a ZIP or TAR.GZ archive containing exactly one
.csvor.tsvfile. - Compute the archive MD5 and send it in
X-Checksum. - Upload with bearer auth.
- Poll the status endpoint until
statusbecomesreadyorerror.
ARCHIVE="sales_data.zip"
CHECKSUM=$(md5sum "$ARCHIVE" | awk '{print $1}')
curl -X POST "https://api.ecue.ai/api/v1/data-sources" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/zip" \
-H "X-Checksum: ${CHECKSUM}" \
--data-binary "@${ARCHIVE}"
Example 201 Created response:
{
"source_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"table_name": "cust_...",
"status": "processing"
}
Then check status:
curl -X GET "https://api.ecue.ai/api/v1/data-sources/a1b2c3d4-e5f6-7890-abcd-ef1234567890" \
-H "Authorization: Bearer YOUR_TOKEN"
Authentication
All data source endpoints require Authorization: Bearer ....
Accepted bearer types:
- JWT access token
- API key for importer callback upload flows
If authentication fails, requests return 401.
Endpoint: Create Data Source
POST /api/v1/data-sources
Required headers:
Authorization: Bearer YOUR_TOKENX-Checksum: <md5-hex>(32 hex characters)
Recommended header:
Content-Type: application/ziporapplication/x-tar+gzip
Body:
- Raw archive bytes (
--data-binary)
Success response:
201 Created- JSON:
source_id(UUID)table_name(generated table name)status
Status values on create:
- User upload path:
processing - Importer callback path:
ready
Endpoint: Get Data Source Status
GET /api/v1/data-sources/{source_id}
Required headers:
Authorization: Bearer YOUR_TOKEN
Success response:
200 OK
{
"source_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"table_name": "cust_...",
"status": "ready",
"row_count": 1234,
"created_at": "2026-03-20T12:00:00Z",
"updated_at": "2026-03-20T12:00:12Z"
}
Notes:
row_countis only returned when metadata status is ready.- If
statusiserror, response may includeerrorwith details. - Source access is customer-scoped; unknown or inaccessible IDs return
404 source_not_found.
Endpoint: Update Data Source
PUT /api/v1/data-sources/{source_id}
Current behavior:
501 Not Implemented
{
"error": "not_implemented",
"message": "Update endpoint not yet implemented"
}
Archive Requirements
Supported archive formats:
- ZIP
- TAR.GZ
Archive validation rules:
- Exactly one data file with
.csvor.tsvextension - No nested directories
- Optional
column_desc.json - Archive size limit: 256MB
Flat archive example:
sales_upload.zip
|- sales.csv
`- column_desc.json
column_desc.json
column_desc.json is optional. If present, it must be a non-empty JSON object with non-empty keys:
{
"customer_id": "Unique customer identifier",
"revenue": "Total revenue in USD"
}
Invalid JSON or invalid structure returns invalid_column_desc.
Checksum
X-Checksum must contain the MD5 checksum of the archive payload.
Linux/macOS:
md5sum data.zip | awk '{print $1}'
If the checksum is missing, malformed, or does not match the uploaded body, the request fails with 400.
Error Examples
Missing checksum:
{
"error": "missing_checksum",
"message": "X-Checksum header is required"
}
Invalid checksum format or mismatch:
{
"error": "invalid_request",
"message": "invalid checksum format: must be 32-character hexadecimal MD5 hash"
}
Invalid archive layout:
{
"error": "nested_directories",
"message": "Archive must have a flat structure"
}
Unsupported or malformed archive:
{
"error": "invalid_archive",
"message": "Failed to extract archive"
}
Polling Example (Python)
import time
import requests
def wait_for_source_ready(api_base, token, source_id, timeout_seconds=300):
started = time.time()
headers = {"Authorization": f"Bearer {token}"}
while time.time() - started < timeout_seconds:
response = requests.get(
f"{api_base}/api/v1/data-sources/{source_id}",
headers=headers,
timeout=30,
)
response.raise_for_status()
payload = response.json()
if payload.get("status") == "ready":
return payload
if payload.get("status") == "error":
raise RuntimeError(payload.get("error", "source failed"))
time.sleep(2)
raise TimeoutError("source did not become ready in time")