> ## Documentation Index
> Fetch the complete documentation index at: https://docs.artosai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Files API

> Upload and retrieve files from S3

# Files API

The Files API enables uploading source documents to S3, listing files within a container, generating presigned URLs from S3 keys, and proxying presigned URLs through the backend.

## List Files

Retrieve all files in a specific container (folder) for your organization.

```bash theme={null}
GET /api/v1/files/{container}
```

### Path Parameters

| Parameter   | Type   | Required | Description                                                     |
| ----------- | ------ | -------- | --------------------------------------------------------------- |
| `container` | string | Yes      | Container/folder name (e.g., 'templates', 'documents', 'input') |

### Request Example

```bash theme={null}
curl -X GET "https://api.artosai.com/api/v1/files/documents" \
  -H "Authorization: Bearer YOUR_TOKEN"
```

### Response

```json theme={null}
{
  "files": [
    {
      "name": "protocol.pdf",
      "url": "https://bucket.s3.amazonaws.com/org-id/documents/protocol.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Expires=3600&..."
    },
    {
      "name": "csr-data.xlsx",
      "url": "https://bucket.s3.amazonaws.com/org-id/documents/csr-data.xlsx?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Expires=3600&..."
    }
  ]
}
```

### Response Fields

| Field          | Type   | Description                                                                  |
| -------------- | ------ | ---------------------------------------------------------------------------- |
| `files`        | array  | Array of file objects                                                        |
| `files[].name` | string | File name                                                                    |
| `files[].url`  | string | Time-limited presigned S3 URL for the file (typically expires after 1 hour). |

### Status Codes

* **200 OK**: Successfully retrieved file list
* **400 Bad Request**: Authentication failed
* **401 Unauthorized**: Missing or invalid Bearer token
* **500 Internal Server Error**: S3 operation failed

***

## Upload File

Upload a file to S3. Automatically converts DOCX to PDF (except in templates container). Validates file type by extension and MIME type.

```bash theme={null}
POST /api/v1/files/upload
```

### Request

**Content-Type**: `multipart/form-data`

| Parameter      | Type   | Required | Description                      |
| -------------- | ------ | -------- | -------------------------------- |
| `file_name`    | string | Yes      | Name to give the uploaded file   |
| `file_content` | file   | Yes      | Binary file content              |
| `container`    | string | Yes      | Container/folder name for upload |

### Supported File Types

* `.docx` - Word documents
* `.pdf` - PDF documents
* `.csv` - CSV files
* `.xlsx` - Excel spreadsheets
* `.rtf` - Rich Text Format

### Request Example

```bash theme={null}
curl -X POST "https://api.artosai.com/api/v1/files/upload" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file_name=protocol.pdf" \
  -F "file_content=@/path/to/protocol.pdf" \
  -F "container=documents"
```

### Python Example

```python theme={null}
import requests

url = "https://api.artosai.com/api/v1/files/upload"
headers = {"Authorization": "Bearer YOUR_TOKEN"}

files = {
    "file_name": (None, "protocol.pdf"),
    "container": (None, "documents"),
    "file_content": open("/path/to/protocol.pdf", "rb")
}

response = requests.post(url, headers=headers, files=files)
print(response.json())
```

### Response

```json theme={null}
{
  "message": "File uploaded successfully"
}
```

### Status Codes

* **200 OK**: File uploaded successfully
* **400 Bad Request**: File validation failure (invalid type, size, etc.)
* **401 Unauthorized**: Missing or invalid Bearer token
* **500 Internal Server Error**: S3 upload failed

### Error Examples

**Invalid File Type**:

```json theme={null}
{
  "detail": "File type .exe is not supported"
}
```

**Missing Required Field**:

```json theme={null}
{
  "detail": "Missing required parameter: container"
}
```

***

## Auto-Conversion: DOCX to PDF

Files uploaded as `.docx` are automatically converted to PDF (except in the `templates` container):

* **Input**: `document.docx`
* **Output**: `document.pdf`
* **Exception**: Files uploaded to `templates` container are not converted

This ensures compatibility with document processing pipelines while allowing template files to remain in their original format.

***

## Container Types

Common container names and their purposes:

| Container   | Purpose                             |
| ----------- | ----------------------------------- |
| `documents` | Processed documents                 |
| `templates` | Template files (not auto-converted) |
| `input`     | Source documents for ingestion      |
| `output`    | Generated output files              |

***

## Downloading Files

`GET /api/v1/files/{container}` now returns presigned S3 URLs directly in `files[].url`. You can:

1. Open/download the presigned URL directly
2. Stream it through the backend proxy (`GET /api/v1/proxy/file`) for browser/VPN-restricted environments

### Proxy Presigned URL Through Backend

```bash theme={null}
GET /api/v1/proxy/file?presigned_url={URL_ENCODED_PRESIGNED_URL}
```

| Parameter       | Type   | Required | Description                                                        |
| --------------- | ------ | -------- | ------------------------------------------------------------------ |
| `presigned_url` | string | Yes      | Fully-qualified AWS S3 presigned URL (URL-encoded in query string) |

#### Request Example

```bash theme={null}
curl -G "https://api.artosai.com/api/v1/proxy/file" \
  --data-urlencode "presigned_url=https://bucket.s3.amazonaws.com/org-id/documents/protocol.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Expires=3600&..."
```

#### Notes

* Proxy route accepts only AWS S3 hosts (`amazonaws.com`)
* Expired presigned URLs return `410 Gone`
* Response is streamed binary content with the file content-type

***

## Generate Presigned URL from S3 Key (Optional)

Use this endpoint when another API returns a raw S3 key (instead of a presigned URL).

### Generate Presigned URL

```bash theme={null}
POST /generate-presigned-url
```

| Parameter | Type   | Required | Description                         |
| --------- | ------ | -------- | ----------------------------------- |
| `key`     | string | Yes      | S3 object key from the file listing |

#### Request Example

```bash theme={null}
curl -X POST "https://api.artosai.com/generate-presigned-url" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"key": "org-id/documents/protocol.pdf"}'
```

#### Response

```json theme={null}
{
  "url": "https://bucket.s3.amazonaws.com/org-id/documents/protocol.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Expires=3600&..."
}
```

The presigned URL:

* **Expires after 1 hour**
* **Requires no authentication** to use — download directly
* Is scoped to your organization's S3 path

### Full Key-Based Download Workflow

```bash theme={null}
# Step 1: Obtain an S3 key from an endpoint that returns keys
# Example key: org-id/documents/protocol.pdf

# Step 2: Generate a presigned URL
curl -X POST "https://api.artosai.com/generate-presigned-url" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"key": "org-id/documents/protocol.pdf"}'

# Step 3a: Download directly from S3 (no auth needed)
# curl -X GET "https://bucket.s3.amazonaws.com/org-id/documents/protocol.pdf?X-Amz-Algorithm=..."

# Step 3b (optional): Stream via backend proxy
# curl -G "https://api.artosai.com/api/v1/proxy/file" \
#   --data-urlencode "presigned_url=https://bucket.s3.amazonaws.com/org-id/documents/protocol.pdf?X-Amz-Algorithm=..."
```
