Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.artosai.com/llms.txt

Use this file to discover all available pages before exploring further.

Documents API

The Documents API enables asynchronous document generation from source materials using structured templates. Document generation is processed as a background Celery task and returns immediately with a document ID for status tracking.

List Documents

Retrieve all documents accessible to the authenticated user across all their document sets.
GET /api/v1/documents/
Access rules:
  • Internal / Owner roles: all documents within their organization
  • All other roles: only documents in document sets they belong to

Request Example

curl -X GET "https://api.artosai.com/api/v1/documents/" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response

{
  "documents": [
    {
      "document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "document_name": "CSR_Final.docx",
      "created_at": "2024-01-25 12:00:00.000000",
      "updated_at": "2024-01-25 12:45:00.000000",
      "workspace_id": "ws-uuid-123",
      "template": {
        "template_id": "tmpl-uuid-456",
        "template_name": "CSR Template"
      }
    },
    {
      "document_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
      "document_name": "Protocol_v2.docx",
      "created_at": "2024-01-20 09:30:00.000000",
      "updated_at": "2024-01-20 11:00:00.000000",
      "workspace_id": "ws-uuid-789",
      "template": null
    }
  ]
}

Response Fields

FieldTypeDescription
documentsarrayList of document items
documents[].document_idstringDocument UUID
documents[].document_namestringDocument file name
documents[].created_atstringCreation timestamp
documents[].updated_atstringLast updated timestamp
documents[].workspace_idstringDocument set ID the document belongs to
documents[].templateobjectTemplate metadata (null if no template)
documents[].template.template_idstringTemplate UUID
documents[].template.template_namestringTemplate name

Status Codes

  • 200 OK: Documents retrieved successfully
  • 401 Unauthorized: Missing or invalid Bearer token
  • 500 Internal Server Error: Database error

Generate Document

Request document generation from source documents using a template. The request is queued as a background Celery task and returns 202 Accepted immediately. A placeholder document record is created synchronously before queuing — use the returned task_id to poll status.
POST /api/v1/documents/generate

Request Body

Fields marked with an alias can be sent using either name — both are accepted.
{
  "document_type": "CSR",
  "file_paths": ["org-id/documents/protocol.pdf", "org-id/documents/data.xlsx"],
  "connector_data_id": "project-2024-001",
  "workspace_name": "Q1 CSR Documents",
  "template_id": "tmpl-uuid-123",
  "output_name": "CSR_Final.docx",
  "selected_section_ids": ["section-uuid-1", "section-uuid-2"],
  "generic_mrt_outline_full": {},
  "document_instructions": "Follow company style guide",
  "style_guide_id": "sg-uuid-456"
}
The following field names are interchangeable:
Alias (also accepted)Internal field name
connector_data_iddocument_set_key
template_idgeneric_mrt_id
workspace_namedocument_set_name

### Request Parameters

The API accepts both the alias name and the internal field name for aliased fields (both are equivalent).

| Parameter | Alias | Type | Required | Description |
|-----------|-------|------|----------|-------------|
| `document_type` | — | string | Yes | Type of document (e.g., `"CSR"`, `"Protocol"`, `"IND"`) |
| `file_paths` | — | array | Yes | S3 file paths for source documents |
| `connector_data_id` | `document_set_key` | string | Yes | Unique key scoping the document set for search |
| `workspace_name` | `document_set_name` | string | No | Human-readable name for the document set (defaults to `""`) |
| `template_id` | `generic_mrt_id` | string | Yes | Template ID to use |
| `output_name` | — | string | No | Name for the generated output file |
| `selected_section_ids` | — | array | No | Specific section IDs (strings or `{section_id, user_instructions}` objects) to include |
| `generic_mrt_outline_full` | — | object | No | Full pre-built outline structure to use directly |
| `document_instructions` | — | string | No | Document-level instructions for generation |
| `style_guide_id` | — | string | No | Style guide ID to apply to the generated document |

### Request Example

```bash
curl -X POST "https://api.artosai.com/api/v1/documents/generate" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_type": "CSR",
    "file_paths": ["org-id/documents/protocol.pdf", "org-id/documents/csr-data.xlsx"],
    "connector_data_id": "project-2024-001",
    "workspace_name": "Q1 CSR Documents",
    "template_id": "tmpl-uuid-123",
    "output_name": "CSR_Final.docx"
  }'
# Note: "document_set_key", "document_set_name", "generic_mrt_id" are accepted too

Python Example

import requests
import time

url = "https://api.artosai.com/api/v1/documents/generate"
headers = {
    "Authorization": "Bearer YOUR_TOKEN",
    "Content-Type": "application/json"
}

payload = {
    "document_type": "CSR",
    "file_paths": ["org-id/documents/protocol.pdf"],
    # Both aliases below are interchangeable:
    "connector_data_id": "project-2024-001",   # or "document_set_key"
    "workspace_name": "Q1 CSR Documents",       # or "document_set_name"
    "template_id": "tmpl-uuid-123",             # or "generic_mrt_id"
    "output_name": "CSR_Final.docx"
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
document_id = result["task_id"]
print(f"Document ID: {document_id}")

Response (202 Accepted)

{
  "message": "Request to generate document has been accepted and is being processed in the background.",
  "task_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}

Response Fields

FieldTypeDescription
messagestringStatus message
task_idstringThe document UUID — use this to poll status and retrieve the document

Status Codes

  • 202 Accepted: Request accepted for background processing
  • 400 Bad Request: Missing required parameters or document set not found
  • 401 Unauthorized: Missing or invalid Bearer token
  • 500 Internal Server Error: Database operation failed

Idempotency

If a document with the same output_name already exists for your organization, the existing document ID is returned immediately (no duplicate is created).

Document Generation Pipeline

The background task performs the following steps:
  1. Extract — Extract and classify content from source documents
  2. Ingest — Ingest documents using classification results
  3. Create Outline — Generate an outline from the template
  4. Orchestrate — Execute document outline rule orchestration
  5. Generate — Produce the final DOCX document

Get Document Status

Poll the current status of a document being generated.
GET /api/v1/documents/status/{document_id}

Path Parameters

ParameterTypeRequiredDescription
document_idstringYesUUID of the document (returned as task_id from /generate)

Request Example

curl -X GET "https://api.artosai.com/api/v1/documents/status/a1b2c3d4-e5f6-7890-abcd-ef1234567890" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response

{
  "task_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "Generating",
  "progress": null,
  "error": null
}

Status Values

StatusDescription
PendingDocument accepted but not yet picked up by a worker
IngestingSource documents are being ingested
GeneratingDocument content is being generated
ReadyDocument generation completed successfully
FailedDocument generation encountered an error

Response Fields

FieldTypeDescription
task_idstringDocument UUID
statusstringCurrent processing status
progressintegerProgress percentage (reserved — currently always null)
errorstringError message if status is "Failed", otherwise null

Status Codes

  • 200 OK: Status retrieved successfully
  • 404 Not Found: Document not found
  • 500 Internal Server Error: Database error

Polling Workflow

#!/bin/bash
TOKEN="your_bearer_token"
API="https://api.artosai.com"
DOC_ID="a1b2c3d4-e5f6-7890-abcd-ef1234567890"

while true; do
  RESPONSE=$(curl -s -X GET "$API/api/v1/documents/status/$DOC_ID" \
    -H "Authorization: Bearer $TOKEN")
  STATUS=$(echo $RESPONSE | jq -r '.status')

  echo "Status: $STATUS"

  if [ "$STATUS" = "Ready" ]; then
    echo "Document generation complete!"
    break
  elif [ "$STATUS" = "Failed" ]; then
    echo "Error: $(echo $RESPONSE | jq -r '.error')"
    exit 1
  fi

  sleep 5
done

Get Single Document

Retrieve a completed document by ID. Returns all document metadata and sections.
GET /api/v1/documents/{document_id}

Path Parameters

ParameterTypeRequiredDescription
document_idstringYesUUID of the document

Request Example

curl -X GET "https://api.artosai.com/api/v1/documents/a1b2c3d4-e5f6-7890-abcd-ef1234567890" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response

{
  "document": {
    "document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "document_name": "CSR_Final.docx",
    "document_set": "Q1 CSR Documents",
    "document_set_name": "Q1 CSR Documents",
    "document_type": "CSR",
    "product_name": "",
    "status": "Ready",
    "version": 0,
    "template_id": "tmpl-uuid-123",
    "template_nickname": "CSR",
    "organization_id": "org-uuid-789",
    "user": "user@example.com",
    "all_sources": ["org-id/documents/protocol.pdf", "org-id/documents/csr-data.xlsx"],
    "selected_section_ids": ["section-uuid-1", "section-uuid-2"],
    "last_regeneration": "2024-01-25T12:45:00Z",
    "created_at": "2024-01-25T12:00:00Z",
    "updated_at": "2024-01-25T12:45:00Z"
  }
}

Status Codes

  • 200 OK: Document retrieved successfully
  • 401 Unauthorized: Missing or invalid Bearer token
  • 403 Forbidden: Document belongs to a different organization
  • 404 Not Found: Document not found
  • 500 Internal Server Error: Database error

Create Local Document

Upload a Word document directly from the Office Add-in and create a minimal document record. Unlike /generate, this does not process the document through the generation pipeline — it simply stores the document for use with chat functionality.
POST /api/v1/documents/local

Request

Content-Type: multipart/form-data
ParameterTypeRequiredDescription
filefileYesThe .docx file to upload
Only .docx files are accepted.

Request Example

curl -X POST "https://api.artosai.com/api/v1/documents/local" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file=@/path/to/my-document.docx"

Python Example

import requests

url = "https://api.artosai.com/api/v1/documents/local"
headers = {"Authorization": "Bearer YOUR_TOKEN"}

with open("/path/to/my-document.docx", "rb") as f:
    response = requests.post(
        url,
        headers=headers,
        files={"file": ("my-document.docx", f, "application/vnd.openxmlformats-officedocument.wordprocessingml.document")}
    )

result = response.json()
print(f"Document ID: {result['document_id']}")
print(f"S3 Key: {result['s3_key']}")

Response (201 Created)

{
  "document_id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
  "document_name": "my-document.docx",
  "s3_key": "org-uuid-789/local-documents/c3d4e5f6-a7b8-9012-cdef-123456789012/my-document.docx",
  "status": "Ready"
}

Response Fields

FieldTypeDescription
document_idstringUUID of the created document
document_namestringName of the uploaded file
s3_keystringS3 key where the document is stored
statusstringAlways "Ready" for local documents

Status Codes

  • 201 Created: Document uploaded and record created successfully
  • 400 Bad Request: Invalid file type (only .docx accepted)
  • 401 Unauthorized: Authentication failed
  • 500 Internal Server Error: S3 upload or database error

Get Sections for Document

Retrieve a flat list of all section identifiers and their associated metadata within a given document. Used to enumerate the full section structure, enabling the Sources panel selection dropdown to be pre-populated with all available sections upon document load.
POST /get-sections-for-document

Request Body

{
  "document_id": "doc_9f3c1e72ab84"
}

Request Parameters

ParameterTypeRequiredDescription
document_idstringYesUnique identifier of the document whose section list is being retrieved. Must reference an existing document accessible by the authenticated user.

Request Example

curl -X POST "https://api.artosai.com/get-sections-for-document" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_id": "doc_9f3c1e72ab84"
  }'

Python Example

import requests

url = "https://api.artosai.com/get-sections-for-document"
headers = {
    "Authorization": "Bearer YOUR_TOKEN",
    "Content-Type": "application/json"
}

payload = {
    "document_id": "doc_9f3c1e72ab84"
}

response = requests.post(url, headers=headers, json=payload)
sections = response.json()["sections"]
for section in sections:
    print(f"{section['section_order']}: {section['section_title']}")

Response

{
  "sections": [
    {
      "section_id": "1.1 Title of Study:",
      "section_title": "1.1 Title of Study:",
      "section_order": 0
    },
    {
      "section_id": "1.2 Study Objectives:",
      "section_title": "1.2 Study Objectives:",
      "section_order": 1
    },
    {
      "section_id": "1.3 Study Design:",
      "section_title": "1.3 Study Design:",
      "section_order": 2
    },
    {
      "section_id": "1.4 Study Population:",
      "section_title": "1.4 Study Population:",
      "section_order": 3
    },
    {
      "section_id": "1.5 Study Duration:",
      "section_title": "1.5 Study Duration:",
      "section_order": 4
    }
  ]
}

Response Fields

FieldTypeDescription
sectionsarrayArray of section objects, sorted by document order
sections[].section_idstringUnique identifier of the section (the section title). Used as section_id in the get-sources-for-selected-text endpoint.
sections[].section_titlestringHuman-readable display name of the section, rendered as an option in the Sources panel selection dropdown.
sections[].section_orderintegerZero-based index representing the section’s position within the document, used to render dropdown options in document order.

Status Codes

  • 200 OK: Successfully retrieved section list
  • 400 Bad Request: Authentication failed or invalid Bearer token
  • 404 Not Found: Document MRT not found for the given document ID

Complete Workflow Example

#!/bin/bash

TOKEN="your_bearer_token"
API="https://api.artosai.com"

# Step 1: Upload source documents
echo "Uploading source documents..."
curl -X POST "$API/api/v1/files/upload" \
  -H "Authorization: Bearer $TOKEN" \
  -F "file_name=protocol.pdf" \
  -F "file_content=@protocol.pdf" \
  -F "container=documents"

# Step 2: Request document generation
echo "Requesting document generation..."
RESPONSE=$(curl -s -X POST "$API/api/v1/documents/generate" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_type": "CSR",
    "file_paths": ["org-id/documents/protocol.pdf"],
    "connector_data_id": "project-2024",
    "workspace_name": "Q1 CSR",
    "template_id": "tmpl-uuid-123",
    "output_name": "CSR_Final.docx"
  }')

DOC_ID=$(echo $RESPONSE | jq -r '.task_id')
echo "Document ID: $DOC_ID"

# Step 3: Poll status until Ready
echo "Waiting for generation to complete..."
while true; do
  STATUS=$(curl -s -X GET "$API/api/v1/documents/status/$DOC_ID" \
    -H "Authorization: Bearer $TOKEN" | jq -r '.status')

  echo "Status: $STATUS"

  if [ "$STATUS" = "Ready" ]; then
    echo "Document generation complete!"
    break
  elif [ "$STATUS" = "Failed" ]; then
    echo "Document generation failed"
    exit 1
  fi

  sleep 5
done

# Step 4: Retrieve document
echo "Retrieving document..."
curl -X GET "$API/api/v1/documents/$DOC_ID" \
  -H "Authorization: Bearer $TOKEN" | jq '.document'