Skip to main content

Documents API

The Documents API enables asynchronous document generation from source materials using MRT (Machine Readable Template) templates. Document generation is processed as a background task and returns immediately with a task ID for status tracking.

Generate Document

Request document generation from source documents using an MRT template. The request is queued as a background Celery task and returns 202 Accepted.
POST /api/v1/documents/generate

Request Body

{
  "document_type": "CSR",
  "file_paths": ["s3://bucket/protocol.pdf", "s3://bucket/data.xlsx"],
  "document_set_key": "project-2024-001",
  "document_set_name": "Q1 CSR Documents",
  "generic_mrt_id": "template-uuid-123",
  "output_name": "CSR_Final.docx",
  "selected_section_ids": ["section-1", "section-2"],
  "generic_mrt_outline_full": {},
  "document_instructions": "Follow company style guide",
  "style_guide_id": "style-guide-uuid"
}

Request Parameters

ParameterTypeRequiredDescription
document_typestringYesType of document (e.g., ‘CSR’, ‘Protocol’, ‘IND’)
file_pathsarrayYesS3 file paths for source documents
document_set_keystringYesUnique key for document set
document_set_namestringYesHuman-readable name for document set
generic_mrt_idstringYesMRT template ID to use
output_namestringYesName for generated output file
selected_section_idsarrayNoSpecific sections to include
generic_mrt_outline_fullobjectNoFull outline structure
document_instructionsstringNoDocument-level instructions
style_guide_idstringNoStyle guide ID for formatting

Request Example

curl -X POST "https://api.artosai.com/api/v1/documents/generate" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_type": "CSR",
    "file_paths": ["s3://bucket/protocol.pdf"],
    "document_set_key": "project-2024-001",
    "document_set_name": "Q1 CSR Documents",
    "generic_mrt_id": "template-uuid-123",
    "output_name": "CSR_Final.docx"
  }'

Python Example

import requests

url = "https://api.artosai.com/api/v1/documents/generate"
headers = {
    "Authorization": "Bearer YOUR_TOKEN",
    "Content-Type": "application/json"
}

payload = {
    "document_type": "CSR",
    "file_paths": ["s3://bucket/protocol.pdf"],
    "document_set_key": "project-2024-001",
    "document_set_name": "Q1 CSR Documents",
    "generic_mrt_id": "template-uuid-123",
    "output_name": "CSR_Final.docx"
}

response = requests.post(url, headers=headers, json=payload)
task = response.json()
print(f"Task ID: {task['task_id']}")

Response (202 Accepted)

{
  "message": "Document generation started",
  "task_id": "celery-task-uuid-456"
}

Response Fields

FieldTypeDescription
messagestringStatus message
task_idstringCelery task ID for status tracking

Status Codes

  • 202 Accepted: Request accepted for background processing
  • 400 Bad Request: Missing required parameters
  • 401 Unauthorized: Missing or invalid Bearer token
  • 500 Internal Server Error: Database operation failed

Document Generation Pipeline

The background task performs the following steps:
  1. Extract - Extract and classify content from source documents
  2. Ingest - Ingest documents using classification
  3. Create Outline - Generate MRT outline from template
  4. Orchestrate - Execute document outline rule orchestration
  5. Generate - Produce final DOCX document

Asynchronous Processing

The endpoint returns immediately with a task ID. Use the Status endpoint to track progress:
# Immediately after generation request
{
  "task_id": "celery-task-uuid-456"
}

# Poll the status endpoint periodically
GET /api/v1/documents/status/document-uuid

Get Document Status

Poll the status of a document being generated.
GET /api/v1/documents/status/{document_id}

Path Parameters

ParameterTypeRequiredDescription
document_idstringYesDocument ID or task ID

Request Example

curl -X GET "https://api.artosai.com/api/v1/documents/status/celery-task-uuid-456" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response

{
  "task_id": "celery-task-uuid-456",
  "status": "Generating",
  "progress": null,
  "error": null
}

Status Values

StatusDescription
GeneratingCurrently processing
CompleteSuccessfully finished
FailedEncountered error

Response Fields

FieldTypeDescription
task_idstringDocument ID
statusstringCurrent status
progressintegerProgress percentage (reserved for future use)
errorstringError message if status is Failed

Status Codes

  • 200 OK: Status retrieved successfully
  • 404 Not Found: Document not found
  • 500 Internal Server Error: Database error

Status Polling Workflow

# 1. Request document generation
curl -X POST "https://api.artosai.com/api/v1/documents/generate" ...
# Response: task_id = "abc-123"

# 2. Poll status immediately
curl -X GET "https://api.artosai.com/api/v1/documents/status/abc-123"
# Response: status = "Generating"

# 3. Continue polling until complete
curl -X GET "https://api.artosai.com/api/v1/documents/status/abc-123"
# Response: status = "Complete"

# 4. Retrieve completed document
curl -X GET "https://api.artosai.com/api/v1/documents/document-id"

Get Single Document

Retrieve a completed document by ID.
GET /api/v1/documents/{document_id}

Path Parameters

ParameterTypeRequiredDescription
document_idstringYesUUID of document

Request Example

curl -X GET "https://api.artosai.com/api/v1/documents/document-uuid" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response

{
  "document": {
    "document_id": "document-uuid",
    "document_set_id": "set-uuid",
    "document_type": "CSR",
    "status": "Complete",
    "output_name": "CSR_Final.docx",
    "sections": [
      {
        "section_id": "section-1",
        "title": "Executive Summary",
        "content": "..."
      }
    ],
    "created_at": "2024-01-25T12:00:00Z",
    "updated_at": "2024-01-25T12:30:00Z"
  }
}

Status Codes

  • 200 OK: Document retrieved successfully
  • 401 Unauthorized: Missing or invalid Bearer token
  • 403 Forbidden: User not authorized (different organization)
  • 404 Not Found: Document not found
  • 500 Internal Server Error: Database error

Complete Workflow Example

#!/bin/bash

TOKEN="your_bearer_token"
API="https://api.artosai.com"

# Step 1: Upload source documents
echo "Uploading source documents..."
curl -X POST "$API/api/v1/files/upload" \
  -H "Authorization: Bearer $TOKEN" \
  -F "file_name=protocol.pdf" \
  -F "[email protected]" \
  -F "container=documents"

# Step 2: Request document generation
echo "Requesting document generation..."
RESPONSE=$(curl -X POST "$API/api/v1/documents/generate" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_type": "CSR",
    "file_paths": ["s3://bucket/protocol.pdf"],
    "document_set_key": "project-2024",
    "document_set_name": "Q1 CSR",
    "generic_mrt_id": "template-uuid",
    "output_name": "CSR_Final.docx"
  }')

TASK_ID=$(echo $RESPONSE | jq -r '.task_id')
echo "Task ID: $TASK_ID"

# Step 3: Poll status until complete
echo "Waiting for generation to complete..."
while true; do
  STATUS=$(curl -s -X GET "$API/api/v1/documents/status/$TASK_ID" \
    -H "Authorization: Bearer $TOKEN" | jq -r '.status')

  echo "Status: $STATUS"

  if [ "$STATUS" = "Complete" ]; then
    echo "Document generation complete!"
    break
  elif [ "$STATUS" = "Failed" ]; then
    echo "Document generation failed"
    exit 1
  fi

  sleep 5
done

# Step 4: Retrieve document
echo "Retrieving document..."
curl -X GET "$API/api/v1/documents/$TASK_ID" \
  -H "Authorization: Bearer $TOKEN" | jq '.document'