> ## Documentation Index
> Fetch the complete documentation index at: https://docs.artosai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Template and Document Generation Cookbook

> Complete guide to auto-generating templates from examples and generating documents with practical examples

# Template and Document Generation Cookbook

This cookbook walks you through the complete workflow of automatically generating a template from example documents and using it to generate new documents. You'll learn by uploading example CSR (Clinical Study Report) documents, automatically generating a reusable template from them, and then generating new documents using that template.

## Overview

The document generation workflow follows these steps:

1. **Upload Template and Example Documents** - Provide a template document and sample documents
2. **Generate Template from Examples** - System automatically creates a template structure from your examples
3. **Upload Source Documents** - Prepare source materials for new document generation
4. **Create Document Set** - Organize documents into a logical group
5. **Generate Document** - Request document generation using the auto-generated template
6. **Monitor Progress** - Poll status until generation completes
7. **Retrieve Results** - Download the generated document and metadata

## Prerequisites

Before starting, ensure you have:

* **API Token** - Valid Bearer token for authentication (see [Authentication](/api-reference/authentication))
* **API Endpoint** - Access to `https://api.artosai.com`
* **Tools** - curl, Python 3.6+, or equivalent HTTP client
* **Example Documents** - Sample regulatory documents (PDF, DOCX) that show your desired template structure
* **Source Documents** - Regulatory documents to process (PDF, DOCX, Excel)

## Section 1: Preparing and Uploading Documents

The template generation system learns from example documents. Prepare documents that represent your desired template structure.

### Document Types

**Template Document** (optional):

* A single document showing the desired output structure and format
* Should include section headers, formatting, and style guidelines
* Helps the system understand your preferred organization

**Example Documents**:

* 1-3 existing documents of the type you want to generate
* Should follow the same structure as your desired output
* Used to identify sections, content patterns, and extraction rules
* More diverse examples = better template

**Source Documents** (for generation):

* Raw materials to be processed
* Will be analyzed and content extracted to fill the generated template
* Can be different from examples (extracted content will follow template structure)

### Upload Documents

Use the Files API to upload template and example documents:

**curl Example - Upload Template Document:**

```bash theme={null}
curl -X POST "https://api.artosai.com/api/v1/files/upload" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file_name=csr_template.docx" \
  -F "file_content=@/path/to/csr_template.docx" \
  -F "container=templates"
```

**curl Example - Upload Example Documents:**

```bash theme={null}
# Upload first example
curl -X POST "https://api.artosai.com/api/v1/files/upload" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file_name=example_csr_2024.docx" \
  -F "file_content=@/path/to/example_csr_2024.docx" \
  -F "container=documents"

# Upload second example
curl -X POST "https://api.artosai.com/api/v1/files/upload" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file_name=example_csr_2023.docx" \
  -F "file_content=@/path/to/example_csr_2023.docx" \
  -F "container=documents"
```

**Python Example:**

```python theme={null}
import requests

def upload_file(token, filename, filepath, container):
    """Upload a file for template generation."""
    url = "https://api.artosai.com/api/v1/files/upload"
    headers = {"Authorization": f"Bearer {token}"}

    files = {
        "file_name": (None, filename),
        "container": (None, container),
        "file_content": open(filepath, "rb")
    }

    response = requests.post(url, headers=headers, files=files)

    if response.status_code == 200:
        print(f"✓ {filename} uploaded to {container}")
        return True
    else:
        print(f"✗ Upload failed: {response.json().get('detail')}")
        return False

# Upload documents
token = "YOUR_TOKEN"
upload_file(token, "csr_template.docx", "csr_template.docx", "templates")
upload_file(token, "example_csr_2024.docx", "example_csr_2024.docx", "documents")
upload_file(token, "example_csr_2023.docx", "example_csr_2023.docx", "documents")
```

## Section 2: Generating a Template from Examples

The `POST /api/v1/templates/generate` endpoint automatically analyzes your example documents and generates a structured template with sections and extraction rules.

### How Template Generation Works

The system:

1. **Analyzes** example documents for structure and content patterns
2. **Identifies** sections, headers, and content types
3. **Generates** extraction rules based on patterns found
4. **Creates** a reusable template with hierarchical sections
5. **Returns** a template ID for document generation

### Create a Template

Submit template generation request with your documents:

**curl Example:**

```bash theme={null}
curl -X POST "https://api.artosai.com/api/v1/templates/generate" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "template_filename": "csr_template.docx",
    "example_filenames": [
      "example_csr_2024.docx",
      "example_csr_2023.docx"
    ],
    "name": "CSR Template 2024",
    "description": "Automatically generated template for Clinical Study Reports",
    "document_type": "CSR",
  }'
```

**Python Example:**

```python theme={null}
import requests
import json

def generate_template(token, template_file, example_files, name, doc_type):
    """Generate a template from template and example documents."""
    url = "https://api.artosai.com/api/v1/templates/generate"
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }

    payload = {
        "template_filename": template_file,
        "example_filenames": example_files,
        "name": name,
        "description": f"Auto-generated template for {doc_type} documents",
        "document_type": doc_type,
    }

    response = requests.post(url, headers=headers, json=payload)

    if response.status_code == 200 or response.status_code == 202:
        result = response.json()
        template_id = result.get('template_id') or result.get('id')
        print(f"✓ Template generated: {template_id}")
        print(f"  Task ID: {result.get('task_id', 'N/A')}")
        return template_id, result.get('task_id')
    else:
        print(f"✗ Error: {response.json().get('detail', 'Unknown error')}")
        return None, None

# Generate template
token = "YOUR_TOKEN"
template_id, task_id = generate_template(
    token,
    template_file="csr_template.docx",
    example_files=["example_csr_2024.docx", "example_csr_2023.docx"],
    name="CSR Template 2024",
    doc_type="CSR"
)
```

**Response (200 OK or 202 Accepted):**

```json theme={null}
{
  "template_id": "template-uuid-abc123",
  "name": "CSR Template 2024",
  "document_type": "CSR",
  "status": "Complete",
  "section_count": 6,
  "created_at": "2024-01-25T12:00:00Z"
}
```

### Request Parameters

| Parameter                    | Type   | Required | Description                                                       |
| ---------------------------- | ------ | -------- | ----------------------------------------------------------------- |
| `template_filename`          | string | Yes      | Filename of template document (will be prefixed with org S3 path) |
| `example_filenames`          | array  | No       | Filenames of example documents for analysis                       |
| `name`                       | string | Yes      | Name for the generated template                                   |
| `description`                | string | No       | Description of the template                                       |
| `document_type`              | string | No       | Document type (default: "CSR")                                    |
| `default_connector_data_ids` | array  | No       | Default data source connectors                                    |
| `cache_version`              | string | No       | Cache version for reprocessing                                    |

### Generated Template Structure

The system automatically creates sections with:

* **Hierarchical organization** - Top-level and nested sections
* **Extraction rules** - Auto-identified rules for content extraction
* **Content patterns** - Recognized from examples
* **Reusable structure** - Can be applied to similar documents

Save the `template_id` for document generation steps.

## Section 3: Uploading Source Documents

Before generating documents, upload the source materials to be processed.

### Supported File Types

| Type  | Extension | Notes                                                 |
| ----- | --------- | ----------------------------------------------------- |
| PDF   | `.pdf`    | Recommended for documents                             |
| Word  | `.docx`   | Auto-converted to PDF (except in templates container) |
| Excel | `.xlsx`   | For data/tables                                       |
| CSV   | `.csv`    | For structured data                                   |
| RTF   | `.rtf`    | Rich text format                                      |

### Upload a File

Use the Files API to upload documents:

**curl Example:**

```bash theme={null}
curl -X POST "https://api.artosai.com/api/v1/files/upload" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file_name=protocol.pdf" \
  -F "file_content=@/path/to/protocol.pdf" \
  -F "container=documents"
```

**Python Example:**

```python theme={null}
import requests

url = "https://api.artosai.com/api/v1/files/upload"
headers = {"Authorization": "Bearer YOUR_TOKEN"}

files = {
    "file_name": (None, "protocol.pdf"),
    "container": (None, "documents"),
    "file_content": open("/path/to/protocol.pdf", "rb")
}

response = requests.post(url, headers=headers, files=files)
result = response.json()

if response.status_code == 200:
    print("✓ File uploaded successfully")
else:
    print(f"✗ Error: {result.get('detail', 'Upload failed')}")
```

**Response (200 OK):**

```json theme={null}
{
  "message": "File uploaded successfully"
}
```

### Upload Multiple Files

Repeat the upload process for each source document:

```bash theme={null}
# Upload protocol
curl -X POST "https://api.artosai.com/api/v1/files/upload" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file_name=protocol.pdf" \
  -F "file_content=@protocol.pdf" \
  -F "container=documents"

# Upload safety report
curl -X POST "https://api.artosai.com/api/v1/files/upload" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file_name=safety_report.docx" \
  -F "file_content=@safety_report.docx" \
  -F "container=documents"

# Upload efficacy data
curl -X POST "https://api.artosai.com/api/v1/files/upload" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file_name=efficacy_data.xlsx" \
  -F "file_content=@efficacy_data.xlsx" \
  -F "container=documents"
```

### File Container Types

| Container   | Purpose                | Auto-Convert   |
| ----------- | ---------------------- | -------------- |
| `documents` | Processed documents    | Yes (DOCX→PDF) |
| `templates` | Template files         | No             |
| `input`     | Source documents       | Yes            |
| `output`    | Generated output files | No             |

## Section 4: Generating a Document

Now that you have a generated template and source documents, request document generation.

### Create a Document Set

First, organize your documents into a document set:

**curl Example:**

```bash theme={null}
curl -X POST "https://api.artosai.com/api/v1/document-sets/" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_set_name": "Q1 2024 CSR Submission"
  }'
```

**Python Example:**

```python theme={null}
import requests

url = "https://api.artosai.com/api/v1/document-sets/"
headers = {
    "Authorization": "Bearer YOUR_TOKEN",
    "Content-Type": "application/json"
}

payload = {"document_set_name": "Q1 2024 CSR Submission"}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

if response.status_code == 201:
    document_set_id = result['document_set_id']
    print(f"✓ Document set created: {document_set_id}")
else:
    print(f"✗ Error: {result.get('detail', 'Failed to create set')}")
```

**Response (201 Created):**

```json theme={null}
{
  "document_set_id": "set-uuid-123",
  "document_set_name": "Q1 2024 CSR Submission",
  "organization_id": "org-uuid",
  "version": 1
}
```

### Request Document Generation

Submit a generation request using your generated template:

**curl Example:**

```bash theme={null}
curl -X POST "https://api.artosai.com/api/v1/documents/generate" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "document_type": "CSR",
    "file_paths": [
      "org-id/documents/protocol.pdf",
      "org-id/documents/safety_report.pdf",
      "org-id/documents/efficacy_data.xlsx"
    ],
    "document_set_key": "random_uuid",
    "document_set_name": "Q1 2024 CSR Submission",
    "template_id": "template-uuid-abc123",
    "output_name": "CSR_Final_Q1_2024.docx"
  }'
```

**Python Example:**

```python theme={null}
import requests

url = "https://api.artosai.com/api/v1/documents/generate"
headers = {
    "Authorization": "Bearer YOUR_TOKEN",
    "Content-Type": "application/json"
}

payload = {
    "document_type": "CSR",
    "file_paths": [
        "org-id/documents/protocol.pdf",
        "org-id/documents/safety_report.pdf",
        "org-id/documents/efficacy_data.xlsx"
    ],
    "document_set_key": "random_uuid",
    "document_set_name": "Q1 2024 CSR Submission",
    "template_id": "template-uuid-abc123",
    "output_name": "CSR_Final_Q1_2024.docx"
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()

if response.status_code == 202:
    task_id = result['task_id']
    print(f"✓ Generation request accepted")
    print(f"  Task ID: {task_id}")
else:
    print(f"✗ Error: {result.get('detail', 'Generation failed')}")
```

**Response (202 Accepted):**

```json theme={null}
{
  "message": "Document generation started",
  "task_id": "celery-task-uuid-456"
}
```

### Generation Request Parameters

| Parameter               | Type   | Required | Description                                    |
| ----------------------- | ------ | -------- | ---------------------------------------------- |
| `document_type`         | string | Yes      | Document type (e.g., 'CSR', 'IND', 'Protocol') |
| `file_paths`            | array  | Yes      | S3 paths to source documents                   |
| `document_set_key`      | string | Yes      | Unique key for this document set               |
| `document_set_name`     | string | Yes      | Human-readable name                            |
| `template_id`           | string | Yes      | Generated template ID from Section 2           |
| `output_name`           | string | Yes      | Output filename                                |
| `selected_section_ids`  | array  | No       | Specific sections to include (optional)        |
| `document_instructions` | string | No       | Additional instructions                        |
| `style_guide_id`        | string | No       | Style guide for formatting                     |

## Section 5: Monitoring Generation Status

Document generation is asynchronous. Poll the status endpoint until complete:

**curl Example:**

```bash theme={null}
curl -X GET "https://api.artosai.com/api/v1/documents/status/celery-task-uuid-456" \
  -H "Authorization: Bearer YOUR_TOKEN"
```

**Python Example:**

```python theme={null}
import requests
import time

url = "https://api.artosai.com/api/v1/documents/status/celery-task-uuid-456"
headers = {"Authorization": "Bearer YOUR_TOKEN"}

# Poll until complete
while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result['status']

    print(f"Status: {status}")

    if status == "Complete":
        print("✓ Document generation complete!")
        break
    elif status == "Failed":
        print(f"✗ Generation failed: {result.get('error', 'Unknown error')}")
        break

    # Wait before next poll
    time.sleep(5)
```

**Response:**

```json theme={null}
{
  "task_id": "celery-task-uuid-456",
  "status": "Generating",
  "progress": null,
  "error": null
}
```

### Status Values

| Status       | Meaning               |
| ------------ | --------------------- |
| `Generating` | Currently processing  |
| `Complete`   | Successfully finished |
| `Failed`     | Encountered an error  |

## Section 6: Retrieving Generated Documents

Once generation completes, retrieve the document details and metadata.

### Get Document Details

Retrieve the completed document:

**curl Example:**

```bash theme={null}
curl -X GET "https://api.artosai.com/api/v1/documents/document-uuid" \
  -H "Authorization: Bearer YOUR_TOKEN"
```

**Python Example:**

```python theme={null}
import requests

url = "https://api.artosai.com/api/v1/documents/document-uuid"
headers = {"Authorization": "Bearer YOUR_TOKEN"}

response = requests.get(url, headers=headers)
result = response.json()

if response.status_code == 200:
    doc = result['document']
    print(f"✓ Document retrieved")
    print(f"  ID: {doc['document_id']}")
    print(f"  Type: {doc['document_type']}")
    print(f"  Status: {doc['status']}")
    print(f"  Output: {doc['output_name']}")
    print(f"  Sections: {len(doc.get('sections', []))}")
else:
    print(f"✗ Error: {result.get('detail', 'Not found')}")
```

**Response:**

```json theme={null}
{
  "document": {
    "document_id": "document-uuid",
    "document_set_id": "set-uuid-123",
    "document_type": "CSR",
    "status": "Complete",
    "output_name": "CSR_Final_Q1_2024.docx",
    "sections": [
      {
        "section_id": "section-1",
        "title": "Executive Summary",
        "content": "..."
      },
      {
        "section_id": "section-2",
        "title": "Methodology",
        "content": "..."
      }
    ],
    "created_at": "2024-01-25T12:00:00Z",
    "updated_at": "2024-01-25T12:30:00Z"
  }
}
```

### Get Document Details

Retrieve the section and rule details for the generated document:

**curl Example:**

```bash theme={null}
curl -X GET "https://api.artosai.com/api/v1/document-mrt/by-document/document-uuid" \
  -H "Authorization: Bearer YOUR_TOKEN"
```

**Python Example:**

```python theme={null}
import requests

url = "https://api.artosai.com/api/v1/document-mrt/by-document/document-uuid"
headers = {"Authorization": "Bearer YOUR_TOKEN"}

response = requests.get(url, headers=headers)
result = response.json()

if response.status_code == 200:
    mrt = result['outline']
    print(f"✓ Document details retrieved")
    print(f"  MRT ID: {mrt['mrt_id']}")
    print(f"  Sections: {len(mrt['sections'])}")

    for section in mrt['sections']:
        print(f"\n  Section: {section['title']}")
        print(f"    Rules: {len(section.get('rules', []))}")
        for rule in section.get('rules', []):
            print(f"      - {rule['rule_type']}: {rule.get('description', 'N/A')}")
            if 'confidence_score' in rule:
                print(f"        Confidence: {rule['confidence_score']:.2%}")
else:
    print(f"✗ Error: {result.get('detail', 'Not found')}")
```

**Response:**

```json theme={null}
{
  "outline": {
    "mrt_id": "mrt-uuid",
    "document_id": "document-uuid",
    "sections": [
      {
        "order_index": 0,
        "level": 1,
        "section_id": "section-1",
        "title": "Executive Summary",
        "synopsis": "High-level overview",
        "rules": [
          {
            "rule_type": "extraction",
            "rule_mode": "auto",
            "confidence_score": 0.95,
            "description": "Extract key findings",
            "generated_content": "..."
          }
        ]
      }
    ],
    "created_at": "2024-01-25T12:00:00Z",
    "updated_at": "2024-01-25T12:30:00Z"
  }
}
```

## Complete End-to-End Workflow

Here's a complete example showing the entire workflow from template generation to document retrieval.

### Bash Script

```bash theme={null}
#!/bin/bash

# Configuration
TOKEN="your_bearer_token"
API="https://api.artosai.com"
ORG_ID="your_org_id"

# Step 1: Upload Template and Example Documents
echo "=== Step 1: Uploading Documents ==="

curl -s -X POST "$API/api/v1/files/upload" \
  -H "Authorization: Bearer $TOKEN" \
  -F "file_name=csr_template.docx" \
  -F "file_content=@csr_template.docx" \
  -F "container=templates" > /dev/null
echo "✓ Template document uploaded"

curl -s -X POST "$API/api/v1/files/upload" \
  -H "Authorization: Bearer $TOKEN" \
  -F "file_name=example_csr_2024.docx" \
  -F "file_content=@example_csr_2024.docx" \
  -F "container=documents" > /dev/null
echo "✓ Example document 1 uploaded"

curl -s -X POST "$API/api/v1/files/upload" \
  -H "Authorization: Bearer $TOKEN" \
  -F "file_name=example_csr_2023.docx" \
  -F "file_content=@example_csr_2023.docx" \
  -F "container=documents" > /dev/null
echo "✓ Example document 2 uploaded"

# Step 2: Generate Template from Examples
echo ""
echo "=== Step 2: Generating Template from Examples ==="

TEMPLATE=$(curl -s -X POST "$API/api/v1/templates/generate" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "template_filename": "csr_template.docx",
    "example_filenames": ["example_csr_2024.docx", "example_csr_2023.docx"],
    "name": "CSR Template 2024",
    "description": "Auto-generated from example documents",
    "document_type": "CSR",
  }')

TEMPLATE_ID=$(echo $TEMPLATE | jq -r '.template_id // .id')
echo "✓ Template generated: $TEMPLATE_ID"

# Step 3: Upload Source Documents for Generation
echo ""
echo "=== Step 3: Uploading Source Documents ==="

curl -s -X POST "$API/api/v1/files/upload" \
  -H "Authorization: Bearer $TOKEN" \
  -F "file_name=protocol.pdf" \
  -F "file_content=@protocol.pdf" \
  -F "container=documents" > /dev/null
echo "✓ Protocol uploaded"

curl -s -X POST "$API/api/v1/files/upload" \
  -H "Authorization: Bearer $TOKEN" \
  -F "file_name=safety_report.pdf" \
  -F "file_content=@safety_report.pdf" \
  -F "container=documents" > /dev/null
echo "✓ Safety report uploaded"

# Step 4: Create Document Set
echo ""
echo "=== Step 4: Creating Document Set ==="
DOCSET=$(curl -s -X POST "$API/api/v1/document-sets/" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"document_set_name": "Q1 2024 CSR"}')

DOCSET_ID=$(echo $DOCSET | jq -r '.document_set_id')
echo "✓ Document set created: $DOCSET_ID"

# Step 5: Request Document Generation
echo ""
echo "=== Step 5: Requesting Document Generation ==="
GEN=$(curl -s -X POST "$API/api/v1/documents/generate" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"document_type\": \"CSR\",
    \"file_paths\": [
      \"$ORG_ID/documents/protocol.pdf\",
      \"$ORG_ID/documents/safety_report.pdf\"
    ],
    \"document_set_key\": \"random_uuid\",
    \"document_set_name\": \"Q1 2024 CSR\",
    \"template_id\": \"$TEMPLATE_ID\",
    \"output_name\": \"CSR_Final.docx\"
  }")

TASK_ID=$(echo $GEN | jq -r '.task_id')
echo "✓ Generation request accepted"
echo "  Task ID: $TASK_ID"

# Step 6: Poll Status Until Complete
echo ""
echo "=== Step 6: Monitoring Generation Progress ==="
while true; do
  STATUS=$(curl -s -X GET "$API/api/v1/documents/status/$TASK_ID" \
    -H "Authorization: Bearer $TOKEN" | jq -r '.status')

  echo "Status: $STATUS"

  if [ "$STATUS" = "Complete" ]; then
    echo "✓ Generation complete!"
    DOC_ID=$TASK_ID
    break
  elif [ "$STATUS" = "Failed" ]; then
    echo "✗ Generation failed"
    exit 1
  fi

  sleep 5
done

# Step 7: Retrieve Document Details
echo ""
echo "=== Step 7: Retrieving Document Details ==="
DOC=$(curl -s -X GET "$API/api/v1/documents/$DOC_ID" \
  -H "Authorization: Bearer $TOKEN")

echo "✓ Document retrieved"
echo "  Document ID: $(echo $DOC | jq -r '.document.document_id')"
echo "  Type: $(echo $DOC | jq -r '.document.document_type')"
echo "  Status: $(echo $DOC | jq -r '.document.status')"
echo "  Output: $(echo $DOC | jq -r '.document.output_name')"

echo ""
echo "=== Workflow Complete ==="
```

### Python Script

```python theme={null}
#!/usr/bin/env python3

import requests
import json
import time
import sys

# Configuration
TOKEN = "your_bearer_token"
API = "https://api.artosai.com"
ORG_ID = "your_org_id"

headers = {
    "Authorization": f"Bearer {TOKEN}",
    "Content-Type": "application/json"
}

def log_step(step, message):
    print(f"\n{'='*50}")
    print(f"Step {step}: {message}")
    print('='*50)

def log_success(message):
    print(f"✓ {message}")

def log_error(message):
    print(f"✗ {message}")
    sys.exit(1)

def upload_file(filename, filepath, container):
    """Upload a file to S3."""
    url = f"{API}/api/v1/files/upload"
    upload_headers = {"Authorization": f"Bearer {TOKEN}"}

    files = {
        "file_name": (None, filename),
        "container": (None, container),
        "file_content": open(filepath, "rb")
    }

    response = requests.post(url, headers=upload_headers, files=files)
    if response.status_code == 200:
        log_success(f"{filename} uploaded")
    else:
        log_error(f"Failed to upload {filename}: {response.json().get('detail')}")

# Step 1: Upload Documents
log_step(1, "Uploading Documents")

upload_file("csr_template.docx", "csr_template.docx", "templates")
upload_file("example_csr_2024.docx", "example_csr_2024.docx", "documents")
upload_file("example_csr_2023.docx", "example_csr_2023.docx", "documents")

# Step 2: Generate Template from Examples
log_step(2, "Generating Template from Examples")

template_data = {
    "template_filename": "csr_template.docx",
    "example_filenames": ["example_csr_2024.docx", "example_csr_2023.docx"],
    "name": "CSR Template 2024",
    "description": "Auto-generated from example documents",
    "document_type": "CSR",
}

response = requests.post(f"{API}/api/v1/templates/generate", headers=headers, json=template_data)
if response.status_code not in [200, 202]:
    log_error(f"Template generation failed: {response.json().get('detail')}")

result = response.json()
template_id = result.get('template_id') or result.get('id')
log_success(f"Template generated: {template_id}")

# Step 3: Upload Source Documents
log_step(3, "Uploading Source Documents")

upload_file("protocol.pdf", "protocol.pdf", "documents")
upload_file("safety_report.pdf", "safety_report.pdf", "documents")

# Step 4: Create Document Set
log_step(4, "Creating Document Set")

docset_data = {"document_set_name": "Q1 2024 CSR"}
response = requests.post(f"{API}/api/v1/document-sets/", headers=headers, json=docset_data)

if response.status_code != 201:
    log_error(f"Document set creation failed: {response.json().get('detail')}")

docset_id = response.json()['document_set_id']
log_success(f"Document set created: {docset_id}")

# Step 5: Request Document Generation
log_step(5, "Requesting Document Generation")

gen_data = {
    "document_type": "CSR",
    "file_paths": [
        f"input/protocol.pdf",
        f"input/safety_report.pdf"
    ],
    "document_set_key": "random-uuid",
    "document_set_name": "Q1 2024 CSR",
    "template_id": template_id,
    "output_name": "CSR_Final.docx"
}

response = requests.post(f"{API}/api/v1/documents/generate", headers=headers, json=gen_data)

if response.status_code != 202:
    log_error(f"Generation request failed: {response.json().get('detail')}")

task_id = response.json()['task_id']
log_success(f"Generation request accepted: {task_id}")

# Step 6: Poll Status Until Complete
log_step(6, "Monitoring Generation Progress")

while True:
    response = requests.get(f"{API}/api/v1/documents/status/{task_id}", headers=headers)
    status = response.json()['status']

    print(f"Status: {status}")

    if status == "Complete":
        log_success("Generation complete!")
        doc_id = task_id
        break
    elif status == "Failed":
        log_error(f"Generation failed: {response.json().get('error')}")

    time.sleep(5)

# Step 7: Retrieve Document Details
log_step(7, "Retrieving Document Details")

response = requests.get(f"{API}/api/v1/documents/{doc_id}", headers=headers)
doc = response.json()['document']

log_success("Document retrieved")
print(f"  Document ID: {doc['document_id']}")
print(f"  Type: {doc['document_type']}")
print(f"  Status: {doc['status']}")
print(f"  Output: {doc['output_name']}")

log_step("Complete", "Workflow Complete")
print(f"Document ID: {doc_id}")
print("Ready for download or further processing")
```

## Troubleshooting

### Common Issues

#### Template Generation Fails

**Error**: `400 Bad Request: File not found`

**Causes**:

* Template or example filenames don't match uploaded files
* Files uploaded to wrong container
* Filename typos

**Solution**:

* Verify filenames match exactly (case-sensitive)
* Upload template to `templates` container
* Upload examples to `documents` container
* Use just the filename, not full path

#### File Upload Fails

**Error**: `400 Bad Request: File type not supported`

**Causes**:

* Unsupported file extension
* Corrupted file
* MIME type mismatch

**Solution**:

* Use only supported types: PDF, DOCX, XLSX, CSV, RTF
* Verify file is not corrupted
* Check file extension matches actual file type

#### Generation Fails with Template

**Error**: `Task failed: Missing required section`

**Causes**:

* Template ID not found
* Source documents don't contain expected content
* Extraction rules unable to find data

**Solution**:

* Verify template ID is correct and generation is complete
* Ensure source documents contain similar content to examples
* Review source documents for required sections
* Try with simpler/more complete source documents

#### Status Endpoint Returns 404

**Error**: `404 Not Found: Task not found`

**Causes**:

* Wrong task ID used
* Task expired (old IDs)
* Task ID typo

**Solution**:

* Copy `task_id` immediately after generation request
* Don't wait more than 24 hours to poll status
* Check for typos in task ID

### Debug Checklist

When troubleshooting generation issues, verify:

* [ ] Bearer token is valid (not expired)
* [ ] All files were uploaded successfully
* [ ] Template filename and example filenames are correct
* [ ] Template generation completed successfully
* [ ] All source files were uploaded (S3 paths are correct)
* [ ] Generation request returned 202 status
* [ ] Task ID is being used correctly for polling
* [ ] Polling every 5-10 seconds (not too frequent)
* [ ] Status endpoint returns valid status values

### Performance Considerations

**Typical Processing Times**:

* Template generation from examples: 1-3 minutes
* Document generation (simple): 2-5 minutes
* Document generation (complex): 5-15 minutes

**Factors Affecting Speed**:

* Document size and complexity
* Number of sections in template
* Number of extraction rules
* Available processing resources

**Optimization Tips**:

* Use high-quality example documents
* Start with simple templates (fewer sections)
* Keep extraction rules focused
* Monitor system resources

## Next Steps

Now that you understand the complete workflow, explore:

* **[Template Workflow Concepts](/concepts/mrt-workflow)** - Deeper understanding of template structure
* **[Document Generation Pipeline](/concepts/document-generation)** - How generation works internally
* **[Documents API Reference](/api-reference/documents)** - Document generation endpoints
* **[Async Operations](/concepts/async-operations)** - Background processing details

## Additional Resources

* **API Playground** - Test endpoints interactively at [https://api.artosai.com/docs](https://api.artosai.com/docs)
* **Authentication** - Set up and manage API tokens at [https://artosai.com](https://artosai.com)
* **Support** - Contact support at [internal@artosai.com](mailto:internal@artosai.com)
