Skip to main content

Overview

Document generation follows this workflow:
  1. Create a pipeline with agents, documents, and configuration
  2. Execute the pipeline to start document generation
  3. Poll execution status until completion
  4. Access generated document from S3 outputs folder

Pipeline Components

Agents

Processing steps that perform specific actions in your document generation workflow. Agents can be chained together, with output from one agent flowing to the next. The orchestration system automatically determines the optimal execution strategy, running agents in parallel when possible to maximize performance.

Documents

Ingested documents that agents will search and use as source material for generation.

Connectors

Batch processing identifiers for handling groups of uploaded documents.

Reference Documents

Styling templates that control the formatting and appearance of generated documents.

Create Pipeline

Endpoint

POST /pipelines

Headers

Authorization: Bearer <access_token>
Content-Type: application/json

Request Body

{
  "name": "RegulatoryFlow",
  "agentIds": ["agent-001", "agent-002", "agent-003"],
  "documents": ["document-123", "document-456"],
  "connectorId": "conn-456",
  "referenceDocument": "style-template-001",
  "outputFileName": "regulatory-report.docx"
}

Request Parameters

FieldTypeRequiredDescriptionnameStringYesHuman-readable pipeline nameagentIdsArray[String]YesList of agent IDs to executedocumentsArray[String]YesList of ingested document IDs to use as sourceconnectorIdStringYesBatch ID for document processingreferenceDocumentStringNoReference document ID for stylingoutputFileNameStringYesName for generated document file

Example Request

curl -X POST "https://api.artosai.com/pipelines" \
  -H "Authorization: Bearer your-access-token" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Clinical Trial Report Generator",
    "agentIds": ["extract-efficacy", "analyze-safety", "format-report"],
    "documents": ["protocol-001", "results-data-002"],
    "connectorId": "batch-clinical-001",
    "referenceDocument": "fda-report-template",
    "outputFileName": "phase3-efficacy-report.docx"
  }'

Success Response (201 Created)

{
  "pipelineId": "pipeline-c78d9e12",
  "name": "Clinical Trial Report Generator",
  "status": "created",
  "created_at": "2024-01-15T10:30:00Z",
  "created_by": "user-abc123"
}

Execute Pipeline

Endpoint

POST /pipelines/{pipelineId}/execute

Path Parameters

ParameterTypeRequiredDescriptionpipelineIdStringYesPipeline ID from creation response

Headers

Authorization: Bearer <access_token>
Content-Type: application/json

Example Request

bash
curl -X POST "https://api.artosai.com/pipelines/pipeline-c78d9e12/execute" \
  -H "Authorization: Bearer your-access-token" \
  -H "Content-Type: application/json"

Success Response (200 OK)

json
{
  "jobId": "job-exec-def456ghi789",
  "pipelineId": "pipeline-c78d9e12",
  "status": "pending",
  "started_at": "2024-01-15T10:35:00Z",
  "estimated_completion": "2024-01-15T10:45:00Z"
}

Parallel Execution

The orchestration system executes document pipelines in parallel to maximize performance and reduce processing time. When agents don’t have dependencies on each other’s outputs, they run concurrently across available compute resources. The system automatically:
  • Analyzes agent dependencies to determine which can run in parallel
  • Distributes workload across available computational resources
  • Optimizes execution order to minimize total processing time
  • Manages resource allocation to prevent conflicts between parallel agents
This parallel execution approach significantly improves pipeline performance, especially for complex workflows with multiple independent processing steps.

Poll Execution Status

Endpoint

POST /pipelines/job/{jobId}

Path Parameters

ParameterTypeRequiredDescriptionjobIdStringYesJob ID from execution response

Headers

Authorization: Bearer <access_token>
Content-Type: application/json

Example Request

bash
curl -X POST "https://api.artosai.com/pipelines/job/job-exec-def456ghi789" \
  -H "Authorization: Bearer your-access-token" \
  -H "Content-Type: application/json"

Response Format

Execution Pending/Processing (200 OK)

{
  "jobId": "job-exec-def456ghi789",
  "pipelineId": "pipeline-c78d9e12",
  "status": "processing",
  "current_agent": 2,
  "total_agents": 3,
  "progress": 67,
  "started_at": "2024-01-15T10:35:00Z",
  "updated_at": "2024-01-15T10:41:30Z",
  "estimated_completion": "2024-01-15T10:45:00Z",
  "output_filename": "phase3-efficacy-report.docx"
}

Execution Completed (200 OK)

{
  "jobId": "job-exec-def456ghi789",
  "pipelineId": "pipeline-c78d9e12",
  "status": "completed",
  "progress": 100,
  "started_at": "2024-01-15T10:35:00Z",
  "completed_at": "2024-01-15T10:43:22Z",
  "processing_time_seconds": 502,
  "output_filename": "phase3-efficacy-report.docx",
  "output_url": "s3://your-bucket/outputs/phase3-efficacy-report.docx",
  "agents_executed": 3,
  "total_pages_generated": 24,
  "word_count": 8947
}

Execution Failed (200 OK)

{
  "jobId": "job-exec-def456ghi789",
  "pipelineId": "pipeline-c78d9e12",
  "status": "failed",
  "progress": 45,
  "started_at": "2024-01-15T10:35:00Z",
  "failed_at": "2024-01-15T10:38:15Z",
  "error": {
    "error_code": "AGENT_EXECUTION_FAILED",
    "message": "Agent failed after maximum retry attempts",
    "details": "Agent 'analyze-safety' exceeded retry limit"
  },
  "output_filename": "phase3-efficacy-report.docx",
  "agents_completed": 1,
  "agents_failed": 1,
  "retry_count": 3,
  "max_retries": 3
}

Status Values

StatusDescriptionpendingPipeline queued for executionprocessingAgents are executing (may be in parallel)completedAll agents executed successfullyfailedPipeline failed after retries

Pipeline Management

List All Pipelines

Endpoint
GET /pipelines
Query Parameters
ParameterTypeRequiredDescriptionpageIntegerNoPage number (default: 1)limitIntegerNoResults per page (default: 20, max: 100)statusStringNoFilter by pipeline statusnameStringNoFilter by pipeline name (partial match)
Headers
Authorization: Bearer <access_token>
Content-Type: application/json
Example Request
bash
curl -X GET "https://api.artosai.com/pipelines?page=1&limit=10" \
  -H "Authorization: Bearer your-access-token"
Success Response (200 OK)
[
  {
    "pipelineId": "pipeline-c78d9e12",
    "name": "Clinical Trial Report Generator", 
    "status": "created",
    "created_at": "2024-01-15T10:30:00Z",
    "last_executed": "2024-01-15T10:35:00Z",
    "agents_count": 3,
    "documents_count": 2,
    "executions_count": 1,
    "created_by": "user-abc123"
  },
  {
    "pipelineId": "pipeline-preset-001",
    "name": "FDA Regulatory Submission",
    "status": "preset",
    "created_at": "2024-01-10T09:00:00Z",
    "last_executed": null,
    "agents_count": 5,
    "documents_count": 0,
    "executions_count": 0,
    "created_by": "system"
  }
]
Permission-Based Access
  • Admin users: See all pipelines in organization
  • Regular users: Only see pipelines they created

Get Specific Pipeline

Endpoint
GET /pipelines/{pipelineId}
Path Parameters
ParameterTypeRequiredDescriptionpipelineIdStringYesPipeline ID
Example Request
bash
curl -X GET "https://api.artosai.com/pipelines/pipeline-c78d9e12" \
  -H "Authorization: Bearer your-access-token"
Success Response (200 OK)
{
  "pipelineId": "pipeline-c78d9e12",
  "name": "Clinical Trial Report Generator",
  "status": "created",
  "agentIds": ["extract-efficacy", "analyze-safety", "format-report"],
  "documents": ["protocol-001", "results-data-002"],
  "connectorId": "batch-clinical-001",
  "referenceDocument": "fda-report-template",
  "outputFileName": "phase3-efficacy-report.docx",
  "created_at": "2024-01-15T10:30:00Z",
  "created_by": "user-abc123",
  "last_executed": "2024-01-15T10:35:00Z",
  "executions_count": 1
}

Update Pipeline

Endpoint
PUT /pipelines/{pipelineId}
Path Parameters
ParameterTypeRequiredDescriptionpipelineIdStringYesPipeline ID
Request Body Same format as create pipeline request. Example Request
curl -X PUT "https://api.artosai.com/pipelines/pipeline-c78d9e12" \
  -H "Authorization: Bearer your-access-token" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Updated Clinical Trial Report Generator",
    "agentIds": ["extract-efficacy", "analyze-safety", "format-report", "quality-check"],
    "documents": ["protocol-001", "results-data-002", "adverse-events-003"],
    "connectorId": "batch-clinical-001",
    "referenceDocument": "fda-report-template-v2",
    "outputFileName": "phase3-comprehensive-report.docx"
  }'
Success Response (200 OK)
{
  "pipelineId": "pipeline-c78d9e12",
  "name": "Updated Clinical Trial Report Generator",
  "status": "updated",
  "updated_at": "2024-01-15T11:30:00Z",
  "version": 2
}

Clone Pipeline

Endpoint
POST /pipelines/{pipelineId}/clone
Path Parameters
ParameterTypeRequiredDescriptionpipelineIdStringYesPipeline ID to clone
Request Body
{
  "name": "Cloned Pipeline Name"
}
Example Request
curl -X POST "https://api.artosai.com/pipelines/pipeline-preset-001/clone" \
  -H "Authorization: Bearer your-access-token" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Custom FDA Report Generator"}'
Success Response (201 Created)
{
  "pipelineId": "pipeline-clone-abc123",
  "name": "My Custom FDA Report Generator",
  "cloned_from": "pipeline-preset-001",
  "status": "created",
  "created_at": "2024-01-15T11:45:00Z"
}

Delete Pipeline

Endpoint
DELETE /pipelines/{pipelineId}
Path Parameters
ParameterTypeRequiredDescriptionpipelineIdStringYesPipeline ID
Example Request
curl -X DELETE "https://api.artosai.com/pipelines/pipeline-c78d9e12" \
  -H "Authorization: Bearer your-access-token"
Success Response (200 OK)
{
  "message": "Pipeline successfully deleted",
  "pipelineId": "pipeline-c78d9e12",
  "deleted_at": "2024-01-15T12:00:00Z"
}

Reference Document Upload

For styling templates, you need to upload reference documents separately from regular ingestion.

Upload Reference Document

Endpoint
POST /ingest/reference-document
Headers
Authorization: Bearer <access_token>
Content-Type: multipart/form-data
Request Body (Multipart Form Data)
FieldTypeRequiredDescriptionfileFileYesReference document file (.docx format)nameStringYesReference document identifierdescriptionStringNoDescription of styling template
Example Request
bash
curl -X POST "https://api.artosai.com/ingest/reference-document" \
  -H "Authorization: Bearer your-access-token" \
  -F "[email protected]" \
  -F "name=fda-report-template" \
  -F "description=Standard FDA regulatory report formatting"
Success Response (200 OK)
{
  "referenceDocumentId": "ref-doc-fda-001",
  "name": "fda-report-template",
  "filename": "fda-report-template.docx",
  "upload_date": "2024-01-15T12:30:00Z",
  "file_size": 1048576,
  "s3_url": "s3://your-bucket/reference-docs/ref-doc-fda-001/fda-report-template.docx"
}

List Reference Documents

Endpoint
GET /reference-documents
Example Response
[
  {
    "referenceDocumentId": "ref-doc-fda-001",
    "name": "fda-report-template",
    "description": "Standard FDA regulatory report formatting",
    "filename": "fda-report-template.docx",
    "upload_date": "2024-01-15T12:30:00Z",
    "file_size": 1048576
  }
]

Agent Retry Logic

When an agent fails during pipeline execution:
  1. First Failure: Agent retries immediately
  2. Second Failure: Agent retries after 30 seconds
  3. Third Failure: Agent retries after 60 seconds
  4. Final Failure: Pipeline continues with remaining agents
The pipeline will continue execution even if individual agents fail, allowing for partial completion when possible. In parallel execution scenarios, failed agents don’t block other agents from continuing their work.

Error Handling

Common Error Codes

Error CodeHTTP StatusDescriptionPIPELINE_NOT_FOUND404Pipeline does not existINVALID_AGENT_ID400One or more agent IDs are invalidINVALID_DOCUMENT_ID400One or more document IDs are invalidMISSING_REFERENCE_DOCUMENT400Reference document not foundAGENT_EXECUTION_FAILED500Agent failed after all retriesPIPELINE_EXECUTION_TIMEOUT408Pipeline execution exceeded time limitINSUFFICIENT_PERMISSIONS403User lacks permission for operation

Error Response Format

{
  "error_code": "INVALID_AGENT_ID",
  "message": "One or more agent IDs are not valid",
  "details": {
    "invalid_agents": ["nonexistent-agent"],
    "valid_agents": ["extract-efficacy", "analyze-safety", "format-report"]
  },
  "request_id": "req-abc123def456"
}

Best Practices

Pipeline Design

  1. Design agents for parallel execution - create independent agents when possible
  2. Use appropriate document sets - include all necessary source material
  3. Test with small document sets before scaling up
  4. Choose descriptive names for easy management

Execution Management

// Example: Robust pipeline execution with proper error handling
class PipelineExecutor {
    constructor(baseUrl, accessToken) {
        this.baseUrl = baseUrl;
        this.accessToken = accessToken;
        this.headers = {
            'Authorization': `Bearer ${accessToken}`,
            'Content-Type': 'application/json'
        };
    }
    
    async createAndExecutePipeline(config) {
        try {
            // Create pipeline
            const createResponse = await fetch(`${this.baseUrl}/pipelines`, {
                method: 'POST',
                headers: this.headers,
                body: JSON.stringify(config)
            });
            
            if (!createResponse.ok) {
                throw new Error(`Pipeline creation failed: ${createResponse.statusText}`);
            }
            
            const { pipelineId } = await createResponse.json();
            
            // Execute pipeline
            const executeResponse = await fetch(`${this.baseUrl}/pipelines/${pipelineId}/execute`, {
                method: 'POST',
                headers: this.headers
            });
            
            if (!executeResponse.ok) {
                throw new Error(`Pipeline execution failed: ${executeResponse.statusText}`);
            }
            
            const { jobId } = await executeResponse.json();
            
            // Poll for completion
            return await this.pollExecution(jobId);
            
        } catch (error) {
            console.error('Pipeline execution error:', error);
            throw error;
        }
    }
    
    async pollExecution(jobId, maxAttempts = 60) {
        let attempts = 0;
        const pollInterval = 30000; // 30 seconds
        
        while (attempts < maxAttempts) {
            try {
                const response = await fetch(`${this.baseUrl}/pipelines/job/${jobId}`, {
                    method: 'POST',
                    headers: this.headers
                });
                
                const jobStatus = await response.json();
                
                if (jobStatus.status === 'completed') {
                    return jobStatus;
                } else if (jobStatus.status === 'failed') {
                    throw new Error(`Pipeline failed: ${jobStatus.error.message}`);
                }
                
                console.log(`Pipeline progress: ${jobStatus.progress}%`);
                await new Promise(resolve => setTimeout(resolve, pollInterval));
                attempts++;
                
            } catch (error) {
                console.error(`Polling attempt ${attempts} failed:`, error);
                if (attempts >= maxAttempts) throw error;
                await new Promise(resolve => setTimeout(resolve, 5000));
            }
        }
        
        throw new Error('Pipeline execution timeout');
    }
}

// Usage
const executor = new PipelineExecutor("https://api.artosai.com", "your-token");

const pipelineConfig = {
    name: "Clinical Report Generator",
    agentIds: ["extract-data", "analyze-results", "format-report"],
    documents: ["protocol-001", "results-002"],
    connectorId: "batch-001",
    referenceDocument: "fda-template",
    outputFileName: "clinical-report.docx"
};

executor.createAndExecutePipeline(pipelineConfig)
    .then(result => {
        console.log('Pipeline completed:', result.output_url);
    })
    .catch(error => {
        console.error('Pipeline failed:', error);
    });

Performance Optimization

  • Leverage parallel execution by designing independent agents
  • Reuse pipelines instead of creating new ones for similar tasks
  • Limit document scope to only necessary source materials
  • Monitor execution times to optimize agent selection
  • Use reference documents to maintain consistent formatting

Integration Example

Complete Document Generation Workflow

import requests
import time
from typing import Dict, List, Any

class DocumentGenerationClient:
    def __init__(self, base_url: str, access_token: str):
        self.base_url = base_url
        self.access_token = access_token
        self.headers = {
            'Authorization': f'Bearer {access_token}',
            'Content-Type': 'application/json'
        }
    
    def create_pipeline(self, config: Dict) -> str:
        """Create a new document generation pipeline"""
        response = requests.post(
            f"{self.base_url}/pipelines",
            headers=self.headers,
            json=config
        )
        response.raise_for_status()
        return response.json()['pipelineId']
    
    def execute_pipeline(self, pipeline_id: str) -> str:
        """Execute a pipeline and return job ID"""
        response = requests.post(
            f"{self.base_url}/pipelines/{pipeline_id}/execute",
            headers=self.headers
        )
        response.raise_for_status()
        return response.json()['jobId']
    
    def poll_execution(self, job_id: str, timeout: int = 1800) -> Dict:
        """Poll pipeline execution until completion"""
        start_time = time.time()
        
        while time.time() - start_time < timeout:
            response = requests.post(
                f"{self.base_url}/pipelines/job/{job_id}",
                headers=self.headers
            )
            response.raise_for_status()
            
            job_status = response.json()
            
            if job_status['status'] == 'completed':
                return job_status
            elif job_status['status'] == 'failed':
                raise Exception(f"Pipeline failed: {job_status['error']['message']}")
            
            print(f"Progress: {job_status['progress']}%")
            time.sleep(30)  # Poll every 30 seconds
        
        raise TimeoutError("Pipeline execution timed out")
    
    def generate_document(self, pipeline_config: Dict) -> str:
        """Complete document generation workflow"""
        try:
            # Create pipeline
            pipeline_id = self.create_pipeline(pipeline_config)
            print(f"Created pipeline: {pipeline_id}")
            
            # Execute pipeline
            job_id = self.execute_pipeline(pipeline_id)
            print(f"Started execution: {job_id}")
            
            # Wait for completion
            result = self.poll_execution(job_id)
            print(f"Generation completed: {result['output_url']}")
            
            return result['output_url']
            
        except Exception as e:
            print(f"Document generation failed: {str(e)}")
            raise

# Usage example
client = DocumentGenerationClient("https://api.artosai.com", "your-access-token")

# Configure pipeline for regulatory report
config = {
    "name": "Monthly Safety Report",
    "agentIds": ["extract-adverse-events", "analyze-trends", "format-regulatory-report"],
    "documents": ["safety-data-001", "clinical-results-002"],
    "connectorId": "safety-batch-001",
    "referenceDocument": "fda-safety-template",
    "outputFileName": "monthly-safety-report-jan-2024.docx"
}

# Generate document
output_url = client.generate_document(config)
print(f"Generated document available at: {output_url}")