Skip to main content

Template Workflow

Templates are structured frameworks for generating regulatory and medical documents with consistent format, content, and compliance requirements.

What is a Template?

A template is a hierarchical structure that defines:
  • Document structure - How sections are organized
  • Content requirements - What information each section needs
  • Extraction rules - How to pull data from source documents
  • Formatting rules - Style and formatting requirements
  • Validation rules - Compliance and data validation
Templates enable organizations to generate consistent, high-quality documents by:
  • Eliminating manual document assembly
  • Ensuring regulatory compliance
  • Maintaining brand consistency
  • Accelerating document production

Template Components

1. Template

The top-level template definition that specifies:
  • Document type (CSR, IND, Protocol, etc.)
  • Overall structure and sections
  • Default extraction and formatting rules
{
  "template_id": "template-uuid",
  "template_name": "CSR Template 2024",
  "document_type": "CSR",
  "sections": [...]
}

2. Sections

Hierarchical sections that organize document content with nesting levels:
{
  "section_id": "section-1",
  "order_index": 0,
  "level": 1,
  "title": "Executive Summary",
  "synopsis": "High-level overview",
  "subsections": [
    {
      "section_id": "section-1-1",
      "order_index": 0,
      "level": 2,
      "title": "Key Findings",
      "subsections": [...]
    }
  ]
}
Section Levels:
  • Level 1: Top-level sections (chapters)
  • Level 2+: Nested subsections (subcategories)
  • Max Nesting: Unlimited levels supported

3. Extraction Rules

Rules that specify how to extract content from source documents:
{
  "rule_id": "rule-1",
  "rule_type": "extraction",
  "rule_mode": "auto",
  "source_section": "Safety Analysis",
  "description": "Extract adverse events data",
  "confidence_threshold": 0.85,
  "data_type": "structured_table"
}
Rule Types:
  • extraction - Pull specific data
  • summary - Create condensed summaries
  • synthesis - Combine multiple sources
  • validation - Check compliance
  • custom - User-defined processing

4. Outlines

Document-specific instances of templates that include:
  • Extracted content for a specific document
  • Section ordering and customization
  • Metadata about content sources
{
  "outline_id": "outline-uuid",
  "parent_template_id": "template-uuid",
  "document_type": "CSR",
  "sections": [
    {
      "section_id": "section-1",
      "title": "Executive Summary",
      "extracted_content": "Content extracted from source docs",
      "sources": ["protocol.pdf", "analysis.docx"]
    }
  ]
}

Typical Workflow

1. Create Template

Create a template that defines document structure:
POST /api/v1/templates/
{
  "template_name": "CSR Template",
  "document_type": "CSR",
  "sections": [
    {
      "order_index": 0,
      "level": 1,
      "section_name": "Executive Summary"
    },
    {
      "order_index": 1,
      "level": 1,
      "section_name": "Methodology"
    }
  ]
}

2. Define Extraction Rules

Add extraction rules to template sections to specify how to get content:
PUT /api/v1/templates/{template_id}
{
  "sections": [
    {
      "section_id": "section-1",
      "title": "Executive Summary",
      "rules": [
        {
          "rule_type": "extraction",
          "description": "Extract key findings",
          "source_field": "findings_section"
        }
      ]
    }
  ]
}

3. Create Outline from Template

Generate a document-specific outline based on the template:
# During document generation, the system:
# 1. Reads the template structure
# 2. Creates a new outline with the same sections
# 3. Extracts content using defined rules
# 4. Populates the outline with extracted content

POST /api/v1/documents/generate
{
  "template_id": "template-uuid",
  "file_paths": ["protocol.pdf", "data.xlsx"],
  ...
}

4. Generate Document

Execute the outline to produce the final document:
Template + Extraction Rules + Source Content

Outline (extracted content organized by template)

Document details (document-specific section content)

Generated Document (final DOCX output)

Use Cases

1. Regulatory Document Generation

Generate compliant regulatory documents (CSR, IND, etc.):
Source Documents (protocols, analyses, data)
↓ Extract sections
Outline (organized by regulatory requirements)
↓ Apply formatting rules
Final Document (formatted for regulatory submission)

2. Document Customization

Customize templates for different document types:
Generic CSR Template
↓ Customize for company A (add specific rules)
Company A CSR Template

Company A CSR Document

Generic CSR Template
↓ Customize for company B (different rules)
Company B CSR Template

Company B CSR Document

3. Multi-Section Documents

Organize complex documents with nested sections:
CSR Document
├── Executive Summary
│   ├── Key Findings
│   └── Conclusions
├── Methodology
│   ├── Study Design
│   └── Population
├── Results
│   ├── Safety
│   │   ├── Adverse Events
│   │   └── Laboratory Findings
│   └── Efficacy
└── Discussion

4. Content Reuse

Reuse template structure across multiple documents:
Shared Template
├── Document 1 (CSR 2024)
├── Document 2 (IND Amendment)
└── Document 3 (Update Report)

All maintain consistent structure and compliance

Architecture

┌─────────────────────────────────────┐
│    Template (Structure Definition)    │
│  ├─ Sections                         │
│  ├─ Extraction Rules                 │
│  └─ Formatting Rules                 │
└──────────────────┬────────────────────┘

                   ├─────────────────────────────┐
                   │                             │
        ┌──────────▼─────────────┐   ┌──────────▼─────────────┐
        │ Outline 1 (Doc Set A)  │   │ Outline 2 (Doc Set B)  │
        │ ├─ Extracted Sections  │   │ ├─ Extracted Sections  │
        │ ├─ Source References   │   │ ├─ Source References   │
        │ └─ Metadata            │   │ └─ Metadata            │
        └──────────┬─────────────┘   └──────────┬─────────────┘
                   │                             │
        ┌──────────▼─────────────┐   ┌──────────▼─────────────┐
        │  Document Details 1    │   │  Document Details 2    │
        │ ├─ Section Content     │   │ ├─ Section Content     │
        │ ├─ Rule Results        │   │ ├─ Rule Results        │
        │ └─ Generated Content   │   │ └─ Generated Content   │
        └──────────┬─────────────┘   └──────────┬─────────────┘
                   │                             │
        ┌──────────▼─────────────┐   ┌──────────▼─────────────┐
        │   Generated Document   │   │   Generated Document   │
        │ ├─ Formatted Sections  │   │ ├─ Formatted Sections  │
        │ ├─ Styled Content      │   │ ├─ Styled Content      │
        │ └─ Final DOCX Output   │   │ └─ Final DOCX Output   │
        └──────────────────────────  └──────────────────────────

Key Concepts

Section Hierarchy

Sections are organized hierarchically with levels:
  • Level 1: Main sections (equivalent to chapters)
  • Level 2: Subsections (equivalent to section headings)
  • Level 3+: Further nesting as needed
The order_index determines position within each level.

Extraction Rules

Rules define how content is extracted and processed:
{
  "rule_type": "extraction",
  "rule_mode": "auto",           // or "manual"
  "source_document": "protocol",
  "source_section": "Safety",
  "target_section": "Safety Data",
  "processing": "table_to_text"
}

Content Organization

Content is organized by:
  1. Source - Where data comes from (which document/section)
  2. Extraction Rule - How to process the data
  3. Target Section - Where it goes in the final document
  4. Formatting - How it’s styled in output

Best Practices

  1. Start Simple - Begin with basic section structure, add rules incrementally
  2. Reuse Templates - Create templates for common document types
  3. Version Templates - Maintain template versions for compliance tracking
  4. Document Rules - Keep clear documentation of extraction rules
  5. Test Extraction - Validate extraction rules on sample documents before production use
  6. Monitor Quality - Review generated documents for extraction accuracy