# The Revision Engine

## A Platform for Cognitive Prosthesis Narrative Development

**Document ID:** CNL-TN-2026-010  
**Version:** 1.0  
**Date:** January 25, 2026  
**Author:** Michael P. Hamilton, Ph.D.  
**Derivation:** Extends CNL-TN-2025-022 (The Novelization Engine)

---

**AI Assistance Disclosure:** This technical note was developed collaboratively with Claude (Anthropic, claude-opus-4-5-20250514). The platform described herein was built through iterative human-AI collaboration, with Claude contributing to architecture design, code generation, and documentation. The author takes full responsibility for the content, technical decisions, and conclusions.

---

## Abstract

This technical note documents the Revision Engine—a web-based platform for systematic manuscript revision through human-AI collaboration. The platform extends the Novelization Engine methodology (CNL-TN-2025-022) from drafting into revision, implementing quantified diagnostic tools that transform subjective editorial intuition into actionable data. Core innovations include: a four-dimension engagement scoring system with computed aggregate metrics; voice fingerprinting and similarity detection across characters; automated dropout zone identification; visual heatmap interfaces for manuscript-level pattern recognition; and a complete export-process-import workflow for AI-assisted revision with version control. Applied to *Hot Water* (218,681 words across 101 chapters), the platform enabled identification of 47 dropout zones, quantification of voice blur across 15 characters, and systematic triage of revision priorities. We introduce the term "cognitive prosthesis narrative development" to describe this approach: extending human cognitive capacity for holding entire manuscripts in working memory while tracking consistency, voice, and engagement across novel-scale texts. The platform demonstrates that AI collaboration in creative work may be most valuable not for generating content but for generating diagnostic infrastructure that makes revision tractable at scale.

---

## 1. Introduction

### 1.1 The Revision Problem

The Novelization Engine (CNL-TN-2025-022) documented a methodology for completing long-incubated fiction through structured human-AI collaboration. That methodology addressed the *drafting* problem: how to synthesize accumulated creative material into a coherent manuscript. A companion problem remained unaddressed: how to *revise* that manuscript systematically when it exceeds human working memory capacity.

Novel-length fiction presents a cognitive challenge that intensifies during revision. A 60,000-word manuscript contains approximately 200 pages; a 200,000-word trilogy approaches 700 pages. Traditional revision approaches rely on the author's memory, supplemented by notes and multiple reading passes. These methods are vulnerable to several failure modes:

**Consistency drift** — The author's mental model of the story evolves during revision, introducing new contradictions while attempting to fix old ones.

**Voice blur** — Character voices converge toward the author's default register as revision homogenizes prose.

**Local optimization** — Scene-level improvements may degrade manuscript-level pacing or reader engagement patterns.

**Revision fatigue** — Repeated passes through the same material produce diminishing returns as the author loses fresh perspective.

The Revision Engine addresses these challenges by externalizing diagnostic functions that authors typically perform intuitively. Rather than relying on memory and instinct to identify problem areas, the platform generates quantified metrics that make manuscript-level patterns visible and tractable.

### 1.2 Cognitive Prosthesis Narrative Development

We introduce the term "cognitive prosthesis narrative development" to describe the approach implemented in this platform. The term draws on Andy Clark and David Chalmers' Extended Mind thesis [1]: cognitive processes extend beyond the brain when external resources are reliably available, automatically endorsed, and easily accessible.

The Revision Engine functions as a cognitive prosthesis in a specific sense: it extends the author's capacity to hold the entire manuscript in working memory while simultaneously tracking multiple dimensions of quality. No human author can maintain awareness of engagement scores, voice consistency, crutch word density, and reader state across 100+ chapters while making local revision decisions. The platform makes this possible by:

1. **Quantifying** subjective editorial intuitions into comparable metrics
2. **Visualizing** manuscript-level patterns that exceed human perception
3. **Tracking** changes and their effects across revision iterations
4. **Exporting** diagnostic context for AI-assisted revision processing
5. **Importing** revised content with version control and staging

The cognitive load distribution model from CNL-TN-2025-022 extends to revision:

| Load Category | Human Contribution | AI Contribution | Platform Output |
|--------------|-------------------|-----------------|-----------------|
| Quality judgment | Primary | Scoring assistance | Validated metrics |
| Pattern recognition | Manuscript knowledge | Context capacity | Diagnostic visualizations |
| Voice consistency | Ear for authenticity | Profile matching | Voice fingerprints |
| Revision execution | Editorial control | Prose generation | Staged revisions |
| Version control | Approval decisions | Tracking automation | Revision history |

### 1.3 Relationship to the Novelization Engine

The Novelization Engine (CNL-TN-2025-022) established documentation infrastructure for drafting: living story bible, character templates, reader state tracking, place documentation, and the eleven-component scene schema. The Revision Engine builds on this foundation by adding:

**Diagnostic layer** — Automated analysis tools that transform prose into quantified metrics

**Visualization layer** — Heatmap and dashboard interfaces for manuscript-level pattern recognition

**Triage layer** — Classification systems that convert diagnosis into actionable revision priorities

**Workflow layer** — Export/import pipeline for AI-assisted revision with version control

The Serialization Engine (CNL-TN-2025-023) subsequently demonstrated that this combined infrastructure produces format-agnostic story systems rather than format-specific manuscripts. The three documents form a trilogy:

1. **Novelization Engine** — How to draft (methodology)
2. **Revision Engine** — How to revise (platform)
3. **Serialization Engine** — What the methodology produces (theory)

### 1.4 Scope

This technical note covers:

* Theoretical foundations for quantified revision
* Database architecture and schema design
* The four-dimension engagement scoring system
* Voice analysis and fingerprinting
* Heatmap visualization and dropout zone detection
* The triage classification system
* Export-process-import workflow for AI collaboration
* Results from application to *Hot Water*
* Implications for creative AI applications

---

## 2. Theoretical Framework

### 2.1 The Quantification Principle

Traditional editorial feedback operates in qualitative registers: "this scene feels slow," "the pacing drags in the middle," "these characters sound too similar." Such feedback identifies problems but provides limited guidance for systematic repair. The author must translate qualitative intuition into specific revision decisions through trial and error.

The Revision Engine implements a quantification principle: every qualitative editorial intuition can be decomposed into measurable dimensions that, aggregated, approximate the intuition's signal. "This scene feels slow" might decompose into:

- Low stakes (nothing at risk)
- Low resistance (no conflict)
- Low change (static situation)
- Low question pull (no reason to continue)

Each dimension becomes a 0-3 scale. The aggregate (0-12) provides a comparable engagement score across all scenes in the manuscript. Scenes scoring below threshold become visible revision targets.

This approach does not claim that numbers capture the full complexity of literary quality. Rather, it claims that quantified proxies are *useful* for identifying patterns that exceed human working memory. The author retains full editorial judgment; the platform surfaces candidates for that judgment to evaluate.

### 2.2 The Heatmap Metaphor

Thermal imaging reveals temperature patterns invisible to the naked eye. A building inspector uses infrared cameras to identify heat loss through insulation failures. The patterns exist; the technology makes them visible.

The manuscript heatmap applies this metaphor to engagement patterns. Each chapter receives a color-coded cell based on diagnostic metrics. Viewed individually, any chapter might seem acceptable. Viewed as a heatmap, patterns emerge:

- Consecutive red cells indicate dropout zones where readers will quit
- Yellow clusters reveal pacing problems spanning multiple chapters
- Voice scores trending downward suggest character convergence
- Crutch word density peaks identify prose requiring attention

The heatmap makes visible what the author cannot perceive through sequential reading: the manuscript's engagement topography.

### 2.3 The Triage Model

Emergency medicine developed triage to allocate limited resources efficiently: identify patients who will survive without intervention, patients who cannot be saved, and patients where intervention matters most. Resources flow to the third category.

Manuscript revision benefits from similar prioritization. Not all chapters require equal attention:

**KEEP** — Scene works. Intervention unnecessary.

**TRIM** — Cut 20-40%. Specific problems identifiable and fixable.

**COMPRESS** — Major reduction (50%+). Preserve only essential beats.

**CONVERT** — Format change required (journal → scene, summary → action).

**MERGE** — Combine with adjacent material.

**DELETE** — Remove entirely; relocate essential plot information.

Triage classification emerges from diagnostic data: engagement scores, voice metrics, crutch word density, and structural analysis. The classification provides actionable guidance that sequential reading cannot.

### 2.4 The Export-Process-Import Cycle

AI language models excel at prose transformation within specified parameters but lack persistent memory across sessions. The Revision Engine addresses this limitation through a structured workflow:

**Export** — Generate a revision package containing:
- Complete manuscript text with chapter boundaries
- Character voice profiles with vocabulary signatures
- Engagement scores and triage classifications
- Crutch word alerts and revision guidelines
- Voice analysis summary with similarity warnings

**Process** — Submit to large-context AI (Gemini 1M, Claude) for targeted revision following embedded guidelines

**Import** — Paste revised content into staging system with:
- Side-by-side diff view against original
- Word count tracking (original → revised → delta)
- Revision history logging
- Publish/unpublish toggle for A/B comparison

This cycle enables AI collaboration at novel scale while maintaining human editorial control over all revision decisions.

---

## 3. Platform Architecture

### 3.1 Technology Stack

The Revision Engine is implemented as a PHP/MySQL web application:

- **Server**: Apache 2.4 on macOS (Mac Mini M4 Pro)
- **Database**: MySQL 8.4 with InnoDB storage engine
- **Backend**: PHP 8.3 with mysqli (no PDO abstraction)
- **Frontend**: Vanilla HTML/CSS/JavaScript (no framework dependencies)
- **CLI Tools**: PHP scripts for batch operations and API integration

Design principles:

1. **Simplicity** — No framework overhead; readable, maintainable code
2. **Direct database access** — mysqli prepared statements; phpMyAdmin for administration
3. **File-based content** — Markdown import/export; version-controlled externally
4. **Progressive enhancement** — Core functionality works without JavaScript

### 3.2 Database Schema Overview

The schema comprises 23 tables organized into functional groups:

**Content Tables**
- `parts` — Trilogy volumes (SIGNAL, CHRONICLE, ANCESTOR)
- `chapters` — Individual chapters with content, metadata, and revision staging
- `images` — Artwork and illustrations
- `chapter_assets` — Image-chapter associations

**Diagnostic Tables**
- `chapter_diagnostics` — Aggregate metrics per chapter
- `scene_analysis` — Scene-level engagement scores and voice metrics
- `tic_word_config` — Configurable crutch word detection
- `tic_word_occurrences` — Per-chapter crutch word counts
- `crutch_word_totals` — Manuscript-level aggregates
- `dropout_zones` — Detected reader abandonment risk areas

**Voice Analysis Tables**
- `voice_fingerprints` — Character voice signatures per chapter
- `voice_analysis_summary` — Character-level dialogue statistics
- `voice_similarity` — Pairwise voice confusion detection
- `chapter_prose_summary` — Telling/showing analysis per chapter

**Revision Tables**
- `revision_history` — Complete revision audit trail
- `chapters.revised_content` — Staging field for pending revisions
- `chapters.show_revised` — Toggle for A/B content display

**Analytics Tables**
- `chapter_views` — Privacy-respecting read tracking (IP hashed)
- `subscribers` — Email list with double opt-in
- `downloads` — EPUB/PDF generation tracking

### 3.3 Key Schema Innovations

**Computed Engagement Score**

The `scene_analysis` table uses a MySQL generated column for engagement scoring:

```sql
engagement_score TINYINT UNSIGNED GENERATED ALWAYS AS (
    stakes + resistance + change_level + question_pull
) STORED
```

This ensures score consistency: changing any component automatically updates the aggregate. The `STORED` attribute enables indexing for efficient sorting and filtering.

**Revision Staging**

The `chapters` table implements a dual-content architecture:

```sql
content LONGTEXT,           -- Original/current content
revised_content LONGTEXT,   -- Staged revision (pending)
show_revised TINYINT(1)     -- Display toggle (0=original, 1=revised)
```

This enables:
- Non-destructive revision (original preserved)
- A/B comparison by toggling `show_revised`
- Incremental publishing (some chapters revised, others not)
- Rollback capability (clear `revised_content`, set `show_revised=0`)

**JSON-Stored Analysis Data**

Complex analysis results use JSON columns for flexibility:

```sql
telling_instances JSON,      -- Array of {text, paragraph}
crutch_words JSON,          -- Object {word: count}
dialogue_by_character JSON,  -- Object {character: word_count}
voice_bleed JSON,           -- Array of {text, paragraph, expected_voice}
```

JSON storage enables schema evolution without migration and efficient storage of variable-structure data.

**Database Views for Common Queries**

Four views simplify common access patterns:

```sql
v_chapter_heatmap      -- Joins chapters, parts, diagnostics for heatmap display
v_triage_queue         -- Chapters requiring revision, sorted by priority
v_revision_progress    -- Per-part revision completion statistics
v_dropout_zones_active -- Unresolved dropout zones with chapter titles
```

---

## 4. The Engagement Scoring System

### 4.1 Theoretical Basis

Reader engagement in fiction correlates with specific narrative qualities identifiable at scene level. Drawing on Robert McKee's *Story* [2] and craft knowledge accumulated through the Novelization Engine collaboration, we identify four orthogonal dimensions:

**Stakes** — What is at risk if things go wrong?

**Resistance** — What pushes back against what the character wants?

**Change** — How different is the situation at the end versus the beginning?

**Question Pull** — Does the scene ending compel the reader to continue?

Each dimension operates independently: a scene can have high stakes but low resistance, or high change but low question pull. The four-dimension model captures distinct failure modes invisible to single-metric approaches.

### 4.2 Scoring Rubrics

Each dimension uses a 0-3 scale with explicit anchors:

**STAKES (S)**
| Score | Definition | Example |
|-------|------------|---------|
| 0 | Nothing at stake | Characters chatting, pure exposition |
| 1 | Mild discomfort | Awkward social moment, minor inconvenience |
| 2 | Real consequences | Relationship damage, reputation at risk |
| 3 | Survival or identity | Physical danger, existential threat |

**RESISTANCE (R)**
| Score | Definition | Example |
|-------|------------|---------|
| 0 | Nothing pushes back | Character gets what they want easily |
| 1 | Internal doubt | Self-questioning, hesitation |
| 2 | Interpersonal conflict | Disagreement, competing goals |
| 3 | Active antagonism | Direct opposition, blocked path |

**CHANGE (C)**
| Score | Definition | Example |
|-------|------------|---------|
| 0 | Static | Same mental/physical state as start |
| 1 | Information gained | Character learns something new |
| 2 | Decision made | Character commits to a path |
| 3 | Irreversible shift | Point of no return, permanent change |

**QUESTION PULL (Q)**
| Score | Definition | Example |
|-------|------------|---------|
| 0 | Resolved | Natural stopping point, reader satisfied |
| 1 | Mild curiosity | Slight interest in what happens next |
| 2 | Need to know | Unanswered question that nags |
| 3 | Hooked | Cannot stop reading, must continue |

### 4.3 Engagement Thresholds

Aggregate scores (0-12) map to reader engagement risk levels:

| Score Range | Classification | Reader Behavior |
|-------------|----------------|-----------------|
| 10-12 | Gripping | Cannot put down |
| 7-9 | Solid | Engaged, will continue |
| 4-6 | Vulnerable | At risk of skimming |
| 0-3 | Dropout | Reader will quit |

The **minimum viable scene** target is 8/12, requiring at least moderate engagement across all four dimensions.

### 4.4 Scoring Implementation

Engagement scoring operates through two complementary mechanisms:

**Manual Scoring (Scene Audit Interface)**

The `scene-audit.php` interface presents chapter content alongside scoring controls. Human scorers evaluate each dimension against the rubric, with scores saved to `scene_analysis`. This approach provides highest accuracy but requires significant time investment.

**AI-Assisted Scoring (CLI Tool)**

The `score-scenes.php` CLI tool submits chapters to Claude API with embedded scoring rubrics and voice profiles. The AI returns JSON-structured scores that populate `scene_analysis`. This approach enables batch processing of entire manuscripts.

```bash
php score-scenes.php --model=sonnet              # Score all unscored
php score-scenes.php --model=opus --chapter=42   # Score specific chapter
php score-scenes.php --model=sonnet --rescore    # Re-score everything
```

AI scoring includes linguistic analysis beyond engagement:
- Voice distinctiveness (1-5)
- Profile adherence (1-5)
- Telling instances with locations
- Crutch word counts
- Dialogue distribution by character
- Round-robin monologue detection

### 4.5 Score Aggregation

Chapter-level diagnostics aggregate from scene-level scores:

```sql
UPDATE chapter_diagnostics SET
    avg_engagement_score = (SELECT AVG(engagement_score) FROM scene_analysis WHERE chapter_id = ?),
    min_engagement_score = (SELECT MIN(engagement_score) FROM scene_analysis WHERE chapter_id = ?),
    max_engagement_score = (SELECT MAX(engagement_score) FROM scene_analysis WHERE chapter_id = ?)
WHERE chapter_id = ?
```

The `avg_engagement_score` drives heatmap coloring; `min_engagement_score` identifies chapters with weak sections even if average is acceptable.

---

## 5. Voice Analysis System

### 5.1 The Voice Blur Problem

Character voice is among the most difficult qualities to maintain across novel-length fiction. Each character should sound distinct through vocabulary choices, sentence patterns, and metaphor domains. In practice, characters often converge toward the author's default voice, particularly during revision when the author is focused on other concerns.

The *Hot Water* manuscript presents acute voice challenges:
- Six primary POV characters (Amara, David, Susan, Margaret, Jennifer, Starseed)
- Three temporal strands (modern, Darwin 1830s, Pictish 570 CE)
- Technical vocabularies spanning physics, geology, biology, engineering, archaeology
- 15+ speaking characters requiring distinct voices

### 5.2 Voice Profiles

Each major character receives a documented voice profile specifying:

**Vocabulary Signature** — Domain-specific terms the character naturally uses:
- Amara: tolerances, load-bearing, thermal gradient, structural integrity
- David: coherence, eigenstate, superposition, probability amplitude
- Susan: taxonomy, phylogeny, adaptation, ecological niche
- Margaret: crystalline matrix, stratification, metamorphic, grain boundaries

**Sentence Patterns** — Characteristic syntactic structures:
- Amara: Direct and declarative, precise but warm
- David: Short questioning sentences, trails off when confused
- Susan: Patient observation building to insight
- Margaret: Scottish rhythm, unhurried geological perspective

**Metaphor Domains** — Where the character draws comparisons:
- Amara: Architecture, materials science, electrical systems
- David: Physics, mathematics, uncertainty
- Susan: Evolution, organic systems, deep time
- Margaret: Rocks, minerals, slow processes

### 5.3 Voice Metrics

The scoring system evaluates voice quality on two dimensions:

**Voice Distinctiveness (1-5)** — Does the POV character sound distinct from other characters?
| Score | Definition |
|-------|------------|
| 1 | Indistinguishable from other characters |
| 2 | Occasional distinctive moments |
| 3 | Moderately distinct voice |
| 4 | Clearly recognizable voice |
| 5 | Unmistakably unique |

**Profile Adherence (1-5)** — Does the character use vocabulary and metaphors from their profile?
| Score | Definition |
|-------|------------|
| 1 | None of expected vocabulary present |
| 2 | Rare use of profile vocabulary |
| 3 | Moderate use of profile vocabulary |
| 4 | Strong use of profile vocabulary |
| 5 | Vocabulary fully consistent with profile |

### 5.4 Voice Analysis Aggregation

The `aggregate-voice-analysis.php` CLI tool processes scored scenes to generate manuscript-level voice statistics:

**Dialogue Distribution** — Word count and percentage by character:
```
| Character | Words   | %     | Distinctiveness | Adherence |
|-----------|---------|-------|-----------------|-----------|
| David     | 19,581  | 21.1% | 2.8             | 2.6       |
| Susan     | 12,226  | 13.2% | 2.7             | 2.5       |
| Margaret  | 11,538  | 12.5% | 3.6             | 3.4       |
```

**Voice Similarity Warnings** — Character pairs with high confusion risk:
```
| Character A | Character B | Similarity | Shared Crutch Words |
|-------------|-------------|------------|---------------------|
| David       | Susan       | 0.72       | pattern, structure  |
```

**Prose Problem Scores** — Chapters ranked by telling constructions, restatements, and voice issues

### 5.5 Voice Bleed Detection

AI scoring identifies specific instances where POV characters use vocabulary belonging to other characters:

```json
"voice_bleed": [
    {
        "text": "crystalline matrix",
        "paragraph": 12,
        "expected_voice": "Margaret"
    }
]
```

When Susan (biologist) thinks in geological vocabulary, the scene needs revision to restore voice integrity.

---

## 6. Heatmap Visualization

### 6.1 Design Principles

The heatmap interface (`heatmap.php`) transforms diagnostic data into a visual representation enabling manuscript-level pattern recognition. Design principles:

**Color Psychology** — Engagement maps to intuitive thermal scale:
- Green (10-12): Cool/safe, no intervention needed
- Yellow (7-9): Warm/caution, monitor but acceptable
- Orange (4-6): Hot/warning, revision candidate
- Red (0-3): Critical/danger, priority revision target

**Information Density** — Each cell encodes multiple dimensions:
- Background color: Engagement score
- Text: Chapter title (truncated)
- Badge: Triage action (if assigned)
- Icon: Revision status (pending/revised/published)

**Spatial Organization** — Chapters ordered by reading sequence:
- Grouped by Part (SIGNAL, CHRONICLE, ANCESTOR)
- Sorted by display_order within Part
- Visual breaks between Parts

### 6.2 Heatmap Data Structure

The `v_chapter_heatmap` view joins necessary tables:

```sql
CREATE VIEW v_chapter_heatmap AS
SELECT 
    c.id, c.title, c.chapter_type, c.word_count, c.display_order,
    p.id AS part_id, p.name AS part_name, p.display_order AS part_order,
    COALESCE(cd.avg_engagement_score, 0) AS engagement_score,
    COALESCE(cd.overall_skim_risk, 'medium') AS skim_risk,
    COALESCE(cd.avg_exposition_density, 0) AS exposition_density,
    COALESCE(cd.total_tic_words, 0) AS tic_words,
    cd.triage_action, cd.triage_priority
FROM chapters c
JOIN parts p ON c.part_id = p.id
LEFT JOIN chapter_diagnostics cd ON c.id = cd.chapter_id
ORDER BY p.display_order, c.display_order
```

### 6.3 Dropout Zone Detection

The system automatically identifies contiguous sequences of low-engagement chapters:

```php
function detectDropoutZones($conn, $threshold = 5, $minLength = 2) {
    // Query chapters ordered by position
    // Identify sequences where avg_engagement_score < threshold
    // Return zones with: start_chapter, end_chapter, scene_count, 
    //                    avg_score, diagnosis, recommendation
}
```

Detected zones are classified by severity:
- **Warning**: 2 consecutive low-engagement chapters
- **Danger**: 3-4 consecutive low-engagement chapters  
- **Critical**: 5+ consecutive low-engagement chapters

Zone types capture specific patterns:
- `consecutive_low_engagement`: Generic dropout risk
- `exposition_cluster`: Multiple explanation-heavy chapters
- `voice_blur`: Character distinctiveness declining
- `pacing_stall`: Static scenes without change
- `journal_sequence`: Multiple document-type chapters

### 6.4 Statistics Dashboard

The heatmap page displays aggregate statistics:

```
Total Chapters: 101
Total Words: 218,681
Scored: 89/101 (88%)
Avg Engagement: 6.4/12
Dropout Zones: 4 active

Revision Progress:
- Revised: 23
- Published: 18
- Pending: 60
```

These metrics provide manuscript health indicators at a glance.

---

## 7. Triage System

### 7.1 Triage Actions

Six triage actions classify revision requirements:

| Action | Definition | Target Reduction |
|--------|------------|------------------|
| KEEP | Scene works; preserve structure and length | 0% |
| TRIM | Cut without losing essential content | 20-40% |
| COMPRESS | Major reduction; preserve only key beats | 50%+ |
| CONVERT | Change format entirely | Variable |
| MERGE | Combine with adjacent material | Consolidation |
| DELETE | Remove entirely; relocate plot info | 100% |

### 7.2 Triage Assignment

Triage actions can be assigned through:

**Manual Assignment** — Scene audit interface includes triage dropdown and notes field

**AI-Assisted Assignment** — Scoring prompt requests triage recommendation based on engagement metrics and prose analysis

**Algorithmic Rules** — Automated assignment based on thresholds:
```php
if ($engagement_score <= 3) {
    $triage_action = 'delete';
} elseif ($engagement_score <= 5 && $exposition_density > 0.3) {
    $triage_action = 'compress';
} elseif ($tic_word_count > 20) {
    $triage_action = 'trim';
}
```

### 7.3 Triage Queue

The `v_triage_queue` view surfaces chapters requiring attention:

```sql
CREATE VIEW v_triage_queue AS
SELECT c.id, c.title, c.word_count, p.name AS part_name,
       cd.avg_engagement_score, cd.triage_action, cd.triage_priority
FROM chapters c
JOIN parts p ON c.part_id = p.id
LEFT JOIN chapter_diagnostics cd ON c.id = cd.chapter_id
WHERE cd.triage_action IS NOT NULL 
  AND cd.triage_action != 'keep'
ORDER BY cd.triage_priority ASC, cd.avg_engagement_score ASC
```

The queue prioritizes:
1. Critical triage actions (delete, compress)
2. Lower engagement scores within action category
3. Earlier chapters (reader reaches them first)

### 7.4 Triage Notes

Each triage assignment includes explanatory notes:

```
Scene 42 (triage: COMPRESS): "Three consecutive monologues without 
conflict. David and Susan explain the same discovery to each other 
that readers already understand. Keep only Margaret's reaction and 
the final decision to proceed."
```

Notes provide revision guidance beyond the action classification.

---

## 8. Export-Process-Import Workflow

### 8.1 Revision Package Export

The `export-revision-package.php` CLI tool generates a comprehensive revision package:

```bash
php export-revision-package.php                    # Full export
php export-revision-package.php --part=2           # Part 2 only
php export-revision-package.php --triage-only      # Just manifest
```

The package includes:

**Header Section**
- Generation timestamp
- Total chapters and word count
- Export parameters

**Voice Differentiation Section**
- Complete voice profiles for all POV characters
- Vocabulary signatures
- Sentence patterns
- Metaphor domains

**Crutch Word Alert**
- High-frequency terms (100+ occurrences)
- Medium-frequency terms (25-99 occurrences)
- Phrase patterns to eliminate

**Revision Guidelines**
- Triage action definitions
- Common problems to fix (with detection flags)
- Engagement score targets
- Voice score targets

**Voice Analysis Summary**
- Dialogue distribution by character
- Voice similarity warnings
- Chapters with worst prose problem scores

**Triage Manifest**
- Per-chapter: engagement score, triage action, triage notes, voice scores

**Full Manuscript Content**
- Chapter headers with metadata
- Complete text in Markdown format
- Scene boundaries where applicable

### 8.2 AI Processing

The revision package is designed for large-context language models:

**Gemini 1M Context** — Can process entire trilogy (218K words) plus guidelines

**Claude with Projects** — Revision guidelines as project knowledge; chapters processed individually or in batches

The embedded guidelines provide the AI with:
- What to fix (triage actions)
- How to fix it (revision guidelines)  
- What to preserve (voice profiles)
- What to eliminate (crutch words)

### 8.3 Revision Import

The `revision-import.php` interface handles revised content:

**Chapter Selection**
- Dropdown listing all chapters with triage status
- Color-coded by revision state (pending/revised/published)

**Side-by-Side View**
- Original content (left panel)
- Revised content or input area (right panel)
- Rendered Markdown for readability

**Word Count Tracking**
- Original word count
- Revised word count
- Delta (positive or negative)
- Real-time calculation as content is pasted

**Revision Actions**
- **Save to Staging** — Store in `revised_content` without publishing
- **Publish Revision** — Set `show_revised = 1` to display revised version
- **Unpublish** — Revert to original (`show_revised = 0`)
- **Clear Revision** — Delete staged content entirely

**Revision History**
- Timestamped log of all revisions
- Original and revised content preserved
- Word delta tracked
- Notes field for revision description

### 8.4 Version Control for Prose

The staging system enables sophisticated version control:

**A/B Testing** — Toggle `show_revised` to compare reader engagement between original and revised versions

**Incremental Publishing** — Some chapters can show revised content while others remain original

**Rollback Capability** — Original content always preserved; can revert any chapter

**Audit Trail** — Complete history of revisions with timestamps and word deltas

---

## 9. Results

### 9.1 Application to *Hot Water*

The Revision Engine was developed iteratively alongside the *Hot Water* manuscript:

| Metric | Value |
|--------|-------|
| Total chapters | 101 |
| Total words | 218,681 |
| Parts | 3 (SIGNAL, CHRONICLE, ANCESTOR) |
| Chapters scored | 89 (88%) |
| Average engagement | 6.4/12 |
| Dropout zones detected | 47 |
| Characters with voice data | 15 |
| Revision history entries | 156 |

### 9.2 Diagnostic Findings

**Engagement Distribution**
- Gripping (10-12): 12 chapters (12%)
- Solid (7-9): 34 chapters (34%)
- Vulnerable (4-6): 31 chapters (31%)
- Dropout (0-3): 12 chapters (12%)
- Unscored: 12 chapters (12%)

**Triage Classification**
- KEEP: 23 chapters
- TRIM: 31 chapters
- COMPRESS: 18 chapters
- CONVERT: 8 chapters
- DELETE: 4 chapters
- Unassigned: 17 chapters

**Voice Analysis**
- Highest distinctiveness: ARCHIE (AI character), 4.0/5
- Lowest distinctiveness: David, 2.8/5
- Highest similarity pair: David/Susan, 0.72

### 9.3 Pattern Discovery

The heatmap revealed patterns invisible to sequential reading:

**Journal Cluster Problem** — Four consecutive journal chapters in Part 2 created a documentation slog. Triage: convert two to dramatized scenes, merge two others.

**Darwin Interlude Pacing** — Darwin sections consistently scored higher engagement than modern timeline. The contrast highlighted modern sections needing more conflict.

**Voice Convergence in Act 3** — Distinctiveness scores declined in final chapters as revision pressure increased. Systematic voice restoration required.

**Exposition Front-Loading** — First chapters of each Part scored lower due to setup exposition. Structural revision to embed exposition in conflict.

### 9.4 Revision Workflow Results

The export-process-import workflow enabled:

- Batch processing of 12 chapters in single Gemini session
- Consistent application of voice profiles across revisions
- Word count reduction of 23% in targeted chapters
- Voice distinctiveness improvement averaging 0.8 points post-revision

---

## 10. Implementation Details

### 10.1 Text Analysis Functions

The `heatmap_functions.php` library provides core analysis utilities:

**Exposition Density**
```php
function calcExpositionDensity($text) {
    $markers = ['which means', 'in other words', 'because', 
                'therefore', 'essentially', 'basically'];
    $sentenceCount = countSentences($text);
    $markerCount = 0;
    foreach ($markers as $marker) {
        $markerCount += substr_count(strtolower($text), $marker);
    }
    return round($markerCount / $sentenceCount, 3);
}
```

**Dialogue Ratio**
```php
function calcDialogueRatio($text) {
    // Extract content within quotation marks
    preg_match_all('/[""]([^""]+)[""]/u', $text, $matches);
    $dialogueWords = str_word_count(implode(' ', $matches[1]));
    $totalWords = str_word_count(strip_tags($text));
    return round($dialogueWords / $totalWords, 3);
}
```

**Tic Word Detection**
```php
function countTicWords($text, $ticWords) {
    $results = [];
    foreach ($ticWords as $word) {
        $pattern = '/\b' . preg_quote($word, '/') . '\b/i';
        preg_match_all($pattern, $text, $matches);
        if (count($matches[0]) > 0) {
            $results[$word] = count($matches[0]);
        }
    }
    return $results;
}
```

### 10.2 API Integration

The `score-scenes.php` tool integrates with Claude API:

```php
function scoreChapterWithAPI($content, $model, $voiceProfiles) {
    $prompt = buildScoringPrompt($content, $voiceProfiles);
    
    $response = callClaudeAPI([
        'model' => MODELS[$model],
        'max_tokens' => MAX_TOKENS,
        'messages' => [
            ['role' => 'user', 'content' => $prompt]
        ]
    ]);
    
    return parseJSONResponse($response);
}
```

The prompt includes:
- Voice profiles for all characters
- Scoring rubrics with explicit anchors
- Linguistic analysis instructions
- JSON output format specification

### 10.3 Privacy-Respecting Analytics

Reader engagement tracking uses IP hashing:

```php
function hashIP($ip) {
    return hash('sha256', $ip . date('Y-m'));  // Monthly rotation
}
```

This enables:
- Unique visitor counting without storing IPs
- Reading pattern analysis without individual tracking
- Monthly hash rotation limits tracking duration
- No personally identifiable information stored

---

## 11. Discussion

### 11.1 Cognitive Prosthesis Effectiveness

The Revision Engine demonstrates that systematic diagnostic infrastructure extends human cognitive capacity for manuscript revision. Specific benefits:

**Pattern Visibility** — The heatmap reveals engagement topography invisible to sequential reading. Authors can see the manuscript as readers experience it rather than as they wrote it.

**Quantified Prioritization** — Triage classification converts intuitive "this needs work" into actionable priority queues. Limited revision time flows to highest-impact chapters.

**Voice Maintenance** — Profile-based voice analysis catches convergence before it becomes endemic. Revision can strengthen distinctiveness rather than homogenize further.

**Systematic Workflow** — The export-process-import cycle enables AI collaboration at scale while maintaining human editorial control. The author directs; the system executes.

### 11.2 Limitations

**Single-Author Development** — The platform was built for a specific author's workflow. Generalization to other authors remains untested.

**Scoring Subjectivity** — Engagement dimensions are proxies for reader experience. Scores reflect systematic approximation, not ground truth.

**AI Dependency** — Batch scoring requires API access and incurs costs. Voice analysis quality depends on model capability.

**Integration Overhead** — The full platform requires MySQL database, PHP server, and CLI access. Simpler tools might serve authors with different technical backgrounds.

### 11.3 Implications for Creative AI

The Revision Engine suggests that AI collaboration in creative work may be most valuable for **infrastructure generation** rather than content generation:

- Diagnostic systems that quantify editorial intuition
- Visualization tools that reveal manuscript-level patterns
- Workflow automation that maintains human control
- Version control that enables experimentation without risk

This complements the Novelization Engine finding that AI collaboration produces format-agnostic story systems. Together, the engines demonstrate a paradigm: **AI as cognitive prosthesis for creative work**, extending human capacity rather than replacing human judgment.

---

## 12. Conclusion

The Revision Engine provides a platform for systematic manuscript revision through human-AI collaboration. Core innovations include:

1. **Four-dimension engagement scoring** with computed aggregates enabling manuscript-level comparison

2. **Voice fingerprinting** with similarity detection preventing character convergence

3. **Heatmap visualization** revealing engagement patterns invisible to sequential reading

4. **Triage classification** converting diagnosis into prioritized revision queues

5. **Export-process-import workflow** enabling AI collaboration while maintaining editorial control

6. **Version control for prose** supporting A/B testing, incremental publishing, and rollback

Applied to *Hot Water* (218,681 words, 101 chapters), the platform enabled identification of 47 dropout zones, quantification of voice metrics across 15 characters, and systematic triage of revision priorities.

The central finding: **cognitive prosthesis narrative development**—extending human cognitive capacity for holding entire manuscripts in working memory while tracking multiple quality dimensions—makes tractable what would otherwise exceed human capability. The Revision Engine demonstrates that AI collaboration adds most value not by generating content but by generating infrastructure that makes human revision systematic at novel scale.

---

## References

[1] Clark, A. & Chalmers, D. (1998). "The Extended Mind." *Analysis*, 58(1), 7-19.

[2] McKee, R. (1997). *Story: Substance, Structure, Style, and the Principles of Screenwriting*. ReganBooks.

[3] Hamilton, M.P. (2025). "The Novelization Engine: A Methodology for AI-Augmented Long-Form Fiction Development." Canemah Nature Laboratory Technical Note CNL-TN-2025-022. https://canemah.org/archive/document.php?id=CNL-TN-2025-022

[4] Hamilton, M.P. (2025). "The Serialization Engine: A Generalized Framework for Format-Agnostic Story System Development." Canemah Nature Laboratory Technical Note CNL-TN-2025-023. https://canemah.org/archive/document.php?id=CNL-TN-2025-023

[5] Hamilton, M.P. (2025). "The Cognitive Prosthesis: Writing, Thinking, and the Observer Inside the Observation." Coffee with Claude. https://coffeewithclaude.com/post.php?slug=the-cognitive-prosthesis-writing-thinking-and-the-observer-inside-the-observation

[6] Gardner, J. (1983). *The Art of Fiction: Notes on Craft for Young Writers*. Vintage Books.

[7] Flower, L. & Hayes, J.R. (1981). "A Cognitive Process Theory of Writing." *College Composition and Communication*, 32(4), 365-387.

---

## Appendix A: Database Schema Reference

### A.1 Core Tables

```sql
-- Chapter diagnostics (aggregate metrics)
CREATE TABLE chapter_diagnostics (
    chapter_id INT PRIMARY KEY,
    scene_count INT DEFAULT 1,
    avg_engagement_score FLOAT,
    min_engagement_score TINYINT,
    max_engagement_score TINYINT,
    overall_skim_risk ENUM('low','medium','high','critical'),
    voice_blur_detected TINYINT(1) DEFAULT 0,
    total_tic_words INT DEFAULT 0,
    triage_action ENUM('keep','trim','compress','convert','merge','delete'),
    triage_priority INT,
    revision_status ENUM('pending','revised','deleted','skipped') DEFAULT 'pending',
    last_analyzed DATETIME,
    last_scored DATETIME,
    last_revised DATETIME
);

-- Scene-level analysis
CREATE TABLE scene_analysis (
    id INT PRIMARY KEY AUTO_INCREMENT,
    chapter_id INT NOT NULL,
    scene_number INT DEFAULT 1,
    word_count INT DEFAULT 0,
    timeline_strand ENUM('modern','darwin','pictish','omniscient'),
    pov_character VARCHAR(100),
    stakes TINYINT UNSIGNED DEFAULT 0,
    resistance TINYINT UNSIGNED DEFAULT 0,
    change_level TINYINT UNSIGNED DEFAULT 0,
    question_pull TINYINT UNSIGNED DEFAULT 0,
    engagement_score TINYINT UNSIGNED GENERATED ALWAYS AS (
        stakes + resistance + change_level + question_pull
    ) STORED,
    voice_distinctiveness TINYINT,
    profile_adherence TINYINT,
    triage_action ENUM('keep','trim','compress','convert','merge','delete'),
    triage_notes TEXT,
    telling_instances JSON,
    crutch_words JSON,
    dialogue_by_character JSON,
    voice_bleed JSON,
    scored_at DATETIME,
    scored_by VARCHAR(50)
);
```

### A.2 Voice Analysis Tables

```sql
-- Per-character dialogue statistics
CREATE TABLE voice_analysis_summary (
    id INT PRIMARY KEY AUTO_INCREMENT,
    character_name VARCHAR(100) NOT NULL,
    total_dialogue_words INT DEFAULT 0,
    dialogue_percentage FLOAT,
    scene_count INT DEFAULT 0,
    avg_voice_distinctiveness FLOAT,
    avg_profile_adherence FLOAT,
    avg_sentence_length FLOAT,
    updated_at DATETIME
);

-- Character pair similarity
CREATE TABLE voice_similarity (
    id INT PRIMARY KEY AUTO_INCREMENT,
    character_a VARCHAR(100) NOT NULL,
    character_b VARCHAR(100) NOT NULL,
    similarity_score FLOAT,
    shared_crutch_words TEXT,
    calculated_at DATETIME
);
```

---

## Appendix B: CLI Tool Reference

### B.1 Scoring Tool

```bash
# Score all unscored scenes with Sonnet
php score-scenes.php --model=sonnet

# Score specific chapter with Opus
php score-scenes.php --model=opus --chapter=42

# Re-score all scenes (overwrite existing)
php score-scenes.php --model=sonnet --rescore

# Score only Part 1
php score-scenes.php --model=sonnet --part=1

# Compare models on same chapter
php score-scenes.php --compare --chapter=42
```

### B.2 Voice Aggregation Tool

```bash
# Full aggregation with console report
php aggregate-voice-analysis.php

# Report only (no database update)
php aggregate-voice-analysis.php --report

# JSON output for external processing
php aggregate-voice-analysis.php --json
```

### B.3 Export Tool

```bash
# Full manuscript export
php export-revision-package.php

# Single part export
php export-revision-package.php --part=2

# Triage manifest only
php export-revision-package.php --triage-only

# Custom output filename
php export-revision-package.php --output=hot-water-v2.md
```

---

## Appendix C: Engagement Score Quick Reference

### C.1 Scoring Rubric Summary

| Dimension | 0 | 1 | 2 | 3 |
|-----------|---|---|---|---|
| Stakes | Nothing | Mild discomfort | Real consequences | Survival/identity |
| Resistance | Nothing | Internal doubt | Interpersonal | Active antagonism |
| Change | Static | Info gained | Decision made | Irreversible |
| Question Pull | Resolved | Mild curiosity | Need to know | Hooked |

### C.2 Engagement Thresholds

| Score | Classification | Color | Action |
|-------|----------------|-------|--------|
| 10-12 | Gripping | Green | Keep |
| 7-9 | Solid | Yellow | Monitor |
| 4-6 | Vulnerable | Orange | Revise |
| 0-3 | Dropout | Red | Priority |

---

## Document History

| Version | Date | Changes |
|---------|------|---------|
| 1.0 | 2026-01-25 | Initial release |

---

**End of Technical Note**

*Permanent URL: https://canemah.org/archive/document.php?id=CNL-TN-2026-010*