CNL-TN-2026-010 Technical Note

The Revision Engine

Michael P. Hamilton , Ph.D.

Published: January 25, 2026 Version: 1

The Revision Engine

A Platform for Cognitive Prosthesis Narrative Development

Document ID: CNL-TN-2026-010
Version: 1.0
Date: January 25, 2026
Author: Michael P. Hamilton, Ph.D.
Derivation: Extends CNL-TN-2025-022 (The Novelization Engine)

AI Assistance Disclosure: This technical note was developed collaboratively with Claude (Anthropic, claude-opus-4-5-20250514). The platform described herein was built through iterative human-AI collaboration, with Claude contributing to architecture design, code generation, and documentation. The author takes full responsibility for the content, technical decisions, and conclusions.

Abstract

This technical note documents the Revision Engine—a web-based platform for systematic manuscript revision through human-AI collaboration. The platform extends the Novelization Engine methodology (CNL-TN-2025-022) from drafting into revision, implementing quantified diagnostic tools that transform subjective editorial intuition into actionable data. Core innovations include: a four-dimension engagement scoring system with computed aggregate metrics; voice fingerprinting and similarity detection across characters; automated dropout zone identification; visual heatmap interfaces for manuscript-level pattern recognition; and a complete export-process-import workflow for AI-assisted revision with version control. Applied to Hot Water (218,681 words across 101 chapters), the platform enabled identification of 47 dropout zones, quantification of voice blur across 15 characters, and systematic triage of revision priorities. We introduce the term "cognitive prosthesis narrative development" to describe this approach: extending human cognitive capacity for holding entire manuscripts in working memory while tracking consistency, voice, and engagement across novel-scale texts. The platform demonstrates that AI collaboration in creative work may be most valuable not for generating content but for generating diagnostic infrastructure that makes revision tractable at scale.

1. Introduction

1.1 The Revision Problem

The Novelization Engine (CNL-TN-2025-022) documented a methodology for completing long-incubated fiction through structured human-AI collaboration. That methodology addressed the drafting problem: how to synthesize accumulated creative material into a coherent manuscript. A companion problem remained unaddressed: how to revise that manuscript systematically when it exceeds human working memory capacity.

Novel-length fiction presents a cognitive challenge that intensifies during revision. A 60,000-word manuscript contains approximately 200 pages; a 200,000-word trilogy approaches 700 pages. Traditional revision approaches rely on the author's memory, supplemented by notes and multiple reading passes. These methods are vulnerable to several failure modes:

Consistency drift — The author's mental model of the story evolves during revision, introducing new contradictions while attempting to fix old ones.

Voice blur — Character voices converge toward the author's default register as revision homogenizes prose.

Local optimization — Scene-level improvements may degrade manuscript-level pacing or reader engagement patterns.

Revision fatigue — Repeated passes through the same material produce diminishing returns as the author loses fresh perspective.

The Revision Engine addresses these challenges by externalizing diagnostic functions that authors typically perform intuitively. Rather than relying on memory and instinct to identify problem areas, the platform generates quantified metrics that make manuscript-level patterns visible and tractable.

1.2 Cognitive Prosthesis Narrative Development

We introduce the term "cognitive prosthesis narrative development" to describe the approach implemented in this platform. The term draws on Andy Clark and David Chalmers' Extended Mind thesis [1]: cognitive processes extend beyond the brain when external resources are reliably available, automatically endorsed, and easily accessible.

The Revision Engine functions as a cognitive prosthesis in a specific sense: it extends the author's capacity to hold the entire manuscript in working memory while simultaneously tracking multiple dimensions of quality. No human author can maintain awareness of engagement scores, voice consistency, crutch word density, and reader state across 100+ chapters while making local revision decisions. The platform makes this possible by:

Quantifying subjective editorial intuitions into comparable metrics
Visualizing manuscript-level patterns that exceed human perception
Tracking changes and their effects across revision iterations
Exporting diagnostic context for AI-assisted revision processing
Importing revised content with version control and staging

The cognitive load distribution model from CNL-TN-2025-022 extends to revision:

Load Category	Human Contribution	AI Contribution	Platform Output
Quality judgment	Primary	Scoring assistance	Validated metrics
Pattern recognition	Manuscript knowledge	Context capacity	Diagnostic visualizations
Voice consistency	Ear for authenticity	Profile matching	Voice fingerprints
Revision execution	Editorial control	Prose generation	Staged revisions
Version control	Approval decisions	Tracking automation	Revision history

1.3 Relationship to the Novelization Engine

The Novelization Engine (CNL-TN-2025-022) established documentation infrastructure for drafting: living story bible, character templates, reader state tracking, place documentation, and the eleven-component scene schema. The Revision Engine builds on this foundation by adding:

Diagnostic layer — Automated analysis tools that transform prose into quantified metrics

Visualization layer — Heatmap and dashboard interfaces for manuscript-level pattern recognition

Triage layer — Classification systems that convert diagnosis into actionable revision priorities

Workflow layer — Export/import pipeline for AI-assisted revision with version control

The Serialization Engine (CNL-TN-2025-023) subsequently demonstrated that this combined infrastructure produces format-agnostic story systems rather than format-specific manuscripts. The three documents form a trilogy:

Novelization Engine — How to draft (methodology)
Revision Engine — How to revise (platform)
Serialization Engine — What the methodology produces (theory)

1.4 Scope

This technical note covers:

Theoretical foundations for quantified revision
Database architecture and schema design
The four-dimension engagement scoring system
Voice analysis and fingerprinting
Heatmap visualization and dropout zone detection
The triage classification system
Export-process-import workflow for AI collaboration
Results from application to Hot Water
Implications for creative AI applications

2. Theoretical Framework

2.1 The Quantification Principle

Traditional editorial feedback operates in qualitative registers: "this scene feels slow," "the pacing drags in the middle," "these characters sound too similar." Such feedback identifies problems but provides limited guidance for systematic repair. The author must translate qualitative intuition into specific revision decisions through trial and error.

The Revision Engine implements a quantification principle: every qualitative editorial intuition can be decomposed into measurable dimensions that, aggregated, approximate the intuition's signal. "This scene feels slow" might decompose into:

Low stakes (nothing at risk)
Low resistance (no conflict)
Low change (static situation)
Low question pull (no reason to continue)

Each dimension becomes a 0-3 scale. The aggregate (0-12) provides a comparable engagement score across all scenes in the manuscript. Scenes scoring below threshold become visible revision targets.

This approach does not claim that numbers capture the full complexity of literary quality. Rather, it claims that quantified proxies are useful for identifying patterns that exceed human working memory. The author retains full editorial judgment; the platform surfaces candidates for that judgment to evaluate.

2.2 The Heatmap Metaphor

Thermal imaging reveals temperature patterns invisible to the naked eye. A building inspector uses infrared cameras to identify heat loss through insulation failures. The patterns exist; the technology makes them visible.

The manuscript heatmap applies this metaphor to engagement patterns. Each chapter receives a color-coded cell based on diagnostic metrics. Viewed individually, any chapter might seem acceptable. Viewed as a heatmap, patterns emerge:

Consecutive red cells indicate dropout zones where readers will quit
Yellow clusters reveal pacing problems spanning multiple chapters
Voice scores trending downward suggest character convergence
Crutch word density peaks identify prose requiring attention

The heatmap makes visible what the author cannot perceive through sequential reading: the manuscript's engagement topography.

2.3 The Triage Model

Emergency medicine developed triage to allocate limited resources efficiently: identify patients who will survive without intervention, patients who cannot be saved, and patients where intervention matters most. Resources flow to the third category.

Manuscript revision benefits from similar prioritization. Not all chapters require equal attention:

KEEP — Scene works. Intervention unnecessary.

TRIM — Cut 20-40%. Specific problems identifiable and fixable.

COMPRESS — Major reduction (50%+). Preserve only essential beats.

CONVERT — Format change required (journal → scene, summary → action).

MERGE — Combine with adjacent material.

DELETE — Remove entirely; relocate essential plot information.

Triage classification emerges from diagnostic data: engagement scores, voice metrics, crutch word density, and structural analysis. The classification provides actionable guidance that sequential reading cannot.

2.4 The Export-Process-Import Cycle

AI language models excel at prose transformation within specified parameters but lack persistent memory across sessions. The Revision Engine addresses this limitation through a structured workflow:

Export — Generate a revision package containing:

Complete manuscript text with chapter boundaries
Character voice profiles with vocabulary signatures
Engagement scores and triage classifications
Crutch word alerts and revision guidelines
Voice analysis summary with similarity warnings

Process — Submit to large-context AI (Gemini 1M, Claude) for targeted revision following embedded guidelines

Import — Paste revised content into staging system with:

Side-by-side diff view against original
Word count tracking (original → revised → delta)
Revision history logging
Publish/unpublish toggle for A/B comparison

This cycle enables AI collaboration at novel scale while maintaining human editorial control over all revision decisions.

3. Platform Architecture

3.1 Technology Stack

The Revision Engine is implemented as a PHP/MySQL web application:

Server: Apache 2.4 on macOS (Mac Mini M4 Pro)
Database: MySQL 8.4 with InnoDB storage engine
Backend: PHP 8.3 with mysqli (no PDO abstraction)
Frontend: Vanilla HTML/CSS/JavaScript (no framework dependencies)
CLI Tools: PHP scripts for batch operations and API integration

Design principles:

Simplicity — No framework overhead; readable, maintainable code
Direct database access — mysqli prepared statements; phpMyAdmin for administration
File-based content — Markdown import/export; version-controlled externally
Progressive enhancement — Core functionality works without JavaScript

3.2 Database Schema Overview

The schema comprises 23 tables organized into functional groups:

Content Tables

parts — Trilogy volumes (SIGNAL, CHRONICLE, ANCESTOR)
chapters — Individual chapters with content, metadata, and revision staging
images — Artwork and illustrations
chapter_assets — Image-chapter associations

Diagnostic Tables

chapter_diagnostics — Aggregate metrics per chapter
scene_analysis — Scene-level engagement scores and voice metrics
tic_word_config — Configurable crutch word detection
tic_word_occurrences — Per-chapter crutch word counts
crutch_word_totals — Manuscript-level aggregates
dropout_zones — Detected reader abandonment risk areas

Voice Analysis Tables

voice_fingerprints — Character voice signatures per chapter
voice_analysis_summary — Character-level dialogue statistics
voice_similarity — Pairwise voice confusion detection
chapter_prose_summary — Telling/showing analysis per chapter

Revision Tables

revision_history — Complete revision audit trail
chapters.revised_content — Staging field for pending revisions
chapters.show_revised — Toggle for A/B content display

Analytics Tables

chapter_views — Privacy-respecting read tracking (IP hashed)
subscribers — Email list with double opt-in
downloads — EPUB/PDF generation tracking

3.3 Key Schema Innovations

Computed Engagement Score

The scene_analysis table uses a MySQL generated column for engagement scoring:

engagement_score TINYINT UNSIGNED GENERATED ALWAYS AS (
    stakes + resistance + change_level + question_pull
) STORED

This ensures score consistency: changing any component automatically updates the aggregate. The STORED attribute enables indexing for efficient sorting and filtering.

Revision Staging

The chapters table implements a dual-content architecture:

content LONGTEXT,           -- Original/current content
revised_content LONGTEXT,   -- Staged revision (pending)
show_revised TINYINT(1)     -- Display toggle (0=original, 1=revised)

This enables:

Non-destructive revision (original preserved)
A/B comparison by toggling show_revised
Incremental publishing (some chapters revised, others not)
Rollback capability (clear revised_content, set show_revised=0)

JSON-Stored Analysis Data

Complex analysis results use JSON columns for flexibility:

telling_instances JSON,      -- Array of {text, paragraph}
crutch_words JSON,          -- Object {word: count}
dialogue_by_character JSON,  -- Object {character: word_count}
voice_bleed JSON,           -- Array of {text, paragraph, expected_voice}

JSON storage enables schema evolution without migration and efficient storage of variable-structure data.

Database Views for Common Queries

Four views simplify common access patterns:

v_chapter_heatmap      -- Joins chapters, parts, diagnostics for heatmap display
v_triage_queue         -- Chapters requiring revision, sorted by priority
v_revision_progress    -- Per-part revision completion statistics
v_dropout_zones_active -- Unresolved dropout zones with chapter titles

4. The Engagement Scoring System

4.1 Theoretical Basis

Reader engagement in fiction correlates with specific narrative qualities identifiable at scene level. Drawing on Robert McKee's Story [2] and craft knowledge accumulated through the Novelization Engine collaboration, we identify four orthogonal dimensions:

Stakes — What is at risk if things go wrong?

Resistance — What pushes back against what the character wants?

Change — How different is the situation at the end versus the beginning?

Question Pull — Does the scene ending compel the reader to continue?

Each dimension operates independently: a scene can have high stakes but low resistance, or high change but low question pull. The four-dimension model captures distinct failure modes invisible to single-metric approaches.

4.2 Scoring Rubrics

Each dimension uses a 0-3 scale with explicit anchors:

STAKES (S)	Score	Definition
0	Nothing at stake	Characters chatting, pure exposition
1	Mild discomfort	Awkward social moment, minor inconvenience
2	Real consequences	Relationship damage, reputation at risk
3	Survival or identity	Physical danger, existential threat

RESISTANCE (R)	Score	Definition
0	Nothing pushes back	Character gets what they want easily
1	Internal doubt	Self-questioning, hesitation
2	Interpersonal conflict	Disagreement, competing goals
3	Active antagonism	Direct opposition, blocked path

CHANGE (C)	Score	Definition
0	Static	Same mental/physical state as start
1	Information gained	Character learns something new
2	Decision made	Character commits to a path
3	Irreversible shift	Point of no return, permanent change

QUESTION PULL (Q)	Score	Definition
0	Resolved	Natural stopping point, reader satisfied
1	Mild curiosity	Slight interest in what happens next
2	Need to know	Unanswered question that nags
3	Hooked	Cannot stop reading, must continue

4.3 Engagement Thresholds

Aggregate scores (0-12) map to reader engagement risk levels:

Score Range	Classification	Reader Behavior
10-12	Gripping	Cannot put down
7-9	Solid	Engaged, will continue
4-6	Vulnerable	At risk of skimming
0-3	Dropout	Reader will quit

The minimum viable scene target is 8/12, requiring at least moderate engagement across all four dimensions.

4.4 Scoring Implementation

Engagement scoring operates through two complementary mechanisms:

Manual Scoring (Scene Audit Interface)

The scene-audit.php interface presents chapter content alongside scoring controls. Human scorers evaluate each dimension against the rubric, with scores saved to scene_analysis. This approach provides highest accuracy but requires significant time investment.

AI-Assisted Scoring (CLI Tool)

The score-scenes.php CLI tool submits chapters to Claude API with embedded scoring rubrics and voice profiles. The AI returns JSON-structured scores that populate scene_analysis. This approach enables batch processing of entire manuscripts.

php score-scenes.php --model=sonnet              # Score all unscored
php score-scenes.php --model=opus --chapter=42   # Score specific chapter
php score-scenes.php --model=sonnet --rescore    # Re-score everything

AI scoring includes linguistic analysis beyond engagement:

Voice distinctiveness (1-5)
Profile adherence (1-5)
Telling instances with locations
Crutch word counts
Dialogue distribution by character
Round-robin monologue detection

4.5 Score Aggregation

Chapter-level diagnostics aggregate from scene-level scores:

UPDATE chapter_diagnostics SET
    avg_engagement_score = (SELECT AVG(engagement_score) FROM scene_analysis WHERE chapter_id = ?),
    min_engagement_score = (SELECT MIN(engagement_score) FROM scene_analysis WHERE chapter_id = ?),
    max_engagement_score = (SELECT MAX(engagement_score) FROM scene_analysis WHERE chapter_id = ?)
WHERE chapter_id = ?

The avg_engagement_score drives heatmap coloring; min_engagement_score identifies chapters with weak sections even if average is acceptable.

5. Voice Analysis System

5.1 The Voice Blur Problem

Character voice is among the most difficult qualities to maintain across novel-length fiction. Each character should sound distinct through vocabulary choices, sentence patterns, and metaphor domains. In practice, characters often converge toward the author's default voice, particularly during revision when the author is focused on other concerns.

The Hot Water manuscript presents acute voice challenges:

Six primary POV characters (Amara, David, Susan, Margaret, Jennifer, Starseed)
Three temporal strands (modern, Darwin 1830s, Pictish 570 CE)
Technical vocabularies spanning physics, geology, biology, engineering, archaeology
15+ speaking characters requiring distinct voices

5.2 Voice Profiles

Each major character receives a documented voice profile specifying:

Vocabulary Signature — Domain-specific terms the character naturally uses:

Amara: tolerances, load-bearing, thermal gradient, structural integrity
David: coherence, eigenstate, superposition, probability amplitude
Susan: taxonomy, phylogeny, adaptation, ecological niche
Margaret: crystalline matrix, stratification, metamorphic, grain boundaries

Sentence Patterns — Characteristic syntactic structures:

Amara: Direct and declarative, precise but warm
David: Short questioning sentences, trails off when confused
Susan: Patient observation building to insight
Margaret: Scottish rhythm, unhurried geological perspective

Metaphor Domains — Where the character draws comparisons:

Amara: Architecture, materials science, electrical systems
David: Physics, mathematics, uncertainty
Susan: Evolution, organic systems, deep time
Margaret: Rocks, minerals, slow processes

5.3 Voice Metrics

The scoring system evaluates voice quality on two dimensions:

Voice Distinctiveness (1-5) — Does the POV character sound distinct from other characters?	Score	Definition
1	Indistinguishable from other characters
2	Occasional distinctive moments
3	Moderately distinct voice
4	Clearly recognizable voice
5	Unmistakably unique

Profile Adherence (1-5) — Does the character use vocabulary and metaphors from their profile?	Score	Definition
1	None of expected vocabulary present
2	Rare use of profile vocabulary
3	Moderate use of profile vocabulary
4	Strong use of profile vocabulary
5	Vocabulary fully consistent with profile

5.4 Voice Analysis Aggregation

The aggregate-voice-analysis.php CLI tool processes scored scenes to generate manuscript-level voice statistics:

Dialogue Distribution — Word count and percentage by character:

| Character | Words   | %     | Distinctiveness | Adherence |
|-----------|---------|-------|-----------------|-----------|
| David     | 19,581  | 21.1% | 2.8             | 2.6       |
| Susan     | 12,226  | 13.2% | 2.7             | 2.5       |
| Margaret  | 11,538  | 12.5% | 3.6             | 3.4       |

Voice Similarity Warnings — Character pairs with high confusion risk:

| Character A | Character B | Similarity | Shared Crutch Words |
|-------------|-------------|------------|---------------------|
| David       | Susan       | 0.72       | pattern, structure  |

Prose Problem Scores — Chapters ranked by telling constructions, restatements, and voice issues

5.5 Voice Bleed Detection

AI scoring identifies specific instances where POV characters use vocabulary belonging to other characters:

"voice_bleed": [
    {
        "text": "crystalline matrix",
        "paragraph": 12,
        "expected_voice": "Margaret"
    }
]

When Susan (biologist) thinks in geological vocabulary, the scene needs revision to restore voice integrity.

6. Heatmap Visualization

6.1 Design Principles

The heatmap interface (heatmap.php) transforms diagnostic data into a visual representation enabling manuscript-level pattern recognition. Design principles:

Color Psychology — Engagement maps to intuitive thermal scale:

Green (10-12): Cool/safe, no intervention needed
Yellow (7-9): Warm/caution, monitor but acceptable
Orange (4-6): Hot/warning, revision candidate
Red (0-3): Critical/danger, priority revision target

Information Density — Each cell encodes multiple dimensions:

Background color: Engagement score
Text: Chapter title (truncated)
Badge: Triage action (if assigned)
Icon: Revision status (pending/revised/published)

Spatial Organization — Chapters ordered by reading sequence:

Grouped by Part (SIGNAL, CHRONICLE, ANCESTOR)
Sorted by display_order within Part
Visual breaks between Parts

6.2 Heatmap Data Structure

The v_chapter_heatmap view joins necessary tables:

CREATE VIEW v_chapter_heatmap AS
SELECT 
    c.id, c.title, c.chapter_type, c.word_count, c.display_order,
    p.id AS part_id, p.name AS part_name, p.display_order AS part_order,
    COALESCE(cd.avg_engagement_score, 0) AS engagement_score,
    COALESCE(cd.overall_skim_risk, 'medium') AS skim_risk,
    COALESCE(cd.avg_exposition_density, 0) AS exposition_density,
    COALESCE(cd.total_tic_words, 0) AS tic_words,
    cd.triage_action, cd.triage_priority
FROM chapters c
JOIN parts p ON c.part_id = p.id
LEFT JOIN chapter_diagnostics cd ON c.id = cd.chapter_id
ORDER BY p.display_order, c.display_order

6.3 Dropout Zone Detection

The system automatically identifies contiguous sequences of low-engagement chapters:

function detectDropoutZones($conn, $threshold = 5, $minLength = 2) {
    // Query chapters ordered by position
    // Identify sequences where avg_engagement_score < threshold
    // Return zones with: start_chapter, end_chapter, scene_count, 
    //                    avg_score, diagnosis, recommendation
}

Detected zones are classified by severity:

Warning: 2 consecutive low-engagement chapters
Danger: 3-4 consecutive low-engagement chapters
Critical: 5+ consecutive low-engagement chapters

Zone types capture specific patterns:

consecutive_low_engagement: Generic dropout risk
exposition_cluster: Multiple explanation-heavy chapters
voice_blur: Character distinctiveness declining
pacing_stall: Static scenes without change
journal_sequence: Multiple document-type chapters

6.4 Statistics Dashboard

The heatmap page displays aggregate statistics:

Total Chapters: 101
Total Words: 218,681
Scored: 89/101 (88%)
Avg Engagement: 6.4/12
Dropout Zones: 4 active

Revision Progress:
- Revised: 23
- Published: 18
- Pending: 60

These metrics provide manuscript health indicators at a glance.

7. Triage System

7.1 Triage Actions

Six triage actions classify revision requirements:

Action	Definition	Target Reduction
KEEP	Scene works; preserve structure and length	0%
TRIM	Cut without losing essential content	20-40%
COMPRESS	Major reduction; preserve only key beats	50%+
CONVERT	Change format entirely	Variable
MERGE	Combine with adjacent material	Consolidation
DELETE	Remove entirely; relocate plot info	100%

7.2 Triage Assignment

Triage actions can be assigned through:

Manual Assignment — Scene audit interface includes triage dropdown and notes field

AI-Assisted Assignment — Scoring prompt requests triage recommendation based on engagement metrics and prose analysis

Algorithmic Rules — Automated assignment based on thresholds:

if ($engagement_score <= 3) {
    $triage_action = 'delete';
} elseif ($engagement_score <= 5 && $exposition_density > 0.3) {
    $triage_action = 'compress';
} elseif ($tic_word_count > 20) {
    $triage_action = 'trim';
}

7.3 Triage Queue

The v_triage_queue view surfaces chapters requiring attention:

CREATE VIEW v_triage_queue AS
SELECT c.id, c.title, c.word_count, p.name AS part_name,
       cd.avg_engagement_score, cd.triage_action, cd.triage_priority
FROM chapters c
JOIN parts p ON c.part_id = p.id
LEFT JOIN chapter_diagnostics cd ON c.id = cd.chapter_id
WHERE cd.triage_action IS NOT NULL 
  AND cd.triage_action != 'keep'
ORDER BY cd.triage_priority ASC, cd.avg_engagement_score ASC

The queue prioritizes:

Critical triage actions (delete, compress)
Lower engagement scores within action category
Earlier chapters (reader reaches them first)

7.4 Triage Notes

Each triage assignment includes explanatory notes:

Scene 42 (triage: COMPRESS): "Three consecutive monologues without 
conflict. David and Susan explain the same discovery to each other 
that readers already understand. Keep only Margaret's reaction and 
the final decision to proceed."

Notes provide revision guidance beyond the action classification.

8. Export-Process-Import Workflow

8.1 Revision Package Export

The export-revision-package.php CLI tool generates a comprehensive revision package:

php export-revision-package.php                    # Full export
php export-revision-package.php --part=2           # Part 2 only
php export-revision-package.php --triage-only      # Just manifest

The package includes:

Header Section

Generation timestamp
Total chapters and word count
Export parameters

Voice Differentiation Section

Complete voice profiles for all POV characters
Vocabulary signatures
Sentence patterns
Metaphor domains

Crutch Word Alert

High-frequency terms (100+ occurrences)
Medium-frequency terms (25-99 occurrences)
Phrase patterns to eliminate

Revision Guidelines

Triage action definitions
Common problems to fix (with detection flags)
Engagement score targets
Voice score targets

Voice Analysis Summary

Dialogue distribution by character
Voice similarity warnings
Chapters with worst prose problem scores

Triage Manifest

Per-chapter: engagement score, triage action, triage notes, voice scores

Full Manuscript Content

Chapter headers with metadata
Complete text in Markdown format
Scene boundaries where applicable

8.2 AI Processing

The revision package is designed for large-context language models:

Gemini 1M Context — Can process entire trilogy (218K words) plus guidelines

Claude with Projects — Revision guidelines as project knowledge; chapters processed individually or in batches

The embedded guidelines provide the AI with:

What to fix (triage actions)
How to fix it (revision guidelines)
What to preserve (voice profiles)
What to eliminate (crutch words)

8.3 Revision Import

The revision-import.php interface handles revised content:

Chapter Selection

Dropdown listing all chapters with triage status
Color-coded by revision state (pending/revised/published)

Side-by-Side View

Original content (left panel)
Revised content or input area (right panel)
Rendered Markdown for readability

Word Count Tracking

Original word count
Revised word count
Delta (positive or negative)
Real-time calculation as content is pasted

Revision Actions

Save to Staging — Store in revised_content without publishing
Publish Revision — Set show_revised = 1 to display revised version
Unpublish — Revert to original (show_revised = 0)
Clear Revision — Delete staged content entirely

Revision History

Timestamped log of all revisions
Original and revised content preserved
Word delta tracked
Notes field for revision description

8.4 Version Control for Prose

The staging system enables sophisticated version control:

A/B Testing — Toggle show_revised to compare reader engagement between original and revised versions

Incremental Publishing — Some chapters can show revised content while others remain original

Rollback Capability — Original content always preserved; can revert any chapter

Audit Trail — Complete history of revisions with timestamps and word deltas

9. Results

9.1 Application to Hot Water

The Revision Engine was developed iteratively alongside the Hot Water manuscript:

Metric	Value
Total chapters	101
Total words	218,681
Parts	3 (SIGNAL, CHRONICLE, ANCESTOR)
Chapters scored	89 (88%)
Average engagement	6.4/12
Dropout zones detected	47
Characters with voice data	15
Revision history entries	156

9.2 Diagnostic Findings

Engagement Distribution

Gripping (10-12): 12 chapters (12%)
Solid (7-9): 34 chapters (34%)
Vulnerable (4-6): 31 chapters (31%)
Dropout (0-3): 12 chapters (12%)
Unscored: 12 chapters (12%)

Triage Classification

KEEP: 23 chapters
TRIM: 31 chapters
COMPRESS: 18 chapters
CONVERT: 8 chapters
DELETE: 4 chapters
Unassigned: 17 chapters

Voice Analysis

Highest distinctiveness: ARCHIE (AI character), 4.0/5
Lowest distinctiveness: David, 2.8/5
Highest similarity pair: David/Susan, 0.72

9.3 Pattern Discovery

The heatmap revealed patterns invisible to sequential reading:

Journal Cluster Problem — Four consecutive journal chapters in Part 2 created a documentation slog. Triage: convert two to dramatized scenes, merge two others.

Darwin Interlude Pacing — Darwin sections consistently scored higher engagement than modern timeline. The contrast highlighted modern sections needing more conflict.

Voice Convergence in Act 3 — Distinctiveness scores declined in final chapters as revision pressure increased. Systematic voice restoration required.

Exposition Front-Loading — First chapters of each Part scored lower due to setup exposition. Structural revision to embed exposition in conflict.

9.4 Revision Workflow Results

The export-process-import workflow enabled:

Batch processing of 12 chapters in single Gemini session
Consistent application of voice profiles across revisions
Word count reduction of 23% in targeted chapters
Voice distinctiveness improvement averaging 0.8 points post-revision

10. Implementation Details

10.1 Text Analysis Functions

The heatmap_functions.php library provides core analysis utilities:

Exposition Density

function calcExpositionDensity($text) {
    $markers = ['which means', 'in other words', 'because', 
                'therefore', 'essentially', 'basically'];
    $sentenceCount = countSentences($text);
    $markerCount = 0;
    foreach ($markers as $marker) {
        $markerCount += substr_count(strtolower($text), $marker);
    }
    return round($markerCount / $sentenceCount, 3);
}

Dialogue Ratio

function calcDialogueRatio($text) {
    // Extract content within quotation marks
    preg_match_all('/[""]([^""]+)[""]/u', $text, $matches);
    $dialogueWords = str_word_count(implode(' ', $matches[1]));
    $totalWords = str_word_count(strip_tags($text));
    return round($dialogueWords / $totalWords, 3);
}

Tic Word Detection

function countTicWords($text, $ticWords) {
    $results = [];
    foreach ($ticWords as $word) {
        $pattern = '/\b' . preg_quote($word, '/') . '\b/i';
        preg_match_all($pattern, $text, $matches);
        if (count($matches[0]) > 0) {
            $results[$word] = count($matches[0]);
        }
    }
    return $results;
}

10.2 API Integration

The score-scenes.php tool integrates with Claude API:

function scoreChapterWithAPI($content, $model, $voiceProfiles) {
    $prompt = buildScoringPrompt($content, $voiceProfiles);

    $response = callClaudeAPI([
        'model' => MODELS[$model],
        'max_tokens' => MAX_TOKENS,
        'messages' => [
            ['role' => 'user', 'content' => $prompt]
        ]
    ]);

    return parseJSONResponse($response);
}

The prompt includes:

Voice profiles for all characters
Scoring rubrics with explicit anchors
Linguistic analysis instructions
JSON output format specification

10.3 Privacy-Respecting Analytics

Reader engagement tracking uses IP hashing:

function hashIP($ip) {
    return hash('sha256', $ip . date('Y-m'));  // Monthly rotation
}

This enables:

Unique visitor counting without storing IPs
Reading pattern analysis without individual tracking
Monthly hash rotation limits tracking duration
No personally identifiable information stored

11. Discussion

11.1 Cognitive Prosthesis Effectiveness

The Revision Engine demonstrates that systematic diagnostic infrastructure extends human cognitive capacity for manuscript revision. Specific benefits:

Pattern Visibility — The heatmap reveals engagement topography invisible to sequential reading. Authors can see the manuscript as readers experience it rather than as they wrote it.

Quantified Prioritization — Triage classification converts intuitive "this needs work" into actionable priority queues. Limited revision time flows to highest-impact chapters.

Voice Maintenance — Profile-based voice analysis catches convergence before it becomes endemic. Revision can strengthen distinctiveness rather than homogenize further.

Systematic Workflow — The export-process-import cycle enables AI collaboration at scale while maintaining human editorial control. The author directs; the system executes.

11.2 Limitations

Single-Author Development — The platform was built for a specific author's workflow. Generalization to other authors remains untested.

Scoring Subjectivity — Engagement dimensions are proxies for reader experience. Scores reflect systematic approximation, not ground truth.

AI Dependency — Batch scoring requires API access and incurs costs. Voice analysis quality depends on model capability.

Integration Overhead — The full platform requires MySQL database, PHP server, and CLI access. Simpler tools might serve authors with different technical backgrounds.

11.3 Implications for Creative AI

The Revision Engine suggests that AI collaboration in creative work may be most valuable for infrastructure generation rather than content generation:

Diagnostic systems that quantify editorial intuition
Visualization tools that reveal manuscript-level patterns
Workflow automation that maintains human control
Version control that enables experimentation without risk

This complements the Novelization Engine finding that AI collaboration produces format-agnostic story systems. Together, the engines demonstrate a paradigm: AI as cognitive prosthesis for creative work, extending human capacity rather than replacing human judgment.

12. Conclusion

The Revision Engine provides a platform for systematic manuscript revision through human-AI collaboration. Core innovations include:

Four-dimension engagement scoring with computed aggregates enabling manuscript-level comparison
Voice fingerprinting with similarity detection preventing character convergence
Heatmap visualization revealing engagement patterns invisible to sequential reading
Triage classification converting diagnosis into prioritized revision queues
Export-process-import workflow enabling AI collaboration while maintaining editorial control
Version control for prose supporting A/B testing, incremental publishing, and rollback

Applied to Hot Water (218,681 words, 101 chapters), the platform enabled identification of 47 dropout zones, quantification of voice metrics across 15 characters, and systematic triage of revision priorities.

The central finding: cognitive prosthesis narrative development—extending human cognitive capacity for holding entire manuscripts in working memory while tracking multiple quality dimensions—makes tractable what would otherwise exceed human capability. The Revision Engine demonstrates that AI collaboration adds most value not by generating content but by generating infrastructure that makes human revision systematic at novel scale.

References

[1] Clark, A. & Chalmers, D. (1998). "The Extended Mind." Analysis, 58(1), 7-19.

[2] McKee, R. (1997). Story: Substance, Structure, Style, and the Principles of Screenwriting. ReganBooks.

[3] Hamilton, M.P. (2025). "The Novelization Engine: A Methodology for AI-Augmented Long-Form Fiction Development." Canemah Nature Laboratory Technical Note CNL-TN-2025-022. https://canemah.org/archive/document.php?id=CNL-TN-2025-022

[4] Hamilton, M.P. (2025). "The Serialization Engine: A Generalized Framework for Format-Agnostic Story System Development." Canemah Nature Laboratory Technical Note CNL-TN-2025-023. https://canemah.org/archive/document.php?id=CNL-TN-2025-023

[5] Hamilton, M.P. (2025). "The Cognitive Prosthesis: Writing, Thinking, and the Observer Inside the Observation." Coffee with Claude. https://coffeewithclaude.com/post.php?slug=the-cognitive-prosthesis-writing-thinking-and-the-observer-inside-the-observation

[6] Gardner, J. (1983). The Art of Fiction: Notes on Craft for Young Writers. Vintage Books.

[7] Flower, L. & Hayes, J.R. (1981). "A Cognitive Process Theory of Writing." College Composition and Communication, 32(4), 365-387.

Appendix A: Database Schema Reference

A.1 Core Tables

-- Chapter diagnostics (aggregate metrics)
CREATE TABLE chapter_diagnostics (
    chapter_id INT PRIMARY KEY,
    scene_count INT DEFAULT 1,
    avg_engagement_score FLOAT,
    min_engagement_score TINYINT,
    max_engagement_score TINYINT,
    overall_skim_risk ENUM('low','medium','high','critical'),
    voice_blur_detected TINYINT(1) DEFAULT 0,
    total_tic_words INT DEFAULT 0,
    triage_action ENUM('keep','trim','compress','convert','merge','delete'),
    triage_priority INT,
    revision_status ENUM('pending','revised','deleted','skipped') DEFAULT 'pending',
    last_analyzed DATETIME,
    last_scored DATETIME,
    last_revised DATETIME
);

-- Scene-level analysis
CREATE TABLE scene_analysis (
    id INT PRIMARY KEY AUTO_INCREMENT,
    chapter_id INT NOT NULL,
    scene_number INT DEFAULT 1,
    word_count INT DEFAULT 0,
    timeline_strand ENUM('modern','darwin','pictish','omniscient'),
    pov_character VARCHAR(100),
    stakes TINYINT UNSIGNED DEFAULT 0,
    resistance TINYINT UNSIGNED DEFAULT 0,
    change_level TINYINT UNSIGNED DEFAULT 0,
    question_pull TINYINT UNSIGNED DEFAULT 0,
    engagement_score TINYINT UNSIGNED GENERATED ALWAYS AS (
        stakes + resistance + change_level + question_pull
    ) STORED,
    voice_distinctiveness TINYINT,
    profile_adherence TINYINT,
    triage_action ENUM('keep','trim','compress','convert','merge','delete'),
    triage_notes TEXT,
    telling_instances JSON,
    crutch_words JSON,
    dialogue_by_character JSON,
    voice_bleed JSON,
    scored_at DATETIME,
    scored_by VARCHAR(50)
);

A.2 Voice Analysis Tables

-- Per-character dialogue statistics
CREATE TABLE voice_analysis_summary (
    id INT PRIMARY KEY AUTO_INCREMENT,
    character_name VARCHAR(100) NOT NULL,
    total_dialogue_words INT DEFAULT 0,
    dialogue_percentage FLOAT,
    scene_count INT DEFAULT 0,
    avg_voice_distinctiveness FLOAT,
    avg_profile_adherence FLOAT,
    avg_sentence_length FLOAT,
    updated_at DATETIME
);

-- Character pair similarity
CREATE TABLE voice_similarity (
    id INT PRIMARY KEY AUTO_INCREMENT,
    character_a VARCHAR(100) NOT NULL,
    character_b VARCHAR(100) NOT NULL,
    similarity_score FLOAT,
    shared_crutch_words TEXT,
    calculated_at DATETIME
);

Appendix B: CLI Tool Reference

B.1 Scoring Tool

# Score all unscored scenes with Sonnet
php score-scenes.php --model=sonnet

# Score specific chapter with Opus
php score-scenes.php --model=opus --chapter=42

# Re-score all scenes (overwrite existing)
php score-scenes.php --model=sonnet --rescore

# Score only Part 1
php score-scenes.php --model=sonnet --part=1

# Compare models on same chapter
php score-scenes.php --compare --chapter=42

B.2 Voice Aggregation Tool

# Full aggregation with console report
php aggregate-voice-analysis.php

# Report only (no database update)
php aggregate-voice-analysis.php --report

# JSON output for external processing
php aggregate-voice-analysis.php --json

B.3 Export Tool

# Full manuscript export
php export-revision-package.php

# Single part export
php export-revision-package.php --part=2

# Triage manifest only
php export-revision-package.php --triage-only

# Custom output filename
php export-revision-package.php --output=hot-water-v2.md

Appendix C: Engagement Score Quick Reference

C.1 Scoring Rubric Summary

Dimension	0	1	2	3
Stakes	Nothing	Mild discomfort	Real consequences	Survival/identity
Resistance	Nothing	Internal doubt	Interpersonal	Active antagonism
Change	Static	Info gained	Decision made	Irreversible
Question Pull	Resolved	Mild curiosity	Need to know	Hooked

C.2 Engagement Thresholds

Score	Classification	Color	Action
10-12	Gripping	Green	Keep
7-9	Solid	Yellow	Monitor
4-6	Vulnerable	Orange	Revise
0-3	Dropout	Red	Priority

Document History

Version	Date	Changes
1.0	2026-01-25	Initial release

End of Technical Note

Permanent URL: https://canemah.org/archive/document.php?id=CNL-TN-2026-010

Cite This Document

Michael P. Hamilton, Ph.D. (2026). "The Revision Engine." Canemah Nature Laboratory Technical Note CNL-TN-2026-010. https://canemah.org/archive/CNL-TN-2026-010

BibTeX

@techreport{hamilton2026revision, author = {Hamilton, Michael P., Ph.D.}, title = {The Revision Engine}, institution = {Canemah Nature Laboratory}, year = {2026}, number = {CNL-TN-2026-010}, month = {january}, url = {https://canemah.org/archive/document.php?id=CNL-TN-2026-010}, abstract = {This technical note documents the Revision Engine—a web-based platform for systematic manuscript revision through human-AI collaboration. The platform extends the Novelization Engine methodology (CNL-TN-2025-022) from drafting into revision, implementing quantified diagnostic tools that transform subjective editorial intuition into actionable data. Core innovations include: a four-dimension engagement scoring system with computed aggregate metrics; voice fingerprinting and similarity detection across characters; automated dropout zone identification; visual heatmap interfaces for manuscript-level pattern recognition; and a complete export-process-import workflow for AI-assisted revision with version control. Applied to *Hot Water* (218,681 words across 101 chapters), the platform enabled identification of 47 dropout zones, quantification of voice blur across 15 characters, and systematic triage of revision priorities. We introduce the term "cognitive prosthesis narrative development" to describe this approach: extending human cognitive capacity for holding entire manuscripts in working memory while tracking consistency, voice, and engagement across novel-scale texts. The platform demonstrates that AI collaboration in creative work may be most valuable not for generating content but for generating diagnostic infrastructure that makes revision tractable at scale.} }

Permanent URL: https://canemah.org/archive/document.php?id=CNL-TN-2026-010