Quotes Collection Infrastructure Protocol
Quotes Collection Infrastructure Protocol
A Personal Knowledge Management System with AI-Assisted Tagging
Document ID: CNL-PR-2025-012
Version: 1.0
Date: December 8, 2025
Author: Michael P. Hamilton, Ph.D.
AI Assistance Disclosure: This protocol was developed with assistance from Claude (Anthropic, claude-opus-4-20250514). The AI contributed to system architecture, code development, prompt engineering, and documentation drafting through an iterative collaborative process spanning multiple sessions. The author takes full responsibility for the content, accuracy, and conclusions.
Abstract
This protocol documents the Quotes Collection infrastructure, a LAMP-based personal knowledge management system for curating, organizing, and exploring a lifetime collection of quotations. The system transforms an unstructured text file of 617 quotes spanning decades of intellectual exploration into a searchable, browsable, and AI-augmented knowledge base. Key innovations include a visual import tool for meditative manual curation, a dual-tool AI tagging methodology (mechanical bulk processing and conversational refinement), and a contemplative public interface designed for reflective engagement. The architecture supports future evolution toward a knowledge graph connecting concepts across science, philosophy, nature, ecology, and the human condition. This document provides specifications for database schema, file architecture, AI integration patterns, and deployment methodology.
1. Introduction
1.1 Purpose
The Quotes Collection addresses a specific need: preserving and organizing a lifetime of intellectual accumulation—quotes collected across decades as breadcrumbs marking a journey through ideas. The system transforms a simple text file into a navigable knowledge structure while honoring the contemplative nature of the content.
Design goals:
- Single-curator workflow with administrative control
- Quote-centric content with attribution and tagging
- Full-text search with MySQL FULLTEXT indexing
- AI-assisted tagging via multiple LLM providers
- Contemplative public interface emphasizing readability
- Markdown support for formatted quote content
- Future extensibility toward AI chat integration
1.2 Scope
This protocol covers:
- Database schema design
- PHP file architecture
- Visual import methodology
- AI integration layer (provider abstraction)
- Two-tool tagging methodology
- Public and administrative interfaces
- Deployment procedures
1.3 Technology Stack
| Component | Specification |
|---|---|
| Web Server | Apache 2.x |
| PHP | 8.3+ (mysqli, no PDO) |
| Database | MySQL 8.4+ with FULLTEXT indexing |
| Content Format | Markdown with custom parser |
| Styling | Custom CSS (no frameworks) |
| JavaScript | Vanilla JS |
| AI Integration | Ollama (local) and Claude API |
| LLM Models | GPT-OSS 20B, Gemma 3, Claude Sonnet |
1.4 Domain Context
This system is part of the Macroscope SELF domain—personal health, work, reading, writing, and social dimensions of the broader Macroscope research program integrating EARTH, LIFE, HOME, and SELF paradigms.
2. Database Schema
2.1 Overview
The system uses six primary tables supporting quotes, tags, many-to-many relationships, authentication, and future AI chat sessions.
2.2 Core Tables
2.2.1 quotes
Primary content storage.
| Column | Type | Description |
|---|---|---|
| id | int, PK, auto | Primary key |
| sequence | int unsigned | Collection order (original sequence) |
| quote_text | text | Quote content (supports Markdown) |
| attribution | varchar(500) | Author, source, date |
| date_added | date | Import/entry date |
| notes | text | Private curator notes |
| is_favorite | tinyint(1) | Favorite flag |
| created_date | datetime | Record creation |
| updated_date | datetime | Last modification |
Index: FULLTEXT on (quote_text, attribution) for search.
2.2.2 tags
Content categorization.
| Column | Type | Description |
|---|---|---|
| id | int, PK, auto | Primary key |
| tag_name | varchar(100) | Display name (lowercase) |
| slug | varchar(100), unique | URL-safe identifier |
| description | text | Tag description |
| created_date | datetime | Record creation |
2.2.3 quote_tags
Junction table for quote-tag relationships.
| Column | Type | Description |
|---|---|---|
| quote_id | int, FK | References quotes.id |
| tag_id | int, FK | References tags.id |
| assigned_date | datetime | Assignment timestamp |
| assigned_by | varchar(50) | Source: 'manual', 'llm', 'import' |
Note: The assigned_by field tracks provenance—whether tags were assigned manually, by AI suggestion, or during import.
2.2.4 admin_users
Authentication for administrative interface.
| Column | Type | Description |
|---|---|---|
| id | int, PK, auto | Primary key |
| username | varchar(50), unique | Login username |
| password_hash | varchar(255) | bcrypt hash |
| varchar(255) | Admin email | |
| last_login | datetime | Last access |
| created_date | datetime | Account creation |
2.2.5 chat_sessions (Future)
AI conversation sessions for contemplative quote engagement.
| Column | Type | Description |
|---|---|---|
| id | int, PK, auto | Primary key |
| quote_id | int, FK | Associated quote (nullable) |
| session_token | varchar(64), unique | Session identifier |
| created_date | datetime | Session start |
| last_activity | datetime | Last message |
2.2.6 chat_messages (Future)
Individual messages within chat sessions.
| Column | Type | Description |
|---|---|---|
| id | int, PK, auto | Primary key |
| session_id | int, FK | References chat_sessions.id |
| role | enum('user','assistant','system') | Message role |
| content | text | Message content |
| created_date | datetime | Message timestamp |
2.3 Views
2.3.1 quotes_with_tags
Convenience view joining quotes with concatenated tag names.
CREATE VIEW quotes_with_tags AS
SELECT q.*, GROUP_CONCAT(t.tag_name ORDER BY t.tag_name SEPARATOR ', ') as tags
FROM quotes q
LEFT JOIN quote_tags qt ON q.id = qt.quote_id
LEFT JOIN tags t ON qt.tag_id = t.id
GROUP BY q.id
ORDER BY q.sequence ASC;
2.3.2 tag_counts
Tag listing with quote counts for navigation.
CREATE VIEW tag_counts AS
SELECT t.id, t.tag_name, t.slug, COUNT(qt.quote_id) as quote_count
FROM tags t
LEFT JOIN quote_tags qt ON t.id = qt.tag_id
GROUP BY t.id
ORDER BY quote_count DESC;
2.3.3 random_quote
Single random quote for homepage feature.
CREATE VIEW random_quote AS
SELECT * FROM quotes ORDER BY RAND() LIMIT 1;
2.4 Foreign Key Constraints
All junction tables use CASCADE DELETE to maintain referential integrity:
quote_tags.quote_id→quotes.id(CASCADE)quote_tags.tag_id→tags.id(CASCADE)chat_sessions.quote_id→quotes.id(SET NULL)chat_messages.session_id→chat_sessions.id(CASCADE)
3. File Architecture
3.1 Directory Structure
/Library/WebServer/Documents/michaelphamilton/quotes/
├── css/
│ └── style.css # Public interface styles
├── includes/
│ ├── config.php # Site configuration
│ ├── search.php # Search and query functions
│ └── markdown.php # Markdown parser
├── templates/
│ ├── header.php # Public header template
│ └── footer.php # Public footer template
├── admin/
│ ├── index.php # Admin dashboard (quote listing)
│ ├── login.php # Authentication
│ ├── logout.php # Session termination
│ ├── edit.php # Quote editor with Markdown toolbar
│ ├── delete.php # Quote deletion
│ ├── tags.php # Tag management
│ ├── ai_tagger.php # Conversational AI tagging
│ ├── bulk_tagger.php # Batch AI tagging
│ └── admin.css # Admin interface styles
├── import/
│ └── visual_import.php # Click-to-select import tool
├── index.php # Homepage with featured quote
├── browse.php # Paginated quote listing
├── quote.php # Single quote view
├── search.php # Search results
├── tag.php # Quotes by tag
├── tags.php # All tags listing
└── random.php # Random quote display
3.2 Credential Storage
Database and AI credentials stored outside web root:
/Library/WebServer/secure/credentials/quotes_db_config.php
/Library/WebServer/secure/quotes/ai-config.php
3.3 Configuration (includes/config.php)
<?php
// Database credentials (external file)
require_once('/Library/WebServer/secure/credentials/quotes_db_config.php');
// Site configuration
define('SITE_NAME', 'Epigrammatical Stream of Consciousness');
define('SITE_TAGLINE', 'A Lifetime Collection of Quotes');
define('SITE_AUTHOR', 'Dr. Michael P. Hamilton');
define('ADMIN_EMAIL', 'mike@canemah.org');
define('BASE_PATH', ''); // Empty for domain root
define('DOC_ROOT', '/Library/WebServer/Documents/michaelphamilton/quotes');
// Database connection
$conn = mysqli_connect(DB_HOST, DB_USER, DB_PASS, DB_NAME);
mysqli_set_charset($conn, 'utf8mb4');
// Session management
session_start();
// Include Markdown parser
require_once(__DIR__ . '/markdown.php');
4. Visual Import Tool
4.1 Purpose
The Visual Import Tool addresses a fundamental challenge: parsing an unstructured text file containing quotes accumulated over decades with inconsistent formatting. Rather than attempting automated parsing (which proved error-prone), the tool enables meditative manual curation—click to select quote boundaries, verify attribution, and import with intention.
4.2 Interface Design
Split-panel interface:
- Left panel: Scrolling text content with paragraph-level selection
- Right panel: Controls, preview, attribution field, and action buttons
4.3 Workflow
- Load source file: Text file parsed into paragraphs
- Click start: First click marks quote beginning (green left border)
- Click end: Second click marks quote end (red right border, yellow fill for span)
- Auto-detect attribution: Last line analyzed for attribution patterns
- Verify and save: Review preview, adjust attribution if needed, save to database
- Visual feedback: Saved quotes dimmed with strikethrough
4.4 Attribution Detection
The tool auto-detects attributions using heuristics:
- Lines starting with em-dash, en-dash, or hyphen
- Short lines (≤80 characters) following quote content
- Lines with capital letters typical of names
- Exclusion of sentence-like patterns
function detectAttribution($line) {
// Check for dash-prefixed attribution
if (preg_match('/^[\u2014\u2013\-]\s*(.+)$/', $line, $m)) {
return trim($m[1]);
}
// Heuristics for standalone name lines
if (strlen($line) <= 60 && preg_match('/^[A-Z]/', $line)) {
return $line;
}
return null;
}
4.5 Attribution Stripping
When saving, the attribution line is automatically stripped from the quote text to avoid duplication, since it's stored separately in the attribution field.
5. AI Integration Layer
5.1 Architecture Overview
The AI layer provides a provider-agnostic interface supporting multiple LLM backends:
- Claude API: Anthropic's cloud models (Opus, Sonnet, Haiku)
- Ollama: Local models via Ollama server (GPT-OSS, Gemma, Ministral)
5.2 Configuration (ai-config.php)
<?php
return [
'provider' => 'ollama', // Default: 'claude' or 'ollama'
'claude' => [
'api_key' => 'sk-ant-...',
'base_url' => 'https://api.anthropic.com/v1/messages',
'max_tokens' => 2048,
'available_models' => [
'claude-opus-4-20250514' => 'Claude Opus 4 - Most capable',
'claude-sonnet-4-20250514' => 'Claude Sonnet 4 - Balanced',
'claude-haiku-4-5-20251001' => 'Claude Haiku 4.5 - Fast'
],
'default_model' => 'claude-sonnet-4-20250514'
],
'ollama' => [
'base_url' => 'http://localhost:11434',
'max_tokens' => 2048,
'available_models' => [
'gemma3:12b' => 'Gemma 3 12B - Balanced',
'gemma3:4b' => 'Gemma 3 4B - Fast',
'ministral-3' => 'Ministral 3 - General',
'gpt-oss:20b' => 'GPT-OSS 20B - Large'
],
'default_model' => 'gemma3:12b'
],
'tagging_prompt' => '...', // Conversational prompt
'bulk_prompt' => '...' // Terse JSON-only prompt
];
5.3 Provider Abstraction
The chat() function routes requests to the appropriate provider:
function chat($userMessage, $conversationHistory = [], $context = []) {
global $aiConfig;
if ($aiConfig['provider'] === 'claude') {
return chatClaude($userMessage, $conversationHistory, $context);
} else {
return chatOllama($userMessage, $conversationHistory, $context);
}
}
5.4 Tag Suggestion Parsing
LLM responses are parsed for JSON tag suggestions with flexible format handling:
function parseTagSuggestions($response) {
// Strip markdown code blocks
$clean = preg_replace('/```json\s*|\s*```/', '', $response);
// Try multiple JSON patterns
// {"action": "suggest_tags", "tags": [...]}
// {"action": "add_tags", "tags": [...]}
// {"tags": [...]}
// Bare array [...]
// Normalize tags: lowercase, spaces for multi-word
foreach ($tags as &$tag) {
$tag = normalizeTag($tag);
}
return $tags;
}
5.5 Tag Normalization
Tags are normalized to ensure consistency:
function normalizeTag($tag) {
$tag = strtolower(trim($tag));
$tag = str_replace(['_', '-'], ' ', $tag);
// Split concatenated words (e.g., "artificialintelligence")
// Using dictionary of common word boundaries
$commonWords = ['intelligence', 'learning', 'thinking', 'making',
'condition', 'development', 'consciousness', ...];
foreach ($commonWords as $word) {
$tag = preg_replace('/(\w)(' . $word . ')/i', '$1 $2', $tag);
}
return preg_replace('/\s+/', ' ', trim($tag));
}
6. Two-Tool Tagging Methodology
6.1 Design Rationale
Early experiments revealed that a single AI approach couldn't serve both efficiency and depth. The solution: two complementary tools optimized for different purposes.
6.2 Tool Comparison
| Aspect | Bulk Tagger | AI Tagger |
|---|---|---|
| Purpose | Baseline tagging at scale | Contemplative refinement |
| Model | GPT-OSS 20B (mechanical) | Gemma 3 / Claude (conversational) |
| Prompt | Terse, JSON-only | Rich, contextual |
| Output | Pure JSON | Natural conversation |
| Interaction | Batch process, review, save | Dialogue, suggest, discuss |
| Speed | ~2 sec/quote | Variable (conversation) |
| Use case | Initial pass on collection | Quotes deserving reflection |
6.3 Bulk Tagger (bulk_tagger.php)
Features:
- Batch processing (10, 25, 50, 100, or all quotes)
- Filter by: untagged, all, or sequence range
- Provider/model selection
- Progress tracking with success/failure indicators
- Existing tags displayed as context
- Batch save approved suggestions
Prompt Engineering:
The bulk prompt is optimized for mechanical reliability:
Suggest tags for this quote. Output ONLY a JSON object:
{"action": "suggest_tags", "tags": ["tag one", "tag two"]}
Rules:
- 3-7 lowercase tags
- IMPORTANT: Use spaces between words (e.g., "artificial intelligence")
- Specific concepts over generic (e.g., "mortality" not "life")
Results: GPT-OSS 20B achieved 100% success rate on 617 quotes with properly spaced multi-word tags.
6.4 AI Tagger (ai_tagger.php)
Features:
- Single-quote focus with full text display
- Conversational interface with message history
- Manual suggestion trigger (not automatic)
- Existing tags shown as context for complementary suggestions
- Apply selected tags with one click
- Previous/Next navigation between quotes
Prompt Engineering:
The conversational prompt encourages intellectual engagement:
You are a thoughtful companion helping Dr. Michael Hamilton tag
quotations from his personal collection spanning decades. Your role
is to suggest relevant tags and discuss the meaning, context, and
connections of each quote.
When suggesting tags, output a JSON block:
{"action": "suggest_tags", "tags": ["tag1", "tag2", "tag3"]}
Be conversational and intellectually engaged. The collection spans
science, philosophy, nature, ecology, relationships, AI, writing,
and the human condition.
6.5 Workflow Integration
Recommended workflow:
- Bulk pass: Process entire collection with GPT-OSS for baseline tags
- Review: Scan results in bulk interface, approve/reject
- Refinement: Use AI Tagger for quotes deserving contemplation
- Iteration: Existing tags inform new suggestions for complementary concepts
7. Public Interface
7.1 Design Philosophy
The public interface emphasizes contemplative reading:
- Warm color palette: Cream backgrounds (#f7f3ed), sage green accents (#6b7c5e)
- Serif typography: Georgia for readability and intellectual tone
- Generous whitespace: Unhurried visual rhythm
- Minimal navigation: Focus on content, not chrome
7.2 CSS Variables
:root {
--color-text: #3d3632;
--color-bg: #f7f3ed;
--color-bg-warm: #f0ebe3;
--color-paper: #fffef9;
--color-accent: #6b7c5e;
--color-accent-dark: #4a5840;
--font-serif: Georgia, 'Times New Roman', serif;
--font-sans: -apple-system, BlinkMacSystemFont, sans-serif;
}
7.3 Page Structure
| Page | Purpose |
|---|---|
| index.php | Featured random quote, search box, recent additions |
| browse.php | Paginated listing with sort options |
| quote.php | Single quote with tags, prev/next navigation |
| search.php | Full-text search results |
| tags.php | All tags with counts |
| tag.php | Quotes filtered by single tag |
| random.php | Random quote display |
7.4 Markdown Rendering
Quotes support Markdown formatting for structured content:
- Headings:
#,##,### - Emphasis:
**bold**,*italic* - Links:
[text](url) - Blockquotes:
> quoted text - Lists:
- itemor* item - Code:
`inline code`
The custom parser (markdown.php) renders these to HTML while preserving line breaks for poetry and multi-paragraph quotes.
8. Administrative Interface
8.1 Dashboard (admin/index.php)
- Paginated quote listing with search
- Columns: sequence, quote preview, attribution, tags, actions
- Quick links to edit, delete, view on site
- Flash messages for operation feedback
8.2 Quote Editor (admin/edit.php)
- Full-width textarea with Markdown toolbar
- Toolbar buttons: Bold, Italic, H1-H3, Blockquote, List, Link
- Attribution field
- Sequence number (auto-increment or manual)
- Favorite checkbox
- Tag input (comma-separated)
- Private notes field
- Previous/Next navigation between quotes
8.3 Tag Manager (admin/tags.php)
- Add new tags with optional description
- Edit existing tags (name, description)
- Delete unused tags (protected if quotes assigned)
- Quote count per tag
9. Search Implementation
9.1 MySQL FULLTEXT
The quotes table uses FULLTEXT indexing for natural language search:
ALTER TABLE quotes ADD FULLTEXT KEY ft_quote_search (quote_text, attribution);
9.2 Search Function
function searchQuotes($query, $limit = 50, $offset = 0) {
$escaped = mysqli_real_escape_string($conn, $query);
$sql = "SELECT q.*,
MATCH(quote_text, attribution) AGAINST('$escaped' IN NATURAL LANGUAGE MODE) as relevance,
GROUP_CONCAT(t.tag_name ORDER BY t.tag_name SEPARATOR ', ') as tags
FROM quotes q
LEFT JOIN quote_tags qt ON q.id = qt.quote_id
LEFT JOIN tags t ON qt.tag_id = t.id
WHERE MATCH(quote_text, attribution) AGAINST('$escaped' IN NATURAL LANGUAGE MODE)
GROUP BY q.id
ORDER BY relevance DESC
LIMIT $limit OFFSET $offset";
return mysqli_query($conn, $sql);
}
10. Deployment Checklist
10.1 New Installation
- [ ] Create MySQL database with utf8mb4 charset
- [ ] Import schema SQL (quotes_db.sql)
- [ ] Create database user with appropriate privileges
- [ ] Create credential file outside web root
- [ ] Copy PHP files to web directory
- [ ] Create AI config file with provider settings
- [ ] Set directory permissions for web server
- [ ] Create admin user via setup.php (then delete setup.php)
- [ ] Configure Apache virtual host
- [ ] Test login and all admin functions
- [ ] Import quotes via Visual Import Tool
- [ ] Run Bulk Tagger for initial tagging
- [ ] Verify public interface
10.2 Ollama Setup (for local AI)
- [ ] Install Ollama on server
- [ ] Pull required models:
ollama pull gpt-oss:20b,ollama pull gemma3:12b - [ ] Configure firewall for localhost:11434
- [ ] Test connection:
curl http://localhost:11434/api/tags - [ ] Update ai-config.php with correct base_url
10.3 Security Considerations
- [ ] Credential files outside web root
- [ ] Admin area requires authentication
- [ ] Input sanitization on all user inputs
- [ ] Prepared statements for database queries
- [ ] Session management with secure settings
- [ ] Delete setup.php after initial configuration
11. Known Limitations
- Single-curator design: No multi-user permissions or roles
- No WYSIWYG editor: Markdown-only content editing
- Local AI dependency: Ollama requires server with sufficient resources
- Basic caching: No Redis/Memcached integration
- Simple authentication: No OAuth, password reset, or 2FA
- Manual tag assignment: No automated tag inheritance or hierarchy
12. Future Directions
12.1 Knowledge Graph Evolution
The tag system provides foundation for knowledge graph development:
- Tag relationships: Hierarchical (parent/child) and associative (related) links
- Quote connections: Explicit links between quotes on similar themes
- Author networks: Connections between thinkers who influenced each other
- Temporal mapping: Timeline visualization of ideas across history
12.2 AI Chat Integration
The chat_sessions and chat_messages tables support planned features:
- Quote contemplation: Conversational exploration of individual quotes
- Collection dialogue: AI-assisted discovery across the collection
- Synthesis: Generate insights connecting disparate quotes
- Personal oracle: Ask questions answered from collection wisdom
12.3 Export and Portability
- JSON export of full collection with tags
- Markdown export for static site generation
- API endpoints for external integration
13. References
[1] PHP Documentation. "mysqli Extension." https://www.php.net/manual/en/book.mysqli.php (accessed December 8, 2025).
[2] MySQL Documentation. "Full-Text Search Functions." https://dev.mysql.com/doc/refman/8.4/en/fulltext-search.html (accessed December 8, 2025).
[3] Ollama Documentation. "API Reference." https://github.com/ollama/ollama/blob/main/docs/api.md (accessed December 8, 2025).
[4] Anthropic Documentation. "Claude API Reference." https://docs.anthropic.com/claude/reference (accessed December 8, 2025).
[5] Gruber, J. "Markdown: Syntax." https://daringfireball.net/projects/markdown/syntax (accessed December 8, 2025).
Appendix A: Complete SQL Schema
See accompanying file: CNL-PR-2025-XXX_schema.sql (quotes_db.sql)
Appendix B: Color Scheme Reference
B.1 Public Interface (Contemplative)
| Element | Color | Hex |
|---|---|---|
| Background | Warm cream | #f7f3ed |
| Paper | Off-white | #fffef9 |
| Text | Warm brown | #3d3632 |
| Accent | Sage green | #6b7c5e |
| Accent dark | Forest | #4a5840 |
| Border | Tan | #d9d2c7 |
B.2 Admin Interface (Functional)
| Element | Color | Hex |
|---|---|---|
| Header | Forest gradient | #3d5a35 → #2a3d25 |
| Background | Light gray | #f5f5f5 |
| Primary | Green | #4a6741 |
| Danger | Red | #c0392b |
| Success | Green | #27ae60 |
Appendix C: Prompt Engineering Notes
C.1 Bulk Tagging Prompt Evolution
Problem: Initial prompts produced concatenated multi-word tags ("artificialintelligence").
Solution: Explicit instruction with examples:
IMPORTANT: Use spaces between words (e.g., "artificial intelligence" NOT "artificialintelligence")
Multi-word tags must have spaces: "climate change", "human condition", "self reflection"
Result: 100% compliance with GPT-OSS 20B.
C.2 Conversational Prompt Design
Key elements for effective tag discussion:
- Role establishment: "thoughtful companion"
- Context provision: Collection spans science, philosophy, nature...
- Output format: JSON block for machine parsing
- Guidelines: Specific over generic, consider themes/disciplines/emotions
- Tone: "Conversational and intellectually engaged"
Document History
| Version | Date | Changes |
|---|---|---|
| 1.0 | 2025-12-08 | Initial release |
End of Protocol
Permanent URL: https://canemah.org/archive/document.php?id=CNL-PR-2025-012
Cite This Document
BibTeX
Permanent URL: https://canemah.org/archive/document.php?id=CNL-PR-2025-012