CNL-SP-2026-015 Specification

THRML Proof-of-Concept Specification: Embodied Sensing Experiments

Published: February 1, 2026 Version: 1

THRML Proof-of-Concept Specification: Embodied Sensing Experiments

Document ID: CNL-SP-2026-001
Version: 1.0
Date: February 1, 2026
Status: Draft
Author: Michael P. Hamilton, Ph.D.
Project: Macroscope Ecological Observatory
Reference Doc: CNL-TN-2026-014

AI Assistance Disclosure: This specification was developed with assistance from Claude (Anthropic, Opus 4.5). The AI contributed to technical architecture design and document drafting. The author takes full responsibility for the content and technical decisions.

1. Abstract

This specification defines two proof-of-concept experiments for validating the embodied sensing framework described in CNL-TN-2026-014. Using the THRML library on Apple M4 Max hardware, we implement minimal Boltzmann machine meshes trained on Macroscope sensor data. Experiment A uses Tempest weather station readings (continuous environmental variables). Experiment B uses BirdWeather detections (discrete species presence/absence with relational structure). Success criteria: measurable energy differential between normal and anomalous input states.

2. Development Environment

2.1 Hardware

System	Role	Specs
Data	Development, iteration	MacBook Pro M4 Max, 128GB unified memory
Galatea	Production data source	Mac Mini M4 Pro, 1Gb fiber, continuous streams

2.2 Software Dependencies

# Python environment (Python 3.12+)
python3 -m venv ~/thrml-poc
source ~/thrml-poc/bin/activate

# Core dependencies
pip install jax jaxlib
pip install thrml
pip install mysql-connector-python
pip install numpy pandas

# Optional visualization
pip install matplotlib seaborn

2.3 JAX Configuration for Apple Silicon

# Verify Metal backend
import jax
print(jax.devices())  # Should show Metal device

# Set memory allocation
import os
os.environ['XLA_PYTHON_CLIENT_PREALLOCATE'] = 'false'
os.environ['XLA_PYTHON_CLIENT_MEM_FRACTION'] = '0.8'

2.4 Database Connection

# config.py
DB_CONFIG = {
    'host': 'localhost',
    'database': 'macroscope',
    'user': 'mikehamilton',
    'password': '***',  # From secure config
    'charset': 'utf8mb4'
}

3. Experiment A: Tempest Environmental Mesh

3.1 Objective

Train a Boltzmann machine on multi-variate environmental time series. Validate that the mesh settles to low energy on typical readings and exhibits measurable tension on anomalous inputs.

3.2 Data Extraction

Source Table: tempest_readings

Selected Variables:

Field	Description	Range	Encoding
`temperature_f`	Air temperature	20-100°F	8 bits
`humidity`	Relative humidity	0-100%	7 bits
`pressure_inhg`	Barometric pressure	28-32 inHg	6 bits
`wind_mph`	Wind speed	0-50 mph	6 bits
`solar_radiation`	Solar irradiance	0-1200 W/m²	8 bits

Total visible nodes: 35 bits

Query:

SELECT 
    recorded_at,
    temperature_f,
    humidity,
    pressure_inhg,
    wind_mph,
    solar_radiation
FROM tempest_readings
WHERE recorded_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
ORDER BY recorded_at ASC;

3.3 Binary Encoding

Continuous values are discretized into binary representations using uniform quantization:

def encode_continuous(value, min_val, max_val, n_bits):
    """Encode continuous value to binary array."""
    # Clamp to range
    value = max(min_val, min(max_val, value))
    # Normalize to [0, 1]
    normalized = (value - min_val) / (max_val - min_val)
    # Convert to integer in [0, 2^n_bits - 1]
    int_val = int(normalized * (2**n_bits - 1))
    # Convert to binary array
    return np.array([(int_val >> i) & 1 for i in range(n_bits)], dtype=np.int32)

# Encoding specification
TEMPEST_ENCODING = {
    'temperature_f': {'min': 20, 'max': 100, 'bits': 8},
    'humidity': {'min': 0, 'max': 100, 'bits': 7},
    'pressure_inhg': {'min': 28, 'max': 32, 'bits': 6},
    'wind_mph': {'min': 0, 'max': 50, 'bits': 6},
    'solar_radiation': {'min': 0, 'max': 1200, 'bits': 8},
}

3.4 Mesh Architecture

from thrml import SpinNode, Block, SamplingSchedule, sample_states
from thrml.models import IsingEBM, IsingSamplingProgram, hinton_init
import jax.numpy as jnp

# Configuration
N_VISIBLE = 35      # Input nodes (encoded sensor values)
N_HIDDEN = 100      # Hidden layer nodes
N_TOTAL = N_VISIBLE + N_HIDDEN

# Create nodes
visible_nodes = [SpinNode() for _ in range(N_VISIBLE)]
hidden_nodes = [SpinNode() for _ in range(N_HIDDEN)]
all_nodes = visible_nodes + hidden_nodes

# Bipartite connectivity: each visible connects to all hidden
edges = [(v, h) for v in visible_nodes for h in hidden_nodes]

# Initialize weights (will be learned)
biases = jnp.zeros((N_TOTAL,))
weights = jnp.zeros((len(edges),))  # Initialize flat, learn structure
beta = jnp.array(1.0)  # Inverse temperature

model = IsingEBM(all_nodes, edges, biases, weights, beta)

3.5 Training Procedure

Using Contrastive Divergence (CD-k):

def train_rbm(model, data, n_epochs=100, learning_rate=0.01, k=1):
    """Train RBM using CD-k."""
    for epoch in range(n_epochs):
        for batch in data_loader(data, batch_size=32):
            # Positive phase: clamp visible, sample hidden
            v_pos = batch
            h_pos = sample_hidden(model, v_pos)

            # Negative phase: k steps of Gibbs sampling
            v_neg, h_neg = gibbs_sample(model, v_pos, k=k)

            # Update weights
            model.weights += learning_rate * (
                outer(v_pos, h_pos) - outer(v_neg, h_neg)
            ).mean(axis=0)

            # Update biases
            model.biases[:N_VISIBLE] += learning_rate * (v_pos - v_neg).mean(axis=0)
            model.biases[N_VISIBLE:] += learning_rate * (h_pos - h_neg).mean(axis=0)

        # Log energy statistics
        energy = compute_energy(model, data)
        print(f"Epoch {epoch}: mean_energy={energy.mean():.4f}")

3.6 Success Criteria

Baseline Energy: After training, compute mean energy for held-out February data
Anomaly Injection: Test with:
- July temperatures (85°F) injected into February context
- Pressure at 29.0 inHg (storm) with solar radiation at 1000 W/m² (clear sky)
- Wind at 40 mph with humidity at 10% (unusual combination)
Threshold: Anomalous inputs should produce energy > 2σ above baseline mean

4. Experiment B: BirdWeather Species Mesh

4.1 Objective

Train a Boltzmann machine on species co-occurrence patterns. Validate that the mesh encodes relational structure and exhibits tension when expected species are absent.

4.2 Data Extraction

Source Table: birdweather_detections

Species Selection: Rather than top-N by count, we explicitly select:

Category	Species	Rationale
Raptors (predators)	Cooper's Hawk, Red-tailed Hawk, Merlin	Trigger community-wide alarm/silence
Resident songbirds	Spotted Towhee, Song Sparrow, Dark-eyed Junco, Bewick's Wren, Black-capped Chickadee, Chestnut-backed Chickadee, Bushtit, American Robin, Northern Flicker, California Scrub-Jay, American Crow	React to raptor presence
Additional context	Lesser Goldfinch, House Finch, Anna's Hummingbird, Golden-crowned Sparrow	Seasonal/behavioral variation

Total species nodes: 18 (3 raptors + 15 songbirds)

Aggregation: 15-minute presence/absence windows (finer than hourly to capture silence events)

Query:

-- Species list with categories
SELECT 
    s.id as species_id,
    s.common_name,
    CASE 
        WHEN s.common_name IN ('Cooper''s Hawk', 'Red-tailed Hawk', 'Merlin') THEN 'raptor'
        ELSE 'songbird'
    END as category,
    COUNT(*) as detection_count
FROM birdweather_detections bd
JOIN species s ON bd.species_id = s.id
WHERE bd.detected_at >= DATE_SUB(NOW(), INTERVAL 90 DAY)
  AND bd.confidence >= 0.7
  AND s.common_name IN (
    'Cooper''s Hawk', 'Red-tailed Hawk', 'Merlin',
    'Spotted Towhee', 'Song Sparrow', 'Dark-eyed Junco', 
    'Bewick''s Wren', 'Black-capped Chickadee', 'Chestnut-backed Chickadee',
    'Bushtit', 'American Robin', 'Northern Flicker',
    'California Scrub-Jay', 'American Crow',
    'Lesser Goldfinch', 'House Finch', 'Anna''s Hummingbird',
    'Golden-crowned Sparrow'
  )
GROUP BY s.id, s.common_name
ORDER BY category, detection_count DESC;

-- 15-minute presence matrix
SELECT 
    DATE_FORMAT(detected_at, '%Y-%m-%d %H:') as hour_part,
    LPAD(FLOOR(MINUTE(detected_at) / 15) * 15, 2, '0') as minute_bucket,
    species_id,
    1 as present
FROM birdweather_detections
WHERE detected_at >= DATE_SUB(NOW(), INTERVAL 90 DAY)
  AND confidence >= 0.7
  AND species_id IN (/* selected species_ids */)
GROUP BY hour_part, minute_bucket, species_id;

4.3 Binary Encoding

Species presence is naturally binary. Raptors and songbirds are encoded separately to facilitate analysis:

def build_presence_matrix(detections_df, species_list, time_buckets):
    """Build binary presence matrix [time_buckets × species]."""
    matrix = np.zeros((len(time_buckets), len(species_list)), dtype=np.int32)

    for idx, bucket in enumerate(time_buckets):
        present = detections_df[detections_df['time_bucket'] == bucket]['species_id'].unique()
        for sp_idx, sp_id in enumerate(species_list):
            if sp_id in present:
                matrix[idx, sp_idx] = 1

    return matrix

# Node allocation
N_RAPTORS = 3       # Cooper's Hawk, Red-tailed Hawk, Merlin
N_SONGBIRDS = 15    # Resident and seasonal songbirds
N_SPECIES = N_RAPTORS + N_SONGBIRDS  # 18 total
HOUR_OF_DAY = 5     # 5 bits for hour (0-23)
MONTH = 4           # 4 bits for month (1-12)
N_VISIBLE = N_SPECIES + HOUR_OF_DAY + MONTH  # 27 visible nodes

4.4 Mesh Architecture

# Configuration
N_VISIBLE = 27      # 18 species + 5 hour + 4 month
N_HIDDEN = 50       # Hidden layer
N_TOTAL = N_VISIBLE + N_HIDDEN

# Node indices for analysis
RAPTOR_INDICES = [0, 1, 2]  # First 3 nodes
SONGBIRD_INDICES = list(range(3, 18))  # Next 15 nodes

# Create nodes
visible_nodes = [SpinNode() for _ in range(N_VISIBLE)]
hidden_nodes = [SpinNode() for _ in range(N_HIDDEN)]
all_nodes = visible_nodes + hidden_nodes

# Bipartite edges (visible-hidden)
edges = [(v, h) for v in visible_nodes for h in hidden_nodes]

# Raptor-songbird lateral connections (capture suppression relationship)
for raptor_idx in RAPTOR_INDICES:
    for songbird_idx in SONGBIRD_INDICES:
        edges.append((visible_nodes[raptor_idx], visible_nodes[songbird_idx]))

# Songbird-songbird lateral connections (capture co-occurrence)
for i in SONGBIRD_INDICES:
    for j in SONGBIRD_INDICES:
        if i < j:
            edges.append((visible_nodes[i], visible_nodes[j]))

model = IsingEBM(all_nodes, edges, biases, weights, beta)

Edge count:

Bipartite: 27 × 50 = 1,350
Raptor-songbird: 3 × 15 = 45
Songbird-songbird: C(15,2) = 105
Total: 1,500 edges

4.5 Training Procedure

Same CD-k procedure as Experiment A, but with:

Batch size: 64 (more samples per update)
k=5 (more Gibbs steps for discrete structure)
Learning rate: 0.001 (smaller for stability)

4.6 Success Criteria

Baseline Energy: Compute mean energy for:
- Typical February morning (7-9 AM): songbirds active, no raptors
- Typical raptor event: raptor=1, most songbirds=0 (learned silence)
Relational Tests:

Scenario	Raptor	Songbirds	Expected Energy	Interpretation
Normal morning	0	Active (many=1)	Low	Equilibrium
Raptor hunting	1	Silent (most=0)	Low	Learned normal response
Unexplained silence	0	Silent (most=0)	High	Tension: why silent?
Unusual boldness	1	Active (many=1)	High	Tension: should be hiding

Threshold: Anomalous scenarios (unexplained silence, unusual boldness) should produce energy > 2σ above the appropriate baseline
Absence-as-Signal Validation: The mesh should register higher tension for "raptor=0, songbirds=0" than for simple low-activity periods (e.g., nighttime). The context of silence matters.

5. Database Integration

5.1 New Tables

-- Model registry
CREATE TABLE thrml_models (
    id INT NOT NULL AUTO_INCREMENT,
    model_name VARCHAR(100) NOT NULL,
    model_type ENUM('tempest', 'birdweather', 'combined') NOT NULL,
    description TEXT,
    n_visible INT NOT NULL,
    n_hidden INT NOT NULL,
    n_edges INT NOT NULL,
    training_start DATETIME,
    training_end DATETIME,
    training_samples INT,
    weights_path VARCHAR(500),
    baseline_energy_mean DECIMAL(10,4),
    baseline_energy_std DECIMAL(10,4),
    status ENUM('training', 'ready', 'archived') DEFAULT 'training',
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id),
    KEY idx_model_type (model_type),
    KEY idx_status (status)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

-- Inference results (time series)
CREATE TABLE thrml_inference (
    id BIGINT NOT NULL AUTO_INCREMENT,
    model_id INT NOT NULL,
    inferred_at DATETIME(6) NOT NULL,
    energy DECIMAL(12,6) NOT NULL,
    energy_zscore DECIMAL(8,4),
    mixing_time_ms INT,
    tension_level ENUM('normal', 'elevated', 'high', 'critical'),
    input_hash CHAR(32),
    input_summary JSON,
    notes TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id),
    KEY idx_model_inferred (model_id, inferred_at),
    KEY idx_tension (tension_level),
    CONSTRAINT fk_inference_model FOREIGN KEY (model_id) 
        REFERENCES thrml_models(id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

-- Anomaly events (flagged for review)
CREATE TABLE thrml_anomalies (
    id INT NOT NULL AUTO_INCREMENT,
    inference_id BIGINT NOT NULL,
    model_id INT NOT NULL,
    detected_at DATETIME NOT NULL,
    anomaly_type VARCHAR(50),
    energy DECIMAL(12,6),
    energy_zscore DECIMAL(8,4),
    description TEXT,
    input_snapshot JSON,
    reviewed TINYINT(1) DEFAULT 0,
    review_notes TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id),
    KEY idx_model_detected (model_id, detected_at),
    KEY idx_reviewed (reviewed),
    CONSTRAINT fk_anomaly_inference FOREIGN KEY (inference_id) 
        REFERENCES thrml_inference(id) ON DELETE CASCADE,
    CONSTRAINT fk_anomaly_model FOREIGN KEY (model_id) 
        REFERENCES thrml_models(id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

5.2 Integration Pattern

┌─────────────────────────────────────────────────────────────────┐
│                         MySQL                                    │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────────────┐  │
│  │ tempest_     │  │ birdweather_ │  │ thrml_models          │  │
│  │ readings     │  │ detections   │  │ thrml_inference       │  │
│  └──────┬───────┘  └──────┬───────┘  │ thrml_anomalies       │  │
│         │                 │          └───────────┬───────────┘  │
└─────────┼─────────────────┼──────────────────────┼──────────────┘
          │ read            │ read                 │ write/read
          ▼                 ▼                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Python Layer                                  │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐                 │
│  │ Training   │  │ Inference  │  │ Anomaly    │                 │
│  │ (manual)   │  │ (cron)     │  │ Detection  │                 │
│  └────────────┘  └────────────┘  └────────────┘                 │
│         │              │               │                         │
│         └──────────────┴───────────────┘                         │
│                        │                                         │
│              THRML / JAX / M4 Max                                │
└─────────────────────────────────────────────────────────────────┘
          ▲
          │ read only
          ▼
┌─────────────────────────────────────────────────────────────────┐
│                    PHP/LAMP Layer                                │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐                 │
│  │ Dashboard  │  │ Energy     │  │ Anomaly    │                 │
│  │ Overview   │  │ Time Series│  │ Review     │                 │
│  └────────────┘  └────────────┘  └────────────┘                 │
└─────────────────────────────────────────────────────────────────┘

5.3 Operational Flow

Training (manual, ~monthly):

cd ~/Macroscope/THRML_POC
source ~/thrml-poc/bin/activate
python training/train_tempest.py --days 90
python training/train_bird.py --days 90

Writes to thrml_models, stores weights in filesystem.

Inference (cron, every 15 minutes):

*/15 * * * * /Users/mikehamilton/thrml-poc/bin/python /Users/mikehamilton/Macroscope/THRML_POC/inference/run_inference.py

Reads latest sensor data, runs mesh, writes to thrml_inference. If energy > 2σ, also writes to thrml_anomalies.

Display (PHP, on request):

Query thrml_inference for recent energy time series
Query thrml_anomalies for unreviewed events
Display tension level, trends, flagged events

5.4 Tension Level Thresholds

Level	Z-Score	Interpretation
normal	< 1.5	Typical conditions
elevated	1.5 - 2.0	Minor deviation
high	2.0 - 3.0	Notable anomaly
critical	> 3.0	Significant event, flag for review

6. Implementation Files

6.1 Directory Structure

~/Macroscope/THRML_POC/
├── config.py              # Database credentials, encoding params
├── data/
│   ├── tempest_loader.py  # Tempest data extraction
│   └── bird_loader.py     # BirdWeather data extraction
├── models/
│   ├── tempest_rbm.py     # Experiment A mesh
│   └── bird_rbm.py        # Experiment B mesh
├── training/
│   ├── train_tempest.py   # Training script A
│   └── train_bird.py      # Training script B
├── evaluation/
│   ├── anomaly_test.py    # Anomaly injection tests
│   └── energy_plots.py    # Visualization
└── notebooks/
    └── poc_exploration.ipynb  # Interactive development

6.2 Deliverables

Artifact	Description
`tempest_model.pkl`	Trained Tempest RBM weights
`bird_model.pkl`	Trained BirdWeather RBM weights
`baseline_energies.json`	Normal state energy statistics
`anomaly_results.json`	Anomaly injection test results
`energy_distribution.png`	Histogram of normal vs anomaly energies

7. Scaling Path

7.1 Phase 1: Local Validation (This Spec)

Data source: Data (real-time) or recent Galatea export
Training window: 30-90 days
Mesh size: 135 nodes (Tempest), 77 nodes (BirdWeather)

7.2 Phase 2: Full Baseline Training

Data source: Galatea (full archive)
Training window: 12+ months (capture seasonal structure)
Mesh size: Scale to 1,000-5,000 nodes

7.3 Phase 3: Multi-Stream Integration

Combine Tempest + BirdWeather into unified mesh
Add temporal hierarchy (daily/seasonal/annual layers)
Target: 50,000+ nodes as described in CNL-TN-2026-014

8. Risk Factors

Risk	Mitigation
JAX/Metal compatibility issues	Fall back to CPU; THRML supports both
Insufficient training data on Data	Pull 90-day export from Galatea
Encoding loses ecological signal	Experiment with thermometer vs binary encoding
Mesh fails to learn structure	Start with known-good Ising parameters; tune beta

9. References

[1] Jelinčič, A., et al. (2025). "An efficient probabilistic hardware architecture for diffusion-like models." arXiv:2510.23972.

[2] Extropic Corp. (2025). "THRML: Thermodynamic Hypergraphical Model Library." https://github.com/extropic-ai/thrml

[3] Hamilton, M. P. (2026). "Embodied Ecological Sensing via Denoising Thermodynamic Models." CNL-TN-2026-014.

10. Document History

Version	Date	Changes
1.0	2026-02-01	Initial specification

Cite This Document

(2026). "THRML Proof-of-Concept Specification: Embodied Sensing Experiments." Canemah Nature Laboratory Specification CNL-SP-2026-015. https://canemah.org/archive/CNL-SP-2026-015

BibTeX

@manual{cnl2026thrml, author = {}, title = {THRML Proof-of-Concept Specification: Embodied Sensing Experiments}, institution = {Canemah Nature Laboratory}, year = {2026}, number = {CNL-SP-2026-015}, month = {february}, url = {https://canemah.org/archive/document.php?id=CNL-SP-2026-015}, abstract = {This specification defines two proof-of-concept experiments for validating the embodied sensing framework described in CNL-TN-2026-014. Using the THRML library on Apple M4 Max hardware, we implement minimal Boltzmann machine meshes trained on Macroscope sensor data. Experiment A uses Tempest weather station readings (continuous environmental variables). Experiment B uses BirdWeather detections (discrete species presence/absence with relational structure). Success criteria: measurable energy differential between normal and anomalous input states.} }

Permanent URL: https://canemah.org/archive/document.php?id=CNL-SP-2026-015