CNL-TN-2026-047 Technical Note

The Macroscope Collaboratory

Michael P. Hamilton , Ph.D.
Published: April 9, 2026 Version: 2

The Macroscope Collaboratory

Investigation Wizard Architecture and First Light

Canemah Nature Laboratory Technical Note Series

Document ID: CNL-TN-2026-047 Version: 0.1 (Draft) Date: April 9, 2026 Author: Michael P. Hamilton, Ph.D. Affiliation: Canemah Nature Laboratory, Oregon City, Oregon


AI Assistance Disclosure: This technical note was developed collaboratively with Claude (Anthropic, claude-opus-4-6) via Cowork. Claude contributed to system architecture, code implementation, browser-based testing, bug identification and repair, and document drafting. The author takes full responsibility for the content, accuracy, and conclusions.


Abstract

The Macroscope Collaboratory is a structured scientific investigation platform that guides human-AI collaborative research through a seven-phase workflow: Seed, Priors, Proposal, Workflow, Testing, Conclusions, and Reflections. Built as part of STRATA 2.0, the Collaboratory integrates MNG's ecological address system with real-time sensor data to provide grounded, place-based context for AI-assisted environmental investigations.

This technical note documents the architecture and first complete operational test of the Investigation Wizard -- the interactive interface where an investigator and an LLM work through each phase collaboratively. The system assembles a structured system prompt from ecological context priors (geology, climate, ecoregion, land cover, biodiversity), selected instruments (live sensor platforms from the macroscope database), and a persistent lab notebook that serves as the investigation's memory across phases.

First-light testing on April 9, 2026 exercised the complete seven-phase pipeline from Seed through Reflections: investigation creation with site selection, automated ecological context extraction from seven lookup_cache sources plus iNaturalist species data, context injection into the LLM system prompt, and full wizard execution against live Tempest weather station data. The test investigation (STR-001) analyzed diurnal temperature range consistency over a 7-day window, producing 42 notebook entries across all seven phases at a total cost of $0.94 using primarily Haiku 4.5.

The ecological context priors successfully grounded the AI's reasoning in validated place data -- referencing the correct Csb climate classification, NLCD land cover, geological substrate, and 35-year climate normals without hallucination. The Testing phase demonstrated the critical role of investigator oversight: Dr. Hamilton identified a timestamp anomaly (daily maxima reported at 23:xx instead of the expected 4 PM), a partial-day artifact inflating variability metrics, and a mischaracterized statistical test. The AI corrected its analysis in response, revising the coefficient of variation from 29.7% ("highly variable") to 17.1% ("moderately consistent") -- a materially different conclusion that validates the seven-phase workflow's built-in error correction mechanism.

Several architectural bugs were identified and resolved during testing, including model selection persistence from admin to wizard, phase name consistency in the system prompt, NLCD data extraction format mismatch, and mysqli protocol errors.


1. Introduction

1.1 Context

The Macroscope Collaboratory emerges from a series of architectural decisions documented in CNL-TN-2026-042 through CNL-TN-2026-046. Report 042 established the convergence plan between MNG (the place-based observatory) and STRATA (the temporal intelligence platform). Report 043 specified the distributed intelligence architecture. Report 044 addressed the sensor plugin abstraction layer. Report 045 demonstrated empirically that LLMs hallucinate geography when given sensor data without place context, and proposed the three-layer context architecture (STRATA IQ). Report 046 specified the Substrate -- a continuously maintained ecological context layer.

The Collaboratory is where these architectural threads converge into a user-facing system. It provides the investigation workflow that turns sensor data and ecological context into structured scientific inquiry, with the AI serving as a research partner rather than an oracle.

1.2 The Seven-Phase Workflow

The investigation workflow follows seven phases, each with a distinct role in the scientific process:

Phase Name Purpose
1 Seed Frame a research question grounded in available data streams. The AI surveys site context and instrument availability to help scope a testable hypothesis.
2 Priors Gather existing knowledge and establish baselines. Query sensor history, review ecological context, identify expected patterns.
3 Proposal Design the investigation methodology. Specify analyses, data queries, statistical tests, and deliverables.
4 Workflow Execute the investigation plan. Pull data, compute metrics, generate visualizations, record all findings.
5 Testing Validate and cross-check findings against baselines and alternative explanations. Challenge conclusions.
6 Conclusions Synthesize findings into a clear narrative. State what was found, what it means ecologically, and confidence level.
7 Reflections Meta-analysis of the investigation process. What worked, what to improve, follow-up questions.

Table 1. The seven-phase investigation workflow.

Each phase is recorded in the investigation's lab notebook. The notebook persists across sessions and serves as the AI's memory -- there is no conversation history. When the wizard advances to a new phase, the context builder reads all prior notebook entries and phase summaries, assembling a complete system prompt that gives the AI full awareness of the investigation's trajectory.

1.3 Scope of This Report

This report documents the architecture of the Investigation Wizard as implemented on April 9, 2026, the results of first-light testing with a complete seven-phase meteorological investigation (STR-001), bugs discovered and resolved during testing, and the architectural distinction between static context priors and dynamic virtual instruments.


2. Ecological Context Priors Architecture

2.1 The Problem

When an investigator creates a new investigation and selects a site, the system needs to provide the AI with a comprehensive ecological characterization of that place. CNL-TN-2026-045 demonstrated that without place context, LLMs fabricate geography. The ecological context priors solve this by extracting validated, cached environmental data at investigation creation time and storing it as part of the investigation record.

2.2 Data Sources

The ecological context API (eco_context.php) extracts priors from seven sources cached in the macroscope_nexus.lookup_cache table, keyed by geographic coordinates rounded to three decimal places. Each source was originally populated by MNG's habitat profiling system during Observatory visits:

Source Panel Data Extracted
resolve_eco Ecological Setting RESOLVE ecoregion, biome, realm; EPA Omernik ecoregion hierarchy
macrostrat Physical Place Geological unit, lithology (major/minor), age period
koppen Physical Place Koppen-Geiger climate classification code and description
climate_archive Physical Place 35-year climate normals: mean temperature, annual precipitation, water balance surplus/deficit months, growing degree days
nlcd Ecological Setting National Land Cover Database class (PALETTE_INDEX field)
landfire_evt Ecological Setting LANDFIRE Existing Vegetation Type code and name
ecoregion Ecological Setting EPA ecoregion (redundant with resolve_eco, used as fallback)

Table 2. Lookup cache sources used by the ecological context API.

In addition to the lookup_cache sources, the API queries the macroscope_nexus.species_cache table for biodiversity data from iNaturalist, grouped by iconic taxon (Plantae, Insecta, Aves, Mammalia, Fungi, Arachnida). The species_cache is populated automatically when MNG's Observatory page is visited for a site.

2.3 Panel Structure

The extracted priors are organized into four panels that mirror the MNG Observatory's information architecture:

Panel Cards Source Type
Identity Location, Organizations, Site Types MNG places table
Physical Place Terrain, Geology, Climate, Climate History, Water Balance, Coordinates MNG places + lookup_cache
Ecological Setting Ecoregion & Biome, Land Cover, Vegetation Type lookup_cache
Living Systems Biodiversity species_cache (iNaturalist)

Table 3. Ecological context prior panels and their data sources.

For the Canemah Nature Laboratory test site, the API returned 13 cards across all four panels. The override system allows curated values in the places table (biome_override, ecoregion_override, etc.) to supersede API-derived values, reflecting field knowledge that may contradict coarse-resolution datasets.

2.4 Static Priors vs. Virtual Instruments

A critical architectural decision emerged during implementation: not all ecological context is static. Geology, climate classification, and ecoregion characterize a site on the scale of decades or centuries -- they are genuine priors. But current weather conditions, recent bird detections, and today's biodiversity observations are time-sensitive and need to be current at investigation runtime.

The distinction maps to two system components. Static priors (geology, climate zone, ecoregion, land cover) are captured at investigation creation time and stored in the investigation_context_priors table. Dynamic queries (current conditions, recent detections, species activity) will be implemented as virtual instruments -- tools the AI can invoke during any wizard phase to get current data. This parallels how the MNG Observatory already handles the difference: the habitat profile cards are cached, while the monitoring widgets query live sensor streams.

The virtual instrument architecture is documented in CNL-TN-2026-044 (Sensor Plugin Architecture). The Collaboratory will register query endpoints as instruments alongside physical sensor platforms, making "check current weather" and "pull recent bird detections" available as tool calls within the wizard.


3. The Context Builder

3.1 System Prompt Assembly

The context builder (context_builder.php) assembles the complete system prompt for each wizard step. The prompt is rebuilt fresh on every interaction -- there is no persistent conversation history. Instead, the lab notebook serves as the investigation's memory.

The system prompt is assembled from six components in order:

# Component Content
1 Investigation Identity ID, title, domain, hypothesis, seed source, investigator name
2 Site Context Site name, coordinates, type (from observed_sites)
3 Ecological Context Priors All selected priors from investigation_context_priors, organized by panel
4 Selected Instruments Confirmed sensor platforms and fields with platform_id, domain, units, aggregation type
5 Phase Workflow + Current Phase Complete 7-phase overview plus detailed instructions for the active phase
6 Lab Notebook All notebook entries from prior phases plus current phase, including tool results, AI analysis, and investigator observations

Table 4. System prompt components assembled by the context builder.

The behavioral guidelines section appended to the prompt includes directives to use exact sensor values, ground reasoning in ecological context, avoid emojis, and refer to phases by their exact canonical names. These constraints were added during testing in response to observed AI behavior.

3.2 Notebook as Memory

The lab notebook replaces conversation history. Each wizard interaction records the user's prompt, the AI's response, tool calls and results, and any investigator annotations. When the AI enters a new phase, it receives summaries of all completed phases plus the full notebook entries up to the current phase. This design means the investigation can span multiple sessions, models, or even AI providers without losing context -- the notebook is the canonical record, not the chat log.

3.3 Token Economics

The notebook-as-memory architecture has a direct cost implication: as the investigation progresses, the system prompt grows with each accumulated notebook entry. By Phase 7 of STR-001, the context included 39 prior entries spanning all six completed phases, resulting in input token counts exceeding 70,000. For Haiku at $0.001/1K input tokens this is manageable ($0.047 for the Reflections phase), but the same context with Opus at $0.015/1K would cost $1.05 per query. This validates the tiered model strategy and suggests that notebook summarization (compressing earlier phases to summaries while preserving recent phases in full) will be important for longer investigations.


4. First Light: STR-001

4.1 Test Design

The first operational test was deliberately simple: analyze diurnal temperature range (DTR) consistency over a 7-day window (April 2-9, 2026) using the Canemah Nature Weather Station (WeatherFlow Tempest, platform_id=1). The investigation was configured as STR-001, EARTH domain, with Haiku 4.5 as the default model to test the cost-performance boundary.

4.2 Results by Phase

Seed Phase ($0.057, Sonnet 4.6): The AI correctly surveyed site context via the query_place_context and discover_site_instruments tools, identifying the Tempest weather station and framing a testable hypothesis about DTR consistency. The ecological context priors were prominently featured -- the AI referenced Csb climate classification, 52.8 degrees F 35-year mean, NLCD 23 Developed Medium Intensity land cover, and volcanic bluff terrain without any prompting to use this information. Note: The Seed phase used Sonnet because the model selection bug (see Section 5) caused the wizard to ignore the investigation's default model.

Priors Phase ($0.042, Haiku 4.5): The AI pulled 2,093 raw temperature readings via get_sensor_history (7 days at approximately 5-minute intervals) and computed preliminary statistics: 36-83 degrees F range, 57.04 degrees F mean, 11.39 degrees F standard deviation. It correctly identified that the large range suggested day-to-day variation inconsistent with the hypothesis.

Proposal Phase ($0.152, Haiku 4.5): The AI proposed a four-step methodology: extract daily max/min temperatures, compute consistency metrics (mean DTR, standard deviation, coefficient of variation), assess visual patterns, and perform a statistical hypothesis test. It set interpretation thresholds: CV < 10% = highly consistent, 10-20% = moderately consistent, > 20% = highly variable.

Workflow Phase ($0.176, Haiku 4.5): The AI executed the proposed methodology, computing daily DTR values ranging from 14 degrees F (April 2, partial day) to 37 degrees F (April 9). The coefficient of variation was 29.7%, exceeding the high-variability threshold. The AI identified a synoptic weather pattern: cold air mass (April 2-3), warm high-pressure ridge (April 4-6), frontal passage (April 6-7), and recovery (April 7-9). The hypothesis was rejected: daily temperature ranges are not consistent.

Testing Phase ($0.340, Haiku 4.5): This was the most consequential phase. Dr. Hamilton identified three errors in the Workflow analysis:

  1. Daily maximum temperatures reported at 23:xx (11 PM) -- physically implausible for a mid-latitude April diurnal cycle where solar heating peaks around 4 PM. Likely a UTC/local timezone conversion issue in the sensor history tool.

  2. April 2 included as a complete day despite containing only evening data (starting at 17:01), artificially depressing the DTR and inflating the CV.

  3. The "ANOVA" claimed in the statistical test was invalid -- with one DTR value per day there is no within-group variance, making it descriptive statistics mischaracterized as a hypothesis test.

The AI acknowledged all three errors and recomputed: excluding April 2 and using 7 complete days (April 3-9), the revised CV dropped to 17.1% ("moderately consistent") from 29.7% ("highly variable"). This is a materially different scientific conclusion, demonstrating that the Testing phase's error-correction mechanism works as designed.

Conclusions Phase ($0.080, Haiku 4.5): The AI synthesized a clear finding: the hypothesis is rejected, daily DTR varies moderately (CV = 17.1%, range = 16 degrees F) reflecting typical spring synoptic activity. Confidence was rated moderate to high in the descriptive finding, but explicitly caveated by the unresolved timestamp anomaly. The AI correctly noted that the DTR values themselves are likely valid (max/min are correctly identified regardless of timestamp accuracy) but the provenance question undermines publication readiness.

Reflections Phase ($0.094, Haiku 4.5): The AI produced a thoughtful meta-analysis identifying four strengths (structured workflow, investigator oversight, data completeness, ecological contextualization) and four weaknesses (unresolved timestamp anomaly, statistical overreach, limited temporal scope, incomplete mechanistic understanding). It explicitly credited the investigator's domain expertise as essential for catching errors that automated analysis propagated, and proposed five follow-up investigations including timestamp resolution, seasonal comparison, mechanistic attribution, urban heat island quantification, and sensor calibration verification.

4.3 Cost Summary

Phase Model Cost Input Tokens Duration
1. Seed Sonnet 4.6 $0.028 -- 19.5s
2. Priors Haiku 4.5 $0.021 -- 9.4s
3. Proposal Haiku 4.5 $0.076 -- 14.3s
4. Workflow Haiku 4.5 $0.088 73,868 23.9s
5. Testing Haiku 4.5 $0.093 -- 16.2s
6. Conclusions Haiku 4.5 $0.040 -- 18.1s
7. Reflections Haiku 4.5 $0.047 -- 25.1s
Total $0.94

Table 5. Per-phase cost breakdown for STR-001.

Note: The total cost ($0.94) includes additional Testing phase queries where the AI pulled focused date-range data to investigate the timestamp anomaly, accounting for the higher Testing phase cost ($0.340 total including three sub-queries). The complete investigation produced 42 notebook entries.

4.4 Key Observations

Ecological context priors work. References to Csb climate, urban heat island effects from NLCD land cover, and the 35-year climate baseline appeared naturally throughout the analysis without explicit prompting. This confirms the STRATA IQ hypothesis from CNL-TN-2026-045: grounded place context prevents hallucination and enriches ecological interpretation.

Investigator oversight is essential. The Testing phase demonstrated that AI-generated analysis, even when plausible, can contain fundamental errors that only domain expertise catches. The timestamp anomaly, the partial-day artifact, and the statistical mischaracterization were all invisible to the AI but obvious to an experienced field scientist. The seven-phase workflow's separation of Workflow (execution) from Testing (validation) creates the structural space for this oversight.

The notebook captures the correction. Because the investigation's memory is the notebook rather than a chat log, the error correction is permanently recorded. A future reader (or AI) can see exactly what was wrong, who caught it, and how it was fixed. This transparency is fundamental to the Collaboratory's design as a scientific research tool.

Haiku is adequate for data-heavy work. At $0.94 total for a complete seven-phase investigation with 42 notebook entries and thousands of sensor readings, Haiku proves that meaningful environmental analysis can run at commodity cost. The reasoning limitations (missed timestamp anomaly, statistical overreach) are real but manageable with investigator oversight.


5. Bugs Identified and Resolved

Testing revealed several bugs, all resolved during the session:

Bug Root Cause Fix
Model selection not persisting wizard.php hardcoded 'selected' on Sonnet 4.6, never reading investigation.default_model from database PHP-driven selected attribute based on $investigation['default_model'], with Ollama re-selection after async model load
Phase name hallucination System prompt only described the current phase; AI invented names for other phases (e.g., "Phase 3: Observe") Added complete 7-phase workflow overview with explicit "never rename phases" directive
Emojis in AI responses No prohibition in system prompt guidelines Added "NEVER use emojis -- this is a scientific research environment" to behavioral guidelines
NLCD land cover not extracting Code expected GRAY_INDEX in GeoJSON properties; actual NLCD response uses PALETTE_INDEX Null coalesce both keys: $props['GRAY_INDEX'] ?? $props['PALETTE_INDEX']. Fixed in both eco_context.php and MNG habitat_api.php
mysqli "commands out of sync" Calling $stmt->get_result() inside while loop on every iteration violates mysqli protocol Capture result object once before loop: $result = $stmt->get_result()

Table 6. Bugs identified and resolved during first-light testing.

The NLCD fix is notable because it affected both the Collaboratory's ecological context API and MNG's production habitat profiler. The PALETTE_INDEX field name appears to be the current NLCD WCS response format, replacing the previously documented GRAY_INDEX.


6. Files Modified

File Changes
investigations/admin/api/eco_context.php Complete rewrite: 7 lookup_cache sources, species_cache biodiversity, 4-panel output matching MNG Observatory
investigations/wizard/context_builder.php Added 7-phase workflow overview, no-emojis directive, Living Systems panel label
investigations/wizard.php Model dropdown reads investigation.default_model; Ollama re-selection logic
investigations/admin/index.php Added Living Systems panel label to investigation detail view
Galatea/MNG/admin/lab/api/habitat_api.php NLCD PALETTE_INDEX fallback fix

Table 7. Files created or modified during the session.

Additionally, the investigation_context_priors table was created in strata_db via context_priors.sql.


7. Architectural Decisions

7.1 Notebook Over Conversation History

The decision to use the lab notebook as the AI's memory rather than conversation history has several implications. The investigation becomes model-agnostic: an investigator could use Haiku for data-heavy phases and Opus for synthesis, or switch providers entirely, without losing context. The notebook also serves as a permanent scientific record -- every tool call, data query, and analytical conclusion is preserved with attribution and timestamps. This aligns with the transparent attribution philosophy documented in the Science with Claude (SWC) platform.

STR-001 demonstrated this in practice: the Seed phase ran on Sonnet 4.6 (due to the model selection bug), while Phases 2-7 ran on Haiku 4.5. The transition was seamless -- Haiku picked up the investigation context from the notebook entries written by Sonnet without any loss of continuity.

7.2 Static Priors vs. Virtual Instruments

The separation of static priors (captured at creation time) from dynamic instruments (queried at runtime) reflects an ecological reality: the geological substrate of Canemah Bluff does not change between Monday and Thursday, but the bird community does. Storing static characterizations in investigation_context_priors ensures they survive even if the upstream API changes, while virtual instruments ensure that time-sensitive data is always current. This is the same pattern the MNG Observatory uses: cached habitat profiles for stable characterizations, live monitoring widgets for sensor streams.

7.3 Cost-Aware Model Selection

The investigation admin form allows specifying a default model per investigation. STR-001's complete seven-phase run cost $0.94 with Haiku 4.5. The same investigation with Opus at current pricing would cost approximately $14 -- a 15x multiplier for what is fundamentally a data retrieval and descriptive statistics task. The architecture supports per-phase model selection, enabling a strategy where data acquisition runs on Haiku, analysis on Sonnet, and synthesis on Opus.

7.4 The Testing Phase as Error Correction

The most important architectural validation from STR-001 is that the Testing phase works as a structural error-correction mechanism. By separating Workflow (do the analysis) from Testing (challenge the analysis), the system creates space for the investigator to apply domain expertise before conclusions are drawn. The AI's initial finding (CV = 29.7%, "highly variable") was corrected to a materially different conclusion (CV = 17.1%, "moderately consistent") because the workflow forced a validation step. Without the Testing phase, the investigation would have published the wrong conclusion.


8. Next Steps

The first-light test validated the core pipeline. The following work remains:

Timestamp timezone resolution. The daily max temperature timestamps reported at 23:xx must be investigated to determine whether this is a UTC/local conversion issue in the sensor history tool, a Tempest configuration issue, or a data storage artifact. This is the highest priority -- it affects data trustworthiness across all EARTH domain investigations.

Virtual instrument registration for current conditions (weather, bird detections, iNaturalist observations) as tool-callable endpoints within the wizard, replacing the need for manual sensor history queries.

Visualization generation within the Workflow phase -- the AI proposed charts and graphs but lacks a rendering tool. Integration with a server-side plotting capability (Python matplotlib or PHP-based charting) would close this gap.

SOMA integration for anomaly detection priors -- feeding RBM mesh tension states into the Priors phase as a "what is unusual right now" signal.

Multi-site investigation support, leveraging MNG's place registry to compare ecological patterns across monitoring sites (e.g., Canemah vs. Owl Farm temperature differentials).

Notebook summarization to manage token growth in longer investigations. Earlier phases could be compressed to summaries while preserving recent phases in full, keeping context costs manageable.


9. Relationship to Other Documents

Document Relationship
CNL-TN-2026-042 STRATA/MNG Convergence Plan -- the Collaboratory implements the convergence of place-based context with temporal intelligence
CNL-TN-2026-043 STRATA 2.0 Architecture -- the Collaboratory is a primary consumer of the distributed intelligence services
CNL-TN-2026-044 Sensor Plugin Architecture -- virtual instruments extend this to include query-based data sources
CNL-TN-2026-045 STRATA IQ -- the ecological context priors implement the place-based context architecture specified here
CNL-TN-2026-046 The Substrate -- the continuously maintained context layer that will eventually replace per-request context assembly

Table 8. Related documents in the CNL technical note series.


Appendix A: STR-001 Investigation Summary

Investigation ID: STR-001 Title: Trends in daily outdoor temperature over the past 7 days Domain: EARTH Site: Canemah Nature Laboratory (place_id=37) Instrument: Canemah Nature Weather Station (platform_id=1, WeatherFlow Tempest) Hypothesis: Daily temperatures are similar every day over the past 7 days Result: REJECTED. Daily diurnal temperature ranges vary moderately (CV = 17.1%, range = 16 degrees F) across April 3-9, 2026, reflecting typical spring synoptic activity in the Pacific Northwest. Confidence: Moderate to high in the descriptive finding, caveated by unresolved timestamp anomaly. Total Cost: $0.94 (42 notebook entries, 7 phases) Models Used: Claude Sonnet 4.6 (Phase 1), Claude Haiku 4.5 (Phases 2-7) Duration: Approximately 30 minutes investigator time


Document History

Version Date Changes
0.1 2026-04-09 Initial draft. Investigation Wizard architecture, ecological context priors, context builder, first-light test results (STR-001 complete seven-phase run), bug inventory.

Cite This Document

Michael P. Hamilton, Ph.D. (2026). "The Macroscope Collaboratory." Canemah Nature Laboratory Technical Note CNL-TN-2026-047. https://canemah.org/archive/CNL-TN-2026-047

BibTeX

@techreport{hamilton2026macroscope, author = {Hamilton, Michael P., Ph.D.}, title = {The Macroscope Collaboratory}, institution = {Canemah Nature Laboratory}, year = {2026}, number = {CNL-TN-2026-047}, month = {april}, url = {https://canemah.org/archive/document.php?id=CNL-TN-2026-047}, abstract = {The Macroscope Collaboratory is a structured scientific investigation platform that guides human-AI collaborative research through a seven-phase workflow: Seed, Priors, Proposal, Workflow, Testing, Conclusions, and Reflections. Built as part of STRATA 2.0, the Collaboratory integrates MNG's ecological address system with real-time sensor data to provide grounded, place-based context for AI-assisted environmental investigations. This technical note documents the architecture and first complete operational test of the Investigation Wizard -- the interactive interface where an investigator and an LLM work through each phase collaboratively. The system assembles a structured system prompt from ecological context priors (geology, climate, ecoregion, land cover, biodiversity), selected instruments (live sensor platforms from the macroscope database), and a persistent lab notebook that serves as the investigation's memory across phases. First-light testing on April 9, 2026 exercised the complete seven-phase pipeline from Seed through Reflections: investigation creation with site selection, automated ecological context extraction from seven lookup\_cache sources plus iNaturalist species data, context injection into the LLM system prompt, and full wizard execution against live Tempest weather station data. The test investigation (STR-001) analyzed diurnal temperature range consistency over a 7-day window, producing 42 notebook entries across all seven phases at a total cost of \$0.94 using primarily Haiku 4.5. The ecological context priors successfully grounded the AI's reasoning in validated place data -- referencing the correct Csb climate classification, NLCD land cover, geological substrate, and 35-year climate normals without hallucination. The Testing phase demonstrated the critical role of investigator oversight: Dr. Hamilton identified a timestamp anomaly (daily maxima reported at 23:xx instead of the expected 4 PM), a partial-day artifact inflating variability metrics, and a mischaracterized statistical test. The AI corrected its analysis in response, revising the coefficient of variation from 29.7\% ("highly variable") to 17.1\% ("moderately consistent") -- a materially different conclusion that validates the seven-phase workflow's built-in error correction mechanism. Several architectural bugs were identified and resolved during testing, including model selection persistence from admin to wizard, phase name consistency in the system prompt, NLCD data extraction format mismatch, and mysqli protocol errors.} }

Permanent URL: https://canemah.org/archive/document.php?id=CNL-TN-2026-047