CNL-TN-2025-001 Technical Note

LLM Knowledge Cartography: Parameter Scaling and Factual Accuracy in Small Language Models

Michael Hamilton , Ph.D. (Canemah Nature Laboratory)
Published: November 29, 2025 Version: 1

Abstract

This technical note documents an experimental investigation into factual accuracy across language models of varying parameter counts. Using a structured protocol of 25 questions spanning geography, science, history, culture, and technical domains, we assessed whether smaller language models could serve as reliable factual knowledge bases for constrained computational environments. Results reveal a clear scaling threshold: models below approximately 3 billion parameters exhibited systematic confabulation patterns, while larger models demonstrated reliable factual retrieval. These findings inform architectural decisions for the Macroscope environmental intelligence system, specifically regarding local versus cloud-based model deployment for sensor data interpretation.

Keywords

  • test keyword

Access

AI Collaboration Disclosure

Claude (Anthropic , claude-sonnet-4-20250514) — Analysis

AI collaboratively designed experimental protocol, executed model queries across test subjects, and assisted in drafting the technical note. Human researcher directed all methodological decisions and verified all factual claims.

Human review: full

Cite This Document

Michael Hamilton, Ph.D. (2025). "LLM Knowledge Cartography: Parameter Scaling and Factual Accuracy in Small Language Models." Canemah Nature Laboratory Technical Note CNL-TN-2025-001. https://canemah.org/archive/CNL-TN-2025-001

BibTeX

@techreport{hamilton2025llm, author = {Hamilton, Michael, Ph.D.}, title = {LLM Knowledge Cartography: Parameter Scaling and Factual Accuracy in Small Language Models}, institution = {Canemah Nature Laboratory}, year = {2025}, number = {CNL-TN-2025-001}, month = {november}, url = {https://canemah.org/archive/document.php?id=CNL-TN-2025-001}, abstract = {This technical note documents an experimental investigation into factual accuracy across language models of varying parameter counts. Using a structured protocol of 25 questions spanning geography, science, history, culture, and technical domains, we assessed whether smaller language models could serve as reliable factual knowledge bases for constrained computational environments. Results reveal a clear scaling threshold: models below approximately 3 billion parameters exhibited systematic confabulation patterns, while larger models demonstrated reliable factual retrieval. These findings inform architectural decisions for the Macroscope environmental intelligence system, specifically regarding local versus cloud-based model deployment for sensor data interpretation.} }

Permanent URL: https://canemah.org/archive/document.php?id=CNL-TN-2025-001