CNL-TN-2026-005 Technical Note

Single-Image 3D Reconstruction from 360° Imagery: Experimental Findings Using Apple SHARP

Michael P. Hamilton , Ph.D.

Published: January 17, 2026 Version: 1

Single-Image 3D Reconstruction from 360° Imagery: Experimental Findings Using Apple SHARP

Document ID: CNL-FN-2026-XXX (auto-assigned)
Date: January 17, 2026
Author: Michael P. Hamilton, Ph.D.
Version: 0.2 (Post-experiment)
Status: Initial findings documented

AI Assistance Disclosure: This field note was developed with assistance from Claude (Anthropic, Opus 4.5). The AI contributed to literature review of the SHARP repository documentation, experimental design discussion, protocol development, software development, and manuscript drafting. The author takes full responsibility for the content, accuracy, and conclusions.

Abstract

This field note documents experimental findings from testing Apple's SHARP model for 3D reconstruction from 360° imagery. The original hypothesis—that cubemap faces from a spherical capture could be processed independently through SHARP and merged into a unified scene model—proved incorrect. SHARP generates independent coordinate systems for each input image with inconsistent depth scales, making geometric fusion infeasible. However, the experiment yielded a productive reframing: each cubemap face produces a valid "terrarium"—a measurable 3D frustum suitable for per-view analysis. We developed a complete toolkit including high-resolution cubemap extraction, a custom WebGL2 Gaussian splatting renderer matching commercial quality, and format converters for GIS/CAD integration. Key finding: input image resolution significantly impacts splat quality; 1536px cubemap faces (from 6080×3040 source imagery) produce substantially better results than 512px extractions.

1. Background and Motivation

The original experimental design proposed using GPS-tagged 360° imagery as a registration framework for SHARP-based 3D reconstruction. The reasoning: cubemap decomposition extracts perspective views with mathematically defined angular relationships, potentially enabling deterministic alignment without feature matching.

This hypothesis required testing before field deployment.

2. Experimental Process

2.1 Pipeline Development

We built a processing pipeline with the following components:

Cubemap Extraction (equirect_to_cubemap.py): Extracts six perspective faces from equirectangular 360° images. Supports automatic resolution detection—for a 6080×3040 source, optimal face size is 1536px (source width ÷ 4).

SHARP Processing: Each cubemap face processed independently:

sharp predict -i front.jpg -o front.ply

Output: ~1.18 million 3D Gaussians per face (fixed by SHARP architecture, regardless of input resolution).

Visualization: Custom WebGL2 Gaussian splatting renderer (splat_viewer.html) achieving visual quality comparable to SuperSplat.

Format Conversion (splat_convert.py): Export to .splat (web), .xyz/.las (GIS), .obj (CAD), .csv (analysis).

2.2 The Merge Attempt

Initial attempts to merge six cubemap faces into a unified spherical model failed. Investigation revealed the core problem:

Each SHARP prediction exists in its own coordinate system with independent depth scale.

Face	Z Range (meters)	Span
down	0.8 – 3.4	2.6m
up	1.4 – 17.5	16.1m
left	1.1 – 66.7	65.6m
back	0.9 – 80.2	79.3m
right	1.3 – 132.3	131.0m
front	1.2 – 135.9	134.7m

The side faces look through forest to distant points 100+ meters away; up/down have compact ranges. These scales cannot be reconciled through rotation alone—the depth predictions are fundamentally incompatible at boundaries.

2.3 Reframing: The Terrarium Model

Rather than a failed merge, we reframed the output as six independent measurement windows. Each cubemap face produces a valid 3D frustum—a "terrarium" with analyzable structure:

Ground plane detection
Height distribution analysis
Vegetation strata classification
Canopy cover estimation (for "up" face)

This reframing aligns with how SHARP was designed: single-image view synthesis for nearby camera movements, not omnidirectional scene reconstruction.

2.4 Resolution Discovery

A critical finding: input resolution significantly impacts output quality.

Initial tests used 512px cubemap faces (from a downscaled 1568×784 test image). Results were visibly inferior to the SuperSplat reference viewer.

Re-extracting at 1536px (from the full 6080×3040 source) produced dramatically sharper splats, matching SuperSplat's rendering quality in our custom viewer.

SHARP internally resizes all inputs to 1536×1536 and outputs a fixed ~1.18M Gaussians. However, starting with higher-resolution input preserves finer detail before this internal resize.

Recommendation: Always extract cubemap faces at source width ÷ 4, minimum 1024px, optimal 1536px.

3. Toolkit Developed

3.1 Cubemap Extractor

equirect_to_cubemap.py

python3 equirect_to_cubemap.py source.jpg ./output --size 1536
python3 equirect_to_cubemap.py source.jpg ./output --size max  # auto-detect

Features:

Bilinear interpolation for quality extraction
Metadata JSON with rotation parameters
Support for standard cubemap (6 faces) or denser grids

3.2 Gaussian Splat Viewer

splat_viewer.html

Custom WebGL2 renderer implementing proper Gaussian splatting:

Vertex shader projects 3D covariance → 2D screen-space ellipse
Fragment shader computes Gaussian alpha falloff
Back-to-front depth sorting for correct blending
Uses all Gaussian attributes: position, scale (3D), rotation (quaternion), color, opacity
Retina/HiDPI support via devicePixelRatio
Orbit controls with zoom range 0.1–50

Performance: 60 FPS on M4 Max with 1.18M Gaussians.

3.3 Format Converter

splat_convert.py

python3 splat_convert.py input.ply output.splat    # Web viewer format
python3 splat_convert.py input.ply output.xyz      # Point cloud (ASCII)
python3 splat_convert.py input.ply output.las      # LIDAR/GIS format
python3 splat_convert.py input.ply output.obj      # Blender/CAD
python3 splat_convert.py input.ply output.csv      # Full data for analysis
python3 splat_convert.py input.ply output.ply --standard  # MeshLab compatible

4. Key Findings

4.1 What Works

Per-face 3D reconstruction: Each cubemap face produces valid, analyzable 3D structure
Rapid processing: <1 second per face on Apple Silicon (MPS)
High visual quality: Custom renderer matches commercial SuperSplat quality
Format flexibility: Export chain to GIS, CAD, and web platforms
Resolution scaling: Higher input resolution → finer splat detail

4.2 What Doesn't Work

Spherical merge: Independent depth scales prevent geometric fusion
Cross-face consistency: Adjacent faces don't align at boundaries
Metric accuracy: Depth is plausible but not calibrated to real-world scale without reference objects

4.3 Implications for Field Use

SHARP + 360° imagery is not a replacement for photogrammetric reconstruction of unified scene models. However, it offers:

Rapid single-view 3D documentation
Per-direction habitat structure analysis
Visual exploration of forest structure
Educational/outreach 3D content
Preliminary site assessment before committing to full photogrammetry

5. Future Directions

5.1 WebXR Integration

The custom viewer could be extended with Three.js WebXR support for immersive exploration. Alternatively, GaussianSplats3D library provides ready-made WebXR compatibility with our PLY files.

5.2 Analysis Tools

Height profiling, vegetation strata classification, and canopy cover analysis could be integrated into the viewer or developed as CLI tools operating on the Gaussian data.

5.3 Multi-Station Capture

While single-sphere merge fails, between-station registration via GPS + ICP might still enable corridor or transect documentation, treating each station as a separate 6-terrarium sampling point rather than attempting unified reconstruction.

6. Files Delivered

File	Purpose
`equirect_to_cubemap.py`	Extract perspective faces from 360° images
`splat_viewer.html`	WebGL2 Gaussian splat renderer
`splat_convert.py`	Format conversion (PLY → splat/xyz/las/obj/csv)

7. Connection to Macroscope

This experiment exemplifies the Macroscope approach: rapid prototyping to evaluate emerging tools, honest documentation of both successes and limitations, and productive reframing when initial hypotheses fail.

The "terrarium" model—six measurable windows rather than one unified sphere—may prove more useful for certain ecological questions than the originally envisioned seamless reconstruction. Vertical structure analysis (down face), canopy openness (up face), and directional habitat characterization (cardinal faces) are valid measurement targets even without geometric fusion.

The toolkit developed here joins the Macroscope instrument collection: sensors and software for technology-mediated environmental observation.

References

[1] Apple Machine Learning Research (2025). "SHARP: Single-image High-Accuracy Real-time Parallax." GitHub repository. https://github.com/apple/ml-sharp

[2] Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). "3D Gaussian Splatting for Real-Time Radiance Field Rendering." SIGGRAPH 2023.

Document History

Version	Date	Changes
0.1	2026-01-17	Initial draft, pre-field trial
0.2	2026-01-17	Post-experiment: documented merge failure, terrarium reframing, toolkit development, resolution findings

Cite This Document

Michael P. Hamilton, Ph.D. (2026). "Single-Image 3D Reconstruction from 360° Imagery: Experimental Findings Using Apple SHARP." Canemah Nature Laboratory Technical Note CNL-TN-2026-005. https://canemah.org/archive/CNL-TN-2026-005

BibTeX

@techreport{hamilton2026singleimage, author = {Hamilton, Michael P., Ph.D.}, title = {Single-Image 3D Reconstruction from 360° Imagery: Experimental Findings Using Apple SHARP}, institution = {Canemah Nature Laboratory}, year = {2026}, number = {CNL-TN-2026-005}, month = {january}, url = {https://canemah.org/archive/document.php?id=CNL-TN-2026-005}, abstract = {This field note documents experimental findings from testing Apple's SHARP model for 3D reconstruction from 360° imagery. The original hypothesis—that cubemap faces from a spherical capture could be processed independently through SHARP and merged into a unified scene model—proved incorrect. SHARP generates independent coordinate systems for each input image with inconsistent depth scales, making geometric fusion infeasible. However, the experiment yielded a productive reframing: each cubemap face produces a valid "terrarium"—a measurable 3D frustum suitable for per-view analysis. We developed a complete toolkit including high-resolution cubemap extraction, a custom WebGL2 Gaussian splatting renderer matching commercial quality, and format converters for GIS/CAD integration. Key finding: input image resolution significantly impacts splat quality; 1536px cubemap faces (from 6080×3040 source imagery) produce substantially better results than 512px extractions.} }

Permanent URL: https://canemah.org/archive/document.php?id=CNL-TN-2026-005