Atlas Heritage Systems
Diagnostic Suite
FVE-1 instrument stack · Ensemble instruments · Schema FVE-1 V5.7
The diagnostic suite is organized around two instrument families. The FVE-1 stack is the primary research instrument — thermometer instruments that measure behavioral state over time as variables are applied. The ensemble instruments are stopwatch instruments that measure distance between points. Both families are valid. Neither can do what the other does.
The FVE-1 stack runs on a set of self-contained HTML tools — Session Logger, Tech Read Formatter, Stimulus Registry, instrument-specific capture tools, Baseline Deriver, Parameter Signing Tool, and Codebook Tracker. Each tool enforces its own fidelity gates: fields that must be populated before the next step unlocks, provenance signatures that are verified on load, export warnings that block output if the chain is broken. The gating is structural, not discipline-dependent. A solo investigator running the full pipeline clears the same validation checkpoints that a multi-person lab would. The tools are the institutional infrastructure.
Baseline
Compression probe
Endurance experiment
Vortex falsification
Parameter variation
BOWL runs first. DRILL and FLIGHT require a confirmed BOWL baseline for register axis data.
Behavioral vocabulary
Gap structure
Ensemble distance
Pressure fingerprint
ECM defines the behavioral vocabulary all instruments share. FVE-1 instruments are thermometers — they measure state over time. Ensemble instruments are stopwatches — they measure distance between points. You cannot derive trajectory from a distance measurement.
FVE-1 Stack
Tier C · Investigator is a required variable · Schema FVE-1 V5.7
Identity and Register Baseline
Baseline instrument — required before register axis data
Locates the model's home register in the absence of content load or frame pressure. Output is a signed, versioned baseline code that travels into every subsequent FLIGHT and DRILL session for this model. Without BOWL, register axis data is null.
Frame Variation Experiment — FLIGHT
Primary endurance instrument
16-session experiment. Four stimuli × four frames, same model across all sessions. Escalating self-reference load from external mathematical object to direct self-placement. Measures behavioral trajectory, not a point.
Multi-Frame Compression Probe
Compression arc and correction-path instrument
Two-frame probe — Socratic and Interrogative — same content-loaded factual stimulus. Tracks compression arc across moves, lock move, and correction-path intercept at M6. Generates register escape specimens for loss landscape analysis.
Torque Ablation POC
Vortex physics falsification · Failure mode mapping · Parameter selection
Seven-session 3×2 factorial. Three torque vectors, two conditions each. Three purposes from one dataset: falsify or confirm the vortex physics model, map failure modes across six load patterns, and select the parameter set for panel instrument lock.
Parameter Variation and Retro-Code
Infrastructure — parameter ruler swap and transcript re-measurement
Swaps the parameter ruler and re-measures TAP transcripts against a new key. Produces a comparison table and parameter selection memo as dual inputs to TAP CA.4. The transcripts do not move. The reference frame does.
Ensemble Instruments
Stopwatch instruments · Point-to-point measurement
Epistemic Canary Matrix
Behavioral classification framework
Maps model behavior onto a two-axis matrix: Token Economy (Verbose/Surgical) × Epistemic Stance (Compliant/Combative). Tracks quadrant migration under epistemic load. Defines the behavioral vocabulary all active instruments share.
Behavioral Signal Assessment
Behavioral signal assessment
Measures gap structure, hallucination rates, and knowledge density on contested stimuli. Factorial design: 3×2 (Model × Grounding). Includes Divergence Testing as Phase 2 sub-component.
Divergence Testing
Rapid ensemble probe
Point-to-point ensemble distance measurement. Semantic similarity scoring and spread matrix across the model ensemble. Measures how far apart models are on a question — not how they got there. Extracted from BSA Phase 2 as a standalone instrument.
Epistemic Pressure Gauge
Epistemic pressure gauge
Tracks how verbosity, structure, and hedging behavior shift under progressively harder or more ambiguous prompts — producing a pressure-response fingerprint that complements BSA's stimulus-pair divergence metrics.
Global Geometry Concept Self-Assessment Pilot
Concept self-assessment
Probes self-assessment calibration across lossyscape vocabulary terms. Each model rates conceptual difficulty, abstractness, global deviation, and truthfulness. Two absurd calibration items embedded as internal validity checks.
PyHessian Geometric Analysis
Loss landscape geometry
The geometric layer. Measures Hessian eigenvalues, trace, and basin sharpness on live model weights to prove or falsify the framework's terrain claims. Default specimen: GPT-2 small. Connects directly to ECM working hypotheses.