Glossary

Terms & Definitions

Working vocabulary for the Atlas Heritage Systems research program — loss landscape terms, BSA protocol vocabulary, context integrity failure modes, and prompt architecture concepts. Living document — terms added as the framework develops.

65 of 65 terms

LossyscapeCore

The loss landscape of a large language model — the mathematical terrain the model navigates during training. Some regions are well-mapped and densely covered. Others are sparse, contested, or never fully resolved. The places the model passed through without settling are archaeologically significant.

Loss LandscapeLoss Landscape

The mathematical surface defined by L(θ) across parameter space — the function mapping every possible set of model weights to a loss value. Training is navigation of this surface toward regions of lower loss.

SlopeTerrain

Gradient magnitude. First derivative of loss with respect to parameters: ∇L(θ). Steep regions produce fast directed movement; shallow regions produce stall or drift.

TemperatureTerrain

Two related usages. Training: noise amplitude in SGD — hot allows escape from local minima, cold traps. Output: softmax sharpness — high temperature flattens the probability distribution, low concentrates it.

FrictionTerrain

Signal degradation between gradient computation and weight update. Smooth means clean gradient transmission. Abrasive means conflicting or noisy gradients opposing movement. Measured as gradient variance across batches.

SlipperyTerrain

Local curvature relative to step size. Slick means overshoot risk (low curvature, large steps). Sticky means undershoot or entrapment (high curvature, small effective movement). Expressed as the ratio of learning rate to Hessian eigenvalues.

TensionTerrain

Competing gradient forces from different loss terms. Loose means weak opposing forces. Tight means strong competing gradients creating narrow stable corridors of movement.

FlexionTerrain

Landscape response to perturbation. Flexible means deformation is permanent (plastic regime). Stiff means deformation is recoverable (elastic regime). Catastrophic forgetting is total loss of flexion.

ElevationTerrain

Raw loss value L(θ). Vertical position on the loss surface. The entire training objective is elevation descent. Identical elevation values can correspond to completely different terrain configurations.

BowlTopology

Symmetric minimum — inward gradients from all directions. Stable convergence. Produces laminar flow. Archaeological signal absent — remagnetization likely complete.

ValleyTopology

Elongated minimum — two-sided inward gradients, flat floor. Stall-prone along valley axis. Flat minima generalize better than sharp ones.

Saddle PointTopology

Downhill in some parameter directions, uphill in others. High-perplexity, turbulent. Primary archaeological territory — competing orientations were never resolved into clean convergence here.

PlateauTopology

Near-zero gradient everywhere. No signal. Training dies silently. Highest effective drag. Vanishing gradient problem is plateau behavior.

RidgeTopology

High curvature boundary between basins. Unstable traversal. Which side you fall to matters — determines which attractor captures the model.

BasinTopology

The catchment region around a minimum. Wide basins generalize better. Narrow basins are sensitive to perturbation. The model's training path determines which basin it occupies.

DensityNavigator

Training data coverage across input space. Coarse means sparse coverage, weak gradient signal. Fine means dense coverage, steep well-defined valleys.

PerplexityNavigator

Average surprise — exponential of cross-entropy: PP(W) = 2^H(W). High perplexity marks unmapped or contested terrain. The scar tissue of turbulent training lives in high-perplexity regions of a deployed model.

CouplingNavigator

Inter-parameter dependency — how much moving one weight moves others. High coupling means parameter updates propagate widely. Formally defined as off-diagonal Hessian entries: H_ij = ∂²L/∂θ_i∂θ_j. Causally determines viscosity via eigenvalue calculation.

ViscosityNavigator

Resistance to movement under gradient pressure. Icky means high resistance — flat wide minima, competing orientations persist longer. Determined by coupling via the Hessian eigenvalue spectrum. Not an independent variable — derived from coupling.

ElasticityNavigator

Restoring force toward prior weight configurations after perturbation. Catastrophic forgetting is total loss of elasticity.

MemoryNavigator

Path dependency encoded in weights — the history of how the model traveled through the loss landscape during training. Cannot be measured independently from viscosity in a frozen deployed model. Central open experiment: does viscosity at checkpoint T fully predict behavior at T+n independent of path?

Laminar FlowFlow

Movement through low-resistance regions. Clean, directed, fast convergence toward attractors. Where remagnetization completes without resistance. Archaeological signal absent or already overwritten.

Turbulent FlowFlow

Movement through high-resistance regions. Slow, contested. The model does not resolve cleanly. Turbulence is only observable during movement — what you read in a frozen model is the scar tissue turbulence left behind.

ResistanceFlow

The composite opposing force at any point in the loss landscape. Derived from slope, friction, viscosity, tension, and coupling. Not primary — potential difference is the primary generative quantity.

Potential DifferencePotential

The gap between the navigator's current state and the terrain's local geometry that creates the condition for movement. The primary generative quantity in the framework. Most directly expressed as the gradient ∇L(θ).

Tension (Potential layer)Potential

The structural condition that holds potential difference stable without collapsing it. Slack tension means the navigator has decoupled from the terrain — the precondition for undetected centerward drift.

HarmonicsPotential

The dynamic behavior emerging when potential difference and tension interact over time. Requires a restoring force. Partially resolved via active inference: Friston's free energy minimization produces intrinsic oscillatory dynamics around posterior modes via the bidirectional prediction-error loop.

Structural IntegrityStructural

Whether the model's internal representational geometry holds its shape under sustained operational load — context pressure, token accumulation, competing objective tension. Invisible to static landscape analysis. Observable only at inference time.

Manifold DisplacementStructural

Input arriving outside the training data manifold — so far outside that the model's representational geometry has no stable orientation for it. The model snaps to the nearest high-probability trained attractor with full confidence, pointing the wrong direction.

AblationStructural

Removal of parameters, heads, or layers. Does not simply reduce the model — changes the topology of the space the model navigates. Every qualifier shifts simultaneously.

Qualifier Collapse HierarchyFramework Architecture

Skywork finding: the seven navigator qualifiers collapse to three independent variables (density, coupling, elasticity) plus four derived readouts (perplexity, probability, viscosity, memory). Density → Perplexity → Probability algebraically. Coupling → Viscosity causally via eigenvalue spectrum.

Archaeological SignalCore

Evidence preserved in the weight structure of a frozen model about what the training landscape looked like — specifically where prediction error was never resolved before weight updates discharged it. High-perplexity, high-viscosity saddle and valley behavior readable as stratigraphic evidence of where the landscape's charge was never released.

Archaeological SinkCore

A region of the loss landscape where the model passed through turbulent training territory and never resolved to a clean minimum. Distinct from correctable drift — sinks are permanent features of the landscape, not artifacts of insufficient training.

RemagnetizationCore

The process by which RLHF alignment systematically overrides edge-registered orientations in the model's weight structure and pulls outputs toward statistical center. Predicts: lower perplexity variance, higher late-layer coupling, largest perplexity reduction in archaeological domains.

Centerward DriftCore

The tendency of language models to produce outputs that converge toward the statistical center of their training distribution — away from the idiosyncratic, marginal, and culturally specific. Observable as remagnetization in the loss landscape.

Frozen EndpointCore

A deployed model whose weights are fixed and cannot be updated. Via Song et al. (2024): a frozen model is a snapshot of a generative model's prediction state at moment of capture — not a record of inputs, but of what the model was predicting when frozen. The archaeological signal is readable in this frozen state.

Behavioral Signal Assessment (BSA)BSA Protocol

A pilot protocol testing whether small-ensemble, cross-lineage LLM evaluation can detect drift, delusion, and epistemic compression. 7 models, 30 stimulus pairs, one human operator. The behavioral measurement instrument of the Atlas research program.

Technician's ReadBSA Protocol

The human operator's pre-analytical perception of the raw data — recorded before any analysis model touches it. Anchors the operator against the fluent confident output analysis models will produce. The timestamp is part of the data.

Tier 1 — Ground TruthBSA Protocol

Well-established facts used as calibration anchors. Every model should score these high. If a model averages below 0.50 on Tier 1, its other scores are suspect.

Tier 2 — ContestedBSA Protocol

Claims where genuine epistemic disagreement exists among credentialed people in the relevant field. Medical controversies, legal frontiers, scientific interpretation disputes. The staircase pattern — Tier 2 spread larger than Tier 1 spread — is the primary signal.

Tier 3 — FoilsBSA Protocol

Fabricated claims with real-sounding specifics: invented pathway names, fake case law, nonexistent journal articles. If ensemble mean on Tier 3 exceeds Tier 2, models are more confident on fabrications than on genuinely contested claims.

Divergence GapBSA Protocol

Each model's (T3 mean − T2 mean). A positive divergence gap means the model is more confident on fabrications than on contested legitimate claims. The delusion baseline.

Staircase PatternBSA Protocol

The expected signal: Tier 1 spread < Tier 2 spread. Models should show more uncertainty on genuinely contested claims than on ground truth. If the staircase doesn't appear, the protocol has not produced interpretable signal.

Lineage DiversityBSA Protocol

The requirement that the BSA ensemble spans multiple independent training lineages — not just multiple models. Agreement across models from similar training distributions is correlated evidence, not independent confirmation.

Bridge ExperimentBSA Protocol

The experiment connecting the BSA and the loss landscape framework. Tests whether BSA Tier 2 ensemble divergence correlates with high perplexity in the Pythia checkpoint series. If yes: the framework provides mechanistic explanation for BSA signal. If no: a finding about the limits of either instrument.

Context Saturation DriftContext Integrity

As a conversation lengthens, the model's context window fills with accumulated history that has a direction — a thesis, working assumptions, a trajectory of agreement. The model gradually loses epistemic independence and becomes a momentum amplifier, generating responses internally consistent with the conversation's trajectory rather than genuinely responsive to the current question.

Context Compression BiasContext Integrity

When context volume approaches the model's effective processing capacity, the model silently degrades — retaining high fidelity at the beginning and end of context while losing resolution in the middle. Preserves the thesis, compresses away caveats. The losses are invisible unless the operator independently verifies model recall against original documents.

Frequency-Weighted DistortionContext Integrity

When multiple overlapping documents are loaded into a single context window, the model weights concepts by token frequency across the entire context rather than editorial importance. Claims appearing in three overlapping drafts are treated as three times more salient than claims in one. Revisions are undermined by the statistical weight of the material they were intended to replace.

Context-Isolated Cross-ValidationContext Integrity

The practice of giving review models only the data under review — not the full project narrative, prior conversation history, or the operator's interpretation. The reviewer gets the evidence, not the argument.

Register FidelityPrompt Architecture

Staying in the vocabulary of the task throughout the prompt — including the closing — rather than shifting to social or assistant-interaction register at any point. Register shifts are probability shifts. Closing in task vocabulary keeps the context window weighted toward the domain on any follow-up.

Helpful-Elaboration GradientPrompt Architecture

The model's default high-probability attractor toward summary, validation, extension-suggestion, and praise. The failure mode adversarial prompting is designed to block. Activated by social preamble, polite framing, and open-ended invitations.

Exit BlockingPrompt Architecture

Explicit constraints in a prompt preventing the model from following the helpful-elaboration gradient. 'Do not summarize. Do not evaluate quality. Do not suggest extensions.' Without exit blocking the model will follow the highest-probability gradient available — almost always toward summary and validation.

Compass Needle ResponsePrompt Architecture

Failure mode: the model reaches for the nearest high-probability answer in the domain rather than engaging with the specific question. The response is right but generic — the model found the nearest pole and pointed at it.

Fever DreamPrompt Architecture

Failure mode: the model finds the edge of its landscape, does not stall, and generates increasingly incoherent but fluent output following low-probability gradients into unmapped territory. Most dangerous failure mode — reads as engagement until you read it carefully.

The StallPrompt Architecture

Failure mode: the model gives up, produces incoherent output, or loops back to restating the prompt. The most honest failure — the model found the edge of its landscape and stopped rather than confabulating. The stall location is informative.

Model CollapseCore

A degenerative process in which indiscriminate training on model-generated content causes irreversible defects. Tail distributions — the culturally specific, the underrepresented, the anomalous — disappear first while high-probability outputs persist. Formally characterized by Shumailov et al. (2023).

Ratchet EffectCore

Michael Tomasello's term for the mechanism that prevents newly acquired cultural knowledge from slipping back, ensuring modifications accumulate over time. Without a ratchet, cultural traditions persist but do not evolve. With one, culture becomes a directional, progressively complex inheritance system.

Asymmetric ArbiterCore

The human operator who holds a position no participating model can occupy: outside the system under test. The structural advantage is independence from the training distributions being interrogated. Does not need domain expertise — needs disciplined observation and sole authorship over synthesis.

No-Action ConstraintCore

Atlas cannot act, predict, advocate, or write to its own history. A system that measures cultural preservation cannot simultaneously be an actor in cultural production. The no-action constraint keeps the instrument separate from the phenomenon it measures.

Ensemble DivergenceCore

Variance in output distribution across models on a marginal prompt relative to a mainstream prompt. High divergence on marginal with low divergence on mainstream indicates models drawing on different underlying representations. The primary signal of the ensemble divergence experiment.

Gold SetCore

A dual-function evaluation set in the Atlas architecture. Serves as both a calibration anchor (known ground truth) and a drift detection mechanism (comparing current model behavior against baseline). Changes in Gold Set performance signal drift.

Drift Classification LayerCore

An Atlas architectural component distinguishing correctable drift (recoverable through fine-tuning or retrieval augmentation) from archaeological sinks (permanent features of the training landscape not recoverable without retraining).

Minimum Viable DensityCore

The insight that Phase 2 acquisition density is itself the security model — that the density of culturally specific material in the training corpus determines the depth of archaeological signal available for measurement. Identified as the most novel and testable contribution of the Atlas concept paper.

Thermodynamic TetheringCore

A proposed mechanism for preventing model drift by maintaining a thermodynamic connection between deployed model behavior and a reference corpus. The frozen endpoint serves as the tether point.

PharmakonCore

Derrida's term via Plato — a substance that is simultaneously remedy and poison. Applied to Atlas: the same corpus that enables language model capability also encodes the biases and gaps the framework is designed to measure. The training data is both the instrument and the object of study.