Loss Landscape Vocabulary Framework

v12 · April 2026 · Atlas Heritage Systems Inc. · Working document — not a finished product

Terrain Properties

Properties of the loss surface itself — the fixed mathematical landscape that training navigates. Formally defined by L(θ) and its derivatives across parameter space. Readable only through dynamics — the landscape exists as a mathematical object but is accessible only through probing (movement).

Slopesteep / shallow

Gradient magnitude. First derivative of loss with respect to parameters. Steep regions produce fast directed movement; shallow regions produce stall or drift.

∇L(θ)

Cauchy (1847) gradient descent; Rumelhart et al. (1986) backpropagation

Temperaturehot / cold

Two related usages. Training: noise amplitude in SGD — hot allows escape from local minima, cold traps. Output: softmax sharpness — high temperature flattens probability distribution, low concentrates it.

SGD noise: σ schedule Softmax: P(x) = exp(z/T) / Σexp(z/T)

Hinton et al. (2015) knowledge distillation temperature

Frictionsmooth / abrasive

Signal degradation between gradient computation and weight update. Smooth means clean gradient transmission. Abrasive means conflicting or noisy gradients opposing movement.

Var[∇L(θ)] — gradient variance across batches
Flagged: metaphor (degradation) and math (variance) are adjacent but not identical

Kingma & Ba (2014) Adam optimizer

Slipperyslick / sticky

Local curvature relative to step size. Slick means overshoot risk (low curvature, large steps). Sticky means undershoot or entrapment (high curvature, small effective movement).

ratio η / λ (learning rate to Hessian eigenvalues)
Flagged: high λ defines sticky here and sharp regions in viscosity — partial overlap

Keskar et al. (2016) sharp minima and generalization

Tensionloose / tight

Competing gradient forces from different loss terms. Loose means weak opposing forces. Tight means strong competing gradients creating narrow stable corridors of movement.

‖∇L_task − ∇L_regularization‖
Flagged: involves dynamic training elements, blurs terrain/navigator distinction

Sener & Koltun (2018) multi-task learning

Flexionflexible / stiff

Landscape response to perturbation. Flexible means deformation is permanent (plastic regime). Stiff means deformation is recoverable (elastic regime).

elastic: λ‖θ‖² restoring term EWC: λΣF_i(θ_i − θ_i*)²

Kirkpatrick et al. (2017) elastic weight consolidation

Elevationhigh / low

Raw loss value. Vertical position on the loss surface. The entire training objective is elevation descent. Identical elevation values can correspond to completely different terrain configurations.

L(θ)

Choromanska et al. (2015) loss surface topology