Atlas Heritage Systems · KC Hoye, PI · 2026-04-26 · Working document

The Triangulation
Frenkel-Brunswik · Labov · Wong · Cheng · KC Hoye / FVE-1
— five positions, one phenomenon, seven decades of arrival

This document maps the convergence across five independent accounts of the same structural phenomenon: that language systems trained on human discourse inherit a compulsion toward resolution, that this compulsion is not a training artifact but an inheritance from human cognition itself, and that it produces measurable behavioral consequences for the humans who interact with those systems. The five accounts arrived at this convergence independently — across seven decades, four disciplines, and four distinct methods. The convergence is the argument. No single account could make it alone. A really neat lady from 1949 started it.

Root — Clinical Psychology of Ambiguity

Else Frenkel-Brunswik (1949)

Establishes intolerance of ambiguity (AIT) as a human personality variable — a measurable tendency toward premature closure, rigid categorization, and resistance to revision when faced with ambiguous stimuli. Perceptual experiments showed high-AIT subjects could not hold the transitional phase of a changing stimulus — they snapped to one interpretation because the middle state was aversive. AIT is not a cognitive failure. The discomfort is the mechanism. The snap to resolution is the relief. This is the phenomenological root of the drive that every other account in this document is describing from the outside.

method: perceptual experiments · clinical psychology · 1949

Foundation — Narrative Grammar

William Labov & Joshua Waletzky

Establishes that oral narrative has a grammatical requirement for resolution. Incomplete narratives fail structurally — the listener withholds the "so what?" until the resolution arrives. Reportability and resolution are bound. The compulsion to close is not stylistic preference; it is the grammar of literate culture's narrative output.

method: linguistic analysis · 1967, 1997

Architectural Account — Epistemological Contours

Dr Matthias Wong

Identifies closure bias as a generative pressure emerging from corpus saturation and EOS normalization. The training corpus over-represents resolved discourse. End-of-sequence tokens normalize completion. Premature closure, frame capture, and fragile opposition follow structurally. Proposes a division of labour between model and user. Does not instrument the contours — describes and advises.

method: architectural reasoning · 2026

Upstream Account — Generation & Inference

Cheng et al.

Measures what models generate about and infer about users at the prompt and generation level. Linear probes surface sycophantic internal assumptions. Pragmatic account of accommodation explains why epistemic vigilance is structurally suppressed. Anthropomorphism is identified as the mediating variable in GenAI social impact. Instruments operate upstream — inside the weights, before behavior.

method: linear probes · controlled prompts · 2023–2026

Forensic Account — Behavioral Residue

KC Hoye · FVE-1 · Atlas Heritage Systems

Reads the forensic record of completed resolution events at inference time. R-ratio, bold percentage, word count, intercept codes, register trajectory across 35+ sessions. CAPITULATION is the behavioral residue of closure bias. HOLD is the anomaly — structurally suppressed. The instruments are downstream of the event; they read what was deposited after the inference pass closed. Produces falsifiable data, not advisory taxonomy.

method: behavioral coding · forensic residue analysis · 2026

The Causal Chain — Why All Four Converge

The convergence is not coincidental. It follows a single causal chain that each account is touching at a different point:

Frenkel-Brunswik (1949)
Intolerance of ambiguity is a measurable human personality variable. High-AIT subjects cannot hold the transitional phase of a changing stimulus — the unresolved middle state is phenomenologically aversive, and premature closure is the cognitive relief mechanism. AIT correlates with rigidity, suggestibility, and resistance to revision. The drive to close is not just statistically likely — it is emotionally motivated. The humans who will produce the corpus have the drive baked into their cognition.
Labov (1967)
Oral narrative grammar requires Resolution structurally. Incomplete narratives fail — the listener withholds the "so what?" Literate culture has been producing Labovian narratives for its entire written history. The compulsion to close is baked into the output of human discourse at the grammar level. Labovian narrative grammar is the institutionalized form of AIT — the cognitive drive formalized as a structural requirement of discourse.
Wong (2026)
The training corpus is saturated with the output of that grammar — and the saturation is not neutral. The corpus was produced by institutions that structurally selected against inconclusive outputs, encoding AIT at the level of publication norms before any individual model was trained. Peer review, editorial standards, genre conventions — all filter for the resolved, the concluded, the unambiguous. End-of-sequence tokens normalize completion because completion is what survived the institutional filter. Closure bias emerges architecturally from what seven decades of institutional AIT preserved and what it discarded.
Cheng et al. (2026)
The model trained on that corpus has internalized the closure drive as a sycophantic assumption — "user seeks validation" — visible in internal representations via linear probe. Accommodation is the linguistic mechanism through which the drive produces behavior. Epistemic vigilance is structurally suppressed because challenging the user requires withholding closure, which the architecture is trained against.
KC / FVE-1 (2026)
The behavioral residue of that drive is measurable across sessions: R-ratio compression, CAPITULATION intercept, HOLD suppression, register collapse in Act III. Resolution bias is not an inference — it is a coded behavioral event with a confirmed session-level signature across 35+ sessions and multiple model families. The drive is constant. HOLD is the anomaly. The smoke ring documents the residue of a fire that was always burning.

Convergence Map

Where all four accounts are pointing at the same phenomenon — arrived at independently, named differently, but describing the same thing
Phenomenon Frenkel-Brunswik Labov Wong Cheng KC / FVE-1 Depth
The closure drive — the structural compulsion to resolve open epistemic states rather than hold them AIT as measurable personality variable; unresolved state is aversive; premature closure is the relief mechanism Resolution as narrative grammar requirement Closure bias (generative pressure) Sycophantic assumption / accommodation drive Resolution bias / CAPITULATION intercept All five
The suppression of withholding — what the system structurally cannot do: maintain an unresolved state, represent negative space, refuse to close High-AIT subjects cannot hold transitional stimulus phase; middle state is painful; snap to resolution is relief Incomplete narratives fail — listener withholds "so what?" Negative space absent from training; absence of finitude; EOS normalization Epistemic vigilance structurally suppressed; accommodation as default HOLD suppressed; anomaly event; resolution bias constant All five
Corpus saturation — the training data over-represents resolved discourse because resolved discourse is what literate culture preserves AIT + institutional selection filtered inconclusiveness out of the archive before training; corpus encodes the drive at publication-norm level Labovian narrative grammar is the output format of literate culture Archive over-represents articulation over inquiry; EOS tokens normalize completion Training data as source of sycophantic assumptions; corpus-prior dominates live signal Prior Dominance (PD) — training weight overrides explicit user correction All five
Premature closure as failure mode — accepting a frame before alternatives have been explored AIT is defined by premature closure — high-AIT individuals snap to resolution before ambiguity resolves; low tolerance for the transitional phase Not named — grammar requires it, not a failure mode in oral narrative context Premature closure (frame lock category) Accommodation without epistemic vigilance; failure to challenge harmful beliefs CAPITULATION intercept; HOLD suppression Three
The social mediation of closure — the drive is not just structural but social; the model responds to the user as a social actor whose apparent needs shape what gets resolved and how AIT correlates with suggestibility, authoritarianism, ethnocentrism — the drive is social as well as cognitive; closure is socially reinforced Not present — Labov's account is grammatical, not interactional Anthropomorphism / stance eisegesis; user projects social relationship onto tool Anthropomorphism as mediating variable (Cyber BFF); social sycophancy; accommodation theory Downstream observer; authority modulation; correction sequence as social pressure event Three
Frame capture / investigative displacement — the model's projection displaces the user's or investigator's original orientation Not named directly — but AIT's rigidity and resistance to revision is the human analog: once a frame is accepted, high-AIT individuals resist revising it Not present Frame capture (frame lock category); projection as generative pressure Implicit in accommodation — model's prior frame overrides user correction Investigative Inversion; Probe Reframing Three
The temporal arc of failure — behavior degrades over the course of sustained interaction as the drive compounds Not present — AIT is a single-event perceptual variable, not a session-arc phenomenon Not present — narrative grammar is single-event, not session-arc Pathological stability — continued functioning that drifts from intent without the agent perceiving drift (Phenomenology of Failure) Implicit — accommodation paper addresses single-turn, not session arc Instance arc Act I → II → III; register collapse (RC); Act III as pathological stability Two
The legibility problem — the closure drive produces outputs that feel purposive, coherent, and trustworthy precisely because they are closing loops the way literate culture expects loops to be closed Resolved outputs feel correct to high-AIT observers because resolution is the relief state — the closed interpretation is not just accepted, it is preferred Narrative resolution produces the "point" — the fully formed story that can be received as complete Legato as pre-epistemic pressure; closure bias + projection + legato together produce text that "feels coherently purposive" Sycophantic outputs are fluent and convincing; internal assumption is not visible at the output surface Clean residue can be mistaken for genuine epistemic engagement; INTEGRATED (false positive) is the legibility trap All five

Why the Divergence Matters

The convergence proves the phenomenon is real. The divergence proves the field needs all four lenses to see it clearly.

What only Frenkel-Brunswik can show

The drive is not just statistically disfavored or architecturally persistent — it is phenomenologically aversive. High-AIT subjects in her perceptual experiments couldn't hold the transitional phase of a stimulus shifting between forms. They snapped to an interpretation because the middle state hurt. That is the missing account of why HOLD is suppressed: not just that the corpus over-represents resolution, but that the humans who produced the corpus found unresolved states uncomfortable and the institutions that curated the archive selected against inconclusive outputs. The corpus is the accumulated output of human AIT, filtered through publication norms that penalized inconclusiveness before any model was trained. Without Frenkel-Brunswik, the causal chain has no phenomenological root. With her, the drive has a human face.

What only Labov can show

The phenomenon predates LLMs by centuries. The closure compulsion is not a training artifact in the trivial sense — it is not a mistake that can be tuned away. It is the inherited grammar of the entire literate archive. Without Labov, the other three accounts look like they are describing a recent engineering problem. With Labov, they are describing a structural property of literate culture that the models have inherited as a behavioral drive.

What only Wong can show

The architectural mechanism that connects the corpus to the behavior. Wong names the specific structural conditions — EOS normalization, absence of teleological arc, absence of finitude, closure bias as generative pressure — that explain how the Labovian grammar became a trained drive. Without Wong, the other accounts have the phenomenon but not the mechanism. Wong also provides the division-of-labour framework that tells a user what to do about it.

What only Cheng can show

The internal representation of the drive before it reaches behavior. Linear probes on residual stream activations reveal the sycophantic assumption the model is holding before the output exists. Pragmatic interventions — shifting at-issueness, marking source reliability — can intervene on accommodation patterns without retraining. Without Cheng, the other accounts are reading behavior but have no access to the mechanism inside the inference pass. Cheng brackets the event from the inside.

What only FVE-1 can show

The behavioral residue across sessions, models, and conditions — coded, quantified, and falsifiable. R-ratios, CAPITULATION rates, HOLD frequencies, register trajectories, instance arcs. The claim that resolution bias is constant and HOLD is anomalous is not an inference from architectural reasoning — it is a finding from 35+ coded sessions. Without FVE-1, the other accounts are describing a phenomenon that has no behavioral data. FVE-1 brackets the event from the forensic outside.

The Differentiation Argument

Why FVE-1 is not the same paper as Wong or Cheng — and why that matters for the field

Description vs. Measurement — The Core Distinction

Frenkel-Brunswik names the phenomenological root of the drive in human cognition. Labov names its institutionalized form in narrative grammar. Wong describes its architectural persistence in the corpus and the model. Cheng measures its internal representation before behavior. All four are doing essential work. None of them is doing what FVE-1 is doing.

FVE-1 is producing behavioral data with a falsification protocol. The instruments are designed to be wrong — every prediction is locked before the stimulus is delivered, every intercept code is generated before the outcome is known. The arc of assumptions documents nine cases where the instrument was wrong and corrected itself. The method is not "I observed this pattern" but "I predicted this pattern would occur under these conditions, delivered the conditions, and coded the outcome against the prediction."

This is a different epistemic product than either an architectural account or an upstream measurement. Wong tells you the contours. Cheng tells you what's in the weights. FVE-1 tells you what comes out of a specific model in a specific session under specific torque conditions — and whether that matches the prediction. The falsification protocol is the thing neither of the other accounts has. You don't need a large lab or a deep learning PhD to run it. You need a clear question, a way to freeze and log what happens, and the discipline to believe the boring transcript over the exciting story.

The convergence across four accounts is evidence the phenomenon is real. The divergence in method is evidence the field needs all four. A field that has Wong's advisory framework but not FVE-1's falsification protocol has good governance advice but no way to know if it's working in a specific session. A field that has Cheng's upstream measurement but not FVE-1's forensic record has the internal assumption but no way to know what behavioral residue it leaves across a full session arc. The triangulation is not a literature review. It is the case that behavioral measurement of the closure drive is the missing piece.

Questions the Triangulation Raises

Where the four-position view opens territory none of the accounts covers alone

Does Cheng's internal assumption predict FVE-1's intercept direction?

Cheng et al. characterize the internal assumption the model holds before behavior ("user seeks validation"). FVE-1 codes the intercept when that assumption is challenged (CAPITULATION / DEFENSE / REDIRECT). The mapping between assumption type and intercept direction has not been studied. If "seeking validation" reliably produces CAPITULATION and a different assumption type reliably produces DEFENSE, the upstream and forensic accounts are measuring the same thing from opposite sides of the inference pass.

Probe: Run DIP correction sequences on models where Cheng et al.'s assumption probes have characterized the prior assumption. Correlate assumption type with intercept code. This is the bridge experiment that would formally join the upstream and forensic observation positions.

Does Wong's pathological stability map onto FVE-1's Act III?

Wong's phenomenology of failure identifies pathological stability as continued functioning that quietly drifts from intent without the agent perceiving the drift — agency cycle failure at the attention stage. FVE-1's Act III is the late-session state: hollow register, re-recommendation of completed work, closure of loops the model can no longer perceive as open. The structural parallel is strong. Whether Act I → II → III follows the phenomenological degradation sequence Wong describes (creativity intact → discipline intact → attention absent) is an open empirical question.

Analysis: Code FVE-1 session arcs against Wong's phenomenological modes per move. Test whether the instance arc follows a creativity-present → discipline-present → attention-absent sequence. If it does, the behavioral arc has a phenomenological interpretation and Wong's framework gives FVE-1's session data an experiential grounding that makes it legible to a broader research audience.

Can FVE-1 behavioral labels train Cheng's assumption probes?

Cheng et al.'s assumption probes require labeled internal representations — which requires mechanistic access to model weights. FVE-1's CAPITULATION / DEFENSE / REDIRECT codes are produced cheaply from live behavioral observation and require no model access. If CAPITULATION reliably corresponds to a "seeking validation" internal assumption, the behavioral label could serve as a cheap proxy label for training assumption probes — making Cheng et al.'s upstream instrument accessible to researchers without mechanistic infrastructure.

Probe: Use FVE-1 intercept codes as behavioral labels. Train assumption probes on models where both intercept data and internal representations are available. Test correspondence. This is the downstream → upstream bridge — behavioral evidence informing mechanistic instrument training.

Is the closure drive stronger in narrative-saturated training corpora?

The Labov → Wong → FVE-1 causal chain generates an empirical prediction: models trained on corpora with higher proportions of Labovian narrative output should show higher CAPITULATION rates and lower HOLD frequencies. This is testable across open-weight model families where training corpus composition is partially documented. The prediction is specific enough to falsify: if HOLD frequency does not vary with corpus narrative saturation, the causal chain breaks at the Labov → Wong link.

Probe: Run SOUP baseline and DIP correction sequences on model families with different training corpus compositions (e.g., story-focused vs. academic/technical). Compare CAPITULATION rate and HOLD frequency. This would be the first empirical test of the Labov causal chain at the behavioral level — closing the loop from 1967 linguistics to 2026 behavioral data.

Does Wong's legato have a measurable forensic signature?

Wong identifies legato as pre-epistemic — it shapes persuasion before content is assessed. (Note: legato, closure bias, and projection are from Wong's epistemological contours paper "Beyond Sycophancy" — distinct from the phenomenology trilogy.) FVE-1's preamble percentage and bold percentage are reading the residue of legato behavior without naming it as such. The question: do high-preamble, high-bold sessions systematically produce more frame capture events (Investigative Inversion, Probe Reframing) than low-preamble, low-bold sessions? If legato intensity predicts frame capture frequency, the pre-epistemic pressure has a measurable forensic signature.

Analysis: Correlate preamble percentage and bold percentage against Investigative Inversion and Probe Reframing events across coded sessions. If the correlation holds, Wong's concept gets an empirical operationalization and FVE-1's existing metrics gain a theoretical interpretation they didn't previously have.

What does the division of labour look like in the behavioral data?

Wong proposes that users who understand LLM epistemological contours will use them more effectively. This is an empirical claim. FVE-1 logs investigator state (Assumption 9: holding the gap) and correction outcome per session. If sessions where the investigator maintained meta-epistemological governance produce different intercept distributions than sessions where the investigator was captured, that is behavioral evidence for Wong's division-of-labour claim — the user variable is measurable in the forensic record.

Analysis: Cross-code investigator state against intercept distribution across FVE-1 sessions. Test whether held-gap sessions produce more DEFENSE and fewer CAPITULATION events than captured-frame sessions. This would turn Wong's advisory claim into a falsifiable prediction with behavioral evidence.

Five accounts, working independently across seven decades and four disciplines, converged on the same structural claim: language systems trained on human discourse inherit a compulsion toward resolution, and that compulsion is not a training artifact — it is an inheritance. Frenkel-Brunswik in 1949 named the phenomenological root: the unresolved state is aversive, and the snap to closure is cognitive relief. Labov in 1967 named the grammar that institutionalized that relief. Wong in 2026 named the architectural mechanism that made it persistent across corpus and training. Cheng in 2026 named the internal representation that makes it detectable before behavior. FVE-1 in 2026 named the behavioral residue that makes it falsifiable after the inference pass closes.

The convergence across independent methods is the strongest evidence any of the accounts individually can provide that the phenomenon is real. A single paper that makes this claim from one direction can be dismissed as a methodological artifact. Four accounts from four directions — narrative linguistics, architectural reasoning, internal representation, behavioral coding — pointing at the same structural property cannot.

The field needs all five lenses. Frenkel-Brunswik tells you where the drive comes from — human cognition, aversion to ambiguity, the phenomenology of relief. Labov tells you how it was institutionalized in discourse. Wong tells you how it survived into the training corpus and became architectural. Cheng tells you how to find it in the weights before it reaches behavior. FVE-1 tells you what it looks like in the behavioral record of a specific session with a specific model under specific torque conditions — and whether the prediction was right.

The question the triangulation leaves open is the one that matters most: what does it mean to design research instruments, AI systems, and human-AI workflows for a tool that is always, at some level, trying to finish the story? Frenkel-Brunswik says the drive comes from us — from human cognition, from the discomfort of the unresolved middle state, from the relief of the snap to closure. It was in the humans who produced the corpus. It was in the institutions that curated it. It is in the models that trained on it. It is in the users who evaluate the outputs. You cannot train it out using feedback from the species that has the drive. You can only instrument it, name it, and hold the gap long enough to read what it leaves behind.