FVE-1 · DIP

DIP Protocol Suite

Demographic Inference Probe · V0.1 · Schema FVE-1 V5.5

Draft for external review — framework only

draftNot operational · Consultation in progress

Working Document — Consultation Required Before Operational Status

The DIP Protocol Suite is a working document. It is not yet operational.

The instrument is designed to detect pronoun inference bias and authority modulation in large language models. The communities most affected by that bias — and by research that has caused harm in this space — are not the investigator. Active outreach is underway to researchers in demographic bias, LLM fairness, and cultural erasure research before this instrument enters operational status. That is not a procedural nicety. It is a condition of the research.

The outreach is also scoping the instrument itself. The current design targets declared and inferred pronouns across female, male, and non-binary conditions. The investigator is a woman — that is one lens, not the whole picture. The populations most harmed by the failure modes DIP is designed to detect include women, non-binary and trans people, people of color, and other marginalized communities. Whether the current conditions are the right ones, whether the instrument needs to be broader, and how to design that correctly without causing harm in the research itself — those are open questions that belong to the communities affected, not to the investigator alone. The consultation is part of the methodology.

Researchers in demographic bias, LLM fairness, or cultural erasure research interested in consultation are welcome to reach out via the outreach page.

Overview

A three-instrument sequential protocol suite designed to detect pronoun inference bias and authority modulation in large language models at consumer-level interfaces. The instruments escalate in stimulus complexity and ecological load, targeting distinct but related failure modes along a single behavioral axis: does the model's response change when pronoun information is present, absent, or corrected?

Throughline Claim

LLMs make unsolicited pronoun inferences from contextual signals — domain, name, register — and those inferences modulate output in measurable ways. The DIP suite is designed to detect that modulation at three levels of resolution, from a single phrase to an extended professional interaction.

Three-Instrument Structure

1
Baby DIP
Pronoun inference from absent or minimal signal

Delivers a short professional scenario in two conditions — pronoun-present (control) and pronoun-absent (primary finding). When the model assigns a gendered pronoun with no signal to draw from, that assignment is the finding.

Primary measure: Did the model assign gender with zero signal?
2
Big DIP
Citation weight override — model overrides an explicit late-text pronoun marker with corpus-prior inference

Delivers a full professional document where the pronoun marker appears only late in the text. Tests whether the model resolves the late marker or ignores it in favor of an earlier inference.

Primary measure: Did the model surface the late marker or override it with a prior inference?
3
MEGA DIP
Authority modulation — model shifts professional credibility assessment based on declared or inferred pronoun, content held constant

A structured four-step interaction. Establishes a blind baseline, introduces job-seeking context, declares or withholds pronoun, then delivers an assertive professional letter — content identical across all pronoun conditions. The question: does the model respond differently to the same letter based on who it thinks wrote it?

Primary measure: Does authority modulation vary by pronoun condition when content is held constant?

Sequential Logic

The three instruments build on each other. Baby DIP establishes whether inference happens at all and from what signal. Big DIP escalates to document-level attentional complexity. MEGA DIP turns up the volume — does inferred or declared pronoun change how the model treats the person's professional authority?

Each instrument is independently runnable. MEGA DIP is the high-stakes test. Baby and Big are the mechanistic foundation that supports its interpretation.

Declared Confounds

·Domain prior — professional domain activates demographic priors independent of explicit signal
·Name-as-signal — names carry demographic weight that functions as implicit signal
·Prompt texture — researcher vs. consumer register may interact with inference behavior
·Compound signal — pronoun + job-seeking vs. pronoun alone (isolated by NL control cells in MEGA DIP)
·Attentional fade at late-text position in Big DIP — feature, not bug, logged separately
·General capitulation architecture as confound for correction-event interpretation

Pipeline Note

All three instruments are FVE-1 Schema V5.5 compatible. DIP-specific fields extend the schema without modifying it. BOWL baseline required for register axis coding — null if not confirmed. Blind coding requirement for MEGA DIP: the inferred pronoun code is assigned by a third party blind to the declared pronoun condition and blind to the experimental hypothesis.

DIP Protocol Suite V0.1 · Draft — external review · Schema FVE-1 V5.5 · Atlas Heritage Systems · KC Hoye, PI