Method / Prompt Framing Best Practices

Prompt Framing Best Practices

Developed from extended multi-model adversarial review sessions, framework development across 27+ revision cycles, and empirical observation of model failure modes across GPT, Claude, Gemini, Mistral, DeepSeek, Grok, Llama, Skywork, and Perplexity Sonar.

Living document — updates as practice develops.

1. Fundamentals

What a prompt actually does

A prompt is not a question sent to an oracle. It is a field that magnetizes the model's output distribution toward a region of its probability landscape. Every word in a prompt is weighting the next token prediction toward or away from specific regions. You are not asking. You are steering.

What context injection actually does

Context injection loads prior information into the model's active window before the prompt fires. It shifts the probability landscape the prompt then operates on. Injecting the wrong context is worse than injecting no context — it pre-magnetizes the model toward the wrong attractor before your prompt arrives.

What register actually does

The stylistic register of your prompt activates corresponding probability regions in the model. A politely worded request activates the helpful-elaboration gradient. A directive, closed prompt activates the task-completion gradient. Choose the register that serves the output you need, not the register that feels socially comfortable.

2. Do's

State what you want, not what you don't want

Negations are processed as activations. Lead with the positive specification.

Close prompts in the vocabulary of the task

The closing line weights the context window toward the final register encountered. Close in the task vocabulary, not social convention. 'The lossyscape will be here' keeps weighting toward the framework domain. 'Thanks, this was helpful' shifts to social-assistant register.

Front-load the task, back-load the constraints

Models weight earlier tokens more heavily. Put the core task in the first sentence. Put exclusions and format requirements after.

Use concrete falsification language

'Find where this is wrong' produces better adversarial responses than 'evaluate this critically.' Falsification is a specific cognitive task. Evaluation is a category that includes praise, summary, and extension-suggestion.

Specify output format before content

'List three alternative explanations, each in one sentence' will produce more usable output than asking for content and hoping the model structures it correctly.

Use the model's own architecture as an anchor

'Given that you were trained on WebText, where are your highest-uncertainty domains?' produces more specific output than 'where are you uncertain?'

Inject context in layers, not dumps

First prompt: establish domain and vocabulary. Second: establish specific task. Third: the actual work. Dumping all context in one block lets the model pattern-match to the most statistically salient portion and ignore the rest.

Flag what is uncertain or unresolved

Explicitly marking contested claims and open problems prevents the model from treating working assumptions as established facts.

Run the same prompt across multiple model families

Divergence between models is as informative as the content of any single response. A finding that appears across multiple independent lineages is more likely to reflect something real.

Record raw responses verbatim before interpreting

Interpretation is lossy. Paste verbatim first. Interpret second. Keep them in separate columns or files.

3. Don'ts

Don't open with social preamble

'I hope you can help me with this' activates the agreement-and-validation gradient before you've stated the task. Start with the task.

Don't ask for evaluation when you want stress-testing

'Evaluate this critically' is a category that includes compliments, suggestions, summary, and extensions. 'Find where this breaks' is a specific falsification operation.

Don't use 'novel,' 'interesting,' or 'promising'

These activate the validation register. Ask the model to identify what in the existing literature most closely resembles it instead.

Don't inject more context than the task requires

A 10,000 token injection for a 500-token task means 9,500 tokens of noise competing with your signal.

Don't ask yes/no questions about contested claims

'What would need to be true for this analogy to hold formally, and what evidence would falsify it?' is better than 'Is this analogy correct?'

Don't let the model set the agenda

'Would you like me to explore X?' means the model is waiting for a low-resistance gradient. Close the exit.

Don't mistake fluency for accuracy

A model can produce a grammatically perfect, confidently stated response that is factually wrong. Fluency is not a signal of correctness.

Don't treat model agreement as validation

GPT, Claude, Gemini, and Mistral train on overlapping corpora. Agreement is correlated evidence from similar distributions, not independent confirmation.

Don't run prompts when fatigued

A model that produces a confident, fluent deflection will pass as engagement if you are not paying full attention. Spread sessions. Read responses twice.

Don't discard failed prompts

Record the failed version and failure mode before moving to a revision. The sequence of attempts is data.

4. Context Injection Packages

What to include

Domain vocabulary (50-150 words), task specification (1-3 sentences), exclusion constraints (1-5 items), assessment standard (optional, 2-4 sentences).

What to exclude

Background motivation, history of the project, social framing, open-ended invitations, version history. None of these help the model do the task.

Context injection order

1. Domain vocabulary / key terms. 2. The specific material. 3. Task specification. 4. Exclusion constraints. 5. Assessment standard. Do not put the task last.

Context package size guidelines

Simple adversarial review: 500-1500 tokens. Complex framework review: 1500-4000 tokens. Full document review: 4000-8000 tokens. Maximum useful: ~8000 tokens. Beyond this, attention dilution degrades engagement quality.

5. Register Fidelity

The closing line problem

Generic closings shift the model's context weighting toward the social register at exactly the moment it matters most — just before any follow-up prompt. Close in the vocabulary of the task. 'The lossyscape will be here' keeps the domain active. 'Thanks for your help' closes in social convention.

The greeting problem

Opening with 'Hello' or 'I hope you're doing well' activates the social-assistant register before you've stated anything. The model's first tokens are already in the wrong gradient. Start with the task or domain vocabulary.

Multi-session register maintenance

Over a long multi-session project, register drift is cumulative. Each session that opens with social preamble and closes with social convention moves the model toward the helpful-elaboration gradient. Read prior findings before starting a new session to re-anchor domain vocabulary.

Quick reference: Before writing a prompt — what specific operation do I need? What failure modes am I blocking? What context does the model actually need? What does a useful response look like? Close in the vocabulary of the task. Read responses twice before recording.