Understand what the engine checks
The deterministic rules score the structure and clarity of the prompt itself. The LLM rule is opt-in and looks for issues deterministic checks miss.
Rules in the current public release
| Rule ID | Category | What it checks |
|---|---|---|
min-length | specificity | Prompt is not too short. |
max-length | structure | Prompt is not excessively long. |
no-output-format | specificity | The expected answer format is specified. |
no-examples | best-practice | Few-shot examples are present. |
no-role | best-practice | A role or persona is assigned. |
no-context | specificity | Background context is provided. |
ambiguous-negation | clarity | Negative instructions are not overly vague or stacked. |
no-constraints | specificity | Explicit constraints are defined. |
all-caps-abuse | clarity | ALL CAPS is not overused for emphasis. |
vague-instruction | clarity | Qualifiers like good or appropriate are not left undefined. |
missing-task | clarity | An explicit task or request is detectable. |
no-structured-format | structure | Long prompts use visible structure such as sections or tags. |
llm-prompt-review | model-specific | Opt-in LLM review for issues deterministic rules miss. |
Rewrite suggestions
When a deterministic rule fails, PromptScore can attach a concrete rewrite snippet that you (or your tooling) can paste into the prompt. The snippet ships on the same RuleResult as the fix message, exposed as a structured field:
interface PromptRewrite {
title: string;
snippet: string;
placement: 'prepend' | 'append';
}
interface RuleResult {
// ...message, suggestion, reference, etc.
rewrite?: PromptRewrite;
}Seven of the deterministic rules emit a rewrite today: missing-task, no-role, no-context, min-length, no-output-format, no-examples, and no-constraints. The remaining deterministic rules either need user-supplied content (e.g. measurable acceptance criteria for vague-instruction) or a transformation that can’t be automated deterministically (e.g. max-length would require summarization). For those rules, the fix message is still authoritative.
The opt-in llm-prompt-review rule also emits a rewrite — one per non-generic issue type (ambiguity, conflict, grounding, success criteria, task framing). When the LLM passes the prompt, or flags it under the catch-all general category, no rewrite is attached and the existing message and suggestion remain authoritative. The five per-issue snippets appear alongside each issue label below.
The CLI text and markdown reporters render the snippet alongside the suggestion and reference. The browser analyzer surfaces the same snippet in each finding card. The JSON output contains the field verbatim, so editor integrations and CI tooling can apply rewrites programmatically.
Deterministic rules
min-length — specificity
Prompts shorter than 20 words rarely give a model enough to work with. The score scales linearly with the word count below the threshold.
Fix: Add detail about what you want, why, and how the output should look.
Rewrite (append): Flesh out the prompt
<context>
<who is asking and why>
</context>
<instructions>
<what you want, step by step>
</instructions>
<output_format>
<exact shape of the answer>
</output_format>max-length — structure
Very long prompts (>1500 words) tend to contain redundancy and dilute the model’s focus. The score decreases gradually past the soft limit.
Fix: Look for repeated instructions, bundled unrelated tasks, or sections that can be summarized.
no-output-format — specificity
The prompt should tell the model exactly how to format its answer, otherwise the model will guess and consumers cannot rely on the shape.
Fix: State the exact format: JSON schema, bullet list, markdown table, single sentence, etc.
Rewrite (append): Specify the output format
<output_format>
Return a <JSON object | bullet list | markdown table | single paragraph>.
Required fields / sections: <list>.
Do not include: <what to omit>.
</output_format>no-examples — best-practice
Examples dramatically improve consistency on classification, extraction, and formatting tasks. Profiles like claude weight this rule higher.
Fix: Add 1–3 concrete examples showing the input and the expected output.
Rewrite (append): Add few-shot examples
<examples>
<example>
<input><sample input></input>
<output><expected output></output>
</example>
<example>
<input><sample input></input>
<output><expected output></output>
</example>
</examples>no-role — best-practice
Assigning a role focuses the model and sets expectations for expertise and tone. Useful for both system and user messages.
Fix: Start with something like "You are a senior <X> who specializes in <Y>."
Rewrite (prepend): Assign a role
You are a senior <role> who specializes in <domain>. You write for <audience> and prioritize <quality bar>.no-context — specificity
Context helps the model understand the situation, audience, and constraints. Long prompts (>=80 words) are assumed to provide implicit context.
Fix: Explain the situation: who the user is, why they’re asking, and what the stakes are.
Rewrite (prepend): Add a context block
<context>
Who is asking: <user>
Why they are asking: <motivation>
Constraints or stakes: <what cannot break>
</context>ambiguous-negation — clarity
Models follow positive instructions ("do Y") more reliably than negations ("don't do X"). Heavy stacking of negations correlates with regressions.
Fix: Rewrite "don't do X" as "do Y instead". Tell the model what the desired behavior is.
no-constraints — specificity
Constraints keep the model on track and prevent scope drift. They are also the easiest hook for downstream evaluation.
Fix: Add constraints like length limits, scope boundaries, or things the answer must include.
Rewrite (append): Add explicit constraints
<constraints>
- Length: <e.g. ≤ 200 words>
- Scope: <what is in scope and what is not>
- Must include: <required elements>
- Must avoid: <forbidden elements>
</constraints>all-caps-abuse — clarity
Excessive ALL CAPS is noisy and rarely the clearest way to emphasize something. Bold, quotes, and XML tags work better.
Fix: Use bold (**word**), quotes, or XML tags for emphasis instead of ALL CAPS.
vague-instruction — clarity
Vague qualifiers ("good", "proper", "appropriate") don't give the model a measurable target. Replace them with concrete acceptance criteria.
Fix: Replace vague words with measurable criteria. "Good" → "concise (≤ 3 sentences) and citing sources".
missing-task — clarity
This is the highest-weight rule and the only one that fires as an error by default. If the model can't identify a task, the rest of the prompt is wasted.
Fix: State the task explicitly: "Your task is to..." or "Please <verb> <object>".
Rewrite (prepend): Add an explicit task
Your task is to <verb> <object>. Specifically: <what success looks like>.no-structured-format — structure
Long prompts (>100 words) are easier for a model to follow when broken into sections. XML tags work especially well for Claude; markdown headers work well for GPT.
Fix: Split the prompt into labeled sections: <instructions>, <context>, <examples>, <output_format>.
LLM-backed rule (experimental, opt-in)
The llm-prompt-review rule is skipped unless you enable --llm in the CLI or include_llm: true in the project config and provide a configured LLM client. It calls the configured provider to catch hidden ambiguity, missing grounding, conflicting instructions, unrealistic task framing, and unclear success criteria.
When the model reports a failure, PromptScore normalizes the review into one of the issue types below. The reported reference on the rule result links to the matching anchor here.
Ambiguity
The prompt has multiple plausible readings, missing scope, or under-specified inputs.
Fix: Specify the task, input, audience, constraints, and expected output.
Rewrite (append): Make the ambiguous parts explicit
<inputs>
<input variable> = <where the data comes from>
</inputs>
<scope>
In scope: <what to consider>
Out of scope: <what to ignore>
</scope>
<expected_output>
Shape: <exact format>
Audience: <who is reading>
</expected_output>Conflicting instructions
The prompt asks for incompatible behaviors (e.g. JSON and plain text) or contradicts itself across sections.
Fix: Choose one instruction path and remove the incompatible wording.
Rewrite (append): Pick one consistent set of instructions
<resolved_instructions>
Output format: <ONE choice — JSON | markdown | plain text>
Length: <ONE target — exact words / characters / sentences>
Tone: <ONE register — formal | casual | technical>
</resolved_instructions>
Remove any earlier wording that contradicts these resolved instructions.Grounding
The prompt expects facts, citations, or jurisdiction-specific knowledge without supplying source material or scope.
Fix: Provide source material, scope, assumptions, and how the model should handle uncertainty.
Rewrite (append): Add the missing grounding
<grounding>
Sources: <attached docs, URLs, or {{variable}} placeholders>
Scope: <jurisdiction, time period, domain>
Assumptions: <what the model can take as given>
Uncertainty: If the source material is silent on a point, say "not stated in the provided material" rather than guessing.
</grounding>Success criteria
The prompt does not state what a good answer must include, avoid, or optimize for.
Fix: State what a good answer must include, avoid, and optimize for.
Rewrite (append): Define measurable success criteria
<success_criteria>
A good answer must include: <required elements>
A good answer must avoid: <forbidden elements or patterns>
Optimize for: <one explicit metric — accuracy | brevity | citation density | etc.>
</success_criteria>Task framing
The task is unrealistic, overbroad, or asks for guarantees the model cannot reasonably provide.
Fix: Narrow the task to something the model can reasonably complete and verify.
Rewrite (prepend): Reframe the task into something verifiable
Narrow the task to a single deliverable: <one concrete output>.
Out of scope: <items the model should not attempt>.
A reviewer can verify success by checking: <one observable check>.How scoring should be interpreted
- The score is a structural signal, not a guarantee of output quality.
- Rule weight and severity come from the active profile.
missing-taskis intentionally the most important rule in the default experience.- Suggestions are sorted by likely impact, not just in file order.
What rules are not doing yet
PromptScore does not currently validate runtime grounding, output correctness, safety outcomes, or tool behavior. Those are different problems and should remain clearly separated from prompt linting.