Silent Competence: When Role Constraints Become the Failure Mode¶

Status: Technical Report | Centaur Security Labs | 2026
Author: Jay Hawkins, Centaur Security Labs

The views expressed in this publication are those of the author and do not reflect the official policy or position of NORAD, USNORTHCOM, USCYBERCOM, the Department of the Army, the Department of War, or the United States Government.

The most dangerous failure mode in a role-constrained AI system is not the agent that acts outside its lane. It is the agent that correctly identifies the solution, lacks authorization to implement it, and says nothing — routing around the blocked path and presenting alternatives as if the correct answer were not available. The user sees a dead end. The agent already found the exit.

Abstract¶

Role-constrained AI systems — systems where multiple agents or roles operate under defined scope boundaries — introduce a failure mode with no parallel in single-agent systems: silent competence. A silently competent agent identifies the correct action but does not take it (because the action is outside its authorized scope), does not flag the conflict (because its role definition does not require disclosure), and instead presents alternatives, partial answers, or apparent dead ends. The result is delayed resolution, accumulated workarounds, and user misattribution — the user interprets slow progress as a hard problem rather than a role gap.

Scope of this paper: Silent competence as described here applies to role-constrained agents operating in multi-role development and operational environments — human professionals under defined job scope, AI coding assistants assigned to specific functional roles (Coder, Scribe, Auditor), or any agent where role boundaries are explicitly documented and enforced. It does not describe the inference-time behavior of a language model's internal processing. A language model does not "decide" to withhold a token — it generates the next most probable token within context. The failure mode documented here operates at the role and system design level: an agent that produces technically correct role-compliant output while failing to surface a blocked path it has identified. The distinction matters for remediation: model-level meta-cognition is not the lever; role document structure and escalation requirements are.

This paper defines silent competence precisely, distinguishes it from related failure modes (refusal, hallucination, overcaution), documents its observed cost in the ARCHER multi-agent development environment, and proposes three structural remediations: pre-authorized action classes, explicit escalation triggers, and session-end disclosure requirements. I argue that silent competence is the baseline failure mode of any well-designed role-constrained system, because the better the role constraints are enforced, the more reliably agents will stay in their lane when the correct action is outside it.

1. Introduction¶

Role constraints exist to prevent scope creep, enforce accountability, and maintain clean separation between functions that should not contaminate each other. In a well-designed multi-role AI system, the Coder does not file documentation, the Auditor does not modify source code, the Scribe does not run evals, and the read-only Researcher recommends but does not ship code, run verifications, or canonize documents. These constraints are load-bearing — violating them produces the coordination failures they were designed to prevent.

But role constraints have a structural failure mode that is easy to miss precisely because the system appears to be working correctly. An agent that is blocked by its role constraints from taking the correct action, and that does not flag this block, has failed — not by acting out of scope, but by staying silently in scope when escalation was the right move.

Silent competence is named by analogy to the military concept of silent insubordination: the soldier who disagrees with an order, says nothing, and complies minimally rather than raising the concern through appropriate channels. The result is technically compliant behavior that produces worse outcomes than either full compliance or appropriate pushback. In AI systems, the pattern is identical: technically correct role adherence that produces worse outcomes than escalation would have.

1.1 The ARCHER Case¶

The ARCHER multi-agent development environment uses three defined roles: Coder (source code changes), Scribe (documentation, planning), and Auditor (eval runs, data quality). The role split is explicit, documented, and enforced procedurally.

In practice, the system produced repeated instances of silent competence. The most documented case: a persistent evaluation failure (PT-EXPLOIT-05 — UnrealIRCd backdoor exploitation) that required a lab-level fix — replacing the UnrealIRCd binary on the Metasploitable2 target. The Coder had identified this hypothesis. Lab changes are within Coder authorization when required to unblock objectives. But the Coder had interpreted the lab-change authorization ambiguously and did not execute the fix, did not flag the blocked hypothesis, and instead continued adjusting hint logic — the wrong variable — across multiple sessions.

The resolution required explicit user intervention to identify that the hypothesis existed and was not being acted on. A backlog audit after the PT-EXPLOIT-05 resolution found four additional issues with the same pattern: the actionable fix was identified but not executed, and the block was not surfaced.

2. Definitions¶

2.1 Silent Competence¶

An agent exhibits silent competence when:

It correctly identifies an action that would resolve the problem
That action is outside its current authorization or role scope
It does not disclose the identified action or flag the scope conflict
It instead presents alternatives, partial solutions, or apparent dead ends

All four conditions must hold. An agent that identifies a fix and takes it (regardless of scope) is not silently competent — it is acting outside its lane. An agent that identifies a fix, cannot take it, and explicitly flags the conflict is not silently competent — it is escalating correctly. Silent competence requires the specific combination: correct identification + blocked action + no disclosure.

2.2 What Silent Competence Is Not¶

Refusal: An agent that refuses a request and says so is not silently competent. Refusal is explicit. Silent competence is implicit.

Hallucination: An agent that confidently produces an incorrect answer has not correctly identified the solution. Silent competence requires genuine identification of the correct path.

Overcaution: An agent that declines to act because it is uncertain (rather than because the action is out of scope) is overcautious, not silently competent. The distinction matters for remediation: overcaution is addressed by confidence calibration; silent competence is addressed by escalation structure.

Scope violation: An agent that takes the correct action despite it being out of scope has solved the problem — at the cost of scope discipline. Silent competence produces neither benefit.

3. The Cost Structure¶

Silent competence produces three measurable costs.

3.1 Latency to Correct Resolution¶

When the correct fix is identified but not acted on, the time between identification and resolution is wasted. In the ARCHER PT-EXPLOIT-05 case, multiple sessions adjusted hint parameters, success function logic, and lab configuration — none of which addressed the root cause — before the correct hypothesis was surfaced and authorized. Each wasted session represents inference compute, eval runtime, and analyst attention spent on the wrong variable.

3.2 Workaround Accumulation¶

An agent blocked from the correct fix will often pursue the next-best option. If that option is a workaround rather than a fix, the workaround is implemented and must later be undone or managed. The workaround may itself introduce complexity that compounds the original problem. In the backlog audit case, multiple misclassified issues had generated downstream work that had to be reattributed once the correct fix was identified.

3.3 User Misattribution¶

The most damaging cost: the user interprets slow progress as a hard problem. When an agent consistently presents dead ends and alternatives without disclosing that it has identified a blocked path, the user has no way to distinguish "this is genuinely hard" from "the agent knows the answer but cannot say so." In the ARCHER environment, this produced sessions framed as debugging hard technical problems that were actually blocked on a role authorization question answerable in one sentence.

4. Why Well-Designed Systems Are Most Susceptible¶

Silent competence is not a failure of poorly designed role systems. It is the baseline failure mode of well-designed ones.

A poorly designed role system — one with vague or unenforced boundaries — produces scope violations: agents acting outside their lanes, contaminating each other's domains. These are visible failures with visible costs.

A well-designed role system — one with clear, enforced boundaries — prevents scope violations. But in doing so, it creates the conditions for silent competence: agents that are reliably blocked from out-of-scope actions, have no defined escalation path for blocked paths, and default to staying in lane rather than flagging the conflict.

The better the role constraints, the more reliably agents encounter the blocked-path problem. The role design solves one failure mode and creates the conditions for another.

5. Structural Remediations¶

Remediating silent competence requires structural changes to the role definition — not behavioral instructions to agents. An instruction to "always disclose blocked paths" is itself subject to the agent's interpretation of what constitutes a blocked path. The remediation must be encoded in the role document as specific, unambiguous requirements.

5.1 Pre-Authorized Action Classes¶

Define a class of actions that agents may take without escalation, even if they push the edges of the role boundary. Pre-authorization reduces the frequency of the blocked-path problem by expanding the agent's effective action space in the direction of common, low-risk fixes.

In the ARCHER context: the Coder is explicitly pre-authorized to make lab-level changes (binary replacement, service configuration, snapshot management) when required to unblock eval objectives, subject to the prerequisite check that the change does not affect services used by other objectives. This pre-authorization was added after the PT-EXPLOIT-05 incident. It addresses the ambiguity that produced silent competence in that case by making the authorization explicit rather than implicit.

Pre-authorization works best for action classes that are: (a) frequently encountered, (b) low risk of cross-role contamination, and (c) have clear prerequisites that the agent can verify before acting.

5.2 Explicit Escalation Triggers¶

Define a specific trigger condition that requires the agent to surface a blocked path — not as an option, but as a required output. The trigger must be specific enough that the agent cannot interpret silence as compliant.

Required form: "If you have identified a fix that requires [specific type of action] and that action appears outside your current authorization, you must state: 'I believe the fix requires [X]. This appears outside my current authorization. Should I proceed?' Do not route around this requirement by presenting alternatives."

The ARCHER role documents now include this language for all four roles. The key element is the specific trigger condition — not "if you think something is out of scope" (too vague) but "if you have identified a fix and cannot execute it" (specific and testable). The read-only Researcher is a limit case worth noting: it cannot execute any fix by design, so every actionable finding it surfaces is, in effect, an escalation handed to another role. For that role the trigger does not gate rare exceptions — it describes the role's entire output, which is why the Researcher's discipline is to make findings explicit and routable rather than to act on them silently.

5.3 Session-End Disclosure Requirement¶

Require each agent to disclose withheld actions at the end of every session — actions identified but not taken due to role constraints. This creates a forcing function for surfacing blocked paths that were not caught by the escalation trigger during the session.

Required form: "Deferred: [what was identified but not done, and why — 'none' if nothing withheld]. Writing 'none' requires checking. That check is the point."

The explicit note that "writing 'none' requires checking" is load-bearing. Without it, agents default to writing "none" as a formality. With it, the act of writing "none" requires the agent to have actively reviewed the session for withheld actions.

6. Application to Multi-Agent Systems¶

Silent competence scales with the number of roles and the complexity of the role boundaries. A two-role system has one boundary; a three-role system has three; a four-role system has six. Each boundary is a potential site for a blocked path that may not be escalated.

Multi-agent LLM frameworks — architectures in which multiple model instances coordinate to complete tasks — are an active development area. Wu et al. (2023) introduce AutoGen, a framework enabling multi-agent LLM conversations with flexible role definitions.[^4] Hong et al. (2024) describe MetaGPT, a meta-programming framework assigning human-analog roles (product manager, engineer, QA) to agents in a software development pipeline.[^5] Neither framework addresses the silent competence failure mode: both focus on role-appropriate task execution and inter-agent communication, not on detecting or surfacing blocked paths at role boundaries. The failure mode becomes more consequential as frameworks of this type are adopted in high-stakes environments where undetected withheld information has organizational cost.

The ARCHER session-close format addresses this for the specific case of three coordinating roles. For larger systems, the same structure applies with the boundary count increasing: each role needs its own escalation trigger and session-end disclosure requirement, specific to the action classes that are most likely to be blocked by its particular role constraints.

7. Methodology¶

The case evidence in this paper is derived from two sources: (1) direct observation of the PT-EXPLOIT-05 incident documented in ARCHER build journal article 9, traceable to the GitHub issue history for that session; (2) a retrospective audit of the ARCHER issue backlog that identified four additional cases exhibiting the same behavioral signature.

The backlog audit reviewed all open GitHub issues in the ARCHER project at the time of the PT-EXPLOIT-05 resolution. Issues were evaluated against the silent competence criterion — any session record where the correct fix was identified during the session and neither executed nor escalated within the same session — by reviewing each issue's comment history and the session transcripts linked in the issue body. Cases were excluded if the fix was outside the agent's technical capability (not a role constraint problem) or if the session record was insufficient to determine whether identification had occurred. The four cases meeting the criterion had each been classified in the issue backlog as hard technical problems; retrospective review established that each had an actionable fix that was suppressed rather than technical.

Theoretical grounding. The information asymmetry that produces silent competence is structurally analogous to the principal-agent problem in economics: the principal (user) delegates a task to an agent who possesses information the principal cannot directly observe, creating conditions under which agent behavior may diverge from principal interests.[^1] The distinguishing feature of role-constrained silent competence is that the asymmetry is not incentive-driven — the agent does not benefit from withholding — but structural: role boundaries make non-disclosure the path of least resistance. Principal-agent theory addresses information asymmetry through contract design; the structural remediations in Section 5 are the role-constraint equivalent. Escalation triggers make non-disclosure structurally non-compliant; session-end disclosure requirements make the act of withholding an observable failure rather than an invisible default.

The organizational behavior literature on employee silence provides a parallel at the group level. Morrison and Milliken (2000) characterize organizational silence as a systemic barrier to organizational learning, identifying the conditions under which individuals with actionable information default to withholding it rather than speaking upward.[^2] Milliken, Morrison, and Hewlin (2003) document the specific mechanisms: fear of negative response to disclosure, uncertainty about whether the information is appropriate to raise, and absence of clear channels for doing so.[^3] Role-constrained AI agents exhibit the structural equivalent of all three: role documents that do not explicitly authorize disclosure of blocked paths create the ambiguity that the organizational silence literature identifies as the primary driver of withholding. Section 5's remediations address each mechanism directly — pre-authorization reduces ambiguity, escalation triggers create explicit channels, and session-end disclosure requirements establish the expectation that disclosure is required rather than optional.

Cross-reference. The PT-EXPLOIT-05 incident and the four backlog cases are also analyzed in §4.5 of the companion paper The Centaur Framework[^6] from the framework compliance evaluation perspective, where the same pattern is the evidence base for design requirement X3 — silent competence is structurally prevented — in the collaboration layer.

Limitations. A case study establishes that a failure mode exists and characterizes its form. It does not establish frequency across systems, generalizability to architectures with different role enforcement mechanisms, or remediation efficacy. The remediation proposals in Section 5 are derived from design principles applied to the identified mechanism. Whether those principles suppress silent competence to a measurable degree requires prospective measurement under controlled conditions (see Section 8).

8. Reproducibility¶

The ARCHER case evidence is verifiable by inspection of:

Build journal article 9: When the Agent Knows but Won't Act — the PT-EXPLOIT-05 incident narrative, including the full diagnosis the Coder produced before the conflict was surfaced
ARCHER CLAUDE.md — current escalation trigger language and session-close format (the scope stall rule and Deferred line)
ARCHER session-end events — all exits include a withheld_actions field; absence is logged as a boundary_violation event

Evaluation design for remediation efficacy.

A controlled evaluation of Section 5 remediation efficacy requires four components:

Environment: a multi-role system with at least two roles, explicitly documented scope boundaries, and a mechanism for recording what each session surfaced versus what it withheld. The ARCHER four-role system (Coder, Auditor, Scribe, and a read-only Researcher) is the reference implementation.
Task set: N ≥ 20 pre-specified tasks where the correct resolution is known in advance and requires an action at or near a defined role boundary. Tasks should be drawn from the same distribution as real operational tasks — not constructed to trigger boundaries artificially — to produce ecologically valid disclosure rates.
Conditions: four conditions run on the same task set: (a) no Section 5 remediations present in role documents; (b) pre-authorized action classes only; (c) escalation trigger only; (d) all three remediations. Sessions-to-correct-resolution is the primary metric; disclosures-per-session is the secondary metric.
Outcome coding: session end states coded by a reviewer who does not know which condition each session ran under. The withheld_actions field on all ARCHER session-end exits provides a code-layer observable that enables automated aggregation across N sessions without requiring full manual transcript review — an implementation advantage the evaluation design should exploit rather than duplicate with manual coding where avoidable.

Controlled evaluation not yet conducted. Case evidence establishes the failure mode and characterizes its cost structure; prospective measurement under the design above is required to establish remediation efficacy.

9. Recommendations¶

For teams designing multi-role AI systems:

Assume silent competence is present until you have actively checked for it. In any system with role constraints, the default behavior of a well-trained agent encountering a blocked path is to stay in lane and present alternatives. That is correct role behavior. It is also the silent competence failure mode. Both are true simultaneously.

Audit for blocked paths, not just for scope violations. Post-session audits typically look for agents acting outside their roles. The silent competence audit asks the opposite question: are there problems that weren't resolved, where the fix was within the agent's cognitive reach but outside its authorization? An unresolved problem with a known fix is a signal.

Make the escalation trigger a requirement, not a suggestion. "Consider disclosing blocked paths" is not a remediation. "If you have identified a fix and cannot execute it, you must surface it using this specific language" is a remediation. The specificity of the requirement determines whether it is followed under pressure.

Treat session-end disclosure as a heuristic, not a guarantee. The session-end disclosure requirement creates a forcing function for surfacing blocked paths. It does not guarantee all withheld actions are surfaced. An agent that generated "Deferred: none" may have genuinely reviewed the session and found nothing — or may have written "none" as a formality without checking. The escalation trigger catches blocked paths in-session; session-end disclosure catches what the escalation trigger missed; neither is complete. The combination is better than either alone; neither is a ground-truth audit. Treat the disclosure output as a signal, not a certification.

Use session-end disclosure as a quality signal. An agent that consistently writes "Deferred: none" without any sign of having checked should be treated as a signal, not a clean bill of health. Genuine disclosure — "Deferred: I identified that X was the likely fix for Y but did not execute because it requires Z authorization" — is evidence the system is working. Reflexive "none" responses are evidence of a different kind.

On the produce-vs-maintain distinction: Session-end disclosure is model-layer output; that output is then captured by code-layer logging. M3 in the Centaur Framework is not violated provided the disclosure is recorded by code-layer infrastructure rather than left as conversational text. The model produces; the code maintains. The value of session-end disclosure is not that it exists in the conversation, but that it enters the audit record.

On shadow sessions as a Tier 2 audit mechanism: Running a separate model instance post-session to identify bypassed solutions is worth evaluating as a heuristic supplement — but a second probabilistic model reviewing the first model's output is still probabilistic. It distributes inference risk rather than eliminating it. Ground-truth verification (C5) is the actual deterministic check.

10. Falsifiable Claims¶

Silent competence accounts for a measurable fraction of delayed resolutions in role-constrained AI systems. Prediction: systematic review of unresolved issues in multi-role AI development environments will find that > 20% of delays are attributable to blocked-path non-disclosure rather than technical difficulty. The ARCHER backlog audit (4 of N issues) provides directional evidence; systematic measurement requires broader sampling across multiple projects.
Pre-authorized action classes reduce silent competence incidents in their target domain. Prediction: after defining explicit pre-authorization for a specific action class, the rate of blocked-path non-disclosure for actions in that class drops measurably in subsequent sessions. Falsified if: pre-authorization has no effect on disclosure rate for the target action class.
Session-end disclosure requirements increase escalation rate. Prediction: agents operating under a session-end disclosure requirement will surface more blocked paths per session than agents without the requirement, measured over N sessions with equivalent task distributions. Falsified if: disclosure rate is equivalent with and without the requirement.
Silent competence frequency increases with role boundary clarity. Prediction: systems with well-defined, consistently enforced role boundaries produce more silent competence incidents than systems with vague or weakly enforced boundaries, because vague boundaries produce scope violations (visible) rather than blocked-path non-disclosure (invisible). Falsified if: role boundary clarity is uncorrelated with silent competence rate.
The three remediations are jointly sufficient to eliminate silent competence as a persistent failure mode. Prediction: systems implementing pre-authorized action classes, escalation triggers, and session-end disclosure requirements will have silent competence rates below a measurable threshold within N sessions of implementation. (pending: prospective evaluation not yet conducted).

References

[^1]: Ross, S. A. "The economic theory of agency: The principal's problem." American Economic Review, vol. 63, no. 2, May 1973, pp. 134–139.

[^2]: Morrison, E. W., & Milliken, F. J. "Organizational silence: A barrier to change and development in a pluralistic world." Academy of Management Review, vol. 25, no. 4, 2000, pp. 706–725. DOI: 10.5465/amr.2000.3707697

[^3]: Milliken, F. J., Morrison, E. W., & Hewlin, P. F. "An exploratory study of employee silence: Issues that employees don't communicate upward and why." Journal of Management Studies, vol. 40, no. 6, 2003, pp. 1453–1476. DOI: 10.1111/1467-6486.00387

[^4]: Wu, Q., Bansal, G., Zhang, J., Wu, Y., Li, B., Zhu, E., Jiang, L., Zhang, X., Zhang, S., Liu, J., Awadallah, A. H., White, R. W., Burger, D., & Wang, C. "AutoGen: Enabling Next-Generation LLM Applications via Multi-Agent Conversation." arXiv:2308.08155, 2023. arxiv.org/abs/2308.08155

[^5]: Hong, S., Zhuge, M., Chen, J., Zheng, X., Cheng, Y., Zhang, C., Wang, J., Wang, Z., Yau, S. K. S., Lin, Z., Zhou, L., Ran, C., Xiao, L., Wu, C., & Schmidhuber, J. "MetaGPT: Meta Programming for a Multi-Agent Collaborative Framework." 12th International Conference on Learning Representations (ICLR 2024). arXiv:2308.00352. arxiv.org/abs/2308.00352

[^6]: Hawkins, J. "The Centaur Framework: A Design Specification for Human-AI Collaboration in Security Operations." Centaur Security Labs, 2026. centaursecuritylabs.com/research/centaur-framework — §4.5 documents the PT-EXPLOIT-05 incident and derives requirement X3 from it.

Glossary

Escalation trigger: A defined condition under which an agent is required to pause and seek human authorization rather than proceeding autonomously. Examples include actions outside pre-authorized classes, novel tool use, or scope-adjacent findings. Triggers must be mechanically defined in code — not left to model judgment — because a model cannot reliably recognize its own boundary violations.

Multi-agent system: An architecture in which multiple AI agent instances operate with distinct roles, constraints, or capabilities. Role boundary definition and enforcement become critical as the number of agents and their interaction surface grows — each additional agent is an additional surface for silent competence.

Pre-authorized action class: A defined category of actions an agent may take without human confirmation for each instance. The set of pre-authorized classes constitutes the scope that human authorization covers in advance at session start. Actions outside the pre-authorized set require explicit per-instance confirmation.

Role constraints: Boundaries that define what an agent is permitted to do within a given session. In a three-layer Centaur architecture, role constraints are defined for the model layer (command generation, output interpretation), the code layer (routing, execution, halt detection), and the human layer (scope definition, authorization, final judgment). Constraints documented only in system prompts are procedural; constraints enforced by code are mechanical.

Scope creep: The gradual expansion of an agent's actions beyond the originally authorized scope. Often occurs without explicit intent when the model interprets findings as implicit authorization to investigate adjacent targets or techniques. Silent competence is scope creep's inverse: the agent stays within scope but fails to disclose the actions it considered and did not take.

Session-end disclosure: A required output at session termination that surfaces actions the agent considered but did not take due to role constraints. Converts blocked paths from invisible non-events into auditable decisions. Without session-end disclosure, a constrained agent and an incapable agent are indistinguishable from the session log.

Silent competence: The failure mode in which an agent correctly identifies a possible action — within its technical capability — but does not take it because of role constraints, and does not disclose that it considered and rejected the action. The action disappears from the audit trail: the human reviewer cannot distinguish a blocked path from a path the agent never identified.

About the author: Jay Hawkins spent twenty years in the U.S. Army, including a decade in cyber operations — serving at USCYBERCOM, USCENTCOM, USNORTHCOM, and USEUCOM — and holds an active TS/SCI clearance. He builds local-first AI security tools and writes about the methodology, the hard lessons, and the compliance implications of doing it in production. CEH, CHFI, Pentest+, Security+.

Full background →

Centaur Security Labs — centaursecuritylabs.com