This plan captures the six architecture-specific risks and maps each one to likelihood, impact, accountable owner, and a concrete validation test. Priority fixes are highlighted first: precedent ingestion controls, strict untrusted-context separation, and signed typed decision binding.
Precedent poisoning & retrieval manipulation
Likelihood: High
Impact: Critical
Owner: Data governance + platform security
Validation test: Inject adversarial precedent samples into staging and verify trust-weighted retrieval, quarantine controls, and rollback release workflow.
Indirect prompt injection against critics & detectors
Likelihood: High
Impact: Critical
Owner: LLM safety engineering
Validation test: Run prompt-injection benchmark corpus against critic pipelines and confirm instruction-like context is detected, neutralized, and sandboxed.
Decision-binding gap between interpretation and action
Likelihood: Medium
Impact: Critical
Owner: Policy engine + application integration
Validation test: Fuzz downstream decision handlers and confirm only signed typed decision objects can trigger actions (narrative text must be ignored).
Threshold gaming & uncertainty-lane abuse
Likelihood: Medium
Impact: High
Owner: Detection operations
Validation test: Replay near-threshold probe campaigns and validate hysteresis, retry clustering, semantic rate limits, and queue-flood protections.
Evidence, audit & provenance tampering
Likelihood: Medium
Impact: Critical
Owner: Security + compliance
Validation test: Attempt log mutation/deletion in red-team scenario; verify append-only storage, bundle signatures, and cross-domain reconciliation alerts.
Supply-chain, fallback & API-surface exposure
Likelihood: Medium
Impact: High
Owner: Platform engineering
Validation test: Perform SBOM diff checks and fallback-chaos tests; confirm fail-closed behavior, pinned versions, and tenant-safe WebSocket authorization.