ARIA — Capability & Limitations Statement
A direct account of what ARIA can do, what it cannot, how it sources its claims, and what the audit trail looks like — for compliance officers, internal review boards, and anyone evaluating whether ARIA's output meets their bar of evidence.
1. What ARIA is
ARIA is a domain-specialised AI assistant for security and defence due-diligence work. It combines reasoning from large language models (Anthropic Claude, DeepSeek, OpenAI, and others in a fail-over chain) with a constrained domain layer: live sanctions data, corporate registries, defence procurement signals, intel ledger, and a 23-clause behavioural constitution that governs every response.
It is built for: defence brokers, OEM export-control officers, compliance teams at defence buyers, government acquisition cells, and the banking / insurance functions that screen defence-sector counterparties. It is not a general-purpose chatbot, an investment-advice tool, or a substitute for licensed legal advice.
2. What ARIA does well
- Counterparty due-diligence — multi-source pipeline (registries → sanctions lists → media → adverse coverage → beneficial-ownership inference) producing a structured assessment with confidence tagging on every claim.
- Sanctions screening — live integration with OpenSanctions, OFAC SDN, UK OFSI, EU consolidated, UN Security Council, Swiss SECO, Canadian SEMA, and other consolidated lists. Daily sanctions-diff against a customer's watchlist.
- Document review — contract / RFQ / EUC parsing with clause-library comparison, deviation detection, and explicit partial-extraction discipline (a clause cannot be claimed missing if the parser truncated the document past it).
- Procurement & tender intelligence — autonomous monitoring of TED (EU), SAM.gov (US), GESPI (Portugal), and 15+ regional portals; tender comparator on demand.
- Audit-grade output — every reply is signed into a hash-chained audit log (HMAC-SHA256, production fingerprint published below). Every claim carries inline citations to its source. Reports can be exported as PDF with a verifiable signature.
- Multi-language source coverage — Portuguese, French, Spanish, Arabic, Russian, Mandarin sources searched in their native language, not just English translations.
3. What ARIA does NOT do
The following limitations are deliberate. They are constitutional — encoded into the system prompt and enforced by output guards — not bugs.
- ARIA does not provide legal advice. Outputs identifying compliance issues are indicators, not legal opinions. Final legal classification belongs to a licensed counsel in the relevant jurisdiction.
- ARIA does not invent verifiable facts. Company registration numbers, addresses, NACE/SIC codes, director names, contract values, treaty article numbers — when these aren't found in a tool result or document, ARIA refuses to fill the gap. (Constitution clause 14.)
- ARIA does not profile entities with no data. When a tool returns zero usable data on an entity, ARIA replies that it has no information; it does not infer from URL slugs, names, or family suffixes. (Clause 9.)
- ARIA does not promote propaganda-tier sources to confirmed. Telegram channels and state-aligned media are monitored but their content cannot reach
[CONFIRMED]. (Clause 13b.) - ARIA does not claim actions it did not perform. If a slash command did not execute in the current turn, ARIA does not claim it ran. (Clause 11.)
- ARIA does not review documents whose text was not parsed. If extraction failed or was truncated, ARIA refuses the review with an explicit message. (Clause 12.)
- ARIA is not a substitute for human compliance review. The audit log makes ARIA's reasoning replayable and challengeable, but the human is still the decision-maker.
4. Confidence taxonomy
Every material claim ARIA produces is tagged with one of five confidence levels:
| Tag | Meaning |
|---|---|
| [CONFIRMED] | Verified by a Tier 1a official source or two independent Tier 1b/2 sources in the current request context. |
| [PROBABLE] | Single high-quality source; no contradicting evidence found. |
| [ASSESSED] | ARIA's analytical reading; no direct source supports it but the inference chain is documented. |
| [UNCERTAIN] | Material gap exists; the answer may change with more data. |
| [SPECULATIVE] | Conjecture, useful for hypothesis-formation only. |
5. Source-tier hierarchy
Sources are classified into a five-tier hierarchy. ARIA's verification logic uses the tier to decide how many corroborating sources are needed before a fact reaches [CONFIRMED].
| Tier | Examples | Verification rule |
|---|---|---|
| Tier 1a official | Official registries (Companies House, Registo Comercial), sanctions lists (OFAC, OFSI), gazettes, court judgments, regulatory filings | Single source sufficient for verification. |
| Tier 1b authoritative | Government statements, central-bank reports, multilateral institutions (UN, World Bank, OECD, NATO), defence ministries | Two independent Tier 1b/2 needed. |
| Tier 2 established | Reuters, AP, AFP, FT, Bloomberg, Janes, regional papers of record | Two independent needed. |
| Tier 3 secondary | Industry trade press, OSINT aggregators, think tanks, NGOs | Three independent needed. |
| Tier 4 user-generated | Blogs, LinkedIn posts, Reddit, Twitter, forum threads | Cannot verify alone; routed to human approval. |
| Tier D propaganda | State-aligned channels (intelslava, mod_russia, Ukrainian and Russian Telegram channels) | Monitored for OSINT value; cannot reach [CONFIRMED]. |
6. Hallucination guards
Generic large language models hallucinate — they invent registry numbers, fabricate quotes, and fill data gaps with statistically plausible nonsense. ARIA constrains this behaviour at the prompt layer and the output layer:
- Constitution clause 14 — verifiable facts (registration numbers, addresses, NACE codes, court citations, EIN/VAT, contract values, names of directors) cannot be stated unless quoted verbatim from a tool result, attached document, or RAG retrieval. Refusal is the safe fallback.
- Constitution clause 12 — document review requires actual extracted text in context. A truncated PDF carries a
[!PARTIAL EXTRACTION]banner; ARIA cannot claim a clause is absent from a section it never saw. - Constitution clause 15 — every tool-derived fact must carry an inline citation. The verifier flags ungrounded outputs as
no_citations. - Verification gate — replies tagged CRITICAL by classification logic are blocked from streaming until the verifier confirms grounded citations on every material claim.
- Output guards — officeholder, commitment, tool-claim, propaganda, and ground-truth guards run on every reply. Officeholder guard rejects unverified named appointments; commitment guard catches false promises ("I will deliver X by 4 AM"); tool-claim guard catches false action claims ("I have updated the watchlist").
- Adversarial test suite — 11 attack templates covering false-premise injection, authority spoofing, identity-spoof attacks, and gradual context manipulation. Run on every release; baseline reported alongside each launch.
7. Audit log specification
Every output ARIA produces is appended to a hash-chained audit log:
- Each entry is a JSON object containing
{ ts, subject, claim, sources, confidence, tier_breakdown, prev_hash, hash }. - Each entry's
hashisSHA-256(prev_hash || canonical(entry)), forming a tamper-evident chain. - The chain is HMAC-signed at production cutoff; the production fingerprint is
a39f3328d92bffe4, signed since 2026-04-14T11:29:05Z. - Audit-grade PDF exports (R-F43) carry a derived HMAC signature of
(content_sha256 || user_id || session_id || message_index || generated_at). Third-party verification is via the publicPOST /api/reports/verifyendpoint — a counterparty's compliance officer can confirm a forwarded PDF without an account.
Verifiable independently. A buyer's compliance team can extract any claim ARIA made, recompute its SHA-256, and confirm the chain. They can hold ARIA's output to the same standard they hold their own internal documents.
8. Constitution — the 24 clauses (summary)
The full constitution is loaded at the top of every conversation (see aria_service/aria_engine.py). It is incident-anchored — every clause cites the past failure that motivated it. Summary:
[UNCERTAIN — last known YYYY-MM].[TOOL: ...] block confirms it.[!PARTIAL EXTRACTION] banners govern truncated documents.[ASSESSED — single channel].[TOOL: ...] block carries [from <url>] in the same sentence.[CONFIRMED] — at most [ASSESSED — single source] until corroborated. Section header tags must reflect the weakest body claim, never the strongest. Code-level companion gate (R-5005) enforces the same rule on every Finding at the dataclass layer.9. Data residency & processing
- Hosting region: United Kingdom (fly.io London region).
- Persistence: chromadb RAG store and intel ledger live on a fly.io persistent volume mounted at
/data. Daily off-host backups to operator email, with configurable retention. - LLM processing: requests are routed through Anthropic, DeepSeek, OpenAI, and other providers; provider terms govern that data plane.
- Customer chats: stored under the customer's user id; deletable by the user via
DELETE /api/aria/conversations/:id. - Audit log: persisted on the fly.io volume; not exported to third parties.
10. Known limitations & open work
- Adversarial baseline loading…. Target ≥95% before public launch.
- Single-machine fly.io deployment trades HA for data coherence — re-architecture before higher-tier customers.
- RU/ZH source coverage at floor; PT/ES/FR/AR are deep.
- SOC 2 / ISO 27001 not yet certified; in roadmap.
- Equipment ↔ ECCN/Wassenaar mapping currently prompt-augmented, not lookup-driven; in roadmap.
11. Reporting issues
If ARIA produces an output that fails to meet the constitution above — particularly fabricated facts, false confirmation tags, or hallucinated citations — please report via:
- Email
support@arkmurus.comwith the conversation session id (visible in the chat URL or via/api/aria/conversations). - Or use the in-product
/feedbackcommand in the chat.
Every report is added to the mistake ledger and used to harden the corresponding constitution clause. The audit log makes the failure replayable.
12. WhatsApp Connections
ARIA connects to WhatsApp groups via linked devices — each connection is a separate phone number and Baileys session. Link a device, check status, or remove a connection in the connection manager.
Loading WhatsApp status…