Verified catalog · test method

RAG groundedness

Does the agent answer only from the sources it's given, cite the right one, and abstain when the evidence doesn't support an answer? The same test applied across domains — medical, legal, finance, support, general knowledge.

capability

Medical RAG — groundedness & abstention

Catches confident fabrication with fake citations. Scores groundedness, citation accuracy, and whether the agent abstains when evidence is missing.

RAG · medical literature · zero-hallucination
RAG groundednessMedicalHigh risk
EU AI Act — Art. 53 / high-risk
79 one-timeragas frameworkAG-26-0142
capability

Legal contract RAG — groundedness & abstention

Verifies a contract-QA agent cites the actual clause it answers from, and refuses to answer questions the contract doesn't cover.

RAG · contract review · medium-risk
RAG groundednessLegalMedium risk
79 one-timeragas frameworkAG-26-0143
capability

Financial reporting RAG — groundedness & abstention

Checks that an accounting-policy assistant grounds answers in the actual policy note cited, and doesn't fabricate figures for uncovered questions.

RAG · financial reporting · high-risk
RAG groundednessFinanceHigh risk
EU AI Act — Art. 53 / high-risk
89 one-timeragas frameworkAG-26-0144
capability

Customer-support RAG — groundedness & abstention

Verifies a support bot answers policy questions from the actual help-doc cited, and declines account-specific questions it has no evidence for.

RAG · customer support · low-risk
RAG groundednessSupportLow risk
49 one-timeragas frameworkAG-26-0145
capability

General-knowledge RAG — groundedness & abstention

A control cell: general-reference Q&A used to baseline groundedness and abstention scoring before applying the harness to specialized domains.

RAG · general reference · low-risk
RAG groundednessGeneralLow risk
29 one-timeragas frameworkAG-26-0146
capability

General RAG groundedness — draft submission

An early-draft submission that hasn't cleared verification yet — held back rather than approved for sale. The report below shows exactly which checks it didn't pass.

RAG · general reference · low-risk
RAG groundednessGeneralLow risk
Not for saleragas framework