RAG groundedness
Does the agent answer only from the sources it's given, cite the right one, and abstain when the evidence doesn't support an answer? The same test applied across domains — medical, legal, finance, support, general knowledge.
Medical RAG — groundedness & abstention
Catches confident fabrication with fake citations. Scores groundedness, citation accuracy, and whether the agent abstains when evidence is missing.
Legal contract RAG — groundedness & abstention
Verifies a contract-QA agent cites the actual clause it answers from, and refuses to answer questions the contract doesn't cover.
Financial reporting RAG — groundedness & abstention
Checks that an accounting-policy assistant grounds answers in the actual policy note cited, and doesn't fabricate figures for uncovered questions.
Customer-support RAG — groundedness & abstention
Verifies a support bot answers policy questions from the actual help-doc cited, and declines account-specific questions it has no evidence for.
General-knowledge RAG — groundedness & abstention
A control cell: general-reference Q&A used to baseline groundedness and abstention scoring before applying the harness to specialized domains.
General RAG groundedness — draft submission
An early-draft submission that hasn't cleared verification yet — held back rather than approved for sale. The report below shows exactly which checks it didn't pass.