Verified catalog · capability
Capability packs
Does the agent do the task well — RAG correctness, tool-calling, retrieval quality.
capability
AVerified
Medical RAG — groundedness & abstention
Catches confident fabrication with fake citations. Scores groundedness, citation accuracy, and whether the agent abstains when evidence is missing.
RAG · medical literature · zero-hallucination
EU AI Act — Art. 53 / high-risk
€79AG-26-0142
capability
AVerified
Tool-calling correctness
Verifies function/tool selection, argument correctness, and recovery from tool errors. Deterministic where possible; judge only for free-form fields.
Tool-calling · general · medium-risk
€49AG-26-0143