agent-evaluation
How to Install
Claude Code:
git clone --depth 1 https://github.com/sickn33/antigravity-awesome-skills.git && cp antigravity-awesome-skills/skills/SKILL.md ~/.claude/skills/SKILL.mdCursor:
Copy the SKILL.md content into your .cursorrules fileTesting and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world benchm
Details
| Category | AI/ML → ml |
| Source | https://github.com/sickn33/antigravity-awesome-skills |
| Stars | ★ 40.9K |
| Risk Level | N/A |
Related Skills
agent-evaluation
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability
hosted-agents
# Hosted Agent Infrastructure
Hosted agents run in remote sandboxed environments rather than on loc
seo-dataforseo
Use DataForSEO for live SERPs, keyword metrics, backlinks, competitor analysis, on-page checks, and
agent-tool-builder
Tools are how AI agents interact with the world. A well-designed tool is the difference between an a