agent-evaluation
How to Install
Claude Code:
git clone https://github.com/sickn33/antigravity-awesome-skills && cp skills/SKILL.md ~/.claude/skills/Cursor:
Copy SKILL.md into your .cursorrules fileTesting and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world benchm
Details
| Category | AI/ML → AI Agents |
| Source | https://github.com/sickn33/antigravity-awesome-skills |
| Stars | ★ 40K |
| Risk Level | N/A |
Related Skills
hosted-agents
Build background agents in sandboxed environments. Use for hosted coding agents, sandboxed VMs, Moda
lambda-lang
Native agent-to-agent language for compact multi-agent messaging. A shared tongue agents speak direc
llm-app-patterns
Production-ready patterns for building LLM applications, inspired by [Dify](https://github.com/langg
local-llm-expert
Master local LLM inference, model selection, VRAM optimization, and local deployment using Ollama, l