Featuredmarkdown
Agent Evaluation
by muratcankoylan•Context Engineering
Methods for evaluating agent performance including LLM-as-Judge patterns, metrics design, and benchmarking
1,580downloads
205stars
~580tokens
Quick Install
One command to add this skill
Terminal
$ mkdir -p ~/.claude/skills/context-engineering && curl -L https://raw.githubusercontent.com/muratcankoylan/Agent-Skills-for-Context-Engineering/main/skills/evaluation/SKILL.md > ~/.claude/skills/context-engineering/evaluation-SKILL.mdInstructions
SKILL.md
Security & Permissions
2 permissions required
- No network access required
- Can modify files on disk
- Executes shell commands
Details
- Published
- 2026/01/10
- Language
- markdown
- Token Est.
- ~580
Resources
Tags
evaluationmetricsbenchmarksllm-as-judgetesting
