← 回總覽

EvoSkill:一种用于 AI Agent 技能发现的自演化框架

📅 2026-03-11 21:44 elvis 人工智能 2 分鐘 1317 字 評分: 84
EvoSkill AI Agent 多 Agent 系统 LLM 自主 Agent
📌 一句话摘要 EvoSkill 是一个多 Agent 框架,通过迭代失败分析和帕累托前沿筛选过程,自动发现并优化 Agent 技能。 📝 详细摘要 这条推文介绍了 EvoSkill,这是一个旨在自动化创建 AI Agent 技能的新型研究框架,摆脱了对手工设计的依赖。该系统采用三个协同工作的 Agent —— 执行者(Executor)、提议者(Proposer)和技能构建者(Skill-Builder),通过诊断执行失败并将新技能具体化为结构化文件夹。它在 OfficeQA 等基准测试中表现出显著的性能提升(从 60.6% 提升至 67.9%),并在保持基础模型冻结的情况下,展示了出色

A self-evolving framework to discover and refine agent skills. Most agent skills I see today are hand-crafted or poorly designed by an agent.

Multi-agent systems for building skills look promising.

This paper introduces EvoSkill, a self-evolving framework that automatically discovers and refines agent skills through iterative failure analysis.

EvoSkill analyzes execution failures, proposes new skills or edits to existing ones, and materializes them into structured, reusable skill folders.

Three collaborating agents drive the entire process.

An Executor that runs tasks, a Proposer that diagnoses failures, and a Skill-Builder that creates concrete skill folders.

A Pareto frontier governs selection, retaining only skills that improve held-out validation performance while keeping the underlying model frozen.

On OfficeQA, EvoSkill improves Claude Code with Opus 4.5 from 60.6% to 67.9% exact-match accuracy. On SealQA, it yields a 12.1% gain. Skills evolved on SealQA transfer zero-shot to BrowseComp, improving accuracy by 5.3% without modification.

I will continue to track this line of research closely. I think it's really important.

Paper: arxiv.org/abs/2603.02766

Learn to build effective AI agents in our academy: academy.dair.ai

查看原文 → 發佈: 2026-03-11 21:44:05 收錄: 2026-03-12 00:01:10

🤖 問 AI

針對這篇文章提問,AI 會根據文章內容回答。按 Ctrl+Enter 送出。