国产av一二三区|日本不卡动作网站|黄色天天久久影片|99草成人免费在线视频|AV三级片成人电影在线|成年人aV不卡免费播放|日韩无码成人一级片视频|人人看人人玩开心色AV|人妻系列在线观看|亚洲av无码一区二区三区在线播放

網(wǎng)易首頁(yè) > 網(wǎng)易號(hào) > 正文 申請(qǐng)入駐

Anthropic 官方指南:怎么給 Agent 設(shè)計(jì)工具

0
分享至

BLOG

本文翻譯自 Anthropic 官方博客「Seeing like an agent: how we design tools in Claude Code」,作者 Thariq Shihipar,Claude Code 團(tuán)隊(duì)工程師,今天發(fā)布

以下為逐段中英對(duì)照翻譯

構(gòu)建 Agent 最難的部分之一:設(shè)計(jì)工具

One of the hardest parts about building an agent harness is constructing its tools.

構(gòu)建 Agent harness 最困難的部分之一,是設(shè)計(jì)它的工具集

Claude acts completely through tool calling, but there are a number of ways tools can be constructed in the Claude API with primitives like bash, skills and code execution.

Claude 完全通過(guò)工具調(diào)用來(lái)行動(dòng)。在 Claude API 中,工具可以用 bash、skills、代碼執(zhí)行等基礎(chǔ)原語(yǔ)來(lái)構(gòu)建

So how do you design your agents' tools? Do you give it one general-purpose tool like bash or code execution? Or fifty specialized tools, one for each use case?

那你該怎么給 Agent 設(shè)計(jì)工具?給它一個(gè)通用工具(比如 bash 或代碼執(zhí)行)就夠了?還是做五十個(gè)專用工具,每個(gè)場(chǎng)景一個(gè)?

To put yourself in the mind of the model, imagine being given a difficult math problem. What tools would you want in order to solve it? It would depend on your own skill set!

要站在模型的角度想這個(gè)問(wèn)題,可以想象你面前有一道很難的數(shù)學(xué)題。你想要什么工具來(lái)解決它?答案取決于你自己的能力

Paper would be the minimum, but you'd be limited by manual calculations. A calculator would be better, but you would need to know how to operate the more advanced options. The fastest and most powerful option would be a computer, but you would have to know how to use it to write and execute code.

一張紙是最低配,但你只能手算。計(jì)算器好一些,但你得知道怎么用高級(jí)功能。最快最強(qiáng)的選擇是電腦,但你得會(huì)用它來(lái)寫和執(zhí)行代碼

This is a useful framework for designing your agent. You want to give it tools that are shaped to its own abilities. But how do you know what those abilities are? You pay attention, read its outputs, experiment. You learn to see like an agent.

這是一個(gè)很有用的設(shè)計(jì)框架。你要給 Agent 的工具,應(yīng)該貼合它自身的能力形狀。但你怎么知道它的能力是什么?你觀察它,讀它的輸出,反復(fù)實(shí)驗(yàn)。你學(xué)會(huì)「像 Agent 一樣看」

If you're building an agent, you'll face the same questions we did: when to add a tool, when to remove one, and how to tell the difference. Here's how we've answered them while building Claude Code, including where we got it wrong first.

如果你在做 Agent,你會(huì)面對(duì)和我們一樣的問(wèn)題:什么時(shí)候加工具,什么時(shí)候刪工具,怎么區(qū)分這兩種情況。下面是我們?cè)?Claude Code 的實(shí)際經(jīng)驗(yàn),包括一開始做錯(cuò)的地方

用 AskUserQuestion 工具改善提問(wèn)能力


三種方案的光譜:從無(wú)結(jié)構(gòu)到過(guò)度剛性,AskUserQuestion 工具落在中間

When building the AskUserQuestion tool, our goal was to improve Claude's ability to ask questions (often called elicitation).

設(shè)計(jì) AskUserQuestion 工具時(shí),我們的目標(biāo)是提升 Claude 向用戶提問(wèn)的能力(通常稱為 elicitation)

While Claude could just ask questions in plain text, we found answering those questions felt like they took an unnecessary amount of time. How could we lower this friction and increase the bandwidth of communication between the user and Claude?

雖然 Claude 可以用純文本提問(wèn),但我們發(fā)現(xiàn)回答這些問(wèn)題的體驗(yàn)很差,耗時(shí)太多。怎么降低這個(gè)摩擦,提升用戶和 Claude 之間的溝通帶寬?

第一次嘗試:修改 ExitPlanTool

The first approach we tried was adding a parameter to the ExitPlanTool to have an array of questions alongside the plan. This was the easiest fix to implement, but it confused Claude because we were simultaneously asking for a plan and a set of questions about the plan. What if the user's answers conflicted with what the plan said? Would Claude need to call the ExitPlanTool twice? We knew this tactic wouldn't work, so we went back to the drawing board.

我們第一個(gè)方案是給 ExitPlanTool 加一個(gè)參數(shù),讓它在輸出計(jì)劃的同時(shí)輸出一組問(wèn)題。這是最省事的改法,但它讓 Claude 很困惑:我們同時(shí)要求它做計(jì)劃和對(duì)計(jì)劃提問(wèn)。如果用戶的回答和計(jì)劃矛盾怎么辦?Claude 是不是得調(diào)兩次這個(gè)工具?我們知道這個(gè)方案行不通,于是回到原點(diǎn)

第二次嘗試:改變輸出格式

Next, we tried updating Claude's output instructions to serve a slightly modified markdown format that it could use to ask questions. For example, we could ask it to output a list of bullet point questions with alternatives in brackets. We could then parse and format that question as UI for the user.

接下來(lái),我們嘗試修改 Claude 的輸出指令,讓它用一種特殊的 Markdown 格式來(lái)提問(wèn)。比如用 bullet point 列出問(wèn)題,每個(gè)問(wèn)題后面用方括號(hào)給出選項(xiàng)。然后前端解析這個(gè)格式,渲染成 UI

Claude could usually produce this format, but not reliably. It would append extra sentences, drop options, or abandon the structure altogether. Onto the next approach.

Claude 大部分時(shí)候能生成這個(gè)格式,但不穩(wěn)定。它會(huì)在末尾多加一句話,漏掉選項(xiàng),或者干脆不用這個(gè)格式。下一個(gè)方案

第三次嘗試:AskUserQuestion 工具


AskUserQuestion 工具的實(shí)際界面

Finally, we landed on creating a tool that Claude could call at any point, but it was particularly prompted to do so during plan mode. When the tool triggered we would show a modal to display the questions and block the agent's loop until the user answered.

最終方案是做一個(gè)獨(dú)立的工具,Claude 可以在任何時(shí)候調(diào)用,但在規(guī)劃模式中會(huì)被特別引導(dǎo)去使用。工具觸發(fā)后彈出一個(gè)模態(tài)框顯示問(wèn)題,阻塞 Agent 循環(huán)直到用戶回答

This tool allowed us to prompt Claude for a structured output and it helped us ensure that Claude gave the user multiple options. It also gave users ways to compose this functionality, for example calling it in the Agent SDK or using referring to it in skills.

這個(gè)工具讓我們能引導(dǎo) Claude 輸出結(jié)構(gòu)化內(nèi)容,確保給用戶多個(gè)選項(xiàng)。它也給了用戶組合使用的空間,比如在 Agent SDK 或 Skills 中引用它

Most importantly, Claude seemed to like calling this tool and we found its outputs worked well. After all, even the best designed tool doesn't work if Claude doesn't understand how to call it.

最關(guān)鍵的一點(diǎn):Claude 喜歡調(diào)用這個(gè)工具,輸出質(zhì)量也好。畢竟,再好的工具設(shè)計(jì),如果模型不理解怎么調(diào)用,也是白搭

Is this the final form of elicitation in Claude Code? We doubt it. As Claude gets more capable, the tools that serve it have to evolve too. The next section shows a case where a tool that once helped started getting in the way.

這是 Claude Code 中 elicitation 的最終形態(tài)嗎?大概不是。隨著 Claude 能力提升,服務(wù)它的工具也必須跟著演進(jìn)。下一節(jié)會(huì)展示一個(gè)曾經(jīng)有用的工具后來(lái)開始礙事的案例

跟隨能力迭代:從 Todos 到 Tasks


從 Todos 到 Tasks:?jiǎn)?Agent 線性清單 → 多 Agent 協(xié)作任務(wù)圖

When we first launched Claude Code, we realized that the model needed a todo list to keep it on track. Todos could be written at the start and checked off as the model did work. To do this we gave Claude the TodoWrite tool, which would write or update Todos and display them to the user.

Claude Code 剛上線時(shí),我們發(fā)現(xiàn)模型需要一個(gè)待辦清單來(lái)保持專注。開工前列好待辦,做完一項(xiàng)勾一項(xiàng)。我們做了 TodoWrite 工具來(lái)實(shí)現(xiàn)這個(gè)功能

But even then, we often saw Claude forgetting what it had to do. To adapt, we inserted system reminders every 5 turns that reminded Claude of its goal.

即便如此,Claude 還是經(jīng)常忘記該干什么。我們于是每隔 5 輪對(duì)話就插一條系統(tǒng)提醒

As models improved, they found To-do lists limiting. Being sent reminders of the todo list made Claude think that it had to stick to the list instead of modifying it when it realized it needed to change course. We also saw Opus 4.5 also get much better at using subagents, but how could subagents coordinate on a shared todo list?

隨著模型迭代,Todo 列表開始礙事。系統(tǒng)提醒讓 Claude 覺得必須嚴(yán)格按清單執(zhí)行,不敢中途調(diào)整方向。Opus 4.5 用子 Agent 的能力大幅提升,但多個(gè)子 Agent 怎么共享一個(gè) Todo 列表?

Seeing this, we replaced the TodoWrite feature with the Task tool. Whereas todos are focused on keeping the model on track, tasks help agents communicate with each other. Tasks could include dependencies, share updates across subagents and the model could alter and delete them.

看到這些問(wèn)題,我們把 TodoWrite 替換成了 Task 工具。Todo 的重點(diǎn)是讓模型保持方向,Task 的重點(diǎn)是讓 Agent 之間互相溝通。Task 支持依賴關(guān)系,可以跨子 Agent 共享狀態(tài)更新,模型可以隨時(shí)修改和刪除

模型能力提升之后,曾經(jīng)需要的工具可能反過(guò)來(lái)限制它

As model capabilities increase, the tools that your models once needed might now be constraining them. It's important to constantly revisit previous assumptions on what tools are needed. This is also why it's useful to stick to a small set of models to support that have a fairly similar capabilities profile.

隨著模型能力提升,你的模型曾經(jīng)需要的工具現(xiàn)在可能反過(guò)來(lái)在限制它。定期回頭審視「這些工具是否還有必要」很重要。這也是為什么建議只支持少量能力相近的模型,這樣工具設(shè)計(jì)可以聚焦

設(shè)計(jì)搜索界面

The most consequential tools we've built are the ones that let Claude find its own context.

我們做過(guò)的最有影響力的工具,是那些讓 Claude 自己尋找上下文的工具

When Claude Code was first released internally, we used RAG: a vector database would pre-index the codebase, and the harness would retrieve relevant snippets and hand them to Claude before each response. While RAG was powerful and fast, it required indexing and setup and could be fragile across a host of different environments. Most importantly, Claude was given this context instead of finding the context itself.

Claude Code 內(nèi)部版本最早用的是 RAG:向量數(shù)據(jù)庫(kù)預(yù)先索引代碼庫(kù),每次回復(fù)前自動(dòng)檢索相關(guān)片段塞給 Claude。RAG 速度快、效果好,但需要預(yù)處理,環(huán)境兼容性脆弱。最根本的問(wèn)題是:上下文是被塞給 Claude 的,不是 Claude 自己找的

But if Claude could search on the web, why couldn't it also search your codebase? By giving Claude a Grep tool, we could let it search for files and build context itself.

如果 Claude 能搜網(wǎng)頁(yè),為什么不能搜代碼庫(kù)?給 Claude 一個(gè) Grep 工具,就能讓它自己搜文件、自己構(gòu)建上下文

As Claude gets smarter, it becomes increasingly good at building its context when given the right tools.

Claude 越聰明,給它合適的工具后它就越擅長(zhǎng)自己構(gòu)建上下文

When we introduced Agent Skills, we formalized the idea of progressive disclosure, which allows agents to incrementally discover relevant context through exploration.

Agent Skills 上線后,我們把這個(gè)思路正式化為漸進(jìn)式披露(progressive disclosure):讓 Agent 通過(guò)探索逐步發(fā)現(xiàn)相關(guān)上下文

Claude could now read skill files and those files could then reference other files that the model could read recursively. In fact, a common use of skills is to add more search capabilities to Claude like giving it instructions on how to use an API or query a database.

Claude 現(xiàn)在可以讀 Skill 文件,Skill 文件可以引用其他文件,模型可以遞歸地發(fā)現(xiàn)和加載上下文。一個(gè)常見的 Skill 用法就是給 Claude 增加搜索能力:告訴它怎么調(diào) API、怎么查數(shù)據(jù)庫(kù)

Over the course of a year, Claude went from not really being able to build its own context to being able to do nested search across several layers of files to find the exact context it needed.

一年時(shí)間,Claude 從幾乎不會(huì)自己構(gòu)建上下文,到能在多層文件中嵌套搜索,精確找到需要的信息

Progressive disclosure is now a common technique we use to add new functionality without adding a tool. In the next section, we explain why.

漸進(jìn)式披露現(xiàn)在是我們常用的一種技術(shù):不加工具就能加功能。下一節(jié)解釋具體怎么做

漸進(jìn)式披露:Claude Code Guide 子 Agent

Claude Code currently has ~20 tools, and our team frequently revisits if we need all of them for Claude to be most effective. The bar to add a new tool is high, because this gives the model one more option to think about.

Claude Code 目前有大約 20 個(gè)工具,團(tuán)隊(duì)經(jīng)常審視是否每個(gè)都有必要。加新工具的門檻很高,因?yàn)槊慷嘁粋€(gè)工具,模型就多一個(gè)需要思考的選項(xiàng)

For example, we noticed that Claude did not know enough about how to use Claude Code. If you asked it how to add a MCP or what a slash command did, it would not be able to reply.

比如,我們發(fā)現(xiàn) Claude 不夠了解 Claude Code 自身的功能。你問(wèn)它怎么加 MCP、某個(gè)斜杠命令是什么意思,它答不上來(lái)

We could have put all of this information in the system prompt, but given that users rarely asked about this, it would have added context rot and interfered with Claude Code's main job: writing code.

可以把這些信息全塞進(jìn) system prompt,但用戶很少問(wèn)這類問(wèn)題,塞進(jìn)去會(huì)造成上下文腐蝕,干擾 Claude 的主要工作(寫代碼)

Instead, we tried progressive disclosure: we gave Claude a link to its docs that it could load and search when needed. This worked, but Claude would pull large chunks of documentation into context to find an answer the user could have gotten in one sentence.

我們嘗試漸進(jìn)式披露:給 Claude 一個(gè)指向文檔的鏈接,需要時(shí)自己去查。能用,但 Claude 會(huì)把大段文檔拉進(jìn)上下文,只為回答一個(gè)一句話就能搞定的問(wèn)題

So we built the Claude Code Guide — a subagent Claude calls whenever a user asks about Claude Code itself. The subagent does the doc-searching in its own context, follows detailed instructions on how to search and what to extract, and hands back only the answer. The main agent's context stays clean.

最終我們做了一個(gè)Claude Code Guide子 Agent。當(dāng)用戶問(wèn) Claude Code 自身的問(wèn)題時(shí),主 Agent 把請(qǐng)求轉(zhuǎn)給這個(gè)子 Agent。子 Agent 在自己的上下文里搜索文檔、提取答案,只把答案?jìng)骰貋?lái)。主 Agent 的上下文保持干凈

While this isn't a perfect solution (Claude can still get confused when you ask it about how to set itself up), we were able to add things to Claude's action space without adding a new tool.

這個(gè)方案不完美(Claude 有時(shí)候還是會(huì)在自身配置問(wèn)題上犯糊涂),但關(guān)鍵是:不用加新工具,就能擴(kuò)展 Agent 的能力范圍

像 Agent 一樣看,是手藝活

Designing the tools for your models is as much an art as it is a science. It depends heavily on the model you're using, the goal of the agent and the environment it's operating in.

給模型設(shè)計(jì)工具,與其說(shuō)是科學(xué),更接近手藝。它取決于你用的模型、Agent 的目標(biāo)、運(yùn)行的環(huán)境

Our best advice? Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.

我們最好的建議?多實(shí)驗(yàn),讀你的輸出,試新東西。最重要的是,學(xué)會(huì)像 Agent 一樣看

Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.

https://claude.com/blog/seeing-like-an-agent

作者:Thariq Shihipar,Anthropic 工程師,Claude Code 團(tuán)隊(duì)

特別聲明:以上內(nèi)容(如有圖片或視頻亦包括在內(nèi))為自媒體平臺(tái)“網(wǎng)易號(hào)”用戶上傳并發(fā)布,本平臺(tái)僅提供信息存儲(chǔ)服務(wù)。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.

相關(guān)推薦
熱點(diǎn)推薦
比亞迪固態(tài)電池獲車規(guī)認(rèn)證,續(xù)航高達(dá)1218公里,2027年量產(chǎn)在即!

比亞迪固態(tài)電池獲車規(guī)認(rèn)證,續(xù)航高達(dá)1218公里,2027年量產(chǎn)在即!

小李子體育
2026-04-18 15:14:06
蘭姐一提起孫子小霖霖,話還沒說(shuō),眼淚先一串一串地往下掉。

蘭姐一提起孫子小霖霖,話還沒說(shuō),眼淚先一串一串地往下掉。

阿廢冷眼觀察所
2026-04-18 17:57:36
清算終于來(lái)了!中方給日本的最后期限:180 天,歸還被掠百年國(guó)寶

清算終于來(lái)了!中方給日本的最后期限:180 天,歸還被掠百年國(guó)寶

z千年歷史老號(hào)
2026-01-31 13:50:06
越南的南北鴻溝:一個(gè)國(guó)家,兩個(gè)世界

越南的南北鴻溝:一個(gè)國(guó)家,兩個(gè)世界

民間胡扯老哥
2026-04-03 02:26:57
農(nóng)村正在消亡的6個(gè)民間手藝,見過(guò)3種以上,說(shuō)明你已經(jīng)老了

農(nóng)村正在消亡的6個(gè)民間手藝,見過(guò)3種以上,說(shuō)明你已經(jīng)老了

心中的麥田
2026-04-17 19:40:16
極度危險(xiǎn)的星座,沒有之一!

極度危險(xiǎn)的星座,沒有之一!

同道大叔
2026-04-18 22:02:11
一旦臺(tái)海戰(zhàn)爭(zhēng)爆發(fā),可能造成上億傷亡,解放軍或需解決4大戰(zhàn)場(chǎng)

一旦臺(tái)海戰(zhàn)爭(zhēng)爆發(fā),可能造成上億傷亡,解放軍或需解決4大戰(zhàn)場(chǎng)

星星會(huì)墜落
2026-04-14 01:10:20
中央明確了!社保最低繳費(fèi)年限要提高,70、80后得早做準(zhǔn)備

中央明確了!社保最低繳費(fèi)年限要提高,70、80后得早做準(zhǔn)備

云鵬敘事
2026-04-12 16:36:39
聽說(shuō)毛澤東領(lǐng)導(dǎo)了紅軍,魯迅關(guān)切地問(wèn)道:他有多大歲數(shù)了?

聽說(shuō)毛澤東領(lǐng)導(dǎo)了紅軍,魯迅關(guān)切地問(wèn)道:他有多大歲數(shù)了?

鶴羽說(shuō)個(gè)事
2026-04-17 22:30:54
為何說(shuō)年齡超過(guò)79歲的人:即便身體健康,也沒有多少來(lái)日方長(zhǎng)?

為何說(shuō)年齡超過(guò)79歲的人:即便身體健康,也沒有多少來(lái)日方長(zhǎng)?

醫(yī)學(xué)原創(chuàng)故事會(huì)
2026-04-18 12:28:22
局勢(shì)生變,全球接到消息,美軍全部撤離,所有軍事基地被敘國(guó)接管

局勢(shì)生變,全球接到消息,美軍全部撤離,所有軍事基地被敘國(guó)接管

老謝談史
2026-04-18 11:32:45
和曹燕華離婚后,他再娶小22歲乒乓美女,如今定居上海兒女雙全

和曹燕華離婚后,他再娶小22歲乒乓美女,如今定居上海兒女雙全

青橘罐頭
2026-04-18 19:32:25
價(jià)格狂飆6倍!日本連夜求購(gòu)遭中方出口管制,高端制造全線崩盤?

價(jià)格狂飆6倍!日本連夜求購(gòu)遭中方出口管制,高端制造全線崩盤?

王二哥老搞笑
2026-04-17 17:08:23
樓市爆發(fā),2周超1個(gè)月?4大一線房?jī)r(jià)轉(zhuǎn)漲!高盛:未來(lái)3年漲15%

樓市爆發(fā),2周超1個(gè)月?4大一線房?jī)r(jià)轉(zhuǎn)漲!高盛:未來(lái)3年漲15%

財(cái)說(shuō)得明白
2026-04-18 21:28:07
全程眼突鼓腮,看了觀眾對(duì)孫儷的評(píng)價(jià),才知張藝謀這句話的含金量

全程眼突鼓腮,看了觀眾對(duì)孫儷的評(píng)價(jià),才知張藝謀這句話的含金量

陳述影視
2026-04-04 17:53:34
19日起,連下5天,4月最強(qiáng)降雨大幕拉開!暴雨、大暴雨連成片

19日起,連下5天,4月最強(qiáng)降雨大幕拉開!暴雨、大暴雨連成片

鯨探所長(zhǎng)
2026-04-18 11:33:53
麥當(dāng)娜的風(fēng)流往事:他的欲望太強(qiáng),讓她疲憊不堪又欲罷不能

麥當(dāng)娜的風(fēng)流往事:他的欲望太強(qiáng),讓她疲憊不堪又欲罷不能

錢小刀娛樂
2026-04-17 11:24:32
金建希小姐的大瓜!

金建希小姐的大瓜!

仕道
2026-04-17 17:03:55
特納反思雄鹿首個(gè)賽季:這絕對(duì)是一次讓人清醒的打擊

特納反思雄鹿首個(gè)賽季:這絕對(duì)是一次讓人清醒的打擊

北青網(wǎng)-北京青年報(bào)
2026-04-18 22:16:05
馬筱梅自爆已經(jīng)通知小楊阿姨,要請(qǐng)假自己開播公布,別給她招黑

馬筱梅自爆已經(jīng)通知小楊阿姨,要請(qǐng)假自己開播公布,別給她招黑

小娛樂悠悠
2026-04-18 12:31:03
2026-04-18 22:43:00
賽博禪心
賽博禪心
拜AI古佛,修賽博禪心
389文章數(shù) 50關(guān)注度
往期回顧 全部

科技要聞

傳Meta下月擬裁8000 大舉清退人力為AI騰位

頭條要聞

小車在高速上跑100碼 車主突然接到電話"你車輪沒了"

頭條要聞

小車在高速上跑100碼 車主突然接到電話"你車輪沒了"

體育要聞

時(shí)隔25年重返英超!沒有人再嘲笑他了

娛樂要聞

劉德華回應(yīng)潘宏彬去世,拒談喪禮細(xì)節(jié)

財(cái)經(jīng)要聞

"影子萬(wàn)科"2.0:管理層如何吸血萬(wàn)物云?

汽車要聞

奇瑞威麟R08 PRO正式上市 售價(jià)14.48萬(wàn)元起

態(tài)度原創(chuàng)

旅游
手機(jī)
本地
游戲
公開課

旅游要聞

花開如雪 暗香浮動(dòng)|濟(jì)寧戴莊流蘇花迎來(lái)最美花期 引市民打卡

手機(jī)要聞

華為蘋果爭(zhēng)第一,手機(jī)TOP5排名來(lái)了

本地新聞

12噸巧克力有難,全網(wǎng)化身超級(jí)偵探添亂

《刺客信條黑旗》重制版將至 你心中最帥刺客導(dǎo)師是誰(shuí)

公開課

李玫瑾:為什么性格比能力更重要?

無(wú)障礙瀏覽 進(jìn)入關(guān)懷版