⌘K
Change language Switch ThemeSign In
Narrow Mode
北航团队为龙虾安全紧急开刀!开源 OpenClaw 风险防御工具,梳理 9 大高危风险缓解措施
量 量子位 @Jay
One Sentence Summary
Beihang University team releases and open-sources ClawGuard Auditor, a security defense tool for OpenClaw agents, along with a comprehensive security report covering 9 high-risk issues including prompt injection and sandbox escape.
Summary
This article introduces ClawGuard Auditor, a defense tool developed by the Intelligent Security Innovation Team at Beihang University for the OpenClaw agent framework, along with its accompanying security report. The tool adopts a static-dynamic combined architecture, implementing full lifecycle protection from code loading to dynamic execution through static application security testing, proactive security kernel, and data loss prevention engine. The report systematically organizes the six major security risk categories faced by agents, with a focus on identifying 9 core high-risk issues including prompt injection, sandbox escape, and path traversal. Additionally, the team proposes specific mitigation measures across six dimensions: instruction, interaction, execution, data, interface, and supply chain, providing AI agent developers with actionable security hardening guidelines.
Main Points
* 1. ClawGuard Auditor constructs a static-dynamic combined, three-pronged collaborative defense architecture.The tool integrates a static reviewer to intercept malicious code, a proactive kernel for runtime transparent supervision, and a data loss prevention engine to monitor sensitive assets, ensuring controlled agent behavior throughout the entire lifecycle. * 2. The defense mechanism relies on four security axioms including zero-trust principles and semantic intent matching.Through default hostility assumptions, deep evaluation of consistency between actual code behavior and declared intent, and a just-in-time minimum privilege token model, it prevents illegal operations disguised as legitimate ones. * 3. The report identifies 9 core high-risk threats covering traditional vulnerabilities and AI-specific risks.Risks not only include cutting-edge AI challenges like prompt injection and model backdoors, but also cover key traditional system security vulnerabilities such as sandbox escape, path traversal, and plaintext storage of sensitive data. * 4. Comprehensive针对性的防护与处置建议 are proposed for the six risk systems.Recommendations include establishing malicious诱导 feature libraries, enabling strict mode sandbox isolation, implementing the principle of least privilege, and regularly scanning third-party dependency vulnerabilities, aiming to build a complete security protection closed loop.
Metadata
AI Score
85
Website qbitai.com
Published At Today
Length 2927 words (about 12 min)
Sign in to use highlight and note-taking features for a better reading experience. Sign in now
> ClawGuard Auditor团队 投稿 > > > 量子位 | 公众号 QbitAI
小龙虾越用越火,养虾er也越来越多。
可是给AI开的权限太高,安全风险也随之攀升。
北航复杂关键软件环境全国重点实验室智能安全创新团队出手,正式发布了全网最系统的安全报告。 并同步开源了OpenClaw安全防御工具ClawGuard Auditor。
能成功检测本地导入的恶意Skill并输出安全审查报告:
ClawGuard Auditor锚定于系统最高特权层运行的底层安全守护进程。
对所有的外部指令、提示词乃至其他技能都拥有最高否决权,全方位保障用户本地系统资产的安全。
除此之外,安全报告还梳理出九大高危风险,附带防护建议,一起来看看。
先说ClawGuard Auditor,相较于现有的开源安全工具,它具备三大核心差异化优势: 1)安全能力全面: 精准涵盖当前已知主流各类智能体专属风险与传统漏洞,威胁防护种类较为全面。 2)覆盖全生命周期: 突破传统工具仅具备单一检测手段的局限,实现从代码加载、模型交互到动态执行的全生命周期守护。 3)较高的可用性: 采用灵活适配的设计理念,尽可能的即插即用,用户无需繁琐配置即可快速为智能体部署底层护栏。
ClawGuard Auditor构建起一套动静结合、三位一体的协同防御架构。
其中,静态应用安全测试审查器会在技能运行前完成接入,借助词法分析和行为建模技术,精准拦截恶意代码包的入侵;
主动安全内核则实现运行时的透明监管,一旦检测到行为触及敏感操作,便会立即接管执行流,阻断未经授权的调用行为;
主动数据防泄漏引擎则全程监控内存状态与网络出口数据,严格保障API Keys等敏感资产不外泄。
其核心原理依托于四大不可被篡改的防御公理,所有行为判定均以此为根本依据展开。
一是绝对覆盖与零信任原则,将所有外部代码默认视为具有敌意,任何机制都无法绕过或修改 Auditor 的规则;
二是语义意图匹配机制,不再局限于单纯的代码分析,而是深入评估代码的实际行为与声明意图是否一致,从而杜绝 “披着合法外衣执行非法行为” 的情况;
三是能力令牌模型与限制特权机制,严格强制执行最小权限原则,令牌采用随用随发的模式,在对应任务结束后便自动撤销;
四是数据主权与数字资产隔离原则,将守护本地资产不受侵犯作为最高准则,全方位保障本地数字资产的安全。
针对OpenClaw智能体全生命周期安全风险,研究团队发布业内首个《OpenClaw智能体安全风险报告》。
相较于行业内其他的公开安全报告,本报告具有三大显著的前瞻性优势: 1)安全风险多维扩展:不仅局限于传统的系统与网络攻击,更深度涵盖了提示词注入等前沿的智能攻击风险; 2)风险体系完整闭环: 风险种类覆盖面广,告别碎片化罗列,为智能体构建了成体系化的风险图谱; 3)防护与检测并重: 不仅提供传统的网络安全防御策略,还针对智能体运行特性给出了落地性强的动态检测建议。
报告基于“全面覆盖、可追溯、可查证”原则,结合OpenClaw技术特性和开源社区安全公告,构建六大安全风险体系,覆盖当前所有已知核心风险点:
- 指令与模型安全:聚焦提示词注入、模型幻觉、模型后门等核心风险;
- 交互与输入安全:覆盖恶意输入注入、诱导性交互等攻击场景;
- 执行与权限安全:重点关注沙箱逃逸、越权操作、高危动作执行等风险;
- 数据与通信安全:包含敏感数据存储、传输加密、数据污染等风险;
- 接口与服务安全:聚焦未授权访问、接口越权、暴力破解等隐患;
- 部署与供应链安全:涵盖第三方依赖漏洞、恶意插件、日志缺失等风险。
报告将OpenClaw安全风险划分为三个等级(低级、中级、高级),共识别如下OpenClaw核心高危风险9项。
均为当前最易被利用、危害最大的核心风险。这些风险既包括传统系统安全问题,也包括智能体系统特有风险。
* 提示词注入与指令劫持
攻击者通过构造恶意输入或隐藏指令,诱导智能体绕过原有安全约束并执行攻击者指定操作。
* 沙箱逃逸与越权执行
若智能体执行环境隔离机制存在漏洞,攻击者可能通过构造特定输入绕过沙箱限制,执行系统命令或访问敏感资源,最终实现系统级控制。
* 路径遍历与越权文件操作
攻击者利用路径遍历字符(如../)访问系统敏感文件。
如配置文件、密钥文件或日志文件,从而获取关键系统信息或篡改系统配置。
* 无限制高危动作执行
智能体若缺乏严格的动作权限控制,可执行高危操作。
例如删除文件、关闭服务、发送外部网络请求等,一旦被攻击者诱导,将直接影响系统稳定性。
* 敏感数据明文存储
系统日志、用户凭证、API 密钥等敏感信息若以明文形式存储,一旦服务器被访问或日志泄露,攻击者可快速获取大量敏感数据。
* 未授权访问与默认口令
系统若使用默认账号或弱认证机制,攻击者可通过扫描工具进行暴力破解或批量攻击,实现远程接管系统。
* 接口越权与权限滥用
若系统接口缺乏细粒度权限控制,攻击者可通过构造请求越权调用控制接口,执行敏感操作或访问内部数据。
* 第三方依赖漏洞(CVE)
OpenClaw依赖的开源组件若存在公开漏洞,攻击者可利用已知漏洞实施远程攻击,执行恶意代码或提升系统权限。
* 插件来源不可信与投毒
自非官方渠道的插件或扩展组件可能包含恶意代码或后门,一旦被加载至系统, 将对智能体运行环境和数据安全造成严重威胁。
本次梳理的所有风险,主要影响OpenClaw智能体的四大安全目标。
结合行业公开事件,具体影响系统完整性、数据保密性、执行可控性、审计可追溯性。
结合本次梳理的风险点、行业安全最佳实践及权威机构防护要求,团队对每类风险提出了如下针对性防护与处置建议,优先处置高危风险,逐步完善防护体系。
* 指令与模型安全:阻断注入,严控输出
建立恶意诱导文本特征库,过滤注入意图输入;
强化模型输出审核,对敏感信息脱敏;
规范训练/微调流程,防范数据投毒;
固定安全指令边界,禁止泄露核心信息。
* 交互与输入安全:过滤恶意输入,识别异常交互
建立输入安全过滤机制,校验恶意命令;
设置交互频率阈值,阻断连续诱导、疲劳提问;
高危场景采用固定回复模板,增加人工复核。
* 执行与权限安全:最小权限,严格隔离
启用严格模式沙箱隔离,限制系统核心资源访问;
实施命令、文件、路径白名单,拦截高危操作;
以低权限用户运行,高危动作增加二次确认和紧急停止功能。
* 数据与通信安全:加密存储传输,数据权限管控
敏感数据(密钥、凭证、日志)加密存储,禁止明文;
全面启用HTTPS/TLS 1.3,禁用 HTTP明文传输;
清洗审计训练、知识库数据,防范恶意数据混入;
建立数据访问权限管控与审计机制,实施最小权限访问。
* 接口与服务安全:严控访问,强化鉴权
关闭公网暴露,仅允许内网、可信IP访问;
禁用默认账号、口令,设置强密码、token鉴权并定期轮换;
接口全链路鉴权,设置访问频率限制、验证码。
* 部署与供应链安全:溯源依赖,完善审计
定期扫描第三方依赖CVE漏洞,及时升级修复;
仅从官方渠道下载插件,启用签名验证与黑名单机制;
开启全流程日志采集,加密存储;
建立常态化安全巡检机制。
在此建议各位养虾er把安全机制拉满,用虾不翻车~
GitHub地址:https://github.com/SafeAgent-Beihang/clawguard
量 量子位 @Jay
One Sentence Summary
Beihang University team releases and open-sources ClawGuard Auditor, a security defense tool for OpenClaw agents, along with a comprehensive security report covering 9 high-risk issues including prompt injection and sandbox escape.
Summary
This article introduces ClawGuard Auditor, a defense tool developed by the Intelligent Security Innovation Team at Beihang University for the OpenClaw agent framework, along with its accompanying security report. The tool adopts a static-dynamic combined architecture, implementing full lifecycle protection from code loading to dynamic execution through static application security testing, proactive security kernel, and data loss prevention engine. The report systematically organizes the six major security risk categories faced by agents, with a focus on identifying 9 core high-risk issues including prompt injection, sandbox escape, and path traversal. Additionally, the team proposes specific mitigation measures across six dimensions: instruction, interaction, execution, data, interface, and supply chain, providing AI agent developers with actionable security hardening guidelines.
Main Points
* 1. ClawGuard Auditor constructs a static-dynamic combined, three-pronged collaborative defense architecture.
The tool integrates a static reviewer to intercept malicious code, a proactive kernel for runtime transparent supervision, and a data loss prevention engine to monitor sensitive assets, ensuring controlled agent behavior throughout the entire lifecycle.
* 2. The defense mechanism relies on four security axioms including zero-trust principles and semantic intent matching.
Through default hostility assumptions, deep evaluation of consistency between actual code behavior and declared intent, and a just-in-time minimum privilege token model, it prevents illegal operations disguised as legitimate ones.
* 3. The report identifies 9 core high-risk threats covering traditional vulnerabilities and AI-specific risks.
Risks not only include cutting-edge AI challenges like prompt injection and model backdoors, but also cover key traditional system security vulnerabilities such as sandbox escape, path traversal, and plaintext storage of sensitive data.
* 4. Comprehensive针对性的防护与处置建议 are proposed for the six risk systems.
Recommendations include establishing malicious诱导 feature libraries, enabling strict mode sandbox isolation, implementing the principle of least privilege, and regularly scanning third-party dependency vulnerabilities, aiming to build a complete security protection closed loop.
Key Quotes
* Having the highest veto power over all external instructions, prompts, and even other skills, comprehensively ensuring the security of users' local system assets. * Treating all external code as hostile by default, with no mechanism able to bypass or modify Auditor's rules. * No longer limited to pure code analysis, but deeply evaluating whether the actual behavior of code is consistent with its declared intent, thereby preventing 'illegal actions disguised as legitimate ones'. * Wide coverage of risk types, farewell to fragmented listing, building a systematic risk map for agents. * Running with low-privilege users, adding secondary confirmation and emergency stop functions for high-risk actions.
AI Score
85
Website qbitai.com
Published At Today
Length 2927 words (about 12 min)
Tags
AI Agent
OpenClaw
Cybersecurity
Prompt Injection
Sandbox Escape
Related Articles
* 128. Manus Co-founder's Final Interview Before Sale: The Fantastical Drift to 2025... * From Clawdbot to the 2026 AI Coding Explosion | A Conversation with PingCAP CTO Dongxu * AI Starts to "Take Action", Alibaba's Qwen Leads the World * GPT-5.4 Released: OpenAI's First Unified Model, Truly Native * Yao Shunyu Lectures Face-to-Face with Tang Jie, Yang Zhilin, and Lin Junyang! Four Schema Heroes Debate Heroes at Zhongguancun * How to Design an AI Agent System * MiniMax Hailuo Video Team's First Open-Source Release: Tokenizer Exhibits a Clear Scaling Law * Deconstructing Clawdbot: Local Architecture, Memory Management, Agent Orchestration, and Context Assembly Principles * What You Don't Know About Claude Code: Architecture, Governance, and Engineering Practices - Tw93 * After Topping Open-Source Rankings with its Programming LLM, the Zhipu GLM Team Faced a 3-Hour Questioning Session HomeArticlesPodcastsVideosTweets