← 回總覽

Redis 作者实测:GPT 5.4 在复杂逆向工程任务中远超 Claude 4.6

📅 2026-04-07 21:38 Viking 人工智能 4 分鐘 3779 字 評分: 88
GPT 5.4 Claude 4.6 Redis作者 模型评测 逆向工程
📌 一句话摘要 Redis 作者 antirez 通过对 90 年代 Unix 磁盘镜像的逆向工程实验,对比证明 GPT 5.4 的深度推理能力显著优于 Claude 4.6。 📝 详细摘要 推文转述了 Redis 作者 antirez 的一项真实实验对比。实验任务是针对 90 年代 Unix 磁盘镜像进行 SCSI 控制器及 ROM 的反向工程,涉及硬件知识、汇编/反汇编等极高难度的工程能力。实验结果显示,GPT 5.4 (max thinking) 在长时程自主运行中取得了所有主要进展,而 Claude Opus 4.6 几乎无法处理此类高难度任务。结论指出在深度推理和复杂工程领域,两

Title: Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude ...

URL Source: https://www.bestblogs.dev/status/2041510879506903270

Published Time: 2026-04-07 13:38:17

Markdown Content: Skip to main content ![Image 1: LogoBestBlogs](https://www.bestblogs.dev/ "BestBlogs.dev")Toggle navigation menu Toggle navigation menuArticlesPodcastsVideosTweetsSourcesNewsletters

⌘K

Change language Switch ThemeSign In

Narrow Mode

Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude 4.6 in Complex Reverse Engineering

Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude 4.6 in Complex Reverse Engineering

![Image 2: Viking](https://www.bestblogs.dev/en/tweets?sourceId=SOURCE_e09d86) ### Viking

@vikingmute

很值得看的一个帖子:Redis 作者分享的一个真实实验对比

过去一周,他用 Claude Code Opus 4.6 和 Codex GPT 5.4(max thinking)进行了长时间的自主运行,在独立的目录环境中反复测试。

任务非常复杂,从一个早期90年代的 Unix 磁盘镜像,反向工程早已消失的 SCSI 控制器及其集成 ROM。这是为了计算机历史和博物馆合作的项目,需要结合硬件知识、汇编/反汇编等深度工程能力。

实验结果:

GPT 5.4 :在多次长时间运行中取得了所有主要进展,能有效混合硬件知识、反汇编技巧等,完成复杂逆向工作。

Claude Opus 4.6:只取得了少量次要进展,在高难度任务上几乎一点用都没有。

他的结论:对于高难度的工程工作,两者差距非常残酷。GPT 5.4 明显更强,尤其在需要深度推理和长时程任务时。

原帖中还有对比图。Show More

!Image 3: antirez

#### antirez

@antirez · 18h ago

During the last week I executed very long autonomous sessions of Claude Code Opus 4.6 and Codex GPT 5.4 (both at max thinking budget), in cloned directories (refreshed every time one was behind). I burned a lot of (flat rate, my OSS free account + my PRO account) of tokens...

49

119

1,617

545.2K

Apr 7, 2026, 1:38 PM View on X

14 Replies

16 Retweets

166 Likes

32.3K Views ![Image 4: Viking](https://www.bestblogs.dev/en/tweets?sourceid=e09d86) Viking @vikingmute

One Sentence Summary

Redis creator antirez conducted a reverse engineering experiment on a 90s Unix disk image, proving GPT 5.4's deep reasoning capabilities are significantly superior to Claude 4.6.

Summary

The tweet recounts a real-world experimental comparison by antirez, the author of Redis. The task involved reverse engineering an early 90s Unix disk image to reconstruct long-lost SCSI controllers and integrated ROMs—a task requiring extreme engineering skills in hardware knowledge and assembly/disassembly. Results showed that GPT 5.4 (max thinking) achieved all major progress during long autonomous sessions, whereas Claude Opus 4.6 was almost useless for such high-difficulty tasks. The conclusion highlights a brutal gap between the two in deep reasoning and complex engineering fields.

AI Score

88

Influence Score 51

Published At Today

Language

Chinese

Tags

GPT 5.4

Claude 4.6

Redis Author

Model Evaluation

Reverse Engineering HomeArticlesPodcastsVideosTweets

Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude ...

查看原文 → 發佈: 2026-04-07 21:38:17 收錄: 2026-04-08 00:01:01

🤖 問 AI

針對這篇文章提問,AI 會根據文章內容回答。按 Ctrl+Enter 送出。