Title: Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude ...

URL Source: https://www.bestblogs.dev/status/2041510879506903270

Published Time: 2026-04-07 13:38:17

Markdown Content: Skip to main content ![Image 1: LogoBestBlogs](https://www.bestblogs.dev/ "BestBlogs.dev")Toggle navigation menu Toggle navigation menuArticles Podcasts Videos Tweets Sources Newsletters

⌘K

Change language Switch ThemeSign In

Narrow Mode

Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude 4.6 in Complex Reverse Engineering

![Image 2: Viking](https://www.bestblogs.dev/en/tweets?sourceId=SOURCE_e09d86) ### Viking

@vikingmute

很值得看的一个帖子：Redis 作者分享的一个真实实验对比

过去一周，他用 Claude Code Opus 4.6 和 Codex GPT 5.4（max thinking）进行了长时间的自主运行，在独立的目录环境中反复测试。

任务非常复杂，从一个早期90年代的 Unix 磁盘镜像，反向工程早已消失的 SCSI 控制器及其集成 ROM。这是为了计算机历史和博物馆合作的项目，需要结合硬件知识、汇编/反汇编等深度工程能力。

实验结果：

GPT 5.4 ：在多次长时间运行中取得了所有主要进展，能有效混合硬件知识、反汇编技巧等，完成复杂逆向工作。

Claude Opus 4.6：只取得了少量次要进展，在高难度任务上几乎一点用都没有。

他的结论：对于高难度的工程工作，两者差距非常残酷。GPT 5.4 明显更强，尤其在需要深度推理和长时程任务时。

原帖中还有对比图。Show More

!Image 3: antirez

#### antirez

@antirez · 18h ago

During the last week I executed very long autonomous sessions of Claude Code Opus 4.6 and Codex GPT 5.4 (both at max thinking budget), in cloned directories (refreshed every time one was behind). I burned a lot of (flat rate, my OSS free account + my PRO account) of tokens...

119

1,617

545.2K

Apr 7, 2026, 1:38 PM View on X

14 Replies

16 Retweets

166 Likes

32.3K Views ![Image 4: Viking](https://www.bestblogs.dev/en/tweets?sourceid=e09d86) Viking @vikingmute

One Sentence Summary

Redis creator antirez conducted a reverse engineering experiment on a 90s Unix disk image, proving GPT 5.4's deep reasoning capabilities are significantly superior to Claude 4.6.

Summary

The tweet recounts a real-world experimental comparison by antirez, the author of Redis. The task involved reverse engineering an early 90s Unix disk image to reconstruct long-lost SCSI controllers and integrated ROMs—a task requiring extreme engineering skills in hardware knowledge and assembly/disassembly. Results showed that GPT 5.4 (max thinking) achieved all major progress during long autonomous sessions, whereas Claude Opus 4.6 was almost useless for such high-difficulty tasks. The conclusion highlights a brutal gap between the two in deep reasoning and complex engineering fields.

AI Score

Influence Score 51

Published At Today

Language

Chinese

Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude ...

Redis 作者实测：GPT 5.4 在复杂逆向工程任务中远超 Claude 4.6

Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude 4.6 in Complex Reverse Engineering

Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude 4.6 in Complex Reverse Engineering

One Sentence Summary

Summary

Tags

Redis Author's Benchmark: GPT 5.4 Far Outperforms Claude ...

🤖 問 AI