← 回總覽

Windows 栈限制检查回顾:arm64,又称 AArch64

📅 2026-03-21 06:11 Raymond Chen 软件编程 3 分鐘 3110 字 評分: 81
Windows 内部机制 ARM64 AArch64 栈管理 汇编编程
📌 一句话摘要 本文详细回顾了 Windows 栈限制检查在 ARM64(AArch64)上的实现,提供了汇编代码分析和全面的跨架构比较表。 📝 详细摘要 本文是关于 Windows 栈限制检查系列文章的收尾,探讨了 ARM64/AArch64 的实现。文中展示了 `chkstk` 函数的汇编代码,解释了 x15 如何保存待分配的 16 字节单元(段)数量,从而支持最大约 1MB 的栈帧。作者阐述了使用内存读取验证栈可用性的栈探测机制,包括页边界计算和下溢限制。文章最后提供了涵盖六种架构(x86-32、MIPS、PowerPC、Alpha AXP、x86-64、AArch64)的详细比较表

Sign in to use highlight and note-taking features for a better reading experience. Sign in now

Our survey of stack limit checking wraps up with arm64, also known as AArch64.

The stack limit checking takes two forms, one simple version for pure arm64 processes, and a more complex version for Arm64EC. I’m going to look at the simple version. The complex version differs in that it has to check whether the code is running on the native arm64 stack or the emulation stack before calculating the stack limit. That part isn’t all that interesting.

; on entry, x15 is the number of paragraphs to allocate ; (bytes divided by 16) ; on exit, stack has been validated (but not adjusted) ; modifies x16, x17

chkstk: subs x16, sp, x15, lsl #4 ; x16 = sp - x15 * 16 ; x16 = desired new stack pointer csello x16, xzr, x16 ; clamp to 0 on underflow

mov x17, sp and x17, x17, #-PAGE_SIZE ; round down to nearest page and x16, x16, #-PAGE_SIZE ; round down to nearest page

cmp x16, x17 ; on the same page? beq done ; Y: nothing to do

probe: sub x17, x17, #PAGE_SIZE ; move to next page¹ ldr xzr, [x17] ; probe cmp x17, x16 ; done? bne probe ; N: keep going

done: ret

The inbound value in x15 is the number of bytes desired _divided by 16_. Since the arm64 stack must be kept 16-byte aligned, we know that the division by 16 will not produce a remainder. Passing the amount in paragraphs expands the number of bytes expressible in a single constant load from 0xFFF0 to 0x0FFF0 (via the movz instruction), allowing convenient allocation of stack frames up to just shy of a megabyte in size. Since the default stack size is a megabyte, this is sufficient to cover all typical usages.

Here’s an example of how a function might use chkstk in its prologue:

mov x15, #17328/16 ; desired stack frame size divided by 16 bl chkstk ; ensure enough stack space available sub sp, sp, x15, lsl #4 ; reserve the stack space

Okay, so let’s summarize all of the different stack limit checks into a table, because people like tables.

| | x86-32 | MIPS | PowerPC | Alpha AXP | x86-64 | AArch64 | | --- | --- | --- | --- | --- | --- | --- | | unit requested | Bytes | Bytes | Negative bytes | Bytes | Bytes | Paragraphs | | adjusts stack pointer before returning | Yes | No | No | No | No | No | | detects stack placement at runtime | No | Yes | Yes | Yes | Yes | Yes | | short-circuits | No | Yes | Yes | Yes | Yes | No | | probe operation | Read | Write | Read | Write | Either | Read |

As we discussed earlier, if the probe operation is a write, then short-circuiting is mandatory.

¹ If you’re paying close attention, you may have noticed that PAGE_SIZE is too large to fit in a 12-bit immediate constant. No problem, because the assembler rewrites it as

sub x17, x17, #PAGE_SIZE/4096, lsl #12

查看原文 → 發佈: 2026-03-21 06:11:23 收錄: 2026-03-21 10:00:45

🤖 問 AI

針對這篇文章提問,AI 會根據文章內容回答。按 Ctrl+Enter 送出。