Fix stack corruption in rt_sdhci_init_host()#11542
Conversation
When msh tries to execute a non-ELF path, lwp_execve() may allocate a PID before lwp_load() fails. The old error path only dropped the LWP reference, leaving the PID tree entry pointing to a freed LWP. In an init-less boot flow, this can poison pid 1 after a failed command from msh. A later LWP launch may then treat the stale pid 1 entry as a valid parent LWP, resulting in invalid pgrp/session state and a job-control assertion during process exit. Add lwp_pid_rollback() for exec/spawn failures before the process becomes runnable. Unlike lwp_pid_put(), it always releases the PID lock and does not enter the "no more pid allocation" state when the PID tree becomes empty. Use the rollback helper in lwp_execve() failure paths after PID allocation. Signed-off-by: zhangyang <gaoshanliukou@163.com>
|
👋 感谢您对 RT-Thread 的贡献!Thank you for your contribution to RT-Thread! 为确保代码符合 RT-Thread 的编码规范,请在你的仓库中执行以下步骤运行代码格式化工作流(如果格式化CI运行失败)。 🛠 操作步骤 | Steps
完成后,提交将自动更新至 如有问题欢迎联系我们,再次感谢您的贡献!💐 |
📌 Code Review Assignment🏷️ Tag: componentsReviewers: @Maihuanyi Changed Files (Click to expand)
📊 Current Review Status (Last Updated: 2026-07-01 14:28 CST)
📝 Review Instructions
|
|
yang.zhang seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
rt_sdhci_init_host() allocated a 32-byte local buffer for
the device name:
char dev_name[32];
However, sdio_host_set_name() copies RT_NAME_MAX bytes into
the output buffer:
rt_strncpy(out_devname, host->name, RT_NAME_MAX);
When RT_NAME_MAX is configured larger than 32 (for example
64), the copy overruns the stack buffer and corrupts nearby
local variables. This may corrupt the local mmc pointer and
lead to a data abort when accessing mmc->caps2.
Fix this by sizing the local buffer with RT_NAME_MAX.
Signed-off-by: zhangyangysu <zhangyangysu0928@gmail.com>
|
这么修改应该是有问题的,因为后续还有rt_sprintf(&dev_name[len], "-timer"); |
拉取/合并请求描述:(PR description)
[
Test on qemu-virt64-aarch64.
Reproduce:
Enable SDHCI, and set RT_NAME_MAX as 64, build and run qemu.py, then crash.
\ | /
/ | \ 5.3.0 build Jul 1 2026 11:09:46
2006 - 2024 Copyright by RT-Thread team
[I/rtdm.pci] Bus I/O region(0):
[I/rtdm.pci] cpu: [0x000000003eff0000, 0x000000003effffff]
[I/rtdm.pci] physical: [0x0000000000000000, 0x000000000000ffff]
[I/rtdm.pci] Bus Memory region(1):
[I/rtdm.pci] cpu: [0x0000000010000000, 0x000000003efeffff]
[I/rtdm.pci] physical: [0x0000000010000000, 0x000000003efeffff]
[I/rtdm.pci] Bus Memory region(2):
[I/rtdm.pci] cpu: [0x0000008000000000, 0x000000ffffffffff]
[I/rtdm.pci] physical: [0x0000008000000000, 0x000000ffffffffff]
[I/audio.hda] Found codec at address 0
[I/audio.hda] Intel HD Audio v1.0 codec=0 dac=2 pin=3
[I/rtdm.nvme] NVM Express v1.0 (PCI, QEMU NVMe Ctrl, 1.0)
exception info:
esr.EC :0x25
esr.IL :0x01
esr.ISS:0x00000006
epc :0x00000000401176e8
Data abort
fault addr = 0x0000000000000228
abort caused by read instruction
Translation fault, second level
Execption:
X00:0x0000000000000000 X01:0x00000000402298e0 X02:0x00000000402298a4 X03:0x000000000000000a
X04:0x0000000000000000 X05:0x00000000ffffffff X06:0x00000000ffffffff X07:0x0000000000000000
X08:0x000000004020ffc8 X09:0x0000000000000000 X10:0x0000000000000000 X11:0x0000000000000000
X12:0x0000000000000000 X13:0x0000000000000000 X14:0x0000000000000000 X15:0x0000000000000000
X16:0x0000000000000001 X17:0x0000000000000d9e X18:0x0000000000000000 X19:0x000000004013dda0
X20:0x000000004014cba8 X21:0x0000000000000000 X22:0x0000000000000016 X23:0x0000000000000017
X24:0x0000000000000018 X25:0x0000000000000019 X26:0x000000000000001a X27:0x000000000000001b
X28:0x000000000000001c X29:0x0000000040229880 X30:0x00000000401176e0
SP_EL0:0x0000000000000000
SPSR :0x0000000060000005
EPC :0x00000000401176e8
...
please use: addr2line -e rtthread.elf -a -f
0x401176e8 0x40117928 0xfffffffffffffffc
addr2line -e rtthread.elf -a -f 0x401176e8 0x40117928 0xfffffffffffffffc
0x00000000401176e8
rt_sdhci_init_host
/home/yangzhang/code/github/rt-thread/components/drivers/sdio/dev_sdhci.c:3543
0x0000000040117928
rt_sdhci_set_and_add_host
/home/yangzhang/code/github/rt-thread/components/drivers/sdio/dev_sdhci.c:3605
0xfffffffffffffffc
??
??:0
rt_sdhci_init_host() allocated a 32-byte local buffer for the device name:
char dev_name[32];
However, sdio_host_set_name() copies RT_NAME_MAX bytes into the output buffer:
rt_strncpy(out_devname, host->name, RT_NAME_MAX);
When RT_NAME_MAX is configured larger than 32 (for example 64), the copy overruns
the stack buffer and corrupts nearby local variables. This may corrupt the local mmc pointer
and lead to a data abort when accessing mmc->caps2.
为什么提交这份PR (why to submit this PR)
Kernel crash when enable SDHCI in some specific case, emmc/sd card can't work.
你的解决方案是什么 (what is your solution)
Fix this by sizing the local buffer dev_name[] with RT_NAME_MAX.
请提供验证的bsp和config (provide the config and bsp)
BSP:
bsp/qemu-virt64-aarch64
.config:
Just enable SDHCI in menuconfig and changing RT_NAME_MAX .
CONFIG_RT_NAME_MAX=64
CONFIG_RT_USING_SDIO=y
CONFIG_RT_SDIO_STACK_SIZE=8192
CONFIG_RT_SDIO_THREAD_PRIORITY=15
CONFIG_RT_MMCSD_STACK_SIZE=8192
CONFIG_RT_MMCSD_THREAD_PRIORITY=22
CONFIG_RT_MMCSD_MAX_PARTITION=16
CONFIG_RT_USING_SDHCI=y
CONFIG_RT_SDIO_SDHCI_PCI=y
action:
N/A (build verified locally on bsp/qemu-virt64-aarch64)
]
当前拉取/合并请求的状态 Intent for your PR
必须选择一项 Choose one (Mandatory):
代码质量 Code Quality:
我在这个拉取/合并请求中已经考虑了 As part of this pull request, I've considered the following:
#if 0代码,不包含已经被注释了的代码 All redundant code is removed and cleaned up