blk-cgroup: fix races and deadlocks by blktests-ci[bot] · Pull Request #540 · linux-blktests/linux-block

… blkcg_mutex blkg_destroy_all() iterates q->blkg_list without holding blkcg_mutex, which can race with blkg_free_workfn() that removes blkgs from the list while holding blkcg_mutex. Add blkcg_mutex protection around the q->blkg_list iteration to prevent potential list corruption or use-after-free issues. Signed-off-by: Yu Kuai <[email protected]>

…mutex bfq_end_wr_async() iterates q->blkg_list while only holding bfqd->lock, but not blkcg_mutex. This can race with blkg_free_workfn() that removes blkgs from the list while holding blkcg_mutex. Add blkcg_mutex protection in bfq_end_wr() before taking bfqd->lock to ensure proper synchronization when iterating q->blkg_list. Signed-off-by: Yu Kuai <[email protected]>

When switching an IO scheduler on a block device, blkcg_activate_policy() allocates blkg_policy_data (pd) for all blkgs attached to the queue. However, blkcg_activate_policy() may race with concurrent blkcg deletion, leading to use-after-free and memory leak issues. The use-after-free occurs in the following race: T1 (blkcg_activate_policy): - Successfully allocates pd for blkg1 (loop0->queue, blkcgA) - Fails to allocate pd for blkg2 (loop0->queue, blkcgB) - Enters the enomem rollback path to release blkg1 resources T2 (blkcg deletion): - blkcgA is deleted concurrently - blkg1 is freed via blkg_free_workfn() - blkg1->pd is freed T1 (continued): - Rollback path accesses blkg1->pd->online after pd is freed - Triggers use-after-free In addition, blkg_free_workfn() frees pd before removing the blkg from q->blkg_list. This allows blkcg_activate_policy() to allocate a new pd for a blkg that is being destroyed, leaving the newly allocated pd unreachable when the blkg is finally freed. Fix these races by extending blkcg_mutex coverage to serialize blkcg_activate_policy() rollback and blkg destruction, ensuring pd lifecycle is synchronized with blkg list visibility. Link: https://lore.kernel.org/all/[email protected]/ Fixes: f1c006f ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()") Signed-off-by: Zheng Qixing <[email protected]> Signed-off-by: Yu Kuai <[email protected]>

When switching IO schedulers on a block device, blkcg_activate_policy() can race with concurrent blkcg deletion, leading to a use-after-free in rcu_accelerate_cbs. T1: T2: blkg_destroy kill(&blkg->refcnt) // blkg->refcnt=1->0 blkg_release // call_rcu(__blkg_release) ... blkg_free_workfn ->pd_free_fn(pd) elv_iosched_store elevator_switch ... iterate blkg list blkg_get(blkg) // blkg->refcnt=0->1 list_del_init(&blkg->q_node) blkg_put(pinned_blkg) // blkg->refcnt=1->0 blkg_release // call_rcu again rcu_accelerate_cbs // uaf Fix this by checking hlist_unhashed(&blkg->blkcg_node) before getting a reference to the blkg. This is the same check used in blkg_destroy() to detect if a blkg has already been destroyed. If the blkg is already unhashed, skip processing it since it's being destroyed. Link: https://lore.kernel.org/all/[email protected]/ Fixes: f1c006f ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()") Signed-off-by: Zheng Qixing <[email protected]> Signed-off-by: Yu Kuai <[email protected]>

Move the teardown sequence which offlines and frees per-policy blkg_policy_data (pd) into a helper for readability. No functional change intended. Signed-off-by: Zheng Qixing <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Signed-off-by: Yu Kuai <[email protected]>

blktests-ci · 2026-02-18T05:55:48Z

Upstream branch: 73cf88d
series: https://patchwork.kernel.org/project/linux-block/list/?series=1050146
version: 2

…cy() Some policies like iocost and iolatency perform percpu allocation in pd_alloc_fn(). Percpu allocation with queue frozen can cause deadlock because percpu memory reclaim may issue IO. Now that q->blkg_list is protected by blkcg_mutex, restructure blkcg_activate_policy() to allocate all pds before freezing the queue: 1. Allocate all pds with GFP_KERNEL before freezing the queue 2. Freeze the queue 3. Initialize and online all pds Note: Future work is to remove all queue freezing before blkcg_activate_policy() to fix the deadlocks thoroughly. Signed-off-by: Yu Kuai <[email protected]>

The current rq_qos_mutex handling has an awkward pattern where callers must acquire the mutex before calling rq_qos_add()/rq_qos_del(), and blkg_conf_open_bdev_frozen() had to release and re-acquire the mutex around queue freezing to maintain proper locking order (freeze queue before mutex). On the other hand, with rq_qos_mutex held after blkg_conf_prep(), there are many possible deadlocks: - allocating memory with GFP_KERNEL, like blk_throtl_init(); - allocating percpu memory, like pd_alloc_fn() for iocost/iolatency; This patch refactors the locking by: 1. Moving queue freeze and rq_qos_mutex acquisition inside rq_qos_add()/rq_qos_del(), with the correct order: freeze first, then acquire mutex. 2. Removing external mutex handling from wbt_init() since rq_qos_add() now handles it internally. 3. Removing rq_qos_mutex handling from blkg_conf_open_bdev() entirely, making it only responsible for parsing MAJ:MIN and opening the bdev. 4. Removing blkg_conf_open_bdev_frozen() and blkg_conf_exit_frozen() functions which are no longer needed. 5. Updating ioc_qos_write() to use the simpler blkg_conf_open_bdev() and blkg_conf_exit() functions. This eliminates the release-and-reacquire pattern and makes rq_qos_add()/rq_qos_del() self-contained, which is cleaner and reduces complexity. Each function now properly manages its own locking with the correct order: queue freeze → mutex acquire → modify → mutex release → queue unfreeze. Signed-off-by: Yu Kuai <[email protected]>

blktests-ci bot added new for-next V2 V2-ci-fail labels Feb 3, 2026

blktests-ci bot force-pushed the for-next_base branch from 6fc8b8a to e8c77cf Compare February 5, 2026 03:28

blktests-ci bot force-pushed the series/1050146=>for-next branch from 9402769 to 5af39dc Compare February 5, 2026 03:31

blktests-ci bot force-pushed the for-next_base branch from e8c77cf to 58d4c59 Compare February 5, 2026 13:24

blktests-ci bot force-pushed the series/1050146=>for-next branch from 5af39dc to 2094e91 Compare February 5, 2026 13:26

blktests-ci bot force-pushed the for-next_base branch from 58d4c59 to 485086f Compare February 6, 2026 08:50

blktests-ci bot force-pushed the series/1050146=>for-next branch from 2094e91 to 2f6143a Compare February 6, 2026 08:53

blktests-ci bot force-pushed the for-next_base branch from 485086f to 30053cb Compare February 8, 2026 02:55

blktests-ci bot force-pushed the series/1050146=>for-next branch from 2f6143a to 00bcb58 Compare February 8, 2026 04:15

blktests-ci bot force-pushed the for-next_base branch from 30053cb to 1e4b86e Compare February 12, 2026 00:34

blktests-ci bot force-pushed the series/1050146=>for-next branch from 00bcb58 to f7732ac Compare February 12, 2026 00:38

blktests-ci bot force-pushed the for-next_base branch from 1e4b86e to 6fe22ad Compare February 12, 2026 06:28

blktests-ci bot force-pushed the series/1050146=>for-next branch from f7732ac to 8a6f574 Compare February 12, 2026 06:32

blktests-ci bot force-pushed the for-next_base branch from 6fe22ad to 27a100d Compare February 13, 2026 02:22

blktests-ci bot force-pushed the series/1050146=>for-next branch from 8a6f574 to 8361e42 Compare February 13, 2026 02:26

blktests-ci bot force-pushed the for-next_base branch from 27a100d to b778bc4 Compare February 15, 2026 15:16

blktests-ci bot force-pushed the series/1050146=>for-next branch from 8361e42 to 7e04b69 Compare February 15, 2026 15:20

blktests-ci bot force-pushed the for-next_base branch from b778bc4 to 2651a68 Compare February 16, 2026 02:04

blktests-ci bot force-pushed the series/1050146=>for-next branch from 7e04b69 to 7f08dcf Compare February 16, 2026 02:11

blktests-ci bot force-pushed the for-next_base branch from 2651a68 to cbfe700 Compare February 16, 2026 17:25

blktests-ci bot force-pushed the series/1050146=>for-next branch from 7f08dcf to 109b119 Compare February 16, 2026 17:28

blktests-ci bot force-pushed the for-next_base branch from cbfe700 to fff1b1b Compare February 17, 2026 09:58

blktests-ci bot force-pushed the series/1050146=>for-next branch from 109b119 to 7c75eaf Compare February 17, 2026 10:04

blktests-ci bot force-pushed the for-next_base branch from fff1b1b to 8c1023e Compare February 18, 2026 05:50

hailan94 and others added 5 commits February 18, 2026 14:55

hailan94 added 2 commits February 18, 2026 14:55

blktests-ci bot force-pushed the series/1050146=>for-next branch from 7c75eaf to 5e813ee Compare February 18, 2026 05:55

Conversation

blktests-ci bot commented Feb 3, 2026

Uh oh!

blktests-ci bot commented Feb 3, 2026

Uh oh!

blktests-ci bot commented Feb 5, 2026

Uh oh!

blktests-ci bot commented Feb 5, 2026

Uh oh!

blktests-ci bot commented Feb 6, 2026

Uh oh!

blktests-ci bot commented Feb 8, 2026

Uh oh!

blktests-ci bot commented Feb 12, 2026

Uh oh!

blktests-ci bot commented Feb 12, 2026

Uh oh!

blktests-ci bot commented Feb 13, 2026

Uh oh!

blktests-ci bot commented Feb 15, 2026

Uh oh!

blktests-ci bot commented Feb 16, 2026

Uh oh!

blktests-ci bot commented Feb 16, 2026

Uh oh!

blktests-ci bot commented Feb 17, 2026

Uh oh!

blktests-ci bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants