sched/fair: Do not special case tasks in throttled hierarchy

With the introduction of task based throttle model, task in a throttled
hierarchy is allowed to continue to run till it gets throttled on its
ret2user path.

For this reason, remove those throttled_hierarchy() checks in the
following functions so that those tasks can get their turn as normal
tasks: dequeue_entities(), check_preempt_wakeup_fair() and
yield_to_task_fair().

The benefit of doing it this way is: if those tasks gets the chance to
run earlier and if they hold any kernel resources, they can release
those resources earlier. The downside is, if they don't hold any kernel
resouces, all they can do is to throttle themselves on their way back to
user space so the favor to let them run seems not that useful and for
check_preempt_wakeup_fair(), that favor may be bad for curr.

K Prateek Nayak pointed out prio_changed_fair() can send a throttled
task to check_preempt_wakeup_fair(), further tests showed the affinity
change path from move_queued_task() can also send a throttled task to
check_preempt_wakeup_fair(), that's why the check of task_is_throttled()
in that function.

Signed-off-by: Aaron Lu <ziqianlu@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
pull/1354/merge
Aaron Lu 2025-09-10 17:50:43 +08:00 committed by Peter Zijlstra
parent fcd394866e
commit 253b3f5872
1 changed files with 4 additions and 4 deletions

View File

@ -7081,7 +7081,7 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags)
* Bias pick_next to pick a task from this cfs_rq, as * Bias pick_next to pick a task from this cfs_rq, as
* p is sleeping when it is within its sched_slice. * p is sleeping when it is within its sched_slice.
*/ */
if (task_sleep && se && !throttled_hierarchy(cfs_rq)) if (task_sleep && se)
set_next_buddy(se); set_next_buddy(se);
break; break;
} }
@ -8735,7 +8735,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
* lead to a throttle). This both saves work and prevents false * lead to a throttle). This both saves work and prevents false
* next-buddy nomination below. * next-buddy nomination below.
*/ */
if (unlikely(throttled_hierarchy(cfs_rq_of(pse)))) if (task_is_throttled(p))
return; return;
if (sched_feat(NEXT_BUDDY) && !(wake_flags & WF_FORK) && !pse->sched_delayed) { if (sched_feat(NEXT_BUDDY) && !(wake_flags & WF_FORK) && !pse->sched_delayed) {
@ -9009,8 +9009,8 @@ static bool yield_to_task_fair(struct rq *rq, struct task_struct *p)
{ {
struct sched_entity *se = &p->se; struct sched_entity *se = &p->se;
/* throttled hierarchies are not runnable */ /* !se->on_rq also covers throttled task */
if (!se->on_rq || throttled_hierarchy(cfs_rq_of(se))) if (!se->on_rq)
return false; return false;
/* Tell the scheduler that we'd really like se to run next. */ /* Tell the scheduler that we'd really like se to run next. */