← back

Greedy

Task / CPU Scheduling

Use a max-heap of task frequencies with a cooldown window. Always execute the most frequent available task. Fill idle slots when no task is available.

The Problem: Scheduling with Cooldowns

Imagine you are a CPU scheduler. You have a list of tasks represented by characters, say ["A","A","A","B","B","B"], and a cooldown constraint n. Between any two executions of the same task, at least n intervals must pass. You can insert idle intervals if needed. The goal: find the minimum number of intervals required to execute all tasks.

This is LeetCode 621, Task Scheduler, and it is one of the best problems for building intuition about greedy scheduling. The key insight is deceptively simple: the most frequent task dictates everything.

Building Intuition: Frames Around the Dominant Task

Let's work through the example. We have tasks ["A","A","A","B","B","B"] with n = 2. Task A appears 3 times and task B appears 3 times.

Start by focusing only on A, the most frequent task (tied with B, but pick either). With a cooldown of 2 between consecutive A's, the earliest possible schedule for A alone looks like this:

1
A _ _ A _ _ A

Each A is followed by n = 2 slots before the next A can run. Think of this as (max_freq - 1) frames, each of width (n + 1). We have 2 frames of width 3, plus the final A at the end.

Now fill in B's. B also has frequency 3, so we slot one B into each frame and one at the tail:

1
A B _ A B _ A B

That's 8 intervals total. No idle time was needed because we had enough distinct tasks to fill the gaps.

This visual reasoning leads directly to a formula.

The Formula

Count the frequency of every task. Let max_freq be the highest frequency, and let max_count be the number of tasks that share that highest frequency. The minimum intervals are:

1
(max_freq - 1) * (n + 1) + max_count

Why does this work? The expression (max_freq - 1) * (n + 1) accounts for the first max_freq - 1 frames, each containing one execution of the dominant task plus n cooldown slots. The + max_count tacks on the final execution of every task that appears max_freq times.

In our example: max_freq = 3, max_count = 2 (both A and B appear 3 times), n = 2.

1
(3 - 1) * (2 + 1) + 2 = 2 * 3 + 2 = 8

That matches the 8 intervals we found by hand.

Why We Need max(formula, len(tasks))

The formula can undercount when there are many distinct tasks. Consider tasks ["A","A","A","B","B","B","C","C","C","D","D","E","E","F","G"] with n = 2. Here max_freq = 3, max_count = 3, so the formula gives (3-1)*(2+1) + 3 = 9. But we have 15 tasks! You cannot execute 15 tasks in 9 intervals.

When you have enough variety to fill every gap and then some, no idle time is needed at all. You just run tasks back to back. The answer is simply the total number of tasks. So the final answer is:

1
max((max_freq - 1) * (n + 1) + max_count, len(tasks))

The first term wins when idle slots are inevitable; the second term wins when the schedule is packed with distinct work.

An Example with Zero Idle Time

Tasks: ["A","A","B","B","C","C","D","D"], n = 1.

Frequencies: A=2, B=2, C=2, D=2. So max_freq = 2, max_count = 4.

Formula: (2-1) * (1+1) + 4 = 2 + 4 = 6. But len(tasks) = 8. The answer is max(6, 8) = 8.

One valid schedule: A B C D A B C D. Every cooldown is satisfied, and there are zero idle slots. The sheer number of distinct tasks eliminated any need to wait.

Code: The Formula Approach

1
2
3
4
5
6
7
from collections import Counter

def leastInterval(tasks, n):
    freq = Counter(tasks)
    max_freq = max(freq.values())
    max_count = sum(1 for f in freq.values() if f == max_freq)
    return max(len(tasks), (max_freq - 1) * (n + 1) + max_count)

This runs in O(n)O(n) time and O(k)O(k) space where k is the number of unique tasks (at most 26 for uppercase letters). It is the optimal approach when you only need the count of intervals.

When You Need the Actual Schedule: The Heap Approach

Sometimes the problem asks not just how many intervals, but what order the tasks run in. The formula cannot tell you that. For this, we simulate the schedule with a max-heap.

How the Heap Simulation Works

The strategy is greedy: at each time step, pick the task with the highest remaining frequency that is not on cooldown. This is optimal because scheduling the most frequent task as early as possible minimizes future idle slots.

The data structures:

  • A max-heap holding (-remaining_count, task_id) for every task still needing execution.
  • A cooldown queue (a regular FIFO queue) holding (-remaining_count, available_at_time) for tasks that were just executed and are waiting for their cooldown to expire.

At each time step:

  1. Check the cooldown queue. If the front entry's available_at_time equals the current time, pop it and push it back onto the heap.
  2. If the heap is non-empty, pop the most frequent task, execute it (decrement its count), and if it still has remaining executions, add it to the cooldown queue with available_at_time = current_time + n + 1.
  3. If the heap is empty and the cooldown queue is non-empty, this is an idle interval.
  4. Increment time.

Code: Heap Simulation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from collections import Counter, deque
import heapq

def leastInterval(tasks, n):
    freq = Counter(tasks)
    heap = [-f for f in freq.values()]
    heapq.heapify(heap)

    cooldown = deque()  # entries: (neg_count, available_at)
    time = 0

    while heap or cooldown:
        time += 1

        # Return tasks from cooldown when ready
        if cooldown and cooldown[0][1] == time:
            heapq.heappush(heap, cooldown.popleft()[0])

        if heap:
            neg_count = heapq.heappop(heap)
            remaining = neg_count + 1  # executed one (remember counts are negative)
            if remaining < 0:  # still has more executions
                cooldown.append((remaining, time + n + 1))
        # else: idle interval

    return time

Let's trace through ["A","A","A","B","B","B"] with n = 2:

TimeHeapCooldownAction
1[-3, -3][]Pop A (freq 3), execute. Cooldown: (A, -2, t=4)
2[-3][(A,-2,4)]Pop B (freq 3), execute. Cooldown: (B,-2,4), wait...
3[][(A,-2,4),(B,-2,5)]Heap empty, idle
4[-2][(B,-2,5)]A returns. Pop A, execute.
5[-2][(A,-1,7)]B returns. Pop B, execute.
6[][(A,-1,7),(B,-1,8)]Idle
7[-1][(B,-1,8)]A returns. Pop A, execute. Done with A.
8[-1][]B returns. Pop B, execute. Done with B.

Total: 8 intervals. Same answer as the formula, but now we know the order: A B idle A B idle A B.

The heap approach runs in O(nlogk)O(n \log k) time, where n is the total number of intervals (including idle) and k is the number of unique tasks. Each heap operation is O(logk)O(\log k), and we do at most one per time step.

Variation: Reorganize String

A closely related problem is LeetCode 767, Reorganize String, which applies the same core idea. Given a string like "aab", rearrange it so that no two adjacent characters are the same. Return "" if impossible.

This is task scheduling with n = 1: every character must have at least one different character between consecutive occurrences. The feasibility check is: the most frequent character can appear at most ceil(len(s) / 2) times. If it exceeds that, there is no valid arrangement.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
from collections import Counter
import heapq

def reorganizeString(s):
    freq = Counter(s)
    max_freq = max(freq.values())

    if max_freq > (len(s) + 1) // 2:
        return ""

    heap = [(-count, char) for char, count in freq.items()]
    heapq.heapify(heap)

    result = []
    prev_count, prev_char = 0, ""

    while heap:
        neg_count, char = heapq.heappop(heap)
        # Re-add the previously used character if it still has remaining count
        if prev_count < 0:
            heapq.heappush(heap, (prev_count, prev_char))

        result.append(char)
        prev_count = neg_count + 1  # used one occurrence
        prev_char = char

    return "".join(result)

The trick here is holding the previously-used character aside for one round so it cannot be picked again immediately. This is equivalent to a cooldown queue of size 1.

Why "Most Frequent Dominates": The Greedy Argument

Why is it always correct to greedily pick the most frequent available task? Consider the alternative: suppose at some point you pick a less frequent task when a more frequent one is available. The more frequent task still needs all its executions, and delaying it only pushes those executions into a tighter window later. This can only increase idle time, never decrease it.

More formally, the most frequent task creates the "frame" that everything else must fit into. If you have a task with frequency f, you need at least (f - 1) * n cooldown slots dedicated to it. Executing it as early as possible at each opportunity gives other tasks the maximum room to fill those cooldown slots.

This greedy choice, always picking the highest-frequency available task, is what the max-heap naturally implements. And the formula is the closed-form consequence of that same logic: the dominant frequency determines the skeleton, and everything else either fills the gaps or overflows them.

Complexity Analysis

Formula approach:

  • Time: O(n)O(n): one pass to count frequencies, one pass to find the max.
  • Space: O(k)O(k): the frequency map, where k is the number of unique tasks (at most 26).

Heap approach:

  • Time: O(nlogk)O(n \log k): each of the n total intervals involves at most one heap push and one heap pop, each O(logk)O(\log k).
  • Space: O(k)O(k): the heap and cooldown queue together hold at most k entries.

For most interview problems, the formula approach is preferred because it is simpler and faster. Use the heap when you need the actual task ordering or when the problem has additional constraints (like task dependencies) that the formula cannot handle.

The formula is the fast path; the heap is the general-purpose engine. Both rest on the same foundation: the most frequent task defines the structure, and everything else fills in around it.

Recognizing this pattern

You're probably looking at task CPU scheduling when:

  • You're given a multiset of items (tasks, characters, jobs) and asked to arrange or count time slots under a spacing constraint between identical items.
  • The constraint is "no two identical items within nn steps" or "adjacent items must differ", a cooldown gap rather than arbitrary precedence.
  • The answer is a count (min total intervals, min length) or a valid arrangement, not a complete schedule with timestamps.
  • The most frequent element clearly dominates the bound, so duplicating one more of it would push the answer up by exactly n+1n+1.
  • O((n+1)maxFreq)O((n+1) \cdot maxFreq) feels like the right shape for the bound, hinting at a closed-form answer.

Common templates:

  • Cooldown formula ((maxFreq1)(n+1)+tieCount)((maxFreq - 1) \cdot (n + 1) + tieCount): closed-form when you only need the total length. Example: Task Scheduler.
  • Max-heap with cooldown queue: simulate by popping the highest-frequency available task, parking it for nn steps. Example: Task Scheduler II.
  • Reorganize so no two adjacent are equal: special case where n=1n = 1, and the same greedy/heap construction reveals feasibility. Example: Reorganize String.
  • Distance-k rearrangement: generalized cooldown with k>1k > 1, requiring the heap simulation for the actual arrangement. Example: Rearrange String k Distance Apart.
  • Build the longest valid prefix / leftover counting: pair the most-frequent with everything else and check if it fits. Example: Longest Happy String.