Sorting Techniques

Custom Comparator Sort

Define a comparator that determines the final order. For largest number, compare str(a)+str(b) vs str(b)+str(a). Reduces a complex ordering problem to a pairwise comparison.

Custom Comparator Sorting

Python's built-in sorted() and .sort() methods order elements in ascending order by default: integers numerically, strings lexicographically. That works perfectly for most problems, but a surprising number of interview questions require a completely different ordering. The moment you catch yourself thinking "I need to sort these, but not in the normal way," a custom comparator is probably your tool.

The classic trigger problem is Largest Number (LC 179): given a list of integers, arrange them to form the largest possible number. With [3, 30, 34, 5, 9], standard numeric sort gives [3, 5, 9, 30, 34], which concatenates to "3593034", nowhere near optimal. The correct answer is "9534330".

The Key Insight: Compare Concatenations

The ordering rule for largest number is: number a should come before number b if the string a+b is lexicographically greater than b+a. For example, comparing "9" and "34": "934" > "349", so 9 goes first. Comparing "3" and "30": "330" > "303", so 3 goes before 30.

This comparison relation is transitive, meaning it defines a valid total order and can be used as a sort key. The tricky part is that Python 3 removed the old-style cmp argument from sorted(). Instead, you convert a two-argument comparator into a key function using functools.cmp_to_key.

A comparator function takes two arguments a and b and must return:

a negative number if a should come first
a positive number if b should come first
zero if they are equal

Walking Through the Example

Starting with [3, 30, 34, 5, 9], convert everything to strings: ["3", "30", "34", "5", "9"].

The comparator checks a+b versus b+a. A few representative comparisons:

"9" vs "5": "95" > "59", so 9 comes first.
"5" vs "34": "534" > "345", so 5 comes first.
"3" vs "30": "330" > "303", so 3 comes first.
"30" vs "34": "3034" < "3430", so 34 comes first.

After sorting: ["9", "5", "34", "3", "30"]. Concatenated: "9534330". One edge case: if every number is zero, the result is all zeros, so return "0" rather than "000...".

from functools import cmp_to_key

def largest_number(nums: list[int]) -> str:
    strs = list(map(str, nums))

    def compare(a, b):
        if a + b > b + a:
            return -1   # a should come first
        elif a + b < b + a:
            return 1    # b should come first
        return 0

    strs.sort(key=cmp_to_key(compare))
    result = "".join(strs)
    return "0" if result[0] == "0" else result

The lambda version you'll often see in competitive code condenses this to a one-liner: key=cmp_to_key(lambda a, b: (1 if a+b < b+a else -1 if a+b > b+a else 0)). Both are correct; the explicit function is easier to read under pressure.

Multi-Key Sorting with Tuples

When you need to sort by multiple criteria, say by priority descending and then by name ascending, Python's tuple comparison gives you multi-key sorting for free. Because tuples are compared element by element, returning a tuple from a key function lets you express mixed ascending/descending orderings cleanly.

1

2

3

4

5

tasks = [("task_a", 2), ("task_b", 1), ("task_c", 2)]

# Sort by priority descending, then by name ascending
tasks.sort(key=lambda t: (-t[1], t[0]))
# Result: [("task_a", 2), ("task_c", 2), ("task_b", 1)]

Negating a numeric field flips its sort direction without needing a custom comparator at all. This tuple-key approach is almost always preferable to cmp_to_key when a simple key expression can express the ordering.

Interval Sorting for Greedy Problems

Interval scheduling problems often live or die on how you sort before applying the greedy pass. The two most common orderings are:

Sort by end time (earliest deadline first) is used in the classic activity selection / non-overlapping intervals problem. Sorting intervals by their end time lets the greedy algorithm always pick the interval that leaves the most room for future choices.

Sort by start time is used when merging overlapping intervals. After sorting by start, a single left-to-right sweep can merge intervals because any overlap must involve the previous interval's end and the current interval's start.

1

2

3

4

5

# Sort intervals by end time for greedy scheduling
intervals.sort(key=lambda x: x[1])

# Sort intervals by start time for merging
intervals.sort(key=lambda x: x[0])

These are so common that recognizing which ordering a greedy interval problem needs is itself a key skill.

Queue Reconstruction by Height

LeetCode 406. Queue Reconstruction by Height is a beautiful example of how sorting order determines algorithm correctness. You are given a list of people described by [height, k], where k is the number of people in front who are taller or equal in height. Reconstruct the queue.

The insight: sort people by height descending, then by k ascending. Then insert each person into the result list at index k. Because taller people are placed first, when you insert a shorter person at position k, all previously placed (taller) people are already in the list and will correctly be "in front" of them. Shorter people inserted later never affect the k-value of taller people already placed.

1

2

3

4

5

6

7

def reconstructQueue(people):
    # Sort: tallest first; among same height, smaller k first
    people.sort(key=lambda p: (-p[0], p[1]))
    queue = []
    for person in people:
        queue.insert(person[1], person)
    return queue

Walkthrough: [[7,0], [4,4], [7,1], [5,0], [6,1], [5,2]]

After sorting by (-height, k): [[7,0], [7,1], [6,1], [5,0], [5,2], [4,4]]

Insert step by step:

Insert [7,0] at index 0: [[7,0]]
Insert [7,1] at index 1: [[7,0], [7,1]]
Insert [6,1] at index 1: [[7,0], [6,1], [7,1]]
Insert [5,0] at index 0: [[5,0], [7,0], [6,1], [7,1]]
Insert [5,2] at index 2: [[5,0], [7,0], [5,2], [6,1], [7,1]]
Insert [4,4] at index 4: [[5,0], [7,0], [5,2], [6,1], [4,4], [7,1]]

Verify the final queue [[5,0], [7,0], [5,2], [6,1], [4,4], [7,1]] by counting people in front who are taller or equal in height. [5,0]: nobody in front. [7,0]: nobody in front. [5,2]: [5,0] (equal) and [7,0] (taller), that's 2. [6,1]: [7,0] (taller), that's 1. [4,4]: [5,0], [7,0], [5,2], [6,1] are all at least 4, that's 4. [7,1]: [7,0] (equal), that's 1. All k-values check out.

The sorting order is the entire algorithm. Without the right comparator, the greedy insertion strategy breaks down completely.

Complexity

Both sorted() and .sort() use Timsort, so the time complexity is $O(n \log n)$ regardless of how the comparator is defined. Each comparison call is $O(1)$ for numeric keys and $O(k)$ for string concatenation comparisons, where k is the average number of digits. Space complexity is $O(n)$ for the auxiliary array produced by sorted(), or $O(\log n)$ for the in-place .sort() (stack space for Timsort's merge passes).

When the comparison itself is expensive, for instance comparing long strings, the total cost becomes $O(n \log n · k)$ . In most interview problems, k is small enough that this doesn't matter, but it's worth noting if strings can be arbitrarily long.

Recognizing this pattern

You're probably looking at custom comparator sort when:

The natural sort order does not produce the desired result, so you need a non-obvious key combining two values.
The greedy strategy depends entirely on the ordering, like queue reconstruction or schedule assembly.
The comparison rule is asymmetric or non-numeric, such as concatenation order between strings.
Phrasing mentions "reconstruct the queue", "largest number you can form", or "order by custom rule".

Common templates:

Pairwise concatenation rule: sort by a + b vs b + a comparison. Example: Largest Number.
Tall-first insertion: sort by (-height, k) then insert at index k. Example: Queue Reconstruction by Height.
Sort by deadline or end time: greedy scheduling by interval finish or due date. Example: Course Schedule III.
Stable secondary key: sort by primary key, break ties with a meaningful secondary like input order. Example: Sort the People.