← back

Hash Map / Set

Frequency Counting

Build a frequency map (Counter) to count occurrences of each element. Used for anagram detection, character counts, majority element, finding duplicates, and grouping by frequency.

You are given a list of strings and asked to group together all words that are anagrams of each other. "eat", "tea", and "ate" go in one group; "tan" and "nat" in another; "bat" by itself. This is LeetCode 49, Group Anagrams, and it is a perfect entry point into one of the most broadly useful patterns in algorithm problems: frequency counting.

The grouping problem and canonical forms

The challenge with grouping anagrams is defining what makes two words "the same." You cannot compare them directly, since "eat" and "tea" are different strings. But they share a property: they contain exactly the same characters in exactly the same quantities. If you could reduce each word to a canonical form that captures this property, then all anagrams would map to the same key, and you could group them with a hash map.

There are two natural canonical forms for anagrams. The first is the sorted string: sort the characters of each word alphabetically, and "eat", "tea", and "ate" all become "aet". The second is a character count tuple: count the frequency of each letter and use that count vector as the key. For "eat", the count tuple would be (0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0): one 'e', one 'a', one 't', zeros elsewhere.

Sorted strings are simpler to implement. Character count tuples avoid the O(klogk)O(k \log k) sorting cost per word (where kk is the word length) and instead run in O(k)O(k), but they produce longer keys. In practice, since words are typically short, sorted strings are the most common choice.

Code for Group Anagrams

1
2
3
4
5
6
7
8
from collections import defaultdict

def groupAnagrams(strs):
    groups = defaultdict(list)
    for s in strs:
        key = tuple(sorted(s))
        groups[key].append(s)
    return list(groups.values())

For each string, sort its characters to produce a canonical key, then append the original string to the list stored at that key. At the end, return all the lists. The time complexity is O(nklogk)O(n \cdot k \log k) where nn is the number of strings and kk is the maximum string length. Space is O(nk)O(n \cdot k) to store all the strings in the hash map.

The fundamental building block: counting frequencies

At the heart of Group Anagrams is a more primitive operation: counting how often each element appears. Python's collections.Counter makes this trivial, but the concept is language-independent. You iterate through a collection, and for each element, increment its count in a hash map.

1
2
3
4
5
6
from collections import Counter

def count_frequencies(arr):
    freq = Counter(arr)
    # freq is a dict-like object: {element: count, ...}
    return freq

This O(n)O(n) operation is the starting point for a surprising number of problems. Once you have a frequency map, you can answer questions like: What is the most common element? Are two collections permutations of each other? How many elements appear exactly k times? Can you make all frequencies unique by deleting characters?

Top K Frequent Elements

Consider LeetCode 347, Top K Frequent Elements: given an array and an integer k, return the k most frequent elements. For example, given nums = [1,1,1,2,2,3] and k = 2, the answer is [1, 2] because 1 appears three times and 2 appears twice.

The most intuitive approach is to count frequencies and then sort by frequency. That works but costs O(nlogn)O(n \log n). You can do better with a heap: push all (frequency, element) pairs into a min-heap of size k, which gives O(nlogk)O(n \log k). But the slickest approach is bucket sort, which runs in O(n)O(n).

The bucket sort idea is this: the maximum possible frequency of any element is n (the array length). Create an array of n+1 buckets, where bucket i holds all elements that appear exactly i times. Then walk the buckets from right to left (highest frequency first) and collect elements until you have k of them.

Walked example: bucket sort for Top K

Given nums = [1,1,1,2,2,3] and k = 2:

Step 1, count frequencies: {1: 3, 2: 2, 3: 1}.

Step 2, build buckets. The array length is 6, so create 7 buckets (indices 0 through 6). Bucket 3 gets [1] (because 1 appears 3 times). Bucket 2 gets [2]. Bucket 1 gets [3]. All other buckets are empty.

Step 3, collect from right to left. Start at bucket 6 (empty), bucket 5 (empty), bucket 4 (empty), bucket 3 (contains [1]), and add 1 to result. Bucket 2 (contains [2]) adds 2 to result. We now have k = 2 elements: [1, 2]. Done.

Code for Top K Frequent Elements

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from collections import Counter

def topKFrequent(nums, k):
    freq = Counter(nums)
    buckets = [[] for _ in range(len(nums) + 1)]
    for num, count in freq.items():
        buckets[count].append(num)

    result = []
    for i in range(len(buckets) - 1, -1, -1):
        for num in buckets[i]:
            result.append(num)
            if len(result) == k:
                return result
    return result

This is O(n)O(n) time and O(n)O(n) space. The bucket array has at most n+1 entries, and each element is placed in exactly one bucket.

Valid Anagram

The simplest frequency comparison problem is LeetCode 242, Valid Anagram. Given two strings s and t, determine if t is an anagram of s. The approach is direct: count the character frequencies of both strings and compare the two frequency maps. If they are identical, the strings are anagrams.

1
2
3
4
from collections import Counter

def isAnagram(s, t):
    return Counter(s) == Counter(t)

That is the entire solution. Under the hood, this is O(n)O(n) time where nn is the combined length of the strings, and O(1)O(1) space if the character set is fixed (26 lowercase letters). You could also do this by sorting both strings and comparing, but that costs O(nlogn)O(n \log n), which is unnecessary when frequency counting gives you a linear solution.

The same frequency comparison idea applies to LeetCode 383, Ransom Note. Given a ransom note string and a magazine string, determine if the ransom note can be constructed from the letters in the magazine. This is a one-sided anagram check: every character in the ransom note must appear in the magazine at least as many times.

1
2
3
4
5
6
7
8
9
from collections import Counter

def canConstruct(ransomNote, magazine):
    mag_freq = Counter(magazine)
    for ch in ransomNote:
        if mag_freq[ch] <= 0:
            return False
        mag_freq[ch] -= 1
    return True

A natural extension is LeetCode 1347, Minimum Number of Steps to Make Two Strings Anagram. Given two strings of equal length, count how many characters in t must be replaced so that t becomes an anagram of s. Count frequencies for both, then sum the positive differences where s has more of a character than t.

1
2
3
4
5
6
from collections import Counter

def minSteps(s, t):
    freq_s = Counter(s)
    freq_t = Counter(t)
    return sum(max(0, freq_s[ch] - freq_t[ch]) for ch in freq_s)

Majority Element and Boyer-Moore voting

For LeetCode 169, Majority Element, the task is to find the element that appears more than n/2n/2 times. You could solve it with a frequency map: count everything, then find the element with count greater than n/2n/2. That is O(n)O(n) time and O(n)O(n) space.

But when you only need the winner and not the full frequency distribution, you can do better on space. The Boyer-Moore voting algorithm finds the majority element in O(n)O(n) time and O(1)O(1) space. The idea is to maintain a candidate and a count. Walk through the array: if the count is zero, set the current element as the candidate. If the current element equals the candidate, increment the count; otherwise, decrement it. At the end, the candidate is the majority element (assuming one exists).

The intuition is that the majority element can "survive" being canceled out by all other elements combined, because it appears more than half the time. For a deeper treatment of Boyer-Moore voting, see the dedicated article on that strategy.

Frequency of frequencies

Some problems operate not on the raw frequency map but on the distribution of frequencies. A good illustration is LeetCode 1647, Minimum Deletions to Make Character Frequencies Unique. You are given a string and need to delete the minimum number of characters so that no two distinct characters have the same frequency.

The approach is to build a frequency map, then look at the set of frequency values. If two characters share the same frequency, you must reduce one of them (by deleting characters from the string) until its frequency is unique. A greedy strategy works: sort the frequencies in descending order, and for each frequency, if it is already taken, reduce it to the largest available value below it.

1
2
3
4
5
6
7
8
9
10
11
12
13
from collections import Counter

def minDeletions(s):
    freq = Counter(s)
    counts = sorted(freq.values(), reverse=True)
    deletions = 0
    used = set()
    for count in counts:
        while count > 0 and count in used:
            count -= 1
            deletions += 1
        used.add(count)
    return deletions

This pattern, computing statistics over the frequency map itself, shows up in several problems. Whenever you see a problem that talks about "making frequencies unique" or "characters appearing the same number of times," think about building the frequency map first and then reasoning about the distribution of its values.

A simpler variant is LeetCode 1207, Unique Number of Occurrences, where you determine whether the number of occurrences of each value in an array is unique. Count frequencies, then check if the set of frequency values has the same size as the list of frequency values.

1
2
3
4
5
6
from collections import Counter

def uniqueOccurrences(arr):
    freq = Counter(arr)
    counts = freq.values()
    return len(counts) == len(set(counts))

Even more direct is LeetCode 1941, Check if All Characters Have Equal Number of Occurrences. Count frequencies and verify they are all the same.

1
2
3
4
5
from collections import Counter

def areOccurrencesEqual(s):
    freq = Counter(s)
    return len(set(freq.values())) == 1

First unique and k-th distinct

Some problems ask you to find elements that appear exactly once. The strategy is always the same two-pass approach: count frequencies first, then scan in the original order to find the element you need.

Take LeetCode 387, First Unique Character in a String, which asks for the index of the first character that does not repeat. Count all character frequencies, then walk the string from left to right and return the index of the first character with count 1.

1
2
3
4
5
6
7
8
from collections import Counter

def firstUniqChar(s):
    freq = Counter(s)
    for i, ch in enumerate(s):
        if freq[ch] == 1:
            return i
    return -1

This idea generalizes in LeetCode 2053, Kth Distinct String in an Array. An element is distinct if it appears exactly once. Count frequencies, then walk the array and collect distinct elements in order. Return the k-th one.

1
2
3
4
5
6
7
8
9
10
from collections import Counter

def kthDistinct(arr, k):
    freq = Counter(arr)
    for s in arr:
        if freq[s] == 1:
            k -= 1
            if k == 0:
                return s
    return ""

Reconstruct from frequencies

A more creative use of frequency counting is reconstruction: given a scrambled or encoded input, use character frequencies to deduce what elements must be present. The key insight is that certain characters uniquely identify certain elements, so you can peel them off in the right order.

With LeetCode 423, Reconstruct Original Digits from English, you are given a string containing the English words for digits 0 through 9 in jumbled order. You must recover which digits are present. The trick is that some letters appear in only one digit word: 'z' only appears in "zero", 'w' only in "two", 'u' only in "four", 'x' only in "six", 'g' only in "eight". Count those first, subtract their letters, then handle the remaining digits whose letters are now unique.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from collections import Counter

def originalDigits(s):
    freq = Counter(s)
    count = [0] * 10
    # Unique-letter digits first
    count[0] = freq['z']
    count[2] = freq['w']
    count[4] = freq['u']
    count[6] = freq['x']
    count[8] = freq['g']
    # Then digits identifiable after subtraction
    count[1] = freq['o'] - count[0] - count[2] - count[4]
    count[3] = freq['h'] - count[8]
    count[5] = freq['f'] - count[4]
    count[7] = freq['s'] - count[6]
    count[9] = freq['i'] - count[5] - count[6] - count[8]
    return ''.join(str(d) * count[d] for d in range(10))

The same idea applies to LeetCode 1419, Minimum Number of Frogs Croaking, but with a sequence of characters. Each frog says "croak" in order, and multiple frogs can overlap. Track the frequency of each stage character (c, r, o, a, k) as a pipeline: when you see a character, the previous stage count decreases and the current stage count increases. The maximum number of frogs active simultaneously is the answer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def minNumberOfFrogs(croakOfFrogs):
    stages = {'c': 0, 'r': 1, 'o': 2, 'a': 3, 'k': 4}
    count = [0] * 5
    frogs = 0
    max_frogs = 0
    for ch in croakOfFrogs:
        idx = stages[ch]
        if idx == 0:
            count[0] += 1
            frogs += 1
            max_frogs = max(max_frogs, frogs)
        else:
            if count[idx - 1] == 0:
                return -1
            count[idx - 1] -= 1
            count[idx] += 1
            if idx == 4:
                count[4] -= 1
                frogs -= 1
    return max_frogs if all(c == 0 for c in count) else -1

Sliding window with frequency map

When a problem asks you to find all substrings that are anagrams of a pattern, or to check whether any permutation of a pattern exists as a substring, you combine frequency counting with a sliding window. Maintain a frequency map over a window of size equal to the pattern, and compare it against the pattern's frequency map as you slide.

A classic application is LeetCode 438, Find All Anagrams in a String, which asks for all start indices where an anagram of string p begins in string s. The approach: count the frequencies of p, then slide a window of length len(p) across s. As you add a character on the right and remove one on the left, update a running frequency map. When the running map matches the target, record the index.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from collections import Counter

def findAnagrams(s, p):
    if len(p) > len(s):
        return []
    p_freq = Counter(p)
    window = Counter(s[:len(p)])
    result = []
    if window == p_freq:
        result.append(0)
    for i in range(len(p), len(s)):
        window[s[i]] += 1
        left = s[i - len(p)]
        window[left] -= 1
        if window[left] == 0:
            del window[left]
        if window == p_freq:
            result.append(i - len(p) + 1)
    return result

Its close relative, LeetCode 567, Permutation in String, is the boolean version of the same problem: return True if any permutation of s1 is a substring of s2. The code is nearly identical. You just return True the first time the window matches instead of collecting all indices.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from collections import Counter

def checkInclusion(s1, s2):
    if len(s1) > len(s2):
        return False
    target = Counter(s1)
    window = Counter(s2[:len(s1)])
    if window == target:
        return True
    for i in range(len(s1), len(s2)):
        window[s2[i]] += 1
        left = s2[i - len(s1)]
        window[left] -= 1
        if window[left] == 0:
            del window[left]
        if window == target:
            return True
    return False

An optimization for both problems is to track a "matches" counter, the number of characters whose window count equals the target count, instead of comparing full dictionaries each time. This brings the per-step work from O(26)O(26) to O(1)O(1).

Complexity analysis

Frequency counting with a hash map is always O(n)O(n) time and O(n)O(n) space, where nn is the number of elements. The space can be considered O(1)O(1) when the domain of elements is fixed and small (like 26 lowercase letters).

For anagram grouping, the time depends on how you compute canonical keys. Sorted keys give O(nklogk)O(n \cdot k \log k), and character count keys give O(nk)O(n \cdot k), where kk is the maximum element length. For Top K Frequent with bucket sort, the time is O(n)O(n). For problems that involve sorting or heaping the frequency values, the time is O(n+dlogd)O(n + d \log d) where dd is the number of distinct elements.

The versatility of frequency counting makes it one of the first tools to reach for when a problem involves counting, grouping, or comparing elements. Build the frequency map, and then ask: what question am I trying to answer about these frequencies?

Recognizing this pattern

You're probably looking at frequency counting when:

  • The problem mentions "anagram," "permutation of," "rearrangement," or "same characters." These are all questions about multisets, which frequency maps describe exactly.
  • You need to group elements by an equivalence relation (anagrams, same shape, same multiset), and count vectors or sorted keys give you the canonical form.
  • The question is "most frequent," "kth most frequent," "first non-repeating," or "appears exactly once."
  • The element domain is small and fixed (26 lowercase letters, digits, byte values), making counting effectively O(1)O(1) space.
  • A naive solution sorts the whole input just to compare or group, and frequency counts replace the sort with a single linear pass.

Common templates:

  • Canonical form group-by: map each element to a sorted/count-vector key and bucket into a hash map. Example: Group Anagrams.
  • Frequency equality check: build counters for two collections and compare directly. Example: Valid Anagram.
  • Bucket sort by count: index elements by their frequency and walk buckets from high to low. Example: Top K Frequent Elements.
  • Two-pass scan with frequencies: count first, then re-scan in original order to find the first/kth element matching a frequency condition. Example: First Unique Character in a String.
  • Sliding window with frequency map: maintain counts over a moving window and compare against a target counter. Example: Find All Anagrams in a String.

Practice Problems (173)

49 Group Anagrams
242 Valid Anagram
299 Bulls and Cows
347 Top K Frequent Elements
383 Ransom Note
387 First Unique Character in a String
409 Longest Palindrome
423 Reconstruct Original Digits from English
447 Number of Boomerangs
451 Sort Characters By Frequency
532 K-diff Pairs in an Array
575 Distribute Candies
594 Longest Harmonious Subsequence
692 Top K Frequent Words
697 Degree of an Array
748 Shortest Completing Word
781 Rabbits in Forest
811 Subdomain Visit Count
819 Most Common Word
884 Uncommon Words from Two Sentences
893 Groups of Special-Equivalent Strings
916 Word Subsets
929 Unique Email Addresses
953 Verifying an Alien Dictionary
961 N-Repeated Element in Size 2N Array
966 Vowel Spellchecker
1002 Find Common Characters
1007 Minimum Domino Rotations For Equal Row
1054 Distant Barcodes
1072 Flip Columns For Maximum Number of Equal Rows
1128 Number of Equivalent Domino Pairs
1160 Find Words That Can Be Formed by Characters
1189 Maximum Number of Balloons
1224 Maximum Equal Frequency
1282 Group the People Given the Group Size They Belong To
1296 Divide Array in Sets of K Consecutive Numbers
1338 Reduce Array Size to The Half
1347 Minimum Number of Steps to Make Two Strings Anagram
1363 Largest Multiple of Three
1370 Increasing Decreasing String
1394 Find Lucky Integer in an Array
1399 Count Largest Group
1400 Construct K Palindrome Strings
1419 Minimum Number of Frogs Croaking
1460 Make Two Arrays Equal by Reversing Subarrays
1497 Check If Array Pairs Are Divisible by k
1540 Can Convert String in K Moves
1647 Minimum Deletions to Make Character Frequencies Unique
1657 Determine if Two Strings Are Close
1704 Determine if String Halves Are Alike
1726 Tuple with Same Product
1733 Minimum Number of People to Teach
1737 Change Minimum Characters to Satisfy One of Three Conditions
1742 Maximum Number of Balls in a Box
1748 Sum of Unique Elements
1781 Sum of Beauty of All Substrings
1790 Check if One String Swap Can Make Strings Equal
1791 Find Center of Star Graph
1817 Finding the Users Active Minutes
1864 Minimum Number of Swaps to Make the Binary String Alternating
1897 Redistribute Characters to Make All Strings Equal
1941 Check if All Characters Have Equal Number of Occurrences
2001 Number of Pairs of Interchangeable Rectangles
2007 Find Original Array From Doubled Array
2013 Detect Squares
2032 Two Out of Three
2053 Kth Distinct String in an Array
2062 Count Vowel Substrings of a String
2068 Check Whether Two Strings are Almost Equivalent
2085 Count Common Words With One Occurrence
2094 Finding 3-Digit Even Numbers
2103 Rings and Rods
2122 Recover the Original Array
2131 Longest Palindrome by Concatenating Two Letter Words
2150 Find All Lonely Numbers in the Array
2170 Minimum Operations to Make the Array Alternating
2176 Count Equal and Divisible Pairs in an Array
2182 Construct String With Repeat Limit
2183 Count Array Pairs Divisible by K
2186 Minimum Number of Steps to Make Two Strings Anagram II
2190 Most Frequent Number Following Key In an Array
2206 Divide Array Into Equal Pairs
2225 Find Players With Zero or One Losses
2244 Minimum Rounds to Complete All Tasks
2248 Intersection of Multiple Arrays
2273 Find Resultant Array After Removing Anagrams
2283 Check if Number Has Equal Digit Count and Digit Value
2284 Sender With Largest Word Count
2287 Rearrange Characters to Make Target String
2306 Naming a Company
2341 Maximum Number of Pairs in Array
2344 Minimum Deletions to Make Array Divisible
2347 Best Poker Hand
2352 Equal Row and Column Pairs
2357 Make Array Zero by Subtracting Equal Amounts
2363 Merge Similar Items
2374 Node With Highest Edge Score
2384 Largest Palindromic Number
2404 Most Frequent Even Element
2423 Remove Letter To Equalize Frequency
2453 Destroy Sequential Targets
2456 Most Popular Video Creator
2475 Number of Unequal Triplets in Array
2499 Minimum Total Cost to Make Arrays Unequal
2531 Make Number of Distinct Characters Equal
2561 Rearranging Fruits
2598 Smallest Missing Non-negative Integer After Operations
2610 Convert an Array Into a 2D Array With Conditions
2671 Frequency Tracker
2748 Number of Beautiful Pairs
2763 Sum of Imbalance Numbers of All Subarrays
2768 Number of Black Blocks
2784 Check if Array is Good
2808 Minimum Seconds to Equalize a Circular Array
2815 Max Pair Sum in an Array
2840 Check if Strings Can be Made Equal With Operations II
2856 Minimum Array Length After Pair Removals
2857 Count Pairs of Points With Distance k
2870 Minimum Number of Operations to Make Array Empty
2910 Minimum Number of Groups to Create a Valid Assignment
2913 Subarrays Distinct Element Sum of Squares I
2933 High-Access Employees
3005 Count Elements With Maximum Frequency
3014 Minimum Number of Pushes to Type Word I
3016 Minimum Number of Pushes to Type Word II
3020 Find the Maximum Number of Elements in Subset
3035 Maximum Palindromes After Operations
3039 Apply Operations to Make String Empty
3046 Split the Array
3081 Replace Question Marks in String to Minimize Its Value
3085 Minimum Deletions to Make String K-Special
3128 Right Triangles
3137 Minimum Number of Operations to Make Word K-Periodic
3138 Minimum Length of Anagram Concatenation
3153 Sum of Digit Differences of All Pairs
3162 Find the Number of Good Pairs I
3164 Find the Number of Good Pairs II
3223 Minimum Length of String After Operations
3238 Find the Number of Winning Players
3265 Count Almost Equal Pairs I
3267 Count Almost Equal Pairs II
3289 The Two Sneaky Numbers of Digitville
3365 Rearrange K Substrings to Form Target String
3371 Identify the Largest Outlier in an Array
3404 Count Special Subsequences
3438 Find Valid Pair of Adjacent Digits in String
3442 Maximum Difference Between Even and Odd Frequency I
3527 Find the Most Common Response
3531 Count Covered Buildings
3541 Find Most Frequent Vowel and Consonant
3545 Minimum Deletions for At Most K Distinct Characters
3623 Count Number of Trapezoids I
3625 Count Number of Trapezoids II
3636 Threshold Majority Queries
3659 Partition Array Into K-Distinct Groups
3663 Find The Least Frequent Digit
3664 Two-Letter Card Game
3684 Maximize Sum of At Most K Distinct Elements
3692 Majority Frequency Characters
3712 Sum of Elements With Frequency Divisible by K
3713 Longest Balanced Substring I
3720 Lexicographically Smallest Permutation Greater Than Target
3760 Maximum Substrings With Distinct Start
3784 Minimum Deletion Cost to Make All Characters Equal
3785 Minimum Swaps to Avoid Forbidden Values
3803 Count Residue Prefixes
3804 Number of Centered Subarrays
3805 Count Caesar Cipher Pairs
3810 Minimum Operations to Reach Target Array
3843 First Element with Unique Frequency
3848 Check Digitorial Permutation
3852 Smallest Pair With Different Frequencies
3854 Minimum Operations to Make Array Parity Alternating