49. Group Anagrams
mediumAsked at TwilioGroup Anagrams is Twilio's canonical hash-key-design probe: given an array of strings, group the anagrams together. The grading rubric weighs whether you choose the sorted-string key (simple) or the character-count key (asymptotically faster) and can articulate why each tradeoff matters.
By Sam K., Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Twilio loops.
- Glassdoor (2026-Q1)— Listed in Twilio backend SWE on-site reports as the 'hash-design' coding question.
- LeetCode Discuss (2025-12)— Recurring in Twilio interview reports across SDK-engineering and platform teams.
Problem
Given an array of strings strs, group the anagrams together. You can return the answer in any order. An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Constraints
1 <= strs.length <= 10^40 <= strs[i].length <= 100strs[i] consists of lowercase English letters.
Examples
Example 1
strs = ["eat","tea","tan","ate","nat","bat"][["bat"],["nat","tan"],["ate","eat","tea"]]Example 2
strs = [""][[""]]Example 3
strs = ["a"][["a"]]Approaches
1. Sorted-string key (simple optimal)
Use the sorted version of each string as the hash-map key.
- Time
- O(n * k log k)
- Space
- O(n * k)
function groupAnagrams(strs) {
const groups = new Map();
for (const s of strs) {
const key = s.split('').sort().join('');
if (!groups.has(key)) groups.set(key, []);
groups.get(key).push(s);
}
return Array.from(groups.values());
}Tradeoff: k is the max string length. Sorting each string is O(k log k). Simple, clear, and fast enough for the constraints — most Twilio panels accept this as the optimal.
2. Character-count key (faster for large k)
Hash by a 26-int frequency vector (encoded as a string) — drops the log k factor.
- Time
- O(n * k)
- Space
- O(n * k)
function groupAnagramsCount(strs) {
const groups = new Map();
for (const s of strs) {
const counts = new Array(26).fill(0);
for (const ch of s) counts[ch.charCodeAt(0) - 97]++;
const key = counts.join('#'); // delimiter prevents collisions like [1,11] vs [11,1]
if (!groups.has(key)) groups.set(key, []);
groups.get(key).push(s);
}
return Array.from(groups.values());
}Tradeoff: O(n * k) time vs O(n * k log k). Marginal for k=100 but the right answer when the interviewer raises k to 10^5. The '#' delimiter in the join is the subtle correctness bit — without it [1,11] and [11,1] hash to the same key.
Twilio-specific tips
Twilio's bar on this question is whether you can articulate the asymptotic difference between the two hash keys. The sorted-string key is fine for k <= 100; the count key wins as k grows. Mention both up front, then implement whichever feels cleaner; the rubric explicitly grades the analysis narration. The '#' delimiter bug is one of the most-asked follow-up questions — preempt it.
Common mistakes
- Using `counts.join('')` without a delimiter — collides on [1,11] vs [11,1] for strings with double-digit character counts.
- Forgetting that the empty string is a valid input that forms its own group.
- Sorting the input array first instead of grouping by key — that's O(n^2 k log k) and breaks the contract.
Follow-up questions
An interviewer at Twilio may pivot to one of these next:
- What if the alphabet were Unicode instead of 26 lowercase letters? (Use a Map<char, int> for the count vector, serialize sorted-by-char.)
- What if you only needed to KNOW the number of anagram groups, not return them? (Same hash, count distinct keys.)
- How would you parallelize this across machines? (Hash partition by key — anagrams of the same word always land on the same partition.)
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Is the count-vector key always asymptotically better?
Yes for time (O(n*k) beats O(n*k log k)) but no for constant factors — the array allocation per string costs more than a single sort + join for small k. For LC's k <= 100 the sort approach often wins wall-clock; count wins as k grows past ~10^3.
Why does the '#' delimiter matter?
Because two distinct count vectors can produce the same flat string if you concatenate raw integers — [1,11,0,...] joins to '1110...' and [11,1,0,...] also joins to '1110...'. A delimiter that can't appear in an integer makes the encoding unambiguous.
Free learning resources
Curated free links for this problem.