Skip to main content

49. Group Anagrams

mediumAsked at Twilio

Group Anagrams is Twilio's canonical hash-key-design probe: given an array of strings, group the anagrams together. The grading rubric weighs whether you choose the sorted-string key (simple) or the character-count key (asymptotically faster) and can articulate why each tradeoff matters.

By Sam K., Founder, InterviewChamp.AI · Last verified

Source citations

Public interview reports confirming this problem appears in Twilio loops.

  • Glassdoor (2026-Q1)Listed in Twilio backend SWE on-site reports as the 'hash-design' coding question.
  • LeetCode Discuss (2025-12)Recurring in Twilio interview reports across SDK-engineering and platform teams.

Problem

Given an array of strings strs, group the anagrams together. You can return the answer in any order. An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

Constraints

  • 1 <= strs.length <= 10^4
  • 0 <= strs[i].length <= 100
  • strs[i] consists of lowercase English letters.

Examples

Example 1

Input
strs = ["eat","tea","tan","ate","nat","bat"]
Output
[["bat"],["nat","tan"],["ate","eat","tea"]]

Example 2

Input
strs = [""]
Output
[[""]]

Example 3

Input
strs = ["a"]
Output
[["a"]]

Approaches

1. Sorted-string key (simple optimal)

Use the sorted version of each string as the hash-map key.

Time
O(n * k log k)
Space
O(n * k)
function groupAnagrams(strs) {
  const groups = new Map();
  for (const s of strs) {
    const key = s.split('').sort().join('');
    if (!groups.has(key)) groups.set(key, []);
    groups.get(key).push(s);
  }
  return Array.from(groups.values());
}

Tradeoff: k is the max string length. Sorting each string is O(k log k). Simple, clear, and fast enough for the constraints — most Twilio panels accept this as the optimal.

2. Character-count key (faster for large k)

Hash by a 26-int frequency vector (encoded as a string) — drops the log k factor.

Time
O(n * k)
Space
O(n * k)
function groupAnagramsCount(strs) {
  const groups = new Map();
  for (const s of strs) {
    const counts = new Array(26).fill(0);
    for (const ch of s) counts[ch.charCodeAt(0) - 97]++;
    const key = counts.join('#'); // delimiter prevents collisions like [1,11] vs [11,1]
    if (!groups.has(key)) groups.set(key, []);
    groups.get(key).push(s);
  }
  return Array.from(groups.values());
}

Tradeoff: O(n * k) time vs O(n * k log k). Marginal for k=100 but the right answer when the interviewer raises k to 10^5. The '#' delimiter in the join is the subtle correctness bit — without it [1,11] and [11,1] hash to the same key.

Twilio-specific tips

Twilio's bar on this question is whether you can articulate the asymptotic difference between the two hash keys. The sorted-string key is fine for k <= 100; the count key wins as k grows. Mention both up front, then implement whichever feels cleaner; the rubric explicitly grades the analysis narration. The '#' delimiter bug is one of the most-asked follow-up questions — preempt it.

Common mistakes

  • Using `counts.join('')` without a delimiter — collides on [1,11] vs [11,1] for strings with double-digit character counts.
  • Forgetting that the empty string is a valid input that forms its own group.
  • Sorting the input array first instead of grouping by key — that's O(n^2 k log k) and breaks the contract.

Follow-up questions

An interviewer at Twilio may pivot to one of these next:

  • What if the alphabet were Unicode instead of 26 lowercase letters? (Use a Map<char, int> for the count vector, serialize sorted-by-char.)
  • What if you only needed to KNOW the number of anagram groups, not return them? (Same hash, count distinct keys.)
  • How would you parallelize this across machines? (Hash partition by key — anagrams of the same word always land on the same partition.)

Solve it now

Free. No sign-up. Python and JavaScript run instantly in your browser.

Output

Press Run or Cmd+Enter to execute

FAQ

Is the count-vector key always asymptotically better?

Yes for time (O(n*k) beats O(n*k log k)) but no for constant factors — the array allocation per string costs more than a single sort + join for small k. For LC's k <= 100 the sort approach often wins wall-clock; count wins as k grows past ~10^3.

Why does the '#' delimiter matter?

Because two distinct count vectors can produce the same flat string if you concatenate raw integers — [1,11,0,...] joins to '1110...' and [11,1,0,...] also joins to '1110...'. A delimiter that can't appear in an integer makes the encoding unambiguous.

Free learning resources

Curated free links for this problem.

Companies that also ask Group Anagrams