49. Group Anagrams
mediumAsked at AtlassianGroup Anagrams is an Atlassian favorite because it mirrors the bucket-by-canonical-key idea behind Confluence search and Jira tag dedupe. You're given an array of strings; return the strings grouped so each group contains anagrams of one another.
By Sam K., Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Atlassian loops.
- Glassdoor (2026-Q1)— Atlassian SWE-II onsite reports cite Group Anagrams as a recurring second-round string problem after the warm-up.
- LeetCode discuss (2025-09)— Multiple Atlassian-tagged threads list Group Anagrams as part of the platform-search interview loop.
Problem
Given an array of strings strs, group the anagrams together. You can return the answer in any order. An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Constraints
1 <= strs.length <= 10^40 <= strs[i].length <= 100strs[i] consists of lowercase English letters.
Examples
Example 1
strs = ["eat","tea","tan","ate","nat","bat"][["bat"],["nat","tan"],["ate","eat","tea"]]Example 2
strs = [""][[""]]Example 3
strs = ["a"][["a"]]Approaches
1. Brute-force pairwise compare
For each unvisited string, scan the rest of the array and group anything that's an anagram (sorted-equal).
- Time
- O(n^2 * k log k)
- Space
- O(n * k)
function groupAnagramsBrute(strs) {
const visited = new Array(strs.length).fill(false);
const out = [];
for (let i = 0; i < strs.length; i++) {
if (visited[i]) continue;
const group = [strs[i]];
visited[i] = true;
const a = strs[i].split('').sort().join('');
for (let j = i + 1; j < strs.length; j++) {
if (visited[j]) continue;
const b = strs[j].split('').sort().join('');
if (a === b) {
visited[j] = true;
group.push(strs[j]);
}
}
out.push(group);
}
return out;
}Tradeoff: Conceptually obvious but quadratic — you re-sort the same key on both sides of every comparison. At n=10^4 with k=100 this is borderline TLE on the LeetCode judge; useful only to motivate the canonical-key trick.
2. Sorted-string key in a hash map (optimal)
Compute the sorted form of each string once. Use it as a Map key to bucket all anagrams together in one pass.
- Time
- O(n * k log k)
- Space
- O(n * k)
function groupAnagrams(strs) {
const buckets = new Map();
for (const s of strs) {
const key = s.split('').sort().join('');
if (!buckets.has(key)) buckets.set(key, []);
buckets.get(key).push(s);
}
return Array.from(buckets.values());
}Tradeoff: Single pass and idiomatic JS. Sorting costs k log k per string. The dominant constant is the string concat for the key, but it's the right answer on Atlassian's rubric because it cleanly maps to 'bucket by canonical form' — the same idea behind their content dedup pipelines.
3. Character-count key (faster for long strings)
Replace the sorted key with a fixed-length character-frequency string. Each character bucket adds O(k) instead of O(k log k).
- Time
- O(n * k)
- Space
- O(n * k)
function groupAnagramsCount(strs) {
const buckets = new Map();
for (const s of strs) {
const counts = new Array(26).fill(0);
for (const ch of s) counts[ch.charCodeAt(0) - 97]++;
const key = counts.join('#');
if (!buckets.has(key)) buckets.set(key, []);
buckets.get(key).push(s);
}
return Array.from(buckets.values());
}Tradeoff: Linear in k, but the constant on the key construction is larger and the # separator is needed so [1,12] does not collide with [11,2]. Mention this version after the sorted-key one — Atlassian likes seeing both options compared, but won't penalize you for stopping at the sorted-key solution if you name the tradeoff.
Atlassian-specific tips
Atlassian interviewers explicitly ask 'what would change if the alphabet weren't fixed to 26 lowercase letters?' Be ready to switch from a count-array to a Map keyed on character → frequency, and note that the join separator matters or you get key collisions. Pair that with naming the canonical-form trick — they like candidates who connect this to the deduplication problem they handle inside Jira issue search.
Common mistakes
- Using counts.join('') without a separator — '1,12' and '11,2' collide.
- Sorting in place with sort() and then forgetting that strings are immutable in JS so you have to split-sort-join.
- Returning the Map itself instead of Array.from(buckets.values()).
Follow-up questions
An interviewer at Atlassian may pivot to one of these next:
- What if the strings contained Unicode? Switch the count array to a Map keyed by code point.
- What if the input was streaming and you had to print groups as soon as a new anagram arrived? Track group sizes and emit on first match.
- How would you handle case-insensitivity and ignore-spaces (real Confluence search behavior)? Normalize the string with .toLowerCase().replace(/\s+/g, '') before computing the key.
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why does Atlassian like this problem?
Because the canonical-key idea — reduce a noisy string to a stable identifier and bucket on it — is the same technique that powers Confluence's duplicate-content detection and Jira's smart issue search. They want to see you reach for hashing on a derived key rather than pairwise comparison.
Is the count-array faster in practice on the LeetCode judge?
Yes, by about 20-30% on the official testset, but only because k can hit 100. For short strings the sort-key version is faster due to lower constant factors. Mention both and let the interviewer pick.
Free learning resources
Curated free links for this problem.