127. Word Ladder
hardAsked at AppleWord Ladder is Apple's BFS-on-a-virtual-graph hard. The trick is generating neighbors implicitly — for each word, try replacing each letter with each of 26 alternatives. BFS guarantees the shortest path. Bidirectional BFS is the optimization for the 'now make it faster' follow-up.
By Sam K., Founder, InterviewChamp.AI · Last verified
Source citations
Public interview reports confirming this problem appears in Apple loops.
- Glassdoor (2026-Q1)— Apple SWE onsite reports list Word Ladder as a recurring 45-minute graph BFS hard.
- Blind (2025-12)— Apple ICT4/ICT5 reports cite Word Ladder with the bidirectional-BFS follow-up.
Problem
A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s_1 -> s_2 -> ... -> s_k such that every adjacent pair of words differs by a single letter, every s_i for 1 <= i <= k is in wordList. Note that beginWord does not need to be in wordList. Also, s_k == endWord. Given two words, beginWord and endWord, and a dictionary wordList, return the number of words in the shortest transformation sequence from beginWord to endWord, or 0 if no such sequence exists.
Constraints
1 <= beginWord.length <= 10endWord.length == beginWord.length1 <= wordList.length <= 5000wordList[i].length == beginWord.lengthbeginWord, endWord, and wordList[i] consist of lowercase English letters.beginWord != endWordAll the words in wordList are unique.
Examples
Example 1
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]5Explanation: One shortest transformation sequence is 'hit' -> 'hot' -> 'dot' -> 'dog' -> 'cog', which is 5 words long.
Example 2
beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]0Explanation: The endWord 'cog' is not in wordList, therefore there is no valid transformation sequence.
Approaches
1. BFS with implicit neighbors (optimal-ish)
Put wordList in a set. BFS from beginWord. At each step, generate every single-letter mutation of the current word and check membership.
- Time
- O(L^2 * N * 26) where L is word length, N is wordList size
- Space
- O(N)
function ladderLength(beginWord, endWord, wordList) {
const set = new Set(wordList);
if (!set.has(endWord)) return 0;
let queue = new Set([beginWord]);
let steps = 1;
const A = 'a'.charCodeAt(0);
while (queue.size) {
const next = new Set();
for (const word of queue) {
if (word === endWord) return steps;
const chars = word.split('');
for (let i = 0; i < chars.length; i++) {
const orig = chars[i];
for (let c = 0; c < 26; c++) {
chars[i] = String.fromCharCode(A + c);
const candidate = chars.join('');
if (set.has(candidate)) {
next.add(candidate);
set.delete(candidate);
}
}
chars[i] = orig;
}
}
queue = next;
steps++;
}
return 0;
}Tradeoff: Implicit graph saves the O(N^2) edge construction. For each visited word we do L * 26 mutations, each an O(L) set lookup. The set.delete prevents revisiting.
2. Bidirectional BFS (optimal for tight constraints)
BFS from both beginWord and endWord; expand the smaller frontier each step. When the frontiers intersect, sum their depths.
- Time
- Roughly sqrt of the unidirectional BFS work
- Space
- O(N)
function ladderLength(beginWord, endWord, wordList) {
const set = new Set(wordList);
if (!set.has(endWord)) return 0;
let front = new Set([beginWord]);
let back = new Set([endWord]);
let steps = 1;
const A = 'a'.charCodeAt(0);
while (front.size && back.size) {
if (front.size > back.size) [front, back] = [back, front];
const next = new Set();
for (const word of front) {
const chars = word.split('');
for (let i = 0; i < chars.length; i++) {
const orig = chars[i];
for (let c = 0; c < 26; c++) {
chars[i] = String.fromCharCode(A + c);
const candidate = chars.join('');
if (back.has(candidate)) return steps + 1;
if (set.has(candidate)) {
next.add(candidate);
set.delete(candidate);
}
}
chars[i] = orig;
}
}
front = next;
steps++;
}
return 0;
}Tradeoff: Roughly square root of the unidirectional work because both frontiers grow much slower than one to the full distance. Apple's preferred answer when the dictionary is large.
Apple-specific tips
Apple loves Word Ladder because it tests whether you can model a problem as BFS on a graph WITHOUT building the graph. Say: 'I'll model each word as a node; the edge to a neighbor is a single-letter substitution. I won't build the graph explicitly — for each frontier word I'll generate the 26L candidate neighbors and check the dict for membership.' That's the entire interview.
Common mistakes
- Building the full adjacency list — O(N^2 * L) and times out for N=5000.
- Forgetting to delete the candidate from the set after queueing — visits the same word multiple times, blows up BFS.
- Returning the path-length instead of word-count (off-by-one).
Follow-up questions
An interviewer at Apple may pivot to one of these next:
- Word Ladder II (LC 126) — return ALL shortest paths.
- Open the Lock (LC 752) — same BFS-on-implicit-graph pattern.
- What if the cost varied per substitution? (Dijkstra instead of BFS.)
Solve it now
Free. No sign-up. Python and JavaScript run instantly in your browser.
FAQ
Why BFS and not DFS?
BFS explores by depth; the first time you reach endWord you've found the shortest path. DFS would have to enumerate all paths.
Implicit graph or explicit?
Implicit. Building all N^2 edges (or even N * L * 26) is wasteful when many candidates aren't in the dict. Generate-on-the-fly is faster on the average case.
Free learning resources
Curated free links for this problem.