Concurrency Interview Questions
Concurrency interview questions assess your ability to design and troubleshoot multi-threaded systems. Senior engineers and platform teams are often grilled on thread safety, locking strategies, and performance implications. Master these questions to demonstrate deep system-level thinking.
What Concurrency interviews cover
Thread Safety & Atomicity
Ensuring shared state is accessed correctly under concurrent access, using locks, atomic operations, or immutable data.
Deadlocks & LiveLocks
Recognizing and preventing conditions where threads block indefinitely, including lock ordering and timeout mechanisms.
Memory Model & Visibility
Understanding happens-before relationships, volatile, and how CPU caches affect multithreaded correctness.
Concurrency Patterns
Producer-consumer, reader-writer lock, thread pool, and actor model implementations.
Sample Concurrency interview questions
- Explain the difference between process and thread. Give an example where threading is better than multiprocessing.What a strong answer covers
- Processes have separate memory spaces; threads share memory within a process.
- Context switching between threads is faster than between processes.
- Threads allow easier data sharing but require synchronization.
- Threading is better for I/O-bound tasks; multiprocessing for CPU-bound tasks with high isolation needs.
View a sample answer
A process is an independent program execution unit with its own memory space, file descriptors, and system resources. A thread is a lightweight unit of execution within a process, sharing the same memory and resources. Context switching between threads is cheaper because they share the same address space, whereas processes require more overhead to switch memory mappings. Threads are preferred when tasks need to share data frequently, such as in a web server handling concurrent requests, because shared memory allows efficient communication. However, threads require careful synchronization to avoid races. Multiprocessing is better for CPU-intensive tasks that benefit from parallelism without shared state, like image processing on separate images, and when fault isolation is critical—a crash in one process does not affect others. For I/O-bound workloads like a chat server, threading is better because threads can handle many connections with lower overhead than processes.
- What is a race condition? Show a simple code snippet in Java/C++ that contains one and fix it using a lock.What a strong answer covers
- A race condition occurs when multiple threads access shared data without synchronization, leading to unpredictable results.
- The classic example is a counter increment: read, modify, write sequence interleaved.
- Fixing it requires a lock (e.g., mutex) to ensure mutual exclusion.
View a sample answer
A race condition is a bug where the outcome depends on the non-deterministic ordering of thread execution. For example, two threads incrementing a shared counter each perform a read-modify-write. Without synchronization, the increments may interleave and one update gets lost, resulting in a final count less than expected. To fix, use a lock (e.g., Java's synchronized block or ReentrantLock) to serialize access to the critical section. The lock ensures that only one thread can execute the increment at a time, guaranteeing correctness. However, excessive locking can degrade performance, so fine-grained locks or lock-free techniques may be preferred in high-throughput scenarios.
Reference solutionjava // Race condition example (unsafe) class Counter { private int count = 0; public void increment() { count++; } // read, modify, write public int getCount() { return count; } } // Fixed version using synchronized class SafeCounter { private int count = 0; public synchronized void increment() { count++; } public synchronized int getCount() { return count; } } // Alternative with ReentrantLock import java.util.concurrent.locks.Lock; import java.util.concurrent.locks.ReentrantLock; class LockCounter { private int count = 0; private final Lock lock = new ReentrantLock(); public void increment() { lock.lock(); try { count++; } finally { lock.unlock(); } } public int getCount() { return count; } // not thread-safe without lock, but for read-only it's safe if volatile } // Time complexity: O(1) per operation. Space: O(1). - Implement a thread-safe bounded queue (blocking queue) in C++ or Java with condition variables.What a strong answer covers
- A bounded queue has a fixed capacity; producers block if full, consumers block if empty.
- Condition variables (wait/notify) are used to signal state changes.
- ReentrantLock with Condition provides efficient blocking in Java.
View a sample answer
A thread-safe bounded queue (blocking queue) supports put and take operations that block when the queue is full or empty, respectively. In Java, we use ReentrantLock with two Conditions: notFull and notEmpty. The put method acquires the lock, waits while the queue is full (notFull.await()), then enqueues and signals notEmpty. The take method similarly acquires the lock, waits while empty (notEmpty.await()), then dequeues and signals notFull. This design prevents busy-waiting and ensures efficient context switching. Care must be taken to handle spurious wakeups by using a while loop condition. The implementation is thread-safe and ensures FIFO ordering. Time complexity: O(1) for put/take amortized. Space: O(capacity).
Reference solutionjava import java.util.LinkedList; import java.util.Queue; import java.util.concurrent.locks.Condition; import java.util.concurrent.locks.ReentrantLock; public class BoundedBlockingQueue<T> { private final Queue<T> queue = new LinkedList<>(); private final int capacity; private final ReentrantLock lock = new ReentrantLock(); private final Condition notFull = lock.newCondition(); private final Condition notEmpty = lock.newCondition(); public BoundedBlockingQueue(int capacity) { this.capacity = capacity; } public void put(T item) throws InterruptedException { lock.lock(); try { while (queue.size() == capacity) { notFull.await(); // block until not full } queue.add(item); notEmpty.signal(); // wake up waiting consumers } finally { lock.unlock(); } } public T take() throws InterruptedException { lock.lock(); try { while (queue.isEmpty()) { notEmpty.await(); // block until not empty } T item = queue.poll(); notFull.signal(); // wake up waiting producers return item; } finally { lock.unlock(); } } public int size() { lock.lock(); try { return queue.size(); } finally { lock.unlock(); } } } // Time: O(1) per operation; Space: O(capacity) - How would you detect a deadlock programmatically? Write pseudocode for a deadlock detection algorithm.What a strong answer covers
- Deadlock detection relies on building a wait-for graph (WFG) showing which thread holds which resource and which thread waits for which resource.
- A cycle in the WFG indicates a deadlock.
- The algorithm periodically checks for cycles by traversing the graph or using a matrix approach.
View a sample answer
Programmatic deadlock detection typically uses a wait-for graph (WFG), where nodes are threads and edges indicate a thread is waiting for a resource held by another thread. A cycle in this graph means a deadlock exists. One algorithm: maintain a global matrix of resource allocations and requests. Periodically, run a cycle detection algorithm like topological sort or DFS. For each thread, check if it can proceed by simulating resource availability; if no thread can proceed, a deadlock is detected. In Java, ThreadMXBean can detect thread deadlocks. The algorithm's complexity is O(N^2) for N threads. However, deadlock detection adds overhead and may require recovery (e.g., aborting threads). It's more common to use prevention (e.g., lock ordering) or avoidance (e.g., banker's algorithm) instead.
Reference solutiontext // Pseudocode for deadlock detection using wait-for graph // Assume we have lists of held resources and pending requests per thread. Function detectDeadlock(Threads, Resources): graph = empty directed graph For each thread t: For each resource r held by t: For each thread w waiting for r: add edge w -> t in graph (w waits for t) // Detect cycle using DFS visited = set() recStack = set() For each node in graph: if DFS(node, visited, recStack): return true (deadlock detected) return false Function DFS(node, visited, recStack): if node in recStack: return true if node in visited: return false visited.add(node) recStack.add(node) For each neighbor of node: if DFS(neighbor, visited, recStack): return true recStack.remove(node) return false // Time: O(V+E) for building and searching graph. Space: O(V+E). - What is the ABA problem in lock-free programming? Provide a concrete scenario and solution.What a strong answer covers
- ABA problem occurs with CAS operations when a value changes from A to B and back to A, causing CAS to succeed incorrectly.
- Common in lock-free data structures like the Treiber stack.
- Solutions: use double-word CAS (tagged pointers) or hazard pointers.
View a sample answer
The ABA problem arises in compare-and-swap (CAS) based lock-free algorithms. It happens when a memory location is read as value A, then other threads modify it to B and later back to A. The CAS operation sees A and succeeds, assuming no change, but the underlying state may have altered (e.g., a node in a linked list was freed and reused). A concrete scenario is a lock-free stack: thread T1 reads head pointing to node A, then gets preempted. Another thread T2 pops A and B, pushes C. The head now points to C. Then T2 pushes A (reused node) back, so head points to A again. When T1 resumes, its CAS sees head==A and updates it, potentially corrupting the structure. The solution is to use a double-word CAS (e.g., ABA-prevention tags or generation number) that increments a tag with each modification, making ABA detectable. Alternatively, hazard pointers ensure nodes are not freed while being accessed.
- Design a concurrent web crawler that respects robots.txt and uses a thread pool. Show the main components.What a strong answer covers
- The crawler uses a thread pool for concurrent fetching.
- A robots.txt parser caches rules per domain and checks before download.
- A URL frontier prioritizes URLs and enforces politeness delays.
- Components: URL frontier, fetcher, parser, robots cache, and storage.
View a sample answer
A concurrent web crawler consists of several components. A thread pool (e.g., fixed-size pool) executes fetch tasks from a URL frontier. The frontier manages URLs, often with a priority queue, and ensures politeness by not fetching from the same domain faster than a configurable delay (e.g., 1 second). The fetcher downloads the page and parses HTML to extract new links. Before fetching any URL, the robots.txt file for that domain is fetched (if not cached) and parsed to check allowed paths. The robots cache stores parsed rules per domain to avoid re-downloading. New discovered URLs are normalized, filtered (e.g., same-origin, depth limit), and added to the frontier. The system also handles duplicate detection using a bloom filter or set. The thread pool handles concurrency while the frontier ensures politeness and fair scheduling.
- Explain the Java memory model: what guarantees does 'volatile' provide? Use an example of a shared flag.What a strong answer covers
- volatile guarantees visibility and prevents reordering of reads/writes on the variable.
- It does not provide atomicity for compound operations (e.g., increment).
- Example: a shared boolean flag to stop a thread; without volatile, the thread may never see the update.
View a sample answer
In the Java Memory Model, the volatile keyword ensures that any write to a volatile variable happens-before any subsequent read of that variable by any thread. This guarantees visibility: the latest value is always seen. It also prevents compiler and processor reordering of accesses with respect to other volatile accesses. However, volatile does not ensure atomicity for compound operations like i++ (read-modify-write). For a simple shared flag scenario, like a stop flag in a loop, declaring the flag volatile ensures the loop sees the change immediately. Without volatile, the thread may cache the flag value and never terminate, even if another thread sets it to true. A common pattern is using volatile for status flags, while for counters, atomic classes or locks are needed.
Reference solutionjava // Example: volatile flag for stopping a thread public class VolatileFlagExample { private static volatile boolean running = true; public static void main(String[] args) throws InterruptedException { Thread worker = new Thread(() -> { while (running) { // do work } System.out.println("Worker stopped."); }); worker.start(); Thread.sleep(1000); running = false; // ensures worker sees change immediately worker.join(); } } // Without volatile, the worker might never see the update due to caching. - If you have a high-contention counter, would you use a mutex, atomic, or a lock-free approach? Compare performance and trade-offs.What a strong answer covers
- Mutex provides mutual exclusion but can cause contention and context switches.
- Atomic operations (e.g., std::atomic) avoid locks but can suffer from CAS loops and cache line bouncing.
- Lock-free techniques like combining or thread-local accumulation reduce contention.
- Performance depends on contention level and hardware.
View a sample answer
For a high-contention counter, the choice between mutex, atomic, or lock-free approach involves trade-offs. A mutex (e.g., std::mutex) is simple but under high contention, threads block and cause context switches, reducing throughput. An atomic counter (e.g., std::atomic<int>) using fetch_add is lock-free in the sense that no OS mutex is involved, but under high contention, the underlying CAS retries (or hardware lock prefix) cause cache line bouncing between cores, limiting scalability. Lock-free techniques like thread-local accumulation (each thread maintains a local counter and periodically merges) reduce contention dramatically but increase memory usage and merging overhead. Another approach is using a combining tree where threads combine increments. In practice, for very high contention, a mutex may be worse than atomic due to context switches; atomic can scale moderately but eventually saturate the cache coherence protocol. The best choice depends on specific workload: for a simple counter with many threads, thread-local with periodic flush often gives the best throughput. Profile to determine the threshold.
How to prepare
- Practice implementing classic concurrency primitives (mutex, semaphore, condition variable) from scratch.
- Study real-world bugs: deadlock, livelock, and priority inversion. Know how to diagnose with thread dumps.
- Understand the difference between concurrency and parallelism, and when to use each.
- Learn lock-free techniques: CAS, transactional memory, and hazard pointers.
- Be ready to discuss performance metrics (throughput vs latency) under contention.
Frequently asked questions
Do I need to know every concurrency API in Java/C++?
Focus on core patterns: locks, semaphores, thread pools, and atomic operations. Depth over breadth.
How do I prepare for a concurrency design question?
Practice designing systems like thread pools, rate limiters, and concurrent hash tables. Outline trade-offs.
What is the trickiest concurrency concept?
Memory model quirks, like reordering and visibility without proper synchronization.
Will I be asked to write code on a whiteboard?
Yes, expect to implement thread-safe data structures or fix concurrent bugs manually.
How important is lock-free programming?
It's advanced but shows deep understanding. Know basic CAS and ABA solution.
Practice Concurrency questions with instant AI feedback
Upload your resume, get a personalized mock interview, and see exactly what to improve — free to start.