Question 1

What are the differences between a list, tuple, set, and dictionary? When would you choose each?

Accepted Answer

Lists are ordered, mutable sequences. Tuples are ordered, immutable sequences. Sets are unordered collections of unique elements. Dictionaries are key-value mappings with O(1) lookup.

Performance characteristics matter when choosing: sets and dicts provide O(1) membership testing via hashing, while lists require O(n) scans. Tuples are hashable (when all elements are hashable) and can serve as dict keys or set members, while lists cannot.

```python
my_list = [1, 2, 3]       # ordered, mutable, allows duplicates
my_tuple = (1, 2, 3)      # ordered, immutable, allows duplicates
my_set = {1, 2, 3}        # unordered, mutable, no duplicates
my_dict = {'a': 1, 'b': 2} # key-value pairs, O(1) lookup
```

Use lists for ordered collections you need to modify. Use tuples for fixed data (coordinates, return values, dict keys). Use sets for membership testing and deduplication. Use dicts for key-value associations. A list cannot be a dictionary key because lists are mutable and therefore unhashable.

Question 2

Explain the difference between `is` and `==` in Python. When would you use each?

Accepted Answer

`==` checks value equality via the `__eq__` method. `is` checks identity — whether two references point to the exact same object in memory.

CPython interns small integers (-5 to 256) and certain strings, so `is` may return `True` for equal small integers. But this is an implementation detail, not a language guarantee.

```python
a = 256
b = 256
a is b   # True in CPython — interned

a = 257
b = 257
a is b   # False in most contexts — not interned

a == b   # True — value equality
```

`is` should only be used for singletons like `None`, `True`, and `False`. The idiomatic pattern is `if x is None:` rather than `if x == None:`. Using `is` for value comparison leads to subtle bugs that depend on CPython internals.

Question 3

How does Python handle mutable default arguments in functions?

Accepted Answer

Default arguments are evaluated once at function definition time, not at each call. Mutable defaults like lists or dicts are shared across all calls to the function.

```python
def append_to(item, target=[]):
    target.append(item)
    return target

append_to(1)  # [1]
append_to(2)  # [1, 2] — not [2]!
```

The idiomatic fix uses `None` as a sentinel:

```python
def append_to(item, target=None):
    if target is None:
        target = []
    target.append(item)
    return target
```

This is one of Python's most well-known gotchas. The behavior exists because default values are attributes of the function object (`func.__defaults__`). In rare cases, mutable defaults are used intentionally — for example, as a simple cache or memo between calls — but this is generally considered an anti-pattern.

Question 4

What is the difference between `pass`, `continue`, and `break`?

Accepted Answer

`pass` is a no-op placeholder — it does nothing and is used where a statement is syntactically required but no action is needed. `continue` skips to the next iteration of a loop. `break` exits the loop entirely.

```python
for i in range(10):
    if i == 5:
        break       # exits the loop
    if i % 2 == 0:
        continue    # skips to next iteration
    print(i)        # prints 1, 3
```

Python has an `else` clause on loops that runs only if the loop completes without hitting `break`. This is a Python-specific feature most developers from other languages don't know:

```python
for item in items:
    if item.is_target():
        print('Found it')
        break
else:
    print('Not found')  # runs only if break was never hit
```

The `for...else` pattern is a clean alternative to using a boolean flag to track whether a loop completed naturally.

Question 5

What are `*args` and `**kwargs`? How do they interact with positional-only and keyword-only parameters?

Accepted Answer

`*args` collects extra positional arguments into a tuple. `**kwargs` collects extra keyword arguments into a dict. They allow functions to accept arbitrary numbers of arguments.

On the caller side, `*` unpacks iterables and `**` unpacks dicts into function arguments.

Python 3.8+ introduced positional-only parameters (before `/`) and keyword-only parameters (after `*`):

```python
def example(pos_only, /, normal, *, kw_only):
    pass

example(1, 2, kw_only=3)       # valid
example(pos_only=1, normal=2)  # TypeError — pos_only is positional-only
```

The full parameter order is: positional-only, `/`, regular, `*args`, keyword-only, `**kwargs`:

```python
def full_example(a, /, b, *args, c, **kwargs):
    pass
```

Positional-only parameters prevent callers from using the parameter name, which lets library authors change internal parameter names without breaking compatibility.

Question 6

Explain Python's string formatting options. When would you use each?

Accepted Answer

Python has four string formatting approaches:

**f-strings** (Python 3.6+) — the preferred modern approach. Evaluated at runtime, can contain arbitrary expressions:

```python
name = 'Alice'
f'Hello, {name.upper()}!'  # 'Hello, ALICE!'
f'{3.14159:.2f}'           # '3.14'
```

**`.format()` method** — useful when the template is defined separately from the values:

```python
template = 'Hello, {name}!'
template.format(name='Bob')  # 'Hello, Bob!'
```

**`%` formatting** — legacy C-style formatting, still seen in older codebases:

```python
'Hello, %s! You are %d.' % ('Carol', 30)
```

**`string.Template`** — safe for user-supplied templates because it doesn't evaluate expressions:

```python
from string import Template
t = Template('Hello, $name!')
t.substitute(name='Dave')  # 'Hello, Dave!'
```

Prefer f-strings for most cases. Use `.format()` when templates are stored as data. Use `Template` when the format string comes from untrusted user input. Avoid `%` formatting in new code.

Question 7

Explain the Method Resolution Order (MRO) in Python. How does it handle the diamond problem?

Accepted Answer

Python uses C3 linearization to determine the Method Resolution Order — the sequence in which base classes are searched when looking up a method. This algorithm ensures each class appears exactly once and respects the order in which parent classes are listed.

The diamond problem occurs when a class inherits from two classes that share a common ancestor:

```python
class A:
    def method(self):
        return 'A'

class B(A):
    def method(self):
        return 'B'

class C(A):
    def method(self):
        return 'C'

class D(B, C):
    pass

D().method()  # 'B' — MRO is D -> B -> C -> A
```

C3 linearization guarantees that `B` is checked before `C` (left-to-right) and that `A` appears only once at the end. You can inspect the MRO using `D.__mro__` or `D.mro()`. Understanding MRO is essential for `super()` calls in cooperative multiple inheritance — `super()` follows the MRO, not the direct parent.

Question 8

What are `classmethod`, `staticmethod`, and regular instance methods? When would you use each?

Accepted Answer

Instance methods receive `self` as the first argument and operate on the instance. Class methods receive `cls` and operate on the class itself. Static methods receive neither and are essentially namespaced functions.

```python
class Date:
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day

def display(self):             # instance method
        return f'{self.year}-{self.month}-{self.day}'

@classmethod
    def from_string(cls, s):       # factory method
        year, month, day = map(int, s.split('-'))
        return cls(year, month, day)

@staticmethod
    def is_valid(s):               # utility function
        parts = s.split('-')
        return len(parts) == 3

date = Date.from_string('2026-01-15')
```

Use `@classmethod` for factory methods and alternate constructors — they work correctly with inheritance because `cls` refers to the subclass. Use `@staticmethod` for utility functions that logically belong to the class but don't need access to instance or class state. Use regular methods for everything that operates on instance data.

Question 9

Explain Python's magic methods (dunder methods). Which ones do you use most frequently?

Accepted Answer

Magic methods (dunder methods) are special methods with double-underscore names that Python calls implicitly in response to operations. Key ones include:

- `__init__` — constructor
- `__repr__` — unambiguous string for developers, used by `repr()`
- `__str__` — readable string for users, used by `str()` and `print()`
- `__eq__` / `__hash__` — equality and hashing
- `__len__` — `len()` support
- `__iter__` / `__next__` — iteration protocol
- `__enter__` / `__exit__` — context manager protocol
- `__getattr__` / `__setattr__` — attribute access hooks

```python
class Point:
    def __init__(self, x, y):
        self.x, self.y = x, y

def __repr__(self):
        return f'Point({self.x}, {self.y})'

def __eq__(self, other):
        return self.x == other.x and self.y == other.y

def __hash__(self):
        return hash((self.x, self.y))
```

Defining `__eq__` makes instances unhashable by default (Python sets `__hash__` to `None`) unless you also define `__hash__`. This prevents subtle bugs when putting custom objects in sets or using them as dict keys after overriding equality.

Question 10

What are Abstract Base Classes (ABCs)? How do they differ from duck typing?

Accepted Answer

Abstract Base Classes enforce interface contracts at instantiation time — you cannot create an instance of a class that doesn't implement all required abstract methods. Duck typing relies on runtime behavior: if an object has the right methods, it works.

```python
from abc import ABC, abstractmethod

class Shape(ABC):
    @abstractmethod
    def area(self):
        pass

@abstractmethod
    def perimeter(self):
        pass

class Circle(Shape):
    def __init__(self, radius):
        self.radius = radius

def area(self):
        return 3.14159 * self.radius ** 2

def perimeter(self):
        return 2 * 3.14159 * self.radius

Shape()   # TypeError: Can't instantiate abstract class
Circle(5) # works fine
```

ABCs are worthwhile in large codebases and public APIs where explicit contracts prevent integration errors. Duck typing is sufficient for smaller codebases and internal code where flexibility matters more than safety. The `collections.abc` module provides standard ABCs like `Iterable`, `Mapping`, and `Sequence` that you can use for type checking and `isinstance()` tests.

Question 11

What are descriptors in Python? How do `property`, `classmethod`, and `staticmethod` use them?

Accepted Answer

Descriptors are objects that define `__get__`, `__set__`, or `__delete__` methods. They power Python's attribute access protocol and are the mechanism behind `property`, `classmethod`, and `staticmethod`.

There are two types:
- **Data descriptors** define `__set__` or `__delete__` (e.g., `property`)
- **Non-data descriptors** define only `__get__` (e.g., `classmethod`, `staticmethod`, regular functions)

The lookup order matters: data descriptors take priority over instance `__dict__`, which takes priority over non-data descriptors.

```python
class Validated:
    def __set_name__(self, owner, name):
        self.name = name

def __get__(self, obj, objtype=None):
        return obj.__dict__.get(self.name)

def __set__(self, obj, value):
        if not isinstance(value, int):
            raise TypeError(f'{self.name} must be int')
        obj.__dict__[self.name] = value

class Order:
    quantity = Validated()

o = Order()
o.quantity = 5    # works
o.quantity = 'x'  # TypeError: quantity must be int
```

Understanding descriptors explains how Python's entire attribute access system works under the hood.

Question 12

Explain `__slots__` and when you would use it.

Accepted Answer

`__slots__` replaces the per-instance `__dict__` with a fixed set of attribute slots, reducing memory footprint significantly for classes with many instances — roughly 40-50% for simple objects.

```python
class Point:
    __slots__ = ('x', 'y')

def __init__(self, x, y):
        self.x = x
        self.y = y

p = Point(1, 2)
p.z = 3  # AttributeError: 'Point' object has no attribute 'z'
```

Use `__slots__` when you're creating millions of instances of a class and memory is a concern — for example, data points in a scientific application or nodes in a graph.

Trade-offs:
- No dynamic attribute assignment — you can only use declared attributes
- Complications with multiple inheritance — both parent classes need compatible `__slots__`
- `__slots__` doesn't inherit automatically — subclasses get `__dict__` unless they also define `__slots__`
- Cannot use `__slots__` with `__dict__` unless you explicitly include `'__dict__'` in slots
- Slightly faster attribute access due to descriptor-based lookup instead of dict lookup

Question 13

What is a decorator? Write a decorator that logs the execution time of a function.

Accepted Answer

A decorator is a function that takes a function and returns a modified function. It's syntactic sugar for `func = decorator(func)`.

```python
import functools
import time

def timer(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        elapsed = time.perf_counter() - start
        print(f'{func.__name__} took {elapsed:.4f}s')
        return result
    return wrapper

@timer
def slow_function():
    time.sleep(1)

slow_function()  # slow_function took 1.0012s
```

`functools.wraps` copies the original function's metadata (`__name__`, `__doc__`, `__module__`, etc.) to the wrapper. Without it, introspection and debugging tools see the wrapper's name instead of the original function's name. This matters for logging, documentation generation, and frameworks that inspect function metadata.

Question 14

Write a decorator that accepts arguments — for example, a `@retry(max_attempts=3)` decorator.

Accepted Answer

A decorator with arguments requires a three-level nested function pattern. The outermost function accepts the decorator arguments and returns the actual decorator.

```python
import functools
import time

def retry(max_attempts=3, delay=1):
    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except Exception:
                    if attempt == max_attempts - 1:
                        raise
                    time.sleep(delay)
        return wrapper
    return decorator

@retry(max_attempts=5, delay=0.5)
def fetch_data():
    pass
```

The three levels are necessary because `@retry(max_attempts=3)` first calls `retry(max_attempts=3)`, which returns `decorator`. Then Python applies `decorator` to the function, which returns `wrapper`. The outer call is evaluated at decoration time, the middle level receives the function, and the inner level handles each call. Candidates who implement this without hesitation have strong closure fundamentals.

Question 15

What is the difference between a generator and a list comprehension? When would you choose each?

Accepted Answer

List comprehensions build the entire list in memory. Generators yield items lazily, one at a time, using constant memory regardless of the data size.

```python
squares_list = [x**2 for x in range(1_000_000)]  # ~8MB in memory
squares_gen  = (x**2 for x in range(1_000_000))  # negligible memory
```

Choose generators when:
- Processing large datasets that don't fit in memory
- Working with pipelines where you only need one item at a time
- The consumer might stop early (short-circuiting)

Choose list comprehensions when:
- You need random access or multiple passes over the data
- The dataset is small enough to fit in memory
- You need `len()`, indexing, or slicing

Generators are single-use — once exhausted, they cannot be iterated again. Lists support unlimited re-iteration. A common mistake is assigning a generator to a variable and trying to iterate it twice, getting an empty result the second time.

Question 16

Explain `yield from` and how it simplifies delegating to sub-generators.

Accepted Answer

`yield from` delegates to another iterable, forwarding values, `send()` calls, and exceptions transparently. Without it, you need an explicit loop with manual `send()`/`throw()` forwarding.

```python
def flatten(nested):
    for item in nested:
        if isinstance(item, list):
            yield from flatten(item)
        else:
            yield item

list(flatten([1, [2, [3, 4], 5]]))  # [1, 2, 3, 4, 5]
```

`yield from` is more than just `for x in iterable: yield x`. It also:
- Forwards `send()` values to the sub-generator
- Propagates exceptions into the sub-generator
- Captures the sub-generator's return value (via `StopIteration.value`)

```python
def accumulate():
    total = 0
    while True:
        value = yield total
        if value is None:
            return total
        total += value

def main():
    result = yield from accumulate()
    print(f'Total: {result}')
```

This makes `yield from` essential for composing complex generator pipelines and coroutine delegation.

Question 17

What is the iterator protocol? How would you make a custom class iterable?

Accepted Answer

The iterator protocol requires two methods: `__iter__` (returns the iterator object) and `__next__` (returns the next value or raises `StopIteration`).

An **iterable** has `__iter__` and returns an iterator. An **iterator** has both `__iter__` (returns `self`) and `__next__`. Iterators return `self` from `__iter__` so they work in `for` loops whether you pass the iterable or the iterator.

```python
class Countdown:
    def __init__(self, start):
        self.start = start

def __iter__(self):
        return CountdownIterator(self.start)

class CountdownIterator:
    def __init__(self, current):
        self.current = current

def __iter__(self):
        return self

def __next__(self):
        if self.current <= 0:
            raise StopIteration
        self.current -= 1
        return self.current + 1

for n in Countdown(3):
    print(n)  # 3, 2, 1
```

Separating the iterable from the iterator allows multiple independent iterations over the same data. If the class were its own iterator, you couldn't iterate it twice concurrently.

Question 18

How do context managers work? Write a custom context manager using both class-based and decorator-based approaches.

Accepted Answer

Context managers implement the `__enter__` and `__exit__` protocol, used with `with` statements for resource management.

**Class-based approach:**

```python
class ManagedFile:
    def __init__(self, path, mode):
        self.path = path
        self.mode = mode

def __enter__(self):
        self.file = open(self.path, self.mode)
        return self.file

def __exit__(self, exc_type, exc_val, exc_tb):
        self.file.close()
        return False  # don't suppress exceptions
```

**Decorator-based approach using `contextlib`:**

```python
from contextlib import contextmanager

@contextmanager
def managed_file(path, mode):
    f = open(path, mode)
    try:
        yield f
    finally:
        f.close()
```

Returning `True` from `__exit__` suppresses the exception — the `with` block won't propagate it. This is rarely used but useful for specific error-handling patterns. The `try/finally` in the generator approach ensures cleanup happens even if the body raises an exception.

Question 19

Explain Python's exception hierarchy. What is the difference between `Exception` and `BaseException`?

Accepted Answer

`BaseException` is the root of all exceptions. `Exception` inherits from it. `KeyboardInterrupt` and `SystemExit` inherit directly from `BaseException`, not `Exception`.

This design is intentional: `except Exception` won't catch Ctrl+C (`KeyboardInterrupt`) or `sys.exit()` (`SystemExit`), allowing programs to shut down cleanly.

```python
BaseException
├── SystemExit
├── KeyboardInterrupt
├── GeneratorExit
└── Exception
    ├── StopIteration
    ├── ValueError
    ├── TypeError
    ├── KeyError
    ├── OSError
    └── ...
```

Best practices:
- Catch specific exceptions, not broad ones
- Never use bare `except:` — it catches `KeyboardInterrupt` and `SystemExit`
- `except Exception: pass` silently hides bugs — always log or re-raise
- Use `except Exception as e:` to capture the exception for logging

```python
try:
    result = process(data)
except ValueError as e:
    logger.warning(f'Invalid data: {e}')
    result = default_value
```

Writing `except Exception` without re-raising also catches `StopIteration`, which can silently break generator-based code.

Question 20

What are comprehensions in Python? Show list, dict, set, and generator comprehensions. When do they hurt readability?

Accepted Answer

Python has four comprehension types, each providing concise syntax for creating collections:

```python
squares = [x**2 for x in range(10)]                # list
evens = {x for x in range(10) if x % 2 == 0}       # set
mapping = {x: x**2 for x in range(5)}              # dict
lazy = (x**2 for x in range(10))                   # generator expression
```

Comprehensions become unreadable when nested or when they combine multiple conditions:

```python
# Hard to read — use a regular loop instead
result = [
    transform(x, y)
    for x in range(10)
    if x > 3
    for y in range(x)
    if y % 2 == 0
]
```

Nested comprehensions read inside-out, which confuses most developers. A good rule: if a comprehension needs more than one `for` clause or more than one condition, use an explicit loop.

Generator expressions use parentheses and are lazy — they don't build the entire collection in memory. When passed directly as the sole argument to a function, the extra parentheses can be omitted: `sum(x**2 for x in range(10))`.

Question 21

What is the Global Interpreter Lock (GIL)? Why does Python have it?

Accepted Answer

The GIL is a mutex in CPython that prevents multiple native threads from executing Python bytecode simultaneously. It exists because CPython's memory management — specifically reference counting — is not thread-safe. Without the GIL, simple operations like incrementing a reference count could corrupt memory in a multi-threaded program.

The GIL simplifies CPython's implementation and makes C extension development easier, since extensions don't need to worry about thread-safe reference counting.

The GIL only affects CPU-bound threads. I/O-bound threads release the GIL during system calls (network requests, file reads, sleep), allowing true concurrency for I/O workloads.

```python
import threading
import time

def io_bound():
    time.sleep(1)  # GIL is released during sleep

threads = [threading.Thread(target=io_bound) for _ in range(10)]
for t in threads: t.start()
for t in threads: t.join()  # completes in ~1 second, not 10
```

The GIL does NOT prevent all race conditions. Operations that span multiple bytecodes (like `counter += 1`, which is `LOAD`, `ADD`, `STORE`) can still be interleaved between threads.

Question 22

When would you use `threading` vs `multiprocessing` vs `asyncio`?

Accepted Answer

Each concurrency model maps to a specific workload type:

**`threading`** — I/O-bound work where the GIL is released during system calls. Good for network requests, file I/O, database queries. Threads share memory, making data sharing easy but requiring locks for thread safety.

**`multiprocessing`** — CPU-bound work that needs true parallelism. Each process has its own GIL, so they run on separate cores. Trade-off: higher memory overhead (full process per worker) and data must be serialized (pickled) to pass between processes.

**`asyncio`** — high-concurrency I/O-bound work with a single-threaded event loop. Excellent for thousands of concurrent connections (web servers, crawlers). Lower overhead than threads, but requires `async/await` syntax throughout, and a single blocking call stalls everything.

```python
# I/O-bound: use threading or asyncio
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor() as pool:
    results = pool.map(fetch_url, urls)

# CPU-bound: use multiprocessing
with concurrent.futures.ProcessPoolExecutor() as pool:
    results = pool.map(heavy_computation, data)
```

For most applications, `concurrent.futures` provides a clean high-level API that works with both threads and processes.

Question 23

Write a basic async function that fetches multiple URLs concurrently using `asyncio`.

Accepted Answer

```python
import asyncio
import aiohttp

async def fetch_url(session, url):
    async with session.get(url) as response:
        return await response.text()

async def fetch_all(urls):
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_url(session, url) for url in urls]
        return await asyncio.gather(*tasks)

results = asyncio.run(fetch_all([
    'https://example.com',
    'https://example.org',
]))
```

`asyncio.gather` launches all tasks concurrently and collects results in order. By default, if one task raises an exception, the others continue running but `gather` propagates the first exception.

`asyncio.TaskGroup` (Python 3.11+) provides structured concurrency — if any task fails, all remaining tasks in the group are cancelled:

```python
async def fetch_all(urls):
    async with aiohttp.ClientSession() as session:
        async with asyncio.TaskGroup() as tg:
            tasks = [tg.create_task(fetch_url(session, url)) for url in urls]
        return [t.result() for t in tasks]
```

`async with` is needed because `aiohttp.ClientSession` is an async context manager. Using a regular `with` would not properly clean up the session.

Question 24

What is a race condition? How can you prevent one in Python?

Accepted Answer

A race condition occurs when the outcome of a program depends on the timing of thread execution. Even with the GIL, race conditions exist in Python because the GIL only guarantees atomic bytecode execution, not atomic compound operations.

```python
import threading

counter = 0

def increment():
    global counter
    for _ in range(100_000):
        counter += 1  # NOT atomic: LOAD_GLOBAL, ADD, STORE_GLOBAL

threads = [threading.Thread(target=increment) for _ in range(4)]
for t in threads: t.start()
for t in threads: t.join()
print(counter)  # Less than 400,000 — race condition!
```

`counter += 1` involves three bytecodes: `LOAD`, `ADD`, `STORE`. The GIL can release between any of them, causing threads to overwrite each other's increments.

Solutions:
- `threading.Lock()` for mutual exclusion
- `queue.Queue` for thread-safe producer-consumer patterns
- `asyncio.Lock()` for async code
- Atomic operations from `threading` (e.g., `Event`, `Semaphore`)
- Redesigning with immutable data or message passing

```python
lock = threading.Lock()
def safe_increment():
    global counter
    with lock:
        counter += 1
```

Question 25

Explain the difference between `asyncio.run()`, `asyncio.create_task()`, and `await`.

Accepted Answer

`asyncio.run()` creates a new event loop, runs a coroutine to completion, and closes the loop. It's the entry point for async programs — typically called once from synchronous code:

```python
async def main():
    result = await some_async_function()
    print(result)

asyncio.run(main())
```

`asyncio.create_task()` schedules a coroutine for concurrent execution within an existing event loop. The task starts running as soon as the current coroutine yields control:

```python
async def main():
    task1 = asyncio.create_task(fetch('url1'))
    task2 = asyncio.create_task(fetch('url2'))
    result1 = await task1
    result2 = await task2
```

`await` suspends the current coroutine until the awaitable completes. It doesn't create tasks or start concurrent execution — it just waits.

The key distinction: `await coroutine()` runs it sequentially. `create_task(coroutine())` followed by `await task` runs it concurrently. Without `create_task`, multiple `await` calls execute one after another, losing the benefit of async.

Question 26

How does Python's garbage collector work? What is reference counting and why isn't it sufficient?

Accepted Answer

CPython uses reference counting as its primary garbage collection mechanism. Every object has a count of references pointing to it. When the count drops to zero, the object is immediately deallocated.

Reference counting alone cannot handle circular references — when objects reference each other, their counts never reach zero:

```python
class Node:
    def __init__(self):
        self.ref = None

a = Node()
b = Node()
a.ref = b
b.ref = a
del a, b  # reference counts are still 1 — not collected!
```

Python's generational garbage collector supplements reference counting by periodically scanning for reference cycles. It uses three generations (0, 1, 2) based on object age. Young objects are scanned more frequently because most objects are short-lived.

You can interact with the collector via the `gc` module:

```python
import gc

gc.collect()              # force a collection
gc.disable()              # disable automatic collection
gc.get_threshold()        # (700, 10, 10) — default thresholds
gc.get_referrers(obj)     # find what references an object
```

Use `gc.collect()` explicitly after deleting large data structures in memory-sensitive applications. Use `weakref` to break cycles without preventing garbage collection.

Question 27

What is interning in Python? How does CPython optimize small integers and strings?

Accepted Answer

Interning is an optimization where CPython reuses existing objects instead of creating new ones for certain values. This saves memory and speeds up comparisons (identity checks are faster than value comparisons).

**Integer interning:** CPython pre-creates and caches integers from -5 to 256. Any variable assigned a value in this range points to the same object:

```python
a = 256
b = 256
a is b  # True — same object

a = 257
b = 257
a is b  # False (in most contexts) — different objects
```

**String interning:** CPython automatically interns strings that look like identifiers (alphanumeric characters and underscores). You can manually intern strings with `sys.intern()`:

```python
import sys

a = 'hello'
b = 'hello'
a is b  # True — automatically interned

a = 'hello world'
b = 'hello world'
a is b  # False — contains space, not auto-interned

a = sys.intern('hello world')
b = sys.intern('hello world')
a is b  # True — manually interned
```

These are CPython implementation details, not language guarantees. Code should never rely on interning behavior — always use `==` for value comparison, reserving `is` for `None` checks.

Question 28

What is the `__dict__` attribute? How does attribute lookup work in Python?

Accepted Answer

Every regular Python object has a `__dict__` dictionary that stores its instance attributes. Classes also have their own `__dict__` for class-level attributes and methods.

Attribute lookup follows a specific chain:

1. **Data descriptors** on the class (and its MRO) — objects with `__get__` and `__set__`
2. **Instance `__dict__`** — the object's own attributes
3. **Non-data descriptors and class attributes** — objects with only `__get__`, or plain class variables
4. **`__getattr__`** — called as a fallback if defined, only when normal lookup fails

```python
class MyClass:
    class_attr = 'class level'

def __init__(self):
        self.instance_attr = 'instance level'

def __getattr__(self, name):
        return f'fallback for {name}'

obj = MyClass()
obj.instance_attr     # 'instance level' — from obj.__dict__
obj.class_attr        # 'class level' — from MyClass.__dict__
obj.anything          # 'fallback for anything' — __getattr__ fallback
```

Note the distinction: `__getattr__` is called only when normal lookup fails, while `__getattribute__` is called for every attribute access and can override the entire lookup chain. Understanding this protocol explains how `property`, `classmethod`, and custom descriptors work.

Question 29

Explain the difference between shallow copy and deep copy. When does each matter?

Accepted Answer

`copy.copy()` creates a new object but inserts references to the same nested objects. `copy.deepcopy()` recursively copies everything, creating fully independent copies at every level.

```python
import copy

original = [[1, 2], [3, 4]]
shallow = copy.copy(original)
deep = copy.deepcopy(original)

shallow[0].append(5)
print(original[0])  # [1, 2, 5] — shared reference!
print(deep[0])      # [1, 2] — independent copy
```

Shallow copy is sufficient when:
- The structure is flat (no nested mutables)
- All nested elements are immutable (strings, tuples, frozensets)

Deep copy is needed when:
- Nested mutable objects exist and must be independently modifiable
- You're creating snapshots of complex state

`deepcopy` handles circular references by maintaining a memo dictionary that tracks already-copied objects, preventing infinite recursion. You can customize copying behavior by defining `__copy__` and `__deepcopy__` methods on your classes.

Question 30

What are weak references and when would you use them?

Accepted Answer

The `weakref` module provides references that don't prevent garbage collection. A weak reference to an object doesn't increment its reference count, so the object can be collected when no strong references remain.

```python
import weakref

class ExpensiveObject:
    def __init__(self, name):
        self.name = name

obj = ExpensiveObject('data')
weak = weakref.ref(obj)

print(weak())       # <ExpensiveObject ...> — object still alive
del obj
print(weak())       # None — object was garbage collected
```

Use cases:
- **Caching** — objects stay cached only while used elsewhere, with `WeakValueDictionary`
- **Observer patterns** — observers don't keep subjects alive
- **Avoiding circular reference leaks** — break cycles without preventing collection

```python
cache = weakref.WeakValueDictionary()

def get_or_create(key):
    obj = cache.get(key)
    if obj is None:
        obj = ExpensiveObject(key)
        cache[key] = obj
    return obj
```

`WeakSet` is useful for tracking all instances of a class without preventing their collection. Note that not all objects support weak references — built-in types like `int`, `str`, and `tuple` cannot be weakly referenced.

Question 31

Implement an LRU cache from scratch (without `functools.lru_cache`).

Accepted Answer

An LRU (Least Recently Used) cache evicts the oldest unused entry when full. Python 3.7+ guarantees dict insertion order, which simplifies the implementation:

```python
class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = {}

def get(self, key):
        if key not in self.cache:
            return -1
        self.cache[key] = self.cache.pop(key)
        return self.cache[key]

def put(self, key, value):
        if key in self.cache:
            self.cache.pop(key)
        elif len(self.cache) >= self.capacity:
            oldest = next(iter(self.cache))
            del self.cache[oldest]
        self.cache[key] = value
```

Both `get` and `put` are O(1) operations. The trick is that `pop` followed by re-insertion moves the key to the end of the ordered dict.

Alternatively, `collections.OrderedDict` provides `move_to_end()`:

```python
from collections import OrderedDict

class LRUCache(OrderedDict):
    def __init__(self, capacity):
        self.capacity = capacity

def get(self, key):
        if key not in self:
            return -1
        self.move_to_end(key)
        return self[key]

def put(self, key, value):
        if key in self:
            self.move_to_end(key)
        self[key] = value
        if len(self) > self.capacity:
            self.popitem(last=False)
```

For thread safety, wrap operations with `threading.Lock()`. For production use, consider `functools.lru_cache` which handles all of this.

Question 32

Write a function that groups anagrams from a list of strings.

Accepted Answer

Use sorted characters as a hash key to group words that are anagrams of each other:

```python
from collections import defaultdict

def group_anagrams(words):
    groups = defaultdict(list)
    for word in words:
        key = tuple(sorted(word.lower()))
        groups[key].append(word)
    return list(groups.values())

group_anagrams(['eat', 'tea', 'tan', 'ate', 'nat', 'bat'])
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
```

The `tuple(sorted(word))` approach has O(k log k) per word where k is the word length. For very long strings, a character frequency tuple is O(k):

```python
def group_anagrams_fast(words):
    groups = defaultdict(list)
    for word in words:
        counts = [0] * 26
        for c in word.lower():
            counts[ord(c) - ord('a')] += 1
        groups[tuple(counts)].append(word)
    return list(groups.values())
```

Using `defaultdict` over manual key checking is idiomatic Python. The overall complexity is O(n * k log k) for the sorted approach or O(n * k) for the frequency approach, where n is the number of words.

Question 33

Implement a decorator that caches function results (memoization) with an optional TTL.

Accepted Answer

This combines decorators, closures, and caching — three intermediate concepts in one problem:

```python
import functools
import time

def memoize(ttl=None):
    def decorator(func):
        cache = {}

@functools.wraps(func)
        def wrapper(*args):
            now = time.time()
            if args in cache:
                result, timestamp = cache[args]
                if ttl is None or now - timestamp < ttl:
                    return result
            result = func(*args)
            cache[args] = (result, now)
            return result
        return wrapper
    return decorator

@memoize(ttl=60)
def expensive_query(query_id):
    pass
```

The TTL expiration check ensures stale entries are recomputed. The cache key is the `args` tuple, which works because tuples are hashable.

`**kwargs` makes caching harder because dicts are not hashable and can't be used as cache keys. Solutions include converting kwargs to a frozen set of items: `key = (args, frozenset(kwargs.items()))`. However, this breaks for kwargs containing mutable values.

For production use, `functools.lru_cache` handles most cases. For TTL support, consider `cachetools.TTLCache`. For thread safety, add a `threading.Lock` around cache access.

Question 34

Write a context manager that temporarily changes the working directory and restores it on exit.

Accepted Answer

This tests the context manager pattern applied to temporary state changes — a common real-world pattern:

```python
import os
from contextlib import contextmanager

@contextmanager
def change_dir(path):
    original = os.getcwd()
    os.chdir(path)
    try:
        yield
    finally:
        os.chdir(original)

with change_dir('/tmp'):
    print(os.getcwd())  # /tmp
print(os.getcwd())  # back to original
```

The key elements are:
- **Store original state** before making changes
- **`try/finally`** guarantees cleanup even if the body raises an exception
- **`yield`** without a value since the user doesn't need a reference

This pattern applies broadly to any temporary state change: environment variables, database transactions, monkey-patching, locale settings. The `finally` block is critical — without it, an exception in the `with` body would leave the process in the wrong directory.

A class-based version would store `original` in `__enter__` and restore in `__exit__`, with the same `try/finally` guarantee built into the protocol.

Question 35

Implement a pipeline function that chains multiple transformations.

Accepted Answer

A pipeline chains functions where each function's output feeds into the next function's input:

```python
from functools import reduce

def pipeline(*funcs):
    def apply(value):
        return reduce(lambda acc, fn: fn(acc), funcs, value)
    return apply

process = pipeline(
    str.strip,
    str.lower,
    lambda s: s.replace(' ', '_'),
)

process('  Hello World  ')  # 'hello_world'
```

For error handling, wrap each step:

```python
def safe_pipeline(*funcs):
    def apply(value):
        for fn in funcs:
            try:
                value = fn(value)
            except Exception as e:
                raise ValueError(
                    f'Pipeline failed at {fn.__name__}: {e}'
                ) from e
        return value
    return apply
```

For lazy evaluation, use generators:

```python
def lazy_pipeline(*funcs):
    def apply(iterable):
        for fn in funcs:
            iterable = map(fn, iterable)
        return iterable
    return apply
```

This pattern appears in data processing frameworks, middleware stacks, and build tools. Understanding function composition and `reduce` shows comfort with functional programming concepts.

Question 36

What is the output of this code? Why?

Accepted Answer

The output is `8, 8, 8, 8, 8` — not `0, 2, 4, 6, 8` as most developers expect.

This is Python's most infamous closure gotcha. The lambdas all capture the same variable `i` by reference, not by value. By the time any lambda is called, the loop has completed and `i` has the value `4`. So every lambda computes `x * 4`.

The fix is to capture `i` as a default argument, which evaluates at definition time:

```python
def create_multipliers():
    return [lambda x, i=i: x * i for i in range(5)]

for multiplier in create_multipliers():
    print(multiplier(2))  # 0, 2, 4, 6, 8
```

Alternatively, use `functools.partial`:

```python
from functools import partial

def multiply(x, i):
    return x * i

def create_multipliers():
    return [partial(multiply, i=i) for i in range(5)]
```

This is the same late-binding closure issue that exists in JavaScript and other languages with closures. Candidates who recognize it instantly have debugged this in production code.

Question 37

What happens here and how do you fix it?

Accepted Answer

`settings` is a class attribute, shared across all instances. Mutating it via any instance affects all of them because `self.settings[key] = value` modifies the existing dict object — it doesn't create a new instance attribute.

```python
a = Config()
b = Config()
a.settings is b.settings  # True — same dict object
```

The fix is to initialize mutable attributes in `__init__` so each instance gets its own copy:

```python
class Config:
    def __init__(self):
        self.settings = {}  # instance attribute — unique per instance

def set(self, key, value):
        self.settings[key] = value

a = Config()
b = Config()
a.set('debug', True)
print(b.settings)  # {} — independent copy
```

This is the class-level version of the mutable default argument gotcha. The underlying principle is the same: mutable objects defined at the class level (or as default arguments) are shared references. Immutable class attributes (strings, numbers, tuples) don't have this problem because they can't be mutated in place.

Question 38

What is wrong with this exception handling?

Accepted Answer

This silently swallows every exception — including unexpected ones like `TypeError` from a bug in `process()`, `MemoryError`, or `StopIteration` from generator code. The program continues with undefined state, making bugs extremely difficult to diagnose.

Problems with `except Exception: pass`:
- Hides genuine bugs by catching exceptions you didn't anticipate
- Makes debugging nearly impossible — no error message, no traceback
- Can mask `StopIteration`, silently breaking generator-based code
- The program continues in an unknown state after the failure

The fix is to catch specific exceptions and handle them explicitly:

```python
import logging

try:
    data = fetch_remote_data()
    result = process(data)
except ConnectionError:
    logging.warning('Failed to fetch data, using cached version')
    result = get_cached_data()
except ValueError as e:
    logging.error(f'Invalid data format: {e}')
    raise
```

If you genuinely need to catch all exceptions (rare), always log the error:

```python
except Exception:
    logging.exception('Unexpected error during processing')
    raise  # re-raise after logging
```

Never use bare `except:` — it catches `KeyboardInterrupt` and `SystemExit`, preventing clean shutdown.

Question 39

Explain what happens with this import pattern and why it fails.

Accepted Answer

This is a circular import. When `module_a` starts loading, it tries to import `helper_b` from `module_b`. Python starts executing `module_b`, which tries to import `helper_a` from `module_a`. But `module_a` hasn't finished loading yet — `helper_a` hasn't been defined — so the import fails with `ImportError`.

Python's import machinery caches modules in `sys.modules` as they load. During circular imports, a partially-loaded module is returned, which may not have all its attributes defined yet.

Solutions:

**1. Restructure to break the cycle** — move shared code to a third module:
```python
# shared.py
def helper_a(): return 'A'
def helper_b(): return helper_a()
```

**2. Use local imports inside functions** — defers the import until call time:
```python
# module_b.py
def helper_b():
    from module_a import helper_a
    return helper_a()
```

**3. Import the module, not the name** — `import module_a` instead of `from module_a import helper_a`. Module-level imports succeed because the module object exists in `sys.modules` even while partially loaded.

Prefer restructuring over local imports. Circular dependencies usually indicate a design problem.

Question 40

What is the output and why?

Accepted Answer

```python
print(b)  # [1, 2, 3]
print(d)  # [1, 2, 3, 4, 5]
```

`a = a + [4, 5]` creates a **new list** via `list.__add__` and rebinds `a` to it. `b` still references the original list.

`c += [4, 5]` calls `list.__iadd__`, which **mutates `c` in place** by extending it. Since `d` references the same object as `c`, `d` sees the change.

This is a critical distinction:
- `__add__` returns a new object (non-mutating)
- `__iadd__` modifies the object in place (mutating) for mutable types

For immutable types like tuples and strings, `+=` creates a new object because in-place mutation is impossible:

```python
a = (1, 2)
b = a
a += (3,)
# a is a new tuple, b is unchanged
```

This catches developers who assume `+=` is always shorthand for `= ... +`. Understanding the difference between rebinding a name and mutating an object is fundamental to Python's data model.

Question 41

How do you use type hints in Python? What role does mypy play?

Accepted Answer

Type hints add static type information to Python code without affecting runtime behavior. They serve as documentation and enable static analysis tools like mypy to catch type errors before execution.

```python
def greet(name: str, times: int = 1) -> str:
    return (f'Hello, {name}! ' * times).strip()

def find_user(user_id: int) -> dict[str, str] | None:
    pass
```

Common type hint constructs:

```python
from typing import Optional, Union, TypeVar, Generic
from collections.abc import Callable, Sequence

x: list[int] = [1, 2, 3]                # generic built-in (3.9+)
y: dict[str, list[int]] = {}             # nested generics
fn: Callable[[int, str], bool]           # function type
opt: str | None = None                   # union syntax (3.10+)
opt_legacy: Optional[str] = None         # equivalent, older syntax
```

`mypy` is a static type checker that analyzes code without running it. It catches type mismatches, missing return types, incorrect argument types, and `None` safety issues:

```
$ mypy app.py
app.py:5: error: Argument 1 to "greet" has incompatible type "int"; expected "str"
```

`Optional[str]` and `str | None` are semantically identical. The `|` syntax (PEP 604, Python 3.10+) is preferred in modern code for readability. Type hints are not enforced at runtime — they require tools like mypy, pyright, or IDE integration.

Question 42

What are dataclasses and when should you use them instead of regular classes?

Accepted Answer

Dataclasses (Python 3.7+) automatically generate `__init__`, `__repr__`, `__eq__`, and other boilerplate methods based on class attributes with type annotations:

```python
from dataclasses import dataclass, field

@dataclass
class Point:
    x: float
    y: float
    label: str = 'origin'
    tags: list[str] = field(default_factory=list)

p = Point(1.0, 2.0)
print(p)           # Point(x=1.0, y=2.0, label='origin', tags=[])
p == Point(1.0, 2.0)  # True — __eq__ compares all fields
```

Use dataclasses when your class is primarily a data container. Use regular classes when you need complex initialization logic, custom `__init__` signatures, or heavy behavioral methods.

Key options:
- `frozen=True` — makes instances immutable (enables hashing, prevents accidental mutation)
- `slots=True` (Python 3.10+) — generates `__slots__` for lower memory usage
- `order=True` — generates comparison methods (`__lt__`, `__le__`, etc.)

```python
@dataclass(frozen=True, slots=True)
class Coordinate:
    lat: float
    lon: float

c = Coordinate(52.23, 21.01)
c.lat = 0  # FrozenInstanceError
{c}         # works — frozen dataclasses are hashable
```

`field(default_factory=list)` avoids the mutable default argument pitfall by creating a new list for each instance.

Question 43

What is the walrus operator (`:=`) and when is it useful?

Accepted Answer

The walrus operator (`:=`), introduced in Python 3.8 (PEP 572), assigns a value to a variable as part of an expression. It eliminates the need for separate assignment and condition lines.

```python
# Without walrus operator
line = input()
while line != 'quit':
    process(line)
    line = input()

# With walrus operator
while (line := input()) != 'quit':
    process(line)
```

Common use cases:

```python
# Filtering with computed value
results = [
    stripped
    for line in lines
    if (stripped := line.strip())
]

# Regex matching
import re
if m := re.match(r'(\d+)-(\w+)', text):
    number, word = m.groups()

# Avoiding redundant function calls
if (n := len(data)) > 10:
    print(f'Processing {n} items')
```

Avoid the walrus operator when:
- The expression is already simple enough without it
- Nesting makes the line hard to read
- It's used in a context where side effects are confusing

```python
# Too clever — just use two lines
result = (x := expensive()) if (y := check(x)) else default
```

The walrus operator is most valuable in `while` loops, comprehension filters, and `if` statements where you need both the test and the value.

Question 44

How do `match` statements work in Python 3.10+? How do they differ from `switch` in other languages?

Accepted Answer

Python 3.10 introduced structural pattern matching (PEP 634) via `match`/`case` statements. Unlike `switch` in C or Java, Python's `match` does structural decomposition — it can match and unpack complex data structures.

```python
def handle_command(command):
    match command:
        case {'action': 'move', 'direction': d}:
            print(f'Moving {d}')
        case {'action': 'attack', 'target': t, 'weapon': w}:
            print(f'Attacking {t} with {w}')
        case {'action': 'quit'}:
            print('Goodbye')
        case _:
            print('Unknown command')
```

Pattern types:

```python
match value:
    case 42:                       # literal pattern
        pass
    case str(s):                   # class pattern with capture
        pass
    case [x, y, *rest]:            # sequence pattern with star
        pass
    case {'key': v}:               # mapping pattern
        pass
    case Point(x=0, y=y):          # class pattern — matches if x==0
        pass
    case x if x > 0:               # guard clause
        pass
    case int() | float():          # OR pattern
        pass
```

Key differences from `switch`:
- No fall-through — each `case` is independent
- Patterns destructure and bind variables simultaneously
- Guards (`if` clauses) enable conditional matching
- Works with custom classes via `__match_args__`

Pattern matching is most useful for parsing commands, handling protocol messages, and processing ASTs — anywhere you need to match and decompose structured data.

Question 45

How do you manage Python packages and virtual environments? What tools are available?

Accepted Answer

Virtual environments isolate project dependencies, preventing conflicts between projects that need different versions of the same package.

**Built-in `venv`:**
```python
python -m venv .venv
source .venv/bin/activate      # Linux/macOS
.venv\Scripts\activate         # Windows
pip install requests
pip freeze > requirements.txt
```

**Modern tools:**
- **`pip`** — standard package installer, uses `requirements.txt`
- **`poetry`** — dependency management with lock files and `pyproject.toml`
- **`uv`** — ultra-fast Rust-based pip/venv replacement
- **`pipx`** — install CLI tools in isolated environments
- **`conda`** — popular for data science, manages non-Python dependencies too

**Package configuration evolution:**

`setup.py` — legacy, imperative configuration (still used but discouraged for new projects)

`requirements.txt` — flat list of pinned dependencies, no metadata

`pyproject.toml` — modern standard (PEP 621), declarative configuration:

```toml
[project]
name = "myapp"
version = "1.0.0"
dependencies = [
    "requests>=2.28",
    "pydantic>=2.0",
]

[project.optional-dependencies]
dev = ["pytest", "mypy"]
```

Best practices: always use virtual environments, pin dependencies with lock files, prefer `pyproject.toml` for new projects, and separate production and development dependencies.

Python Interview Questions

Fundamentals

OOP

Patterns

Concurrency

Memory & Internals

Coding Challenges

Pitfalls

Use these questions in your next interview

More Interview Questions

JavaScript Interview Questions

React Interview Questions

Frontend Developer Interview Questions