Backend Developer Interview Questions
31 questions — 7 easy · 17 medium · 7 hard
Programming Fundamentals
(5)SOLID:
- S — Single Responsibility: A class should have only one reason to change
- O — Open/Closed: Open for extension, closed for modification
- L — Liskov Substitution: Subtypes must be substitutable for their base types
- I — Interface Segregation: Many specific interfaces are better than one general-purpose interface
- D — Dependency Inversion: Depend on abstractions, not concretions
SRP example:
# Bad — one class does too much
class UserService:
def create_user(self, data): ...
def send_welcome_email(self, user): ...
def generate_report(self, user): ...
# Good — each class has one responsibility
class UserService:
def create_user(self, data): ...
class EmailService:
def send_welcome_email(self, user): ...
class ReportService:
def generate_report(self, user): ...
SRP makes code easier to test, maintain, and reason about. Each class changes only when its specific responsibility changes.
Follow-up
Follow-up: Which SOLID principle do you find most difficult to apply in practice?
The testing pyramid organizes tests by speed, cost, and scope:
/ E2E \ Few, slow, expensive
/ Integ. \ Medium amount
/ Unit \ Many, fast, cheap
Unit tests (base):
- Test individual functions/classes in isolation
- Fast, deterministic, many of them
- Mock external dependencies
- Example: test a
calculateTotal()function
Integration tests (middle):
- Test how components work together
- May involve real database, API calls between services
- Slower than unit tests, fewer of them
- Example: test an API endpoint with a real database
End-to-end tests (top):
- Test complete user workflows through the entire system
- Slowest, most brittle, fewest of them
- Example: test the full signup flow in a browser
Mocking vs stubbing vs faking:
- Mock — records calls and verifies interactions ("was this method called with these args?")
- Stub — returns predefined responses ("when called, return this value")
- Fake — a working simplified implementation (e.g., in-memory database instead of real one)
Follow-up
Follow-up: What is the difference between mocking, stubbing, and faking?
Common strategies:
Cache-Aside (Lazy Loading):
- Application checks cache first, on miss reads from DB and populates cache
- Best for read-heavy workloads
- Risk: cache miss penalty, stale data
Write-Through:
- Write to cache and DB simultaneously
- Data is always consistent
- Slower writes, good for read-heavy + consistency needs
Write-Behind (Write-Back):
- Write to cache, asynchronously write to DB
- Fast writes, risk of data loss if cache fails
TTL (Time-To-Live):
- Data expires after a set time
- Simple, good for data that changes infrequently
Cache levels:
- Application-level: In-memory (HashMap, LRU cache)
- Distributed: Redis, Memcached
- HTTP: Browser cache, CDN,
Cache-Controlheaders - Database: Query cache, materialized views
Cache invalidation approaches:
- TTL expiration (simplest)
- Event-driven invalidation (publish on write, subscribers clear cache)
- Version keys (change key when data changes)
Phil Karlton: "There are only two hard things in computer science: cache invalidation and naming things."
Follow-up
Follow-up: How do you handle cache invalidation?
Error handling best practices:
- Use a global error handler — catch unhandled errors in one place
- Distinguish error types:
- Operational errors (expected: invalid input, not found) — handle gracefully
- Programmer errors (bugs: null reference, type errors) — log and restart
- Return consistent error responses:
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Email is required",
"status": 400
}
}
- Never expose internal details to clients (stack traces, DB queries, file paths)
- Use appropriate HTTP status codes (400, 401, 403, 404, 422, 500)
Logging best practices:
- Structured logging — use JSON format for machine-readable logs
- Log levels: ERROR (failures), WARN (potential issues), INFO (key events), DEBUG (development)
- Include context: request ID, user ID, timestamp, operation name
- Centralized logging: aggregate logs with ELK stack, Datadog, or similar
- Correlation IDs: trace a request across multiple services
- Never log: passwords, tokens, personal data, credit card numbers
Follow-up
Follow-up: How do you avoid exposing internal error details to the client?
Concurrency means dealing with multiple tasks at the same time — tasks make progress by interleaving (one pauses while another runs). A single CPU can be concurrent.
Parallelism means executing multiple tasks simultaneously on different CPU cores. Requires multiple processors.
Concurrency is about structure; parallelism is about execution.
Handling concurrent requests in a backend:
Thread-based (Java, .NET, Go):
- Each request gets a thread from a thread pool
- Thread blocks while waiting for I/O (DB, HTTP calls)
- Simple mental model, but threads are expensive in memory (~1MB each)
Event loop / async I/O (Node.js, Python asyncio):
- Single-threaded event loop handles many requests
- I/O operations are non-blocking — the loop handles other requests while waiting
- CPU-bound work blocks the loop (offload to worker threads)
// Non-blocking — event loop continues while waiting for DB
async function getUser(id) {
const user = await db.query('SELECT * FROM users WHERE id = $1', [id]);
return user;
}
Go goroutines:
- Lightweight coroutines (~2KB) managed by the Go runtime
- True parallelism across CPU cores
- Channels for safe communication between goroutines
Race condition — when the outcome depends on the unpredictable timing of concurrent operations:
// Race condition — two requests read balance=100, both subtract 50
// Final balance is 50 instead of 0
const balance = await getBalance(userId); // both read 100
await setBalance(userId, balance - 50); // both write 50
Prevention strategies:
- Database transactions with row locking:
SELECT ... FOR UPDATE - Optimistic locking: version column, retry on conflict
- Atomic operations:
UPDATE accounts SET balance = balance - 50 WHERE balance >= 50 - Distributed locks: Redis
SET NX EXfor cross-service locks
Follow-up
Follow-up: What is a race condition and how do you prevent it?
Databases
(7)Main join types:
- INNER JOIN — returns only rows with matching values in both tables
- LEFT JOIN — returns all rows from left table + matching rows from right (NULL if no match)
- RIGHT JOIN — returns all rows from right table + matching rows from left
- FULL OUTER JOIN — returns all rows from both tables (NULL where no match)
- CROSS JOIN — returns the Cartesian product of both tables
-- Get all users with their orders (only users who have orders)
SELECT u.name, o.total
FROM users u
INNER JOIN orders o ON u.id = o.user_id;
-- Get all users, including those without orders
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
NULL behavior: NULLs never match in join conditions (NULL = NULL is false in SQL). Use IS NOT DISTINCT FROM or COALESCE if you need NULL-safe comparisons.
When to use:
- INNER JOIN — when you only want rows that exist in both tables
- LEFT JOIN — when you want all records from the primary table regardless of matches
Follow-up
Follow-up: What happens with NULL values in joins?
An index is a data structure (typically B-tree) that speeds up data retrieval at the cost of additional storage and slower writes.
When to add an index:
- Columns used frequently in
WHEREclauses - Columns used in
JOINconditions - Columns used in
ORDER BYorGROUP BY - Columns with high cardinality (many unique values)
- Foreign key columns
When to avoid:
- Small tables (full scan is fast enough)
- Columns with low cardinality (e.g., boolean columns)
- Tables with heavy write operations (inserts/updates)
- Columns rarely used in queries
Types:
- B-tree — general purpose, supports range queries (
<,>,BETWEEN), ordered - Hash — exact match only (
=), faster for equality lookups, no range support - Composite — index on multiple columns, order matters (leftmost prefix rule)
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
Trade-off: Indexes speed up reads but slow down writes because the index must be updated on every insert/update/delete.
Follow-up
Follow-up: What is the difference between a B-tree index and a hash index?
SQL (Relational):
- Structured data in tables with rows and columns
- Fixed schema, enforced by the database
- ACID transactions (Atomicity, Consistency, Isolation, Durability)
- Examples: PostgreSQL, MySQL, SQLite
NoSQL (Non-relational):
- Flexible data models: document, key-value, column-family, graph
- Schema-less or flexible schema
- BASE model (Basically Available, Soft state, Eventually consistent)
- Examples: MongoDB (document), Redis (key-value), Cassandra (column), Neo4j (graph)
Choose SQL when:
- Data has clear relationships and structure
- You need complex queries with joins
- ACID compliance is critical (financial data, transactions)
- Data integrity and consistency are top priority
Choose NoSQL when:
- Schema changes frequently or is unpredictable
- Horizontal scaling is needed (distributed systems)
- High write throughput is required
- Data is hierarchical or document-like (e.g., JSON)
Polyglot persistence: Yes, many projects use both — for example, PostgreSQL for user accounts and transactions, Redis for caching and sessions, Elasticsearch for search.
Follow-up
Follow-up: Can you use both SQL and NoSQL in the same project?
A transaction is a sequence of database operations that are treated as a single unit. Either all operations succeed (commit) or none of them apply (rollback). Transactions ensure data integrity even in the presence of failures or concurrent access.
ACID properties:
- Atomicity — all operations in a transaction succeed or none do. No partial updates.
- Consistency — a transaction brings the database from one valid state to another. Constraints and rules are always satisfied.
- Isolation — concurrent transactions don't interfere with each other. Each transaction sees a consistent view of the data.
- Durability — once committed, the transaction persists even if the system crashes (written to disk).
Example:
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- If either update fails, both are rolled back
Isolation levels (weakest to strongest):
| Level | Dirty Read | Non-Repeatable Read | Phantom Read |
|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible |
| Read Committed | Prevented | Possible | Possible |
| Repeatable Read | Prevented | Prevented | Possible |
| Serializable | Prevented | Prevented | Prevented |
- Dirty read — reading uncommitted data from another transaction
- Non-repeatable read — same row returns different values in the same transaction
- Phantom read — a query returns different rows if run twice (another transaction inserted/deleted)
Most databases default to Read Committed. PostgreSQL's default is Read Committed; MySQL InnoDB uses Repeatable Read.
Follow-up
Follow-up: What is an isolation level and what problems does each level prevent?
The N+1 query problem occurs when an application fetches N records and then executes one additional query for each record, resulting in N+1 total queries instead of the expected 1 or 2.
Example — N+1 problem:
// 1 query to get all posts
const posts = await Post.findAll();
// N queries — one per post to get its author
for (const post of posts) {
const author = await User.findById(post.authorId); // N queries!
console.log(post.title, author.name);
}
// Total: 1 + N queries
How to detect it:
- Enable SQL query logging in development and look for repeating patterns
- Use ORM tools like Django Debug Toolbar, Bullet gem (Rails), or Hibernate statistics
- Query count spikes when list size grows
Fix 1 — JOIN query:
SELECT posts.*, users.name AS author_name
FROM posts
JOIN users ON users.id = posts.author_id;
Fix 2 — Eager loading (ORM):
// Prisma — include related data in one query
const posts = await prisma.post.findMany({
include: { author: true },
});
// Sequelize
const posts = await Post.findAll({ include: [User] });
Fix 3 — DataLoader pattern (batching):
Used in GraphQL — collect all IDs, then fetch in a single query:
const userLoader = new DataLoader(async (ids) => {
const users = await User.findMany({ where: { id: { in: ids } } });
return ids.map(id => users.find(u => u.id === id));
});
The N+1 problem is one of the most common causes of poor API performance in applications using ORMs.
Follow-up
Follow-up: How does eager loading solve the N+1 problem?
A database migration is a version-controlled script that describes a change to the database schema (adding a table, renaming a column, adding an index) or to the data itself. Migrations allow schema evolution to be tracked in version control and applied consistently across all environments.
Why migrations matter:
- Schema changes are reproducible and reversible
- All environments (dev, staging, prod) stay in sync
- Team members share the same schema history
- Rollback is possible if something goes wrong
Migration file example (SQL):
-- migration: 20260401_add_email_verified_to_users.sql
ALTER TABLE users ADD COLUMN email_verified BOOLEAN NOT NULL DEFAULT FALSE;
CREATE INDEX idx_users_email_verified ON users(email_verified);
Tools: Flyway, Liquibase, Prisma Migrate, Alembic (Python), Rails ActiveRecord Migrations.
Safe production practices:
- Never modify migrations that have already been applied — create a new one
- Test migrations against a production-size data clone before applying
- Always have a rollback migration ready
- Use transactions for DDL operations where the database supports it
Zero-downtime migrations are needed when tables are large and the migration would lock the table for a long time (e.g., adding a non-nullable column, building an index).
Strategy — expand/contract (also called parallel change):
- Expand — add new column as nullable, deploy app that writes to both old and new column
- Backfill — populate existing rows in batches
- Contract — make column non-nullable, remove old column, deploy app that only uses new column
For index creation, use CREATE INDEX CONCURRENTLY in PostgreSQL to build without locking.
Follow-up
Follow-up: What is a zero-downtime migration and when do you need one?
A deadlock occurs when two or more transactions are each waiting for a lock held by the other, creating a cycle where none can proceed.
Classic example:
Transaction A: Transaction B:
LOCK row 1 (success) LOCK row 2 (success)
Wait for lock on row 2... Wait for lock on row 1...
← DEADLOCK →
-- Transaction A
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1; -- locks row 1
UPDATE accounts SET balance = balance + 100 WHERE id = 2; -- waits for row 2
-- Transaction B (concurrent)
BEGIN;
UPDATE accounts SET balance = balance - 50 WHERE id = 2; -- locks row 2
UPDATE accounts SET balance = balance + 50 WHERE id = 1; -- waits for row 1 → DEADLOCK
How databases resolve deadlocks: The database periodically checks for wait-for cycles. When detected, it picks a victim transaction (usually the one that has done the least work) and rolls it back, allowing the other to proceed. The application must retry the rolled-back transaction.
Prevention strategies:
- Consistent lock ordering — always acquire locks in the same order across all transactions (e.g., always lock the lower ID first)
- Keep transactions short — minimize the time locks are held
- Use lower isolation levels when strict isolation is not needed
- Optimistic concurrency — don't lock at read time, check for conflicts at write time
- Avoid user input during transactions — never hold a lock while waiting for user response
- SELECT FOR UPDATE SKIP LOCKED — skip locked rows instead of waiting, useful for job queues
Follow-up
Follow-up: How does a database detect and resolve a deadlock?
API Design
(3)REST principles:
- Resources identified by URLs (
/users,/users/1) - HTTP methods map to operations: GET (read), POST (create), PUT (full update), PATCH (partial update), DELETE (remove)
- Stateless — each request contains all information needed
- Proper status codes — 200, 201, 204, 400, 401, 403, 404, 500
Good API design:
GET /api/users — list users
GET /api/users/1 — get user by ID
POST /api/users — create user
PUT /api/users/1 — replace user
PATCH /api/users/1 — update user fields
DELETE /api/users/1 — delete user
GET /api/users/1/orders — nested resources
Best practices:
- Use plural nouns for resources (
/users, not/user) - Use query parameters for filtering (
/users?role=admin) - Paginate list endpoints (
?page=1&limit=20) - Return consistent error format with message and error code
- Use HATEOAS links for discoverability (optional)
Versioning strategies: URL path (/api/v2/users), header (Accept: application/vnd.api.v2+json), or query parameter (?version=2). URL path is most common.
Follow-up
Follow-up: How do you handle API versioning?
Authentication (AuthN) — verifying who you are. Authorization (AuthZ) — verifying what you can do.
| Aspect | Authentication | Authorization |
|---|---|---|
| Question | Who are you? | What can you access? |
| Example | Login with password | Admin vs. regular user |
| Happens | First | After authentication |
| Methods | Password, OAuth, biometrics | Roles, permissions, ACLs |
JWT-based authentication flow:
- User sends credentials (email + password) to
/auth/login - Server validates credentials, generates a JWT containing user ID and roles
- Server returns the JWT to the client
- Client stores the JWT (typically in memory or httpOnly cookie)
- Client sends JWT in
Authorization: Bearer <token>header with each request - Server verifies the JWT signature and extracts user info
JWT structure: header.payload.signature (Base64-encoded)
- Header: algorithm and token type
- Payload: claims (user ID, roles, expiration)
- Signature: ensures the token has not been tampered with
Security considerations: Use short expiration times, refresh tokens for long sessions, httpOnly cookies to prevent XSS.
Follow-up
Follow-up: Describe how JWT-based authentication works.
REST is an architectural style that uses HTTP methods and resource-oriented URLs. Each resource has a fixed response shape.
GraphQL is a query language for APIs that lets clients specify exactly what data they need in a single request.
Key differences:
| Aspect | REST | GraphQL |
|---|---|---|
| Endpoints | Multiple (/users, /posts) |
Single (/graphql) |
| Response shape | Fixed by server | Defined by client |
| Over/under-fetching | Common | Solved |
| Versioning | URL or headers | Schema evolution (deprecation) |
| Type system | OpenAPI (optional) | Built-in, enforced |
| Caching | HTTP cache (CDN-friendly) | Complex (POST requests) |
| Learning curve | Low | Higher |
GraphQL example:
query {
user(id: 42) {
name
email
posts(limit: 5) {
title
publishedAt
}
}
}
This fetches exactly what is needed in one request — no over-fetching (extra fields) or under-fetching (multiple requests).
Disadvantages of GraphQL:
- Caching is hard — queries are POST requests; HTTP caching doesn't work without extra tooling (persisted queries)
- N+1 problem — without DataLoader, deeply nested queries trigger many DB queries
- Schema complexity — requires maintaining a typed schema and resolvers
- Security — clients can craft expensive queries; need query depth/complexity limits
- File uploads — not natively supported, requires workarounds
- Overkill for simple APIs — REST is simpler when you control both client and server
When to prefer GraphQL: Multiple client types (web, mobile, third-party) with different data needs; data-heavy apps with complex, nested relationships.
Follow-up
Follow-up: What are the main disadvantages of GraphQL compared to REST?
DevOps
(3)Docker is a platform for building, running, and shipping applications in containers — lightweight, isolated environments that package code with all its dependencies.
Container vs Virtual Machine:
| Aspect | Container | Virtual Machine |
|---|---|---|
| Isolation | Process-level (shared kernel) | Full OS (own kernel) |
| Size | Megabytes | Gigabytes |
| Startup | Seconds | Minutes |
| Overhead | Minimal | Significant |
| Portability | Very high | High |
Image vs Container:
- Image — a read-only template (blueprint) with the app, dependencies, and config
- Container — a running instance of an image (like an object from a class)
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]
Key concepts:
Dockerfile— instructions to build an imagedocker-compose— define and run multi-container applications- Layers — each instruction creates a cached layer (speeds up rebuilds)
- Volumes — persist data outside the container lifecycle
Follow-up
Follow-up: Explain the difference between a Docker image and a container.
CI (Continuous Integration) is the practice of frequently merging code changes into a shared branch and automatically running tests and quality checks on each merge.
CD (Continuous Delivery/Deployment) extends CI by automatically preparing and optionally deploying the application to production after tests pass.
Typical backend CI/CD pipeline:
Code Push → Pull Request
↓
[CI Pipeline]
1. Install dependencies
2. Lint & type check
3. Unit tests
4. Integration tests
5. Build Docker image
6. Security scan (Trivy, Snyk)
↓
PR Approved → Merge to main
↓
[CD Pipeline]
7. Build & tag production image
8. Deploy to staging
9. Run E2E tests against staging
10. Deploy to production (canary → full)
11. Run smoke tests
12. Notify team
Continuous Delivery vs Continuous Deployment:
- Continuous Delivery — every commit is releasable, but a human approves the production deployment. The pipeline automates everything up to staging.
- Continuous Deployment — every commit that passes all tests is automatically deployed to production. No human gate.
Best practices:
- Keep pipelines fast (< 10 minutes) — developers wait for feedback
- Fail fast — run the cheapest checks first
- Deploy the same artifact across all environments (build once, deploy many)
- Use feature flags to decouple deployment from feature release
- Maintain deployment rollback capability
Popular tools: GitHub Actions, GitLab CI, CircleCI, Jenkins, ArgoCD (GitOps).
Follow-up
Follow-up: What is the difference between continuous delivery and continuous deployment?
Vertical scaling (scale up) means adding more resources (CPU, RAM, disk) to an existing server.
Horizontal scaling (scale out) means adding more server instances and distributing load across them using a load balancer.
Comparison:
| Aspect | Vertical | Horizontal |
|---|---|---|
| How | Bigger machine | More machines |
| Cost | Exponentially expensive at high end | Linear cost |
| Downtime | Often requires restart | Zero-downtime with rolling deploys |
| Limit | Hardware maximum | Theoretically unlimited |
| Complexity | Simple | Requires stateless design, load balancing |
| Single point of failure | Yes | No |
When to choose vertical:
- Stateful applications that are hard to distribute (legacy monoliths)
- Database servers (easier to manage, ACID on one node)
- Quick fix for immediate capacity needs
- Cost is still linear (cloud instance upgrades)
When to choose horizontal:
- Stateless services (web servers, APIs) — trivial to distribute
- Need fault tolerance (one instance failure doesn't affect availability)
- Traffic patterns have large peaks (auto-scale up and back down)
- Cost optimization at scale
Challenges of horizontal scaling:
- Session management — sessions can't live in server memory; use Redis or JWTs
- Distributed state — caches, locks, counters must be shared (Redis, ZooKeeper)
- Data consistency — multiple writers to a distributed database
- Sticky sessions — some protocols require the same server for a session
- Network overhead — inter-service communication adds latency
- Observability — logs and metrics from many instances need aggregation
Follow-up
Follow-up: What challenges does horizontal scaling introduce?
Security
(4)SQL injection is an attack where malicious SQL code is inserted into an input field and executed by the database. It is consistently in the OWASP Top 10 most critical web application vulnerabilities.
Example of vulnerable code:
# DANGEROUS — never do this
query = f"SELECT * FROM users WHERE email = '{user_input}'"
# Attacker input: ' OR '1'='1
# Resulting query: SELECT * FROM users WHERE email = '' OR '1'='1'
# Returns all rows!
Prevention — use parameterized queries:
# SAFE — parameterized query
cursor.execute("SELECT * FROM users WHERE email = ?", (user_input,))
# Node.js / PostgreSQL
const result = await pool.query(
'SELECT * FROM users WHERE email = $1',
[userInput]
);
Additional defenses:
- ORM/query builders — most ORMs (Prisma, SQLAlchemy, Hibernate) use parameterized queries by default
- Input validation — validate and sanitize all user input before processing
- Least privilege — database users should have only the permissions they need
- Web Application Firewall (WAF) — as a last line of defense, not a primary control
- Stored procedures — can help but are not foolproof if they concatenate strings internally
Prepared statements and parameterized queries are equivalent: both separate SQL code from data so user input is always treated as a value, never executable code.
Follow-up
Follow-up: What is the difference between prepared statements and parameterized queries?
CORS (Cross-Origin Resource Sharing) is a browser security mechanism that restricts HTTP requests made from a different origin (protocol + domain + port) than the server's origin. It prevents malicious websites from making authenticated requests to your API using the visitor's credentials.
How it works:
- Browser sends an HTTP request with an
Originheader - Server responds with
Access-Control-Allow-Originheader - Browser allows or blocks the response based on the header
Configuring CORS in Node.js (Express):
import cors from 'cors';
app.use(cors({
origin: ['https://myapp.com', 'https://staging.myapp.com'],
methods: ['GET', 'POST', 'PUT', 'DELETE', 'PATCH'],
allowedHeaders: ['Content-Type', 'Authorization'],
credentials: true,
maxAge: 86400,
}));
Common mistakes:
- Using
origin: '*'withcredentials: true— browsers block this combination - Allowing all origins in production — exposes API to any website
- Forgetting to allow custom headers like
Authorization
Preflight requests: The browser automatically sends an OPTIONS request before cross-origin requests that use non-simple methods (PUT, DELETE, PATCH) or custom headers. The server must respond with appropriate CORS headers for the actual request to proceed. Set maxAge to cache the preflight response and reduce OPTIONS requests.
Follow-up
Follow-up: What is a preflight request and when does the browser send one?
The OWASP Top 10 is a standard list of the most critical web application security risks, updated periodically by the Open Web Application Security Project.
Key vulnerabilities:
1. Broken Access Control (A01)
Users can access resources or perform actions beyond their permissions. Example: a regular user accessing /admin/users or modifying another user's data by changing the user_id in a request.
Prevention: enforce authorization on every endpoint, validate ownership of resources server-side, deny by default.
2. Cryptographic Failures (A02) Sensitive data (passwords, credit cards, health data) exposed due to weak encryption or no encryption. Example: storing passwords in plaintext or using MD5 for password hashing.
Prevention: hash passwords with bcrypt/Argon2, use TLS everywhere, never store sensitive data you don't need.
3. Injection (A03) Untrusted data sent to an interpreter as part of a command. SQL injection is the classic example; others include command injection, LDAP injection, and NoSQL injection.
Prevention: parameterized queries, input validation, allowlists.
4. Security Misconfiguration (A05) Default credentials, unnecessary features enabled, verbose error messages, missing security headers.
Prevention: security hardening checklists, infrastructure-as-code, regular audits.
5. Identification and Authentication Failures (A07) Weak passwords allowed, no rate limiting on login, sessions not invalidated on logout.
Prevention: MFA, rate limiting, secure session management, strong password policies.
Follow-up
Follow-up: How would you protect against broken access control?
Never store passwords in plaintext. If your database is compromised, plaintext passwords expose users on every service where they reused that password.
Correct approach — use a password hashing algorithm:
Password hashing algorithms are specifically designed to be slow (computationally expensive), making brute-force attacks impractical.
Recommended algorithms (in order of preference):
- Argon2id — winner of the Password Hashing Competition (2015), best current choice
- bcrypt — widely supported, battle-tested since 1999
- scrypt — memory-hard, good alternative to bcrypt
Implementation with bcrypt (Node.js):
import bcrypt from 'bcrypt';
const SALT_ROUNDS = 12;
async function hashPassword(plaintext) {
return bcrypt.hash(plaintext, SALT_ROUNDS);
}
async function verifyPassword(plaintext, hash) {
return bcrypt.compare(plaintext, hash);
}
// Usage
const hash = await hashPassword('myP@ssw0rd');
await verifyPassword('myP@ssw0rd', hash); // true
await verifyPassword('wrongpassword', hash); // false
Why MD5/SHA-256 are unsuitable for passwords:
- They are designed to be fast — a GPU can compute billions of SHA-256 hashes per second
- No built-in salting — identical passwords produce identical hashes, enabling rainbow table attacks
- bcrypt with cost factor 12 takes ~250ms; SHA-256 takes nanoseconds
Additional best practices:
- Use a unique random salt per password (bcrypt/Argon2 do this automatically)
- Set a minimum password length (12+ characters)
- Check passwords against breach databases (HaveIBeenPwned API)
- Never log passwords, even hashed ones
Follow-up
Follow-up: Why is MD5 or SHA-256 not suitable for password hashing?
Caching
(3)Redis is an in-memory data structure store that can be used as a database, cache, and message broker. Because all data lives in RAM, read and write operations are typically completed in under a millisecond.
Core data structures:
- String — simple key-value, counters (
INCR,DECR) - Hash — field-value pairs within a key (user objects)
- List — ordered sequences, stacks and queues
- Set — unique unordered members (tags, unique visitors)
- Sorted Set — members with scores, leaderboards
- Stream — append-only log for event sourcing
Common use cases:
# Session storage
SET session:abc123 '{"userId":1,"role":"admin"}' EX 3600
# Rate limiting (sliding window counter)
INCR rate:user:42
EXPIRE rate:user:42 60
# Leaderboard
ZADD leaderboard 1500 "player:42"
ZRANGE leaderboard 0 9 WITHSCORES REV
# Pub/sub for real-time notifications
PUBLISH notifications '{"type":"order_shipped","orderId":99}'
Persistence modes:
- RDB (Redis Database) — periodic snapshots to disk. Compact, fast restarts, but may lose data since last snapshot.
- AOF (Append-Only File) — logs every write operation. More durable (can configure
fsyncper second or always), larger files, slower restarts. - No persistence — pure cache mode, fastest, all data lost on restart.
For production caches choose RDB or no persistence. For durable storage combine both modes.
Follow-up
Follow-up: What is the difference between Redis persistence modes RDB and AOF?
Rate limiting controls how many requests a client can make in a given time window. It protects the API from abuse, brute-force attacks, and accidental overload.
Common algorithms:
Fixed Window:
- Count requests in fixed time buckets (e.g., 100 requests per minute)
- Simple and memory-efficient
- Weakness: burst traffic at window boundaries (200 requests in 2 seconds spanning two windows)
Sliding Window:
- Count requests in a rolling time window from each request's perspective
- More accurate, prevents boundary bursts
- Slightly more complex to implement
Token Bucket:
- Tokens are added at a constant rate, each request consumes a token
- Allows short bursts up to bucket capacity
- Good for APIs that want to allow occasional bursts
Implementation with Redis:
async function isRateLimited(userId, limit = 100, windowSec = 60) {
const key = `rate:${userId}:${Math.floor(Date.now() / (windowSec * 1000))}`;
const count = await redis.incr(key);
if (count === 1) {
await redis.expire(key, windowSec);
}
return count > limit;
}
Response headers to include:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1700000060
Retry-After: 30 (when 429 is returned)
Return HTTP 429 Too Many Requests when the limit is exceeded. Rate limit by IP for unauthenticated endpoints and by user ID for authenticated ones.
Follow-up
Follow-up: What is the difference between the fixed window and sliding window algorithms?
A CDN (Content Delivery Network) is a geographically distributed network of servers (edge nodes) that cache and serve content from the location closest to the user, reducing latency and offloading traffic from the origin server.
How it helps:
- Reduced latency — user fetches assets from a nearby edge node instead of a distant origin server
- Reduced origin load — cached responses are served by the CDN, not your server
- DDoS protection — CDNs can absorb large traffic spikes and filter malicious traffic
- Automatic compression — Gzip/Brotli compression without origin server work
- TLS termination — CDN handles HTTPS handshakes, reducing origin CPU load
What to cache on a CDN:
- Static assets: images, CSS, JavaScript bundles, fonts
- Publicly accessible API responses that change infrequently
- Server-rendered HTML pages with appropriate
Cache-Controlheaders
Cache control headers:
Cache-Control: public, max-age=31536000, immutable # Static assets with hash
Cache-Control: public, max-age=300, stale-while-revalidate=60 # API responses
Cache-Control: no-store # Sensitive/private data
NOT suitable for CDN caching:
- Authenticated or personalized content (user dashboards, account pages)
- Real-time data that must be fresh (live stock prices, chat messages)
- POST/PUT/DELETE requests — CDNs only cache GET/HEAD by default
- Responses with
Set-Cookieheaders — CDN may strip cookies
Popular CDNs include Cloudflare, AWS CloudFront, Fastly, and Akamai.
Follow-up
Follow-up: What types of content are NOT suitable for CDN caching?
Messaging
(3)A message queue is a form of asynchronous communication between services where a producer sends a message to a queue and a consumer reads and processes it independently. The producer does not wait for the consumer to finish.
When to use a message queue instead of a direct API call:
- Decoupling — producer and consumer don't need to be running at the same time
- Absorbing traffic spikes — queue buffers bursts so consumers process at a steady rate
- Reliability — if the consumer is down, messages wait in the queue rather than being lost
- Long-running tasks — image processing, email sending, PDF generation — don't block HTTP responses
- Fan-out — one event needs to trigger multiple independent consumers (order placed → send email + update inventory + notify warehouse)
Real-world example:
User places order → API returns 200 immediately
↓
[Order Queue]
↓
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Email Worker │ │Inventory Svc │ │Warehouse Svc │
└──────────────┘ └──────────────┘ └──────────────┘
Delivery guarantees:
- At-most-once — message delivered 0 or 1 times, possible loss (fire and forget)
- At-least-once — message delivered 1 or more times, possible duplicates (most queues)
- Exactly-once — delivered exactly once, hardest to achieve, requires idempotency
Popular tools: RabbitMQ, AWS SQS, Apache Kafka, Google Pub/Sub, Redis Streams.
Always design consumers to be idempotent — processing the same message twice should produce the same result as processing it once.
Follow-up
Follow-up: What guarantees does a message queue provide around message delivery?
Apache Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant, real-time data pipelines. It is fundamentally different from traditional message queues like RabbitMQ.
Key differences:
| Aspect | Traditional Queue (RabbitMQ) | Kafka |
|---|---|---|
| Model | Push to consumer | Pull by consumer |
| Message retention | Deleted after consumption | Retained for configurable period |
| Ordering | Per-queue | Per-partition |
| Replay | Not supported | Supported (seek to any offset) |
| Throughput | Millions/day | Millions/second |
| Consumer | Competes for messages | Each consumer group gets all messages |
Core concepts:
- Topic — a named stream of messages (like a table in a DB)
- Partition — topics are split into partitions for parallelism. Messages within a partition are ordered.
- Offset — sequential ID for each message in a partition. Consumers track their position.
- Producer — writes messages to topics
- Consumer — reads messages from topics
- Broker — a Kafka server. A cluster has multiple brokers for redundancy.
Consumer groups: A consumer group is a set of consumers that cooperate to consume a topic. Kafka distributes partitions across consumers in the group — each partition is consumed by exactly one consumer in the group at a time. Multiple groups can read the same topic independently, enabling fan-out.
When to choose Kafka over a queue:
- You need event replay or audit logs
- Multiple independent systems need the same events
- Very high throughput (millions of events per second)
- Event sourcing or CQRS architecture
Follow-up
Follow-up: What is a consumer group in Kafka?
Synchronous communication means the caller waits for the response before continuing. HTTP REST and gRPC calls are typical examples. The caller is blocked until the callee responds.
Asynchronous communication means the caller sends a message and continues without waiting. The response (if any) arrives later. Message queues and event buses are common mechanisms.
Comparison:
| Aspect | Synchronous | Asynchronous |
|---|---|---|
| Coupling | Tight (caller needs callee available) | Loose (independent availability) |
| Latency | Response time = callee processing time | Near-instant acknowledgment |
| Complexity | Simpler mental model | More complex (ordering, retries, idempotency) |
| Failure handling | Easy — callee returns an error | Harder — dead letter queues, retries |
| Use case | Queries, real-time results needed | Background work, fan-out, high throughput |
Event-driven architecture trade-offs:
Advantages:
- Services are decoupled and independently deployable
- Natural scalability — add more consumers to handle load
- Resilience — services can tolerate each other's downtime
- Easy audit trail if events are stored durably
Disadvantages:
- Eventual consistency — data may be temporarily inconsistent across services
- Debugging is harder — tracing a request across multiple async steps requires distributed tracing
- Ordering guarantees are complex in distributed systems
- At-least-once delivery requires idempotent consumers
- Increased infrastructure complexity (queue management, dead letter handling)
Follow-up
Follow-up: What are the trade-offs of event-driven architecture?
Monitoring
(3)Monitoring tells you whether a system is working — it answers predefined questions using dashboards and alerts. You know what to look for ahead of time.
Observability is the ability to understand the internal state of a system from its external outputs — it answers unknown questions. A system is observable if you can diagnose new problems without deploying new instrumentation.
The three pillars of observability:
1. Metrics Numerical measurements over time. Aggregated, low-cardinality data ideal for dashboards and alerting.
http_request_duration_seconds{method="GET", route="/users", status="200"} 0.045
error_rate 0.02
queue_depth 142
Tools: Prometheus, Datadog, CloudWatch
2. Logs Immutable, timestamped records of events. High detail but high volume. Best for debugging specific requests.
{"level":"error","time":"2026-04-01T10:00:00Z","requestId":"abc-123","userId":42,"msg":"Payment failed","reason":"card_declined"}
Tools: ELK Stack, Loki, Datadog Logs
3. Traces End-to-end records of a request's journey across services. Each step is a span with timing, metadata, and parent-child relationships.
[request abc-123]
├── API Gateway 5ms
├── Auth Service 12ms
├── Order Service 38ms
│ ├── DB query 25ms
│ └── Redis get 3ms
└── Payment Service 95ms ← bottleneck
Tools: Jaeger, Zipkin, Datadog APM, OpenTelemetry
The key enabling standard is OpenTelemetry — a vendor-neutral SDK for emitting all three signals from your application code.
Follow-up
Follow-up: What are the three pillars of observability?
SLO (Service Level Objective) is a target value for a service reliability metric. It defines what "good enough" means for your service from the user's perspective.
Related terms:
- SLI (Service Level Indicator) — the actual measurement (e.g., % of requests under 200ms)
- SLO — the target for that measurement (e.g., 99.5% of requests under 200ms)
- SLA (Service Level Agreement) — a contractual commitment to customers with consequences for violation
Common SLIs and SLOs:
| SLI | Example SLO |
|---|---|
| Availability | 99.9% of requests return non-5xx responses |
| Latency | 95% of requests complete in < 200ms, 99% < 1s |
| Error rate | < 0.1% of requests result in errors |
| Throughput | Process > 1000 events per second |
Defining a good SLO:
- Start from user experience — what degradation do users actually notice?
- Look at historical data — what have you actually achieved?
- Set a target slightly below your historical best to leave headroom
- Measure at the user-facing boundary, not internal components
Error budget: If your SLO is 99.9% availability, you have 0.1% allowed downtime — about 43 minutes per month. This is your error budget.
Teams use error budgets to balance reliability and velocity:
- Budget remaining → deploy freely, experiment, take risks
- Budget nearly exhausted → freeze deployments, focus on reliability
- Budget exhausted → postmortem required, no new features until reliability improves
This framework (from Google's SRE book) gives engineering teams a rational, data-driven way to make reliability decisions.
Follow-up
Follow-up: What is an error budget and how do teams use it?
Structured logging means emitting logs as machine-readable key-value pairs (typically JSON) rather than unstructured text strings.
Unstructured log (hard to query):
[2026-04-01 10:23:15] ERROR Failed to process payment for user 42, order 99, reason: card_declined
Structured log (queryable, filterable):
{
"timestamp": "2026-04-01T10:23:15Z",
"level": "error",
"message": "Payment processing failed",
"userId": 42,
"orderId": 99,
"reason": "card_declined",
"service": "payment-service",
"requestId": "req-abc-123",
"durationMs": 145
}
Why structured logging matters:
- Log aggregation platforms (Datadog, Splunk, Loki) can index and query fields
- You can filter:
level:error AND service:payment-service— instantly - Metrics can be derived from logs (error rate, p99 latency)
- Consistent format makes automated alerting reliable
Correlation IDs in distributed systems:
When a user request flows through multiple microservices (API gateway → order service → payment service → notification service), each service generates its own logs. Without a shared identifier, it is impossible to stitch together the full trace of a single request.
A correlation ID (also called request ID or trace ID) is a unique identifier generated at the entry point and propagated through every downstream call via HTTP headers:
X-Request-ID: req-abc-123
Each service includes this ID in every log line. When debugging an issue, you filter all logs by the correlation ID and instantly see the complete request journey across all services — timestamps, durations, errors — without guessing which log entries belong together.
Follow-up
Follow-up: How do correlation IDs help with debugging across microservices?
Use these questions in your next interview
Import all 31 questions into Intervy with one click. Add scoring rubrics, organize by template, and conduct structured interviews.
Try Intervy Free