Scaling NestJS APIs with Caching, Queues, and Connection Pooling
Learn architectural patterns to optimize NestJS API performance, scale throughput, and reduce PostgreSQL database latency under heavy request loads.
As your backend applications grow, simple monolithic setups can quickly run into scaling bottlenecks. In this deep dive, we'll cover key performance patterns to supercharge your NestJS APIs.
1. Fast Cache Layer (Redis Cache-Aside Pattern)
Database calls are expensive. By caching heavy read requests (like YouTube aggregates or user portfolios), you can cut latency from 200ms down to < 5ms.
In NestJS, we implement cache-aside cleanly with an interceptor or by directly wrapping services:
@Injectable()
export class CacheService {
constructor(@Inject('REDIS_CLIENT') private readonly redis: Redis) {}
async getOrFetch<T>(key: string, ttl: number, fetchFn: () => Promise<T>): Promise<T> {
const cached = await this.redis.get(key);
if (cached) return JSON.parse(cached);
const fresh = await fetchFn();
await this.redis.set(key, JSON.stringify(fresh), 'EX', ttl);
return fresh;
}
}
2. Background Queue Processing (BullMQ)
Never block the client-facing HTTP main thread. If a user triggers a heavy process (like generating a PDF invoice or sending a transaction email), dispatch it off to a persistent background queue.
@Injectable()
export class OrderService {
constructor(@InjectQueue('orders') private orderQueue: Queue) {}
async placeOrder(dto: PlaceOrderDto) {
const order = await this.db.save(dto);
// Offload heavy notifications/sync to queues
await this.orderQueue.add('process-sync', { orderId: order.id }, {
attempts: 3,
backoff: 5000
});
return { status: 'QUEUED', orderId: order.id };
}
}
3. Database Connection Pooling with pg-pool
Under heavy traffic, opening and closing a TCP connection for every single PostgreSQL query is disastrous. Connection pools keep a set of persistent database channels open and ready to use, maximizing query execution rates.
Always configure the max pool size, connection timeout, and idle timeout in your NestJS Prisma or TypeORM configuration based on your CPU/RAM limitations:
const poolConfig = {
max: 20, // Max active clients
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
};
By applying these three simple concepts—caching reads, queuing writes, and pooling connections—your backend will comfortably handle thousands of concurrent queries without sweat.