Back to articles
NestJSMay 15, 20262 min read

Scaling NestJS APIs with Caching, Queues, and Connection Pooling

Learn architectural patterns to optimize NestJS API performance, scale throughput, and reduce PostgreSQL database latency under heavy request loads.

As your backend applications grow, simple monolithic setups can quickly run into scaling bottlenecks. In this deep dive, we'll cover key performance patterns to supercharge your NestJS APIs.

1. Fast Cache Layer (Redis Cache-Aside Pattern)

Database calls are expensive. By caching heavy read requests (like YouTube aggregates or user portfolios), you can cut latency from 200ms down to < 5ms.

In NestJS, we implement cache-aside cleanly with an interceptor or by directly wrapping services:

typescript read-only
@Injectable()
export class CacheService {
  constructor(@Inject('REDIS_CLIENT') private readonly redis: Redis) {}

async getOrFetch<T>(key: string, ttl: number, fetchFn: () => Promise<T>): Promise<T> { const cached = await this.redis.get(key); if (cached) return JSON.parse(cached);

const fresh = await fetchFn(); await this.redis.set(key, JSON.stringify(fresh), 'EX', ttl); return fresh; } }

2. Background Queue Processing (BullMQ)

Never block the client-facing HTTP main thread. If a user triggers a heavy process (like generating a PDF invoice or sending a transaction email), dispatch it off to a persistent background queue.

typescript read-only
@Injectable()
export class OrderService {
  constructor(@InjectQueue('orders') private orderQueue: Queue) {}

async placeOrder(dto: PlaceOrderDto) { const order = await this.db.save(dto); // Offload heavy notifications/sync to queues await this.orderQueue.add('process-sync', { orderId: order.id }, { attempts: 3, backoff: 5000 });

return { status: 'QUEUED', orderId: order.id }; } }

3. Database Connection Pooling with pg-pool

Under heavy traffic, opening and closing a TCP connection for every single PostgreSQL query is disastrous. Connection pools keep a set of persistent database channels open and ready to use, maximizing query execution rates.

Always configure the max pool size, connection timeout, and idle timeout in your NestJS Prisma or TypeORM configuration based on your CPU/RAM limitations:

typescript read-only
const poolConfig = {
  max: 20, // Max active clients
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
};

By applying these three simple concepts—caching reads, queuing writes, and pooling connections—your backend will comfortably handle thousands of concurrent queries without sweat.

P
Pratik Sangani
Backend Developer & Architect
Get in touch