Warming up the neural circuits...
By the end of this chapter you will:
User-supplied files are the single highest-risk your backend will ever accept. A
.jpgthat contains a PHP shell, a.xlsxwith a 1 KB body that decompresses into 5 GB, a filename that navigates to../../etc/passwd— all of these come in through your upload endpoint. This chapter is how you survive them.
The browser or mobile app does not send "data" — it sends bytes with a Content-Type header that tells your server how to read those bytes.
| Content-Type | What it looks like | When it is used |
|---|---|---|
application/json | {"name":"Om","age":29} | Default for SPAs / mobile apps. Use this 95% of the time. |
application/x-www-form-urlencoded | name=Om&age=29 | Old <form> without enctype, some webhooks. |
multipart/form-data | Each field separated by a boundary string; can carry files | Anything that uploads a file. |
If a request contains a file, it MUST be multipart/form-data. JSON cannot carry binary efficiently — base64-ing a 5 MB image inside JSON inflates it to ~6.7 MB and is a known anti-pattern.
NestJS uses multer under the hood via @nestjs/platform-express.
import {
Controller, Post, UploadedFile, UseInterceptors, BadRequestException, Body,
} from '@nestjs/common';
import { FileInterceptor } from '@nestjs/platform-express';
import { memoryStorage } from 'multer
FileInterceptor('document', ...) — the form field name MUST be document.FilesInterceptor('docs', 5, ...) — multiple files in one field (max 5).FileFieldsInterceptor([{name:'photo'},{name:'pdf'}]) — different fields, each with their own files.Always set limits.fileSize on the interceptor. Without it, a malicious client can stream a 50 GB file at you and hold the connection open until your server runs out of memory or disk.
Already enforced by limits.fileSize. Reject early so you never buffer a huge file.
The original filename is attacker-controlled. Do not save it as-is, do not use it in S3 keys, do not echo it back unsanitised in HTML.
// ❌ Path traversal — attacker uploads "../../etc/passwd"
const dest = path.join('/uploads', file.originalname);
// ✅ Generate a new name; preserve only the extension after sanitising it
import { extname } from 'path';
import { v4 as
file.mimetype is what the client claims. An attacker can send whatever they want. This is not sufficient — move to step 4.
Every real file format starts with a known byte signature. PDFs start with %PDF-. JPEGs start with FF D8 FF. A .jpg file that contains a PHP web shell will fail this check.
import { fileTypeFromBuffer } from 'file-type';
const detected = await fileTypeFromBuffer(file.buffer);
if (!detected) throw new BadRequestException('Unrecognised file');
const allowed =
file-type is the canonical npm library for magic-byte sniffing. Use it for every upload endpoint.
A 100 KB PNG can decode into a 1 GB bitmap (a "decompression bomb") and crash your image processor. Cap dimensions before any processing.
import sharp from 'sharp';
const meta = await sharp(file.buffer).metadata();
if ((meta.width ?? 0) > 8000 || (meta.height ?? 0
Run uploaded files through ClamAV (free, self-hosted) or a cloud AV . Run the scan before the file becomes accessible to other users. Enqueue it as an job (Chapter 15) and mark virus_scan_status when it completes.
A common need: upload a document AND submit a structured object alongside it. With multipart/form-data, every field is a string — you cannot send nested JSON natively.
Frontend builds a FormData like this:
document: <File>
payload: '{"customer_id":"abc","remarks":"front of PAN"}' ← stringifiedBackend parses it:
@Post('document')
@UseInterceptors(FileInterceptor('document'))
async upload(
@UploadedFile() file: Express
It is tempting to skip class-validator because the data came from a string. Don't. Multipart bypasses NestJS's global ValidationPipe for nested fields — you must explicitly call validateOrReject.
memoryStorage() | diskStorage() | |
|---|---|---|
| Where the file goes | RAM (file.buffer) | Local disk (file.path) |
| Good for | Files you forward to S3 immediately | Files you process line-by-line and discard |
| Bad for | Files larger than ~10 MB (eats heap) | Containerised apps (disk is ephemeral) |
In this project: always use memoryStorage() and stream to S3. The local disk on an app container vanishes on every deploy.
{
fileSize: 5 * 1024 * 1024, // 5 MB per file
files: 5, // max 5 files per request
fields: 20, // max 20 non-file fields
fieldNameSize: 100,
fieldSize: 1024 * 100, // 100 KB per text field
Without these, a single request can DoS your server with a million tiny fields.
Once validation passes, the flow is always the same:
1. Validate file (size, magic bytes, AV)
2. Compute a key: kyc/{user_id}/{uuid}.{ext}
3. Upload to S3 (server-side encrypted, private)
4. INSERT a row into the `documents` table:
id, owner_id, type, s3_key, content_type,
size_bytes, sha256_hash, virus_scan_status
5. Return the document ID to the client (NOT the S3 URL)Why store metadata in the DB and bytes in S3? Because databases are slow and expensive at storing blobs, you want to query documents via , and you may want different retention for the DB row vs. the S3 object.
| Mistake | Why it matters |
|---|---|
Trusting file.mimetype | Attacker-controlled. Verify with magic bytes. |
| Saving the original filename | Path traversal or content injection. Generate your own. |
No fileSize limit | Request buffers 2 GB into heap — crash. |
Allowing .html, .svg, .exe | SVG can carry XSS; HTML served back is stored XSS. |
| Serving uploads from the same origin as the app | Auth cookies are readable via XSS in uploaded content. |
| Keeping the file on the app server disk | Container restarts delete the file. |
| Skipping AV scan because "it's only PDFs" | PDFs carry payloads. Scan everything. |
Validate in this order: size limit → magic bytes → dimensions → AV scan. Each step is cheap relative to the one after it. Fail early and fail with a specific error. Never save the original filename anywhere.