Processing Documents Without Storing Them
· 4 min · docsdingo.com
The Privacy Problem
Most online PDF tools have a dirty secret: they keep your files. You upload a contract, a tax return, or a medical form, and it sits on their servers indefinitely. Some use uploaded documents for analytics or training data. Others just never bother to delete them. When I built Docs Dingo, the first design principle was that files should exist on our infrastructure for the absolute minimum time required to process them, and then be automatically destroyed.
The Zero-Retention Approach
Every file uploaded to Docs Dingo is tagged with an automatic expiration. Processed results get a short-lived download link. Once the link expires, both the uploaded file and the processed output are deleted. No document content is logged, indexed, or retained in any form. For lightweight operations like page counting or text extraction, the file never leaves the user's browser at all -- those tasks run entirely client-side.
This approach means there is no growing archive of user documents, no compliance burden around stored personal data, and no risk of a breach exposing files we never should have kept in the first place.
Handling 100MB Files Serverlessly
Serverless functions have strict payload size limits, which creates an obvious problem for a tool that needs to handle large PDFs. The solution is to separate the upload path from the processing path. Files are uploaded directly to cloud storage using temporary credentials, bypassing the function's size restrictions entirely. The processing function then reads the file from storage, does its work, and writes the result back. The user's browser never sends the file through the API layer at all.
This design handles documents up to 100MB without hitting any infrastructure limits. It also means uploads are faster because the file goes straight to storage rather than passing through an intermediary.
What It Actually Does
Docs Dingo covers the PDF tasks people hit constantly: merging multiple PDFs into one (preserving bookmarks and links), splitting out specific page ranges, compressing file sizes by 40-70% without visible quality loss, and converting between PDF and image formats. Every operation follows the same pattern: upload, process, download, delete. Nothing persists.
The platform runs serverlessly, so there are no idle servers and costs scale with actual usage. Whether one person processes a file today or a thousand do, the infrastructure handles it without any manual intervention.