Limits & Accuracy Details

What to know about limits and accuracy

An honest list of where RedactVault's automated detection and export can fall short, why each limitation exists, and what to do about it. Read this before you trust any redacted file you release.

Four habits that catch almost everything

  • Treat automated detection as a first pass that gets you most of the way, not as a final answer.
  • Always read the document and confirm what was caught and what was missed before exporting.
  • Open the exported file in a different reader and try to select text under the redactions. If you can select it, it was not removed.
  • For high-sensitivity material, default to rasterized PDF export and check the metadata in Document Properties after export.

Automated detection misses things, especially context

The detection engines are good at patterns: names that look like names, numbers shaped like phone numbers or account numbers, addresses, dates of birth. They are not good at meaning. A phrase like "the claimant's eldest daughter" identifies a real person but contains no pattern to match. Anything where the sensitivity comes from context rather than format will need a human pass.

What to do about it: Read the document yourself before exporting. Use the search-and-redact tool to sweep for names and terms you know are sensitive but might not be flagged as patterns.

OCR quality bounds everything else

For scanned PDFs and images, detection only sees what OCR extracts. If the scan is faint, skewed, low-resolution, or has handwritten content, OCR will miss text — and anything OCR misses, the detector cannot flag. There is no way for the tool to redact text it cannot read.

What to do about it: For scanned legal exhibits, work from the highest-quality scan you can obtain. After redaction, scroll the document and visually check anywhere OCR might have struggled (margins, footers, stamps, signatures).

PDF, DOCX, and image workflows are not interchangeable

Each file type has its own export behavior. PDFs have two export modes (rasterized and native) with different tradeoffs. DOCX preserves a virtualized layout with text-to-position mapping that occasionally needs progressive refinement. Images are pixel-based, so redactions are burned into the image directly. A workflow you tested on PDFs will not behave identically on DOCX or images.

What to do about it: Test each file type you actually use, not just the dominant one. The verification steps that catch a leak in one format may not be the right ones for another.

Native PDF export and rasterized PDF export make different bets

Native PDF export keeps the text layer for unredacted parts of the document, which means smaller files and searchable output. It removes underlying content under each redaction and runs verification before download — and if a page cannot be verified, it falls back to rasterizing that page rather than letting an uncertain export through. Rasterized export converts every page to an image up front, which is the strongest assurance but produces larger, non-searchable files.

What to do about it: Default to rasterized when assurance matters more than file size. Use native when downstream searchability matters and the document is straightforward.

Device resources cap what is possible in the browser

Because processing happens in your browser, the limit on file size, page count, and detection model speed is set by the device you are using — its memory, its CPU, and how much your browser is willing to allocate to one tab. Very large documents on modest hardware can be slow, and the AI-based accurate detector loads a model into the tab that takes memory of its own.

What to do about it: For unusually large documents, use a desktop browser rather than mobile, close other heavy tabs, and run the fast detector first to scope the work before enabling the accurate detector.

Offline support is partial, not absolute

Offline support works inside an already-active paid session for some workflows, after the page and assets have been loaded. It does not cover every detection engine, and you cannot start a fresh session without connectivity — sign-in, plan checks, and initial asset load all need the network.

What to do about it: If you need offline reliability, load the app and the document while connected, then do not refresh the tab until you have finished and exported.

Read this with the security architecture page

In-browser processing protects the file from ever leaving your device. This page covers the other side of the picture: where the automated work falls short and what your team still has to do by hand. Both pages are needed for an honest read on what the tool can and cannot do.