PDF RedactionPDF Redaction

Can Redacted Text Still Be Recovered From a PDF?

Yes, more often than people think. A practical look at why "redacted" text survives in PDFs, real incidents where it has happened, and a two-minute test you can run on any file before you share it.

Published April 7, 20269 min read
RedactVault Support
pdf redactionrecover redacted textredaction failurepdf privacydocument security

Short answer: yes, surprisingly often. The longer answer is more interesting, because it tells you exactly which kinds of "redactions" are safe and which are not, and how to tell the difference in about two minutes without any special tools.

If you have ever sent a PDF with sensitive information blacked out and felt that small flicker of doubt afterwards, you were right to feel it. The way most people redact PDFs has a specific weakness, and it has caused some genuinely embarrassing public incidents. Let me walk you through what is actually happening inside the file, why it goes wrong, and how to check your own documents before you find out the hard way.

Why a PDF is not what you think it is

When you look at a PDF on screen, it feels like a picture of a page. That is the trap. A PDF is not really a picture. It is a set of instructions that tells your reader where to draw text, where to place images, what fonts to use, and what objects sit on top of what. The text is stored as text, not as pixels, which is why you can search a PDF and copy from it.

That last detail is the whole problem. When someone "redacts" a name by drawing a black rectangle over it, what they have usually done is added one new instruction to the file: "draw a black rectangle here." The instruction that says "draw the name Jane Smith here" is still in the file. The reader follows both instructions in order. The name gets drawn first. The black rectangle gets drawn on top. Visually, the name disappears. Inside the file, it never moved.

The next person who opens that PDF can click and drag across the black bar, copy what is "underneath" it, and paste the original text into any other document. They have not hacked anything. They have not used a tool. They just used the normal copy-and-paste that PDFs have always supported.

Real incidents this has caused

This is not theoretical. It has been happening to large, well-resourced organisations for over twenty years, and it keeps happening because the failure mode is invisible until someone else points it out.

In 2018, lawyers for Paul Manafort filed a court document in a high-profile US federal case with portions visibly redacted. Within hours, journalists discovered they could simply select the blacked-out text and copy it out. The hidden passages described meetings and contacts that were not supposed to be public. The filing had to be re-issued, and the original recoverable version was already mirrored across the internet. The court filing came from a major law firm using mainstream PDF software.

A few years earlier, the US Transportation Security Administration published an operating manual to a government procurement website with sections "redacted." The redactions were black rectangles on top of the original text. Researchers and journalists copied the underlying content within a day and the manual's sensitive screening procedures were widely circulated. Different agency, same mistake.

Then there is the variant where the redaction "works" visually but the document still contains the same information somewhere else inside the file. A redacted name on page seven, but the same name appears in the document's metadata, or in a bookmark that names the section, or in an embedded comment from the original drafting process. Everyone reviewed the visible page. Nobody opened the document properties dialog.

These are not the actions of careless amateurs. These are professional teams who genuinely thought the file was safe. The reason it keeps happening is that the failure does not look like a failure when you check it on the same machine you redacted it on.

The four ways "redacted" text survives

Roughly in order of how often each one shows up in real incidents:

  1. Visual covering. The most common by a wide margin. A shape is drawn on top of the text, the underlying text is never removed. Selectable, copyable, searchable.
  2. Metadata leakage. The document properties (author, title, subject, keywords, original filename) often carry information that mirrors what was redacted on the page. The XMP metadata block in particular can contain edit history and earlier versions of the title.
  3. Structural surfaces. Bookmarks, the document outline, comments, form field default values, embedded attachments, and named destinations all contain text that the redaction tool may not have touched. A bookmark titled "Settlement with Acme Corp" is just as revealing as the redacted name on page three.
  4. Layered objects. Some PDFs use optional content groups (layers) that can be turned on and off. A redaction might hide a layer rather than removing it. Toggling the layer in another reader brings it back.

Each of these has been the cause of a real-world leak at least once that we know about, and probably many more we never hear about because nobody noticed before the file was archived and forgotten.

The two-minute test for any PDF you are about to share

You do not need any special software to check whether a redaction held. You need a different PDF reader than the one you used to create it, and about two minutes. Here is the routine:

  1. Open the redacted file in a different reader. If you used Adobe Acrobat to redact it, open the export in Edge, Chrome, or Preview. If you used a browser-based tool, open it in Acrobat Reader. The reader that produced the file is the worst place to test it because it sees the file the way you intended, not the way a fresh reader does.
  2. Click and drag across one of the redaction bars, slowly, like you are trying to highlight text. If anything gets selected — a faint highlight, a cursor that turns into a text caret — the underlying text is still there. Press Ctrl+C and then paste into any other application to confirm.
  3. Press Ctrl+F (or Cmd+F on a Mac) and search for a word you know was redacted. A name, a number, a single distinctive phrase. The search should return zero results. If it lights up a bar on the page, the text is in the file.
  4. Open Document Properties (in most readers, File → Properties or File → Document Info). Read the Description, Custom, and Advanced tabs. Look at the title, the author, the keywords, the original filename. None of those fields should contain anything you meant to redact.
  5. If the document has bookmarks or a navigation pane, expand the whole tree. Read every bookmark title. Then expand any comments or annotations panel and read those too.

Five steps. About two minutes for a normal-sized document. This single routine catches almost every common leak before it leaves your machine.

What if the test fails?

First, do not panic. The file has not gone anywhere yet — that is the whole point of testing before you share. The fix is to redo the redaction with a workflow that actually removes the underlying content rather than covering it.

There are two reliable approaches. The safer one is a rasterized export, where each page is rendered into an image and the redactions are burned into the image. There is no longer a text layer to leak from, because there is no text. The trade-off is that the file is larger and the output is no longer searchable. For high-sensitivity material, this is usually the right call.

The other approach is a native redaction that removes or rewrites the underlying text and objects properly, then verifies the result before producing the final file. This keeps the document searchable and produces smaller files, but it requires a tool that takes the structural cleanup seriously. The two-minute test is the only honest way to confirm it actually worked.

Whichever approach you take, run the test again on the new export. Trust the test, not your memory of having "definitely fixed it this time."

Can someone recover redacted text from a file you have already shared?

Once a file is out, anyone with a copy can run the same test you should have run, and if the redaction was cosmetic, the underlying text is right there. There is no clever recovery technique required. They open the file, they select the bar, they paste. That is the entire process.

If you discover an old file with bad redactions has already been shared, the realistic options are limited. You can ask the recipient to delete it, but you cannot make them. You can issue a corrected version, which is good practice but does not unmake the original. For regulated material, you may have notification obligations under data protection law. The unhappy lesson here is that the test is much more useful before you share than after.

A note on the tools we use

We build RedactVault, which is a redaction tool designed around the failure modes described above. The relevant detail for this article is that its native PDF export verifies the result before producing the file, and if it cannot verify a page safely, it falls back to rasterizing that page rather than letting an uncertain export through. That is one way to handle the verification problem, and it is the way we chose because the alternative — trusting the user to remember the two-minute test every single time — is the alternative that has caused most of the public failures of the last twenty years.

You do not need our tool to apply the lessons in this post. The two-minute test works on any PDF from any source. The most important thing is to do the test, on every file, before it leaves your hands.

If you want the technique side of this, read How to redact a PDF properly so the hidden text is actually gone. For the deeper explanation of why a black box is not enough, see Why drawing black boxes over a PDF is not real redaction. If you want to try a workflow where the file is processed in your browser and never uploaded, open RedactVault.

The bottom line

Redacted text can be recovered from a PDF whenever the redaction was cosmetic — when something was drawn on top of the text without the text itself being removed from the file. That covers a depressingly large fraction of "redacted" PDFs in the wild, including some from organisations that absolutely should know better.

The good news is that you can tell whether your own files are safe in about two minutes, with no special software, before you share them. The better news is that once you have done the test a few times, you will start noticing every PDF you receive from anyone else, and you will be the first person in the room to spot the next one that fails.

FAQ

Common questions

Is this still a problem in 2026, or has it been fixed?

It is still a problem. The underlying behaviour of PDFs has not changed, and the tools that handle redaction badly are still in widespread use. Public incidents involving improperly redacted PDFs continue to happen every year.

Do you need special software to recover redacted text?

No. If the redaction is cosmetic, you can recover the text by clicking and dragging across the black bar in any normal PDF reader, then copying and pasting. There is no hacking involved. That is the whole point of why this problem is so dangerous.

Is rasterized PDF export always safer than native?

In terms of pure assurance against text-layer leaks, yes. Rasterized export turns each page into an image, so there is no text layer to leak from. The trade-off is larger files and loss of searchability. For high-sensitivity material, the trade-off is usually worth it.

What is the fastest way to check a PDF is actually redacted?

Open it in a different PDF reader than the one that produced it, try to select text under the redaction bars, search for a redacted term, and check Document Properties for metadata. The whole routine takes about two minutes.

What if I have already sent a badly redacted file?

Issue a corrected version, ask the recipient to delete the original (knowing they may not), and check whether your situation triggers any breach notification obligations. The realistic message is that the test is much more valuable before sharing than after.

RedactVault

Want to redact a PDF without uploading it anywhere?

RedactVault processes documents in your browser on your own device. The source file never reaches our servers, and the export verifies itself before you download it.

Open RedactVault

Continue reading