Medical RedactionMedical Redaction

The Best Way to Redact Medical Documents Without Uploading Them to the Cloud

A practical guide for healthcare professionals who need to redact PHI from clinical, billing, and research documents without sending them to a cloud vendor. Covers what actually counts as PHI in a document, what a Business Associate Agreement does and does not cover, the specific mistakes that lead to OCR enforcement, and the realistic workflow options for clinical and administrative staff.

Published April 15, 202617 min read

RedactVault Support

medical redactionHIPAAprotected health informationbusiness associate agreementhealthcare complianceclient-side processing

If you have ever stared at a chart, an explanation of benefits, or a clinical narrative and wondered whether the redaction tool you are about to use is one your privacy officer would be comfortable with, this post is for you. Healthcare redaction looks like ordinary document redaction — you pick a tool, you cover the names, you save the file — but the rules around it are different in ways that matter, and the consequences of getting it wrong have a regulator with subpoena power attached.

We recently published a sister piece for lawyers on cloud-based redaction. The surface question is the same here: does this document need to leave my device. The underlying analysis is not a copy of the legal one with the words swapped. PHI is its own category, HIPAA has its own machinery, and the Office for Civil Rights enforces breaches with specific patterns. The most useful thing this post can do is name those patterns and the workflows that avoid them.

What we will cover: what actually counts as protected health information in a document, the parts of HIPAA's de-identification standard people most often get wrong, what a Business Associate Agreement covers and what it does not, the specific mistakes that lead to OCR settlements, and the realistic workflow options for clinical and administrative staff who are not in IT and do not have the budget to be.

What actually counts as PHI in a document

PHI — protected health information — is any health information that can be linked, directly or indirectly, to a specific person. The legal definition lives in 45 CFR §160.103 and is broader than people expect. The Privacy Rule's Safe Harbor de-identification standard at §164.514(b)(2) gives the working list of identifiers that have to be removed before a document is treated as de-identified.

There are eighteen Safe Harbor categories. Removing them all is one of two routes to a de-identified data set under HIPAA; the other is Expert Determination, which requires a documented statistical analysis showing the risk of re-identification is very small. The eighteen categories are these:

Names
Geographic subdivisions smaller than a state — street address, city, county, precinct, and most ZIP codes
All elements of dates (other than year) related to an individual, including birth date, admission date, discharge date, and date of death; and all ages over 89 along with elements of dates indicating such age
Telephone numbers
Fax numbers
Email addresses
Social Security numbers
Medical record numbers
Health plan beneficiary numbers
Account numbers
Certificate or license numbers
Vehicle identifiers and serial numbers, including license plates
Device identifiers and serial numbers
Web URLs
IP addresses
Biometric identifiers, including finger and voice prints
Full-face photographs and any comparable images
Any other unique identifying number, characteristic, or code

Three things on this list catch people out, and they catch out the same people every time.

The first is the date rule. It is not just dates of birth. Dates of admission, dates of discharge, dates of death, dates of service, and dates of any specific procedure are all PHI when associated with an individual. A radiology report with the patient's name removed but the exam date left in is not de-identified.

The second is the ZIP code rule. The first three digits of a ZIP code are usually allowed under Safe Harbor, but only if the population of the geographic area covered by all ZIP codes sharing those three digits is greater than 20,000 people. There are several three-digit ZIPs in the United States that fail this test, and HHS publishes the list of restricted prefixes. Five-digit ZIPs are always treated as PHI.

The third is item 18 — the catch-all for any other unique identifying characteristic. This is what gets you when you carefully strip the seventeen named identifiers and leave behind a free-text note that says "patient is the only female cardiothoracic surgeon at the affiliated teaching hospital." One sentence and the file is no longer de-identified, even if every formal identifier has been removed.

Beyond Safe Harbor, the practical PHI surface in a typical clinical document is wider than the formal list suggests. DICOM image headers carry patient name, patient ID, accession number, and referring physician fields by default. PDFs of EHR exports often carry the user account name of whoever generated the export in the document metadata. Scanned consult letters carry letterhead and fax cover sheets that name the patient on the cover even when the body is redacted. Each of these is a place PHI hides outside the page area you are looking at.

Redaction in healthcare is actually two different jobs

Most posts treat redaction as a single activity. In healthcare it is two different activities with different rules, and confusing them is one of the easier ways to do the wrong thing.

The first activity is sharing a single document with someone who is allowed to see most of it. A patient's attorney has requested chart copies as part of a personal injury case. A workers' comp carrier needs progress notes for a claim. A family member is picking up paperwork the patient signed an authorization for. In each case the recipient is identified, the disclosure is authorized, and the redaction job is to remove the parts they are not authorized to see — typically other patients mentioned in the chart, internal billing notes, or staff personal information.

This is the easier job, and it is the one most clinical and admin staff are doing day to day. The Privacy Rule's minimum necessary standard at §164.502(b) applies: even with a valid authorization, you should disclose only what is needed for the purpose. A good redaction tool is the difference between a copy that is minimum-necessary and a copy that overshares because nobody had the patience to black out three lines.

The second activity is producing a de-identified data set, typically for research, public health reporting, or sharing with a third party who does not have a direct treatment or payment relationship with the patient. This is the harder job. Safe Harbor applies in full, or the alternative Expert Determination method has to be documented. The bar is much higher, and a lot of redaction work that calls itself "de-identification" is really job one wearing job two's clothes.

Knowing which job you are doing matters because the stopping point is different. Job one stops when the recipient has what they are authorized to see and nothing more. Job two stops when no one looking at the document, armed with reasonable external information, could plausibly identify the individual. Tooling that is sufficient for the first is often insufficient for the second.

What HIPAA actually says about sending PHI to a cloud vendor

Here is the part that surprises people. If you upload a document containing PHI to a cloud-based redaction service, that service is, in HIPAA's eyes, a business associate of your organization. The Office for Civil Rights said this directly in its 2016 guidance on cloud computing. A cloud service provider that creates, receives, maintains, or transmits PHI on behalf of a covered entity or another business associate is a business associate, regardless of whether the provider can read the data. Encryption does not change the relationship. The status is defined by what the vendor does with the data, not by what the vendor can see.

There is a narrow exception called the conduit exception, which covers entities that merely transmit PHI without storing it in any meaningful way. The classic examples are the US Postal Service, courier services, and internet service providers. The 2013 Omnibus rule preamble was explicit that the exception is intentionally narrow and does not cover services that store PHI even temporarily. A redaction service that receives a file, processes it on a server, and returns a result is not a conduit.

Practically, this means three things. If you send PHI to a cloud-based vendor for redaction or any other processing, you must have a Business Associate Agreement with that vendor before the data leaves your network. If the vendor uses sub-processors — and most do — those sub-processors must be bound by appropriate agreements as well. And if either of those agreements is missing, the upload itself is a HIPAA violation, separate from anything that happens to the data afterwards.

OCR has enforced this directly. The 2016 Raleigh Orthopaedic Clinic settlement of $750,000 followed disclosure of PHI to an entity converting X-ray films to silver, where no BAA had been put in place. The 2017 Center for Children's Digestive Health settlement of $31,000 turned on a missing BAA covering a paper records storage vendor. The 2016 Catholic Health Care Services settlement of $650,000 included missing BAA findings as part of a broader compliance failure. The pattern is consistent: the missing piece of paper is itself an enforceable violation, regardless of whether anything was breached.

What a BAA actually does and what it does not

A Business Associate Agreement is a contract required by §164.504(e) that binds the vendor to safeguard PHI in specified ways. It is a real and important instrument. It is also commonly misunderstood as something stronger than it is.

A BAA requires the vendor to use PHI only for the purposes you specify, to safeguard it with appropriate administrative, physical, and technical controls, to report breaches to you within a defined timeframe, to bind any subcontractors with equivalent terms, to make PHI available for amendment and access requests under the Privacy Rule, and to return or destroy PHI at the end of the engagement when feasible. Since the 2013 Omnibus rule, a BAA also makes business associates directly liable under HIPAA for the parts of the rules that apply to them.

What a BAA does not do is more important. It does not prevent the vendor from being breached. It does not shrink the size of a reportable breach if one happens — if a vendor exposes 50,000 of your patients' records, you have a 50,000-record breach to notify on, BAA or no BAA. It does not satisfy your Security Rule risk analysis obligation, which requires you to evaluate the risks and vulnerabilities to PHI throughout your environment, including the parts you have outsourced. And it does not relieve you of the duty to actually pick a vendor that can do the job competently.

The misreading goes like this: "we have a BAA, so the vendor is HIPAA-compliant, so uploading is fine." This is wrong on two counts. A BAA does not make a vendor HIPAA-compliant — a BAA is a contractual instrument; compliance is a question of what the vendor actually does. And the question is not whether the upload is permitted under HIPAA. It is whether the upload is the right call given the alternatives.

The mistakes that get healthcare organizations in trouble

OCR's enforcement record over the last decade is, to a degree people do not appreciate, a list of the same handful of mistakes repeated. Reading through the resolution agreements for the largest settlements is the closest thing the industry has to a study guide for what not to do.

Anthem's 2018 settlement of $16 million — the largest HIPAA settlement to date — followed a 2014–2015 cyberattack that exposed PHI of nearly 79 million people. The OCR finding was not just that the breach happened. It was that Anthem had failed to conduct an enterprise-wide risk analysis, failed to implement procedures to regularly review system activity, and failed to respond to a known security incident in time. The breach was the trigger; the underlying compliance gaps were the reason for the size of the settlement.

Premera Blue Cross settled for $6.85 million in 2020 over a 2014 attack affecting 10.4 million individuals. The OCR press release specifically called out failures of risk analysis and risk management — not novel exotic security failures, but the foundational requirements of the Security Rule that the organization had not done well.

Memorial Healthcare System settled for $5.5 million in 2017 after the login credentials of a former affiliated physician were used to access PHI of more than 100,000 individuals. The credentials had not been disabled when the affiliation ended. There was no exotic attacker. There was a process gap and an access control failure that lasted over a year.

Touchstone Medical Imaging settled for $3 million in 2019 after an FTP server allowing uncontrolled access exposed PHI of approximately 300,000 patients. The server was discoverable through standard search engines. The exposure had been ongoing and no one had noticed until a security researcher reported it.

These are not redaction cases specifically. They are something more useful: a picture of how healthcare organizations actually fail. The pattern is rarely a sophisticated attack. It is a missing risk analysis, a credential that should have been disabled, a server that should not have been internet-facing, a vendor that should have had a BAA in place. The redaction-specific failures fit the same shape.

The redaction-specific failure patterns

A clinical staffer redacts a chart in Microsoft Word by changing the font color to white over the patient name, and the recipient pastes the document into a new file and the name reappears. A billing specialist exports an EOB to PDF, draws black rectangles using a markup tool, and the underlying text is still extractable — we covered this exact failure mode in our post on why drawing black boxes over a PDF is not real redaction. A research coordinator strips the seventeen named Safe Harbor identifiers from a clinical narrative and leaves a free-text comment that says "patient is the 53-year-old wife of the chief of surgery," and the data set is now identifiable to anyone in the building.

And the workflow failure that causes the most quiet damage is sending the wrong file. The redacted version and the original sit in the same folder with similar names. The original is attached to the email by mistake. The recipient opens it. There is now a HIPAA breach, and there is no way to unsend it. The 60-day breach notification clock to affected individuals is now running, and depending on the count, an immediate notification to HHS may be required as well.

The metadata category is its own quiet hazard. PDFs exported from EHR systems regularly contain author names, software versions, and creation timestamps in the document properties. Scanned documents converted to PDF often carry the source scanner's identifier and the workstation account name. DICOM images carry an extensive set of patient and operational tags by default; the DICOM Part 15 Basic Confidentiality Profile lists the tags that need to be cleaned, and most viewers do not do it for you.

What actually happens when you upload PHI to a typical cloud redaction tool

Because the framing of this post is about avoiding uploads, it is worth being concrete about what an upload actually involves on a typical SaaS redaction service. The mechanics matter for the Security Rule risk analysis you are obligated to perform.

The file leaves your browser over HTTPS and arrives at the vendor's load balancer. That load balancer logs the request — usually IP address, file size, timestamp, and a request identifier. Those logs are infrastructure logs and are typically subject to a different retention policy from the document content itself. They are also typically not what the BAA's deletion language refers to when it talks about "returning or destroying PHI." Filenames containing patient identifiers — Smith_John_DOB1962.pdf is a common pattern — end up in those logs and can survive document deletion by months or years.

The processing server receives the file, writes it to disk in a staging area, runs detection, and produces a redacted result. Modern cloud architectures rely on sub-processors for many of these stages: OCR may go to a third-party API, AI-based name detection may go to a different inference service, image processing may go elsewhere again. Each sub-processor is a place the file or a derivative crosses a system boundary. Each is a place that needs to be on your list of business associates and sub-business-associates if you want the chain of custody to be defensible.

Backups, snapshots, and disaster recovery copies are the next surface. Most cloud vendors maintain backups of their processing infrastructure, and those backups capture whatever was on the relevant disks at backup time. Even when the vendor deletes the document promptly after processing, the backup may persist for the standard retention period — often 30, 60, or 90 days. A vendor's BAA usually addresses this in the deletion clause, but the actual deletion timeline for backups is rarely "immediate." If you have ever asked a vendor "when exactly is the document gone," you have probably noticed how many caveats the answer has.

Then there is the employee access surface. Vendor employees with operational responsibilities — engineers, support staff, infrastructure operators — typically have some access to the processing environment. Reputable vendors restrict that access tightly with audit logs, just-in-time access controls, and least-privilege roles. Less reputable vendors do not, and the Security Rule risk analysis you are obligated to perform on the vendor is supposed to surface that distinction. In practice, most covered entities accept the vendor's SOC 2 report and move on, which is a defensible shortcut for low-sensitivity processing and a problem for high-sensitivity processing.

None of this means cloud-based redaction is wrong. It means cloud-based redaction is a real choice with a real risk profile that needs a real risk analysis. The default of "we have a BAA, send the file" is not a risk analysis. It is the absence of one, and OCR knows the difference.

Realistic workflow options for clinical and administrative staff

Most of the people doing day-to-day redaction in healthcare are not security engineers. They are HIM staff, medical records clerks, release-of-information coordinators, billing specialists, clinical research associates, and front-desk admins. The right workflow for them needs to be operationally simple, defensible to a privacy officer, and not require IT to be in the loop for every document.

Option 1: Redaction inside the EHR itself

Most modern EHRs include some form of redaction or release-of-information module. Epic, Cerner, Meditech, and Allscripts all offer ways to mark portions of a chart as not-for-disclosure when generating an export for a specific recipient. The advantage is that the document never leaves the EHR's already-secured environment, the audit trail is built in, and the disclosure is logged for the patient's accounting of disclosures.

The disadvantage is that EHR redaction tools are usually limited in two ways. They tend to be coarse — operating at the encounter or note level rather than allowing redaction of individual paragraphs or words within a note. And they often do not handle attached PDFs, scanned documents, or external records, which are precisely the documents that most need careful redaction. For chart-export use cases that fit the EHR's model, this is the right tool. For everything else, you need something on top.

Option 2: A local desktop tool like Acrobat Pro

Acrobat Pro's redaction tools are widely deployed in healthcare and they work, with the same caveat that applies in every other domain: the two-step process of marking and then applying redactions, plus the third step of running Sanitize Document to remove hidden metadata, has to be completed every time without skipping. Healthcare staff who use Acrobat regularly are usually trained to do this. Healthcare staff who use it occasionally — which is most of them — are the source of the failure pattern where "redacted" Acrobat exports contain extractable text underneath the bars.

The data-handling case for desktop Acrobat is strong. The file stays on the endpoint. There is no vendor copy, no BAA to negotiate, no sub-processor chain. The endpoint security and access controls become the limiting factor — which is normal in healthcare, where workstation lockdown, encryption, and access logging are usually in place anyway. The failure mode to watch is the procedural one, not the architectural one.

Option 3: A client-side, browser-based tool

A browser-based tool that processes the document inside the browser, without uploading the file to a server, occupies a useful spot in this workflow stack. From a HIPAA perspective, if the file does not leave the browser, the tool is not a business associate in OCR's sense — there is no PHI being created, received, maintained, or transmitted on behalf of the covered entity. The tool is closer to a piece of installed software, conceptually, than to a SaaS vendor. The BAA question simply does not arise in the same way.

The honest caveat: this only works if the tool actually does what it says. A browser-based tool that quietly sends document content to a backend for OCR or AI processing has the same regulatory profile as any other cloud service. The diligence question is not "does the tool have a nice client-side story" but "does the tool actually keep the document inside the browser end-to-end." This is verifiable. Open the browser's developer tools, switch to the Network tab, drop a non-sensitive test document onto the tool, and watch what gets sent. A genuinely client-side tool will show no upload of document content. Any tool worth using will document the answer clearly and will be willing to walk a privacy officer through the verification.

Option 4: A cloud-based SaaS service with a signed BAA

A managed cloud redaction service with a properly negotiated BAA is a legitimate choice for many healthcare workflows. Large-scale redaction across millions of documents — for example, building a de-identified research data warehouse — is genuinely a job that fits a managed service better than any desktop tool. The trade-off is the one this whole post has been about: you are accepting the upload, the vendor's processing chain, and the residual risk that the BAA allocates but does not eliminate.

If this is the path you take, the diligence checklist is non-trivial. You need to verify the vendor will sign a BAA without unusual carve-outs, understand the sub-processor chain, get clarity on backup and log retention specifically for PHI, satisfy yourself the vendor's access controls are appropriate, and document the whole analysis in your Security Rule risk assessment. None of this is exotic — it is the same diligence required for any business associate — but it is real work, and it is the work that distinguishes "using a cloud tool defensibly" from "using a cloud tool because the alternative seemed harder."

Option 5: An air-gapped workstation for the most sensitive data sets

For research data sets containing especially sensitive information — substance use disorder records under 42 CFR Part 2, HIV records, mental health records subject to state-level enhanced protections, genetic data — an air-gapped workstation is sometimes the right answer. No network connection, no telemetry, no possibility of accidental upload. Documents move in and out on encrypted physical media.

This is operationally inconvenient and overkill for most clinical workflows. It is the right answer when the wrong answer is a reportable breach of a record category that carries enhanced state penalties or 42 CFR Part 2 exposure. Most large healthcare organizations have at least one such workstation somewhere; smaller practices usually do not, which is one of the reasons sensitive specialty data sets often live with the institutions that can support them.

Match the tool to the document

Most healthcare organizations use a single redaction workflow for every document. The sensible approach is tiered. Not every document carries the same risk, and not every document needs the same tool.

A rough tiering that works for most healthcare settings:

Routine internal sharing — administrative documents, policy drafts, training materials with no PHI — can use whatever tool is convenient.
Standard release of information — chart copies for attorneys, EOBs for patients, progress notes for insurers — should use a tool whose processing model you can describe to a privacy officer in one sentence. Client-side or local desktop is the straightforward answer.
Sensitive specialty records — psychotherapy notes, substance use disorder records, HIV status, genetic information — should not leave the device unless there is a specific operational reason and a current BAA covering the receiving party. Client-side, on-premise, or air-gapped depending on the specific sensitivity.
Research and de-identified data sets — workloads that genuinely require batch processing — usually justify a managed cloud service with a properly negotiated BAA, and require Expert Determination or Safe Harbor work that is out of scope for routine clinical staff.

Tiering is not a statement that cloud services are bad. It is a statement that "what tool do we use for this document" is a real question, and the answer depends on the document. Treating every document the same is how organizations end up uploading the wrong thing.

A four-question framework for the document in front of you

When a staffer is staring at a specific document and trying to decide whether to upload it, these four questions will usually give a clean answer in under a minute.

Does this document contain PHI? (If you are unsure, treat the answer as yes — the cost of a false negative is a notifiable breach.)
Is the recipient of any planned disclosure both authorized and identified, and does the disclosure satisfy minimum necessary?
Is there a tool that can do this redaction without the document leaving my device?
If I had to explain this choice to my privacy officer or to OCR, could I?

If question 1 is yes and question 3 is yes, the default is to keep the document on the device. That is the simplest version of reasonable safeguards, and it is defensible. Everything else is a judgment call based on the specifics of the document, the recipient, and the alternatives available.

Where RedactVault fits, and where it does not

Short and factual. RedactVault is a browser-based redaction tool. The document is processed inside the browser — text extraction, detection, redaction, and export all run on the device. The file does not travel to our servers for processing, which is verifiable in the network tab and which we are happy to walk through with anyone whose privacy officer wants to see it.

From a HIPAA workflow standpoint, this means RedactVault sits outside the business associate question for the document itself. We do not receive, maintain, or transmit PHI on behalf of a covered entity in the course of redaction. That is a clean story for a Security Rule risk analysis — "no upload, no vendor copy, no sub-processor chain" is a much shorter paragraph than the equivalent for any cloud service.

It is not the right fit for every healthcare workflow. We are not built for batch de-identification of multi-million-record research warehouses; that is a different tool category with different scaling requirements. We are not a replacement for the EHR's release-of-information module if your workflow already routes everything through the EHR cleanly. And we are not a substitute for the proper diligence of choosing whatever tool you choose; we will tell you when we are not the right answer.

We want to be the right answer for the everyday clinical and administrative redaction job: a chart copy for an attorney, an EOB for a patient, a research extract for a colleague, a consult letter being shared with a referring physician. Documents that do not need to leave the clinical desktop, and where the redaction needs to be quick, accurate, and not generate a chain of vendor exposure for every single use.

The through-line

HIPAA is not a checklist. It is a framework that asks healthcare organizations to make defensible decisions about how PHI is handled, document those decisions, and be able to explain them when a regulator asks. The cloud-versus-not-cloud question for redaction is one of those decisions. The answer is not "cloud bad" — cloud is fine for many workflows, including some redaction workflows — but it is not "cloud default" either.

The defensible position for most clinical and administrative redaction is the one that requires the smallest explanation: the document did not leave the device, so the BAA question does not arise, the sub-processor chain does not exist, the backup retention does not apply, and the risk analysis section for this workflow is one paragraph long. That is what client-side and on-premise tools give you, and it is the position that ages well as regulations evolve and OCR enforcement priorities shift.

Match the tool to the document. Most documents do not need to leave the clinical desktop. Choose a workflow that reflects that, and your privacy officer's next audit binder will be shorter for it.

FAQ

Common questions

Does encryption in transit and at rest mean the cloud vendor is not a business associate?

No. OCR's 2016 cloud computing guidance is explicit on this point. A cloud service provider that creates, receives, maintains, or transmits PHI on behalf of a covered entity is a business associate regardless of whether the provider can read the data. Encryption is a security control that the BAA may require; it is not a substitute for the BAA itself, and it does not change the regulatory status of the vendor.

We have a BAA with our cloud vendor. Are we covered if they get breached?

A BAA allocates responsibility — the vendor is required to notify you of breaches and may bear costs under the contract — but it does not shrink your reportable footprint. If the vendor exposes 25,000 of your patients' records, you have a 25,000-record breach for HIPAA notification purposes. You will notify the affected individuals within 60 days, you will notify HHS through the breach reporting portal, and depending on the state you may have additional state attorney general or media notifications to make. The BAA helps with the recovery; it does not prevent the obligation.

What about the conduit exception? Does that not cover cloud services that just process data briefly?

No, and this is a common misreading. The conduit exception is intentionally narrow. It covers entities like the US Postal Service, courier services, and ISPs that merely transmit information without accessing it except as a random or infrequent incident. The 2013 Omnibus rule preamble made clear that cloud service providers that store ePHI — even temporarily — are not conduits. A redaction service that receives a file, processes it, and returns a result is doing more than transmission, regardless of how briefly the file is held.

I am a solo physician or small practice. Does any of this apply at the same scale as a hospital?

HIPAA applies to covered entities of all sizes. The practical risk profile is different — OCR's largest settlements have been with large organizations because the breaches affected more people — but the rules are the same. Small practices benefit disproportionately from workflows that minimize the BAA-and-vendor-management overhead, which is one reason browser-based or local desktop tools are particularly well-suited to small-practice redaction needs. The compliance story is shorter and the dependency on outside vendors is smaller.

How do I verify that a "browser-based" tool actually keeps the document in the browser?

Open the browser's developer tools, switch to the Network tab, and use the tool with a non-sensitive test document. Watch what gets sent. A genuinely client-side tool will show no upload of the document content — only requests for the application code, fonts, and any analytics endpoints the tool uses. If you see a POST request with the file in the body, the document is leaving the browser. Any reputable client-side tool will have published documentation explaining this and will be willing to walk a privacy officer through the verification.

What about the dates issue specifically? If I leave the year on a clinical note, am I OK for de-identification?

For Safe Harbor de-identification, year is acceptable. Anything more granular than year — month, day, specific date of admission, specific date of procedure — is PHI when associated with an individual. Note that ages over 89 are also restricted; under Safe Harbor they can only be expressed as a single category of "90 or older," because in a small population the very elderly are easier to re-identify. If your use case requires real dates, you are typically operating outside Safe Harbor and need either a valid authorization, a treatment-payment-operations purpose, or expert determination of the de-identified data set.

We use Acrobat Pro for everything and it has worked fine for years. Why change?

If your team consistently completes the mark-apply-sanitize sequence and your endpoint security is good, Acrobat is a defensible workflow. The reason to look at alternatives is usually one of two things: the failure pattern of staff who skip the sanitize step has bitten the organization at least once, or the volume of redaction has grown to the point where the procedural overhead is creating bottlenecks. Tools that fail closed by default — where the only export option is one that has actually removed the data — eliminate the most common Acrobat failure mode by design.

Are there special rules for substance use disorder, mental health, or HIV records?

Yes, and they are stricter than baseline HIPAA. 42 CFR Part 2 governs substance use disorder records held by federally assisted programs and requires specific patient consent for many disclosures that HIPAA would otherwise allow. State laws often add enhanced protections for mental health records, HIV status, and genetic information. The practical implication for redaction is that the documents most in need of careful handling are also the ones where the regulatory penalties for mishandling are highest, which is the case for keeping those documents on the device when there is a workable client-side tool available.

RedactVault

Redact PHI without leaving your browser

RedactVault processes medical documents entirely client-side. The file does not travel to our servers, which keeps the BAA and vendor diligence story for everyday clinical redaction short.

See how it works

Financial RedactionFinancial Redaction

The Best Way to Redact Financial Documents Without Uploading Them to the Cloud

A 16-digit card number is not 16 random digits. The first six to eight identify the issuer. The last digit is a checksum determined by the others. Visual redaction that leaves "just the last four" or "just the middle" exposed is doing far less work than it looks like, and the regulators that care about this know the math.

April 15, 202617 min read

Read article

Legal RedactionLegal Redaction

The Best Way to Redact Legal Documents Without Uploading Them to the Cloud

"Is the cloud safe" is the wrong question for legal work. The question that matters is whether the document needs to leave your device at all, and what changes about your duties under the rules of professional conduct if it does.

April 11, 202617 min read

Read article

PDF RedactionPDF redaction basics

Why Drawing Black Boxes Over a PDF Is Not Real Redaction

A black rectangle may hide text from view, but it often does not remove the underlying data. Here is why that matters, how fake redaction fails, and what a proper PDF redaction workflow looks like.

April 3, 20268 min read

Read article

What actually counts as PHI in a document

Redaction in healthcare is actually two different jobs

What HIPAA actually says about sending PHI to a cloud vendor

What a BAA actually does and what it does not

The mistakes that get healthcare organizations in trouble

The redaction-specific failure patterns

What actually happens when you upload PHI to a typical cloud redaction tool

Realistic workflow options for clinical and administrative staff

Option 1: Redaction inside the EHR itself

Option 2: A local desktop tool like Acrobat Pro

Option 3: A client-side, browser-based tool

Option 4: A cloud-based SaaS service with a signed BAA

Option 5: An air-gapped workstation for the most sensitive data sets

Match the tool to the document

A four-question framework for the document in front of you

Where RedactVault fits, and where it does not

The through-line

Common questions

Does encryption in transit and at rest mean the cloud vendor is not a business associate?

We have a BAA with our cloud vendor. Are we covered if they get breached?

What about the conduit exception? Does that not cover cloud services that just process data briefly?

I am a solo physician or small practice. Does any of this apply at the same scale as a hospital?

How do I verify that a "browser-based" tool actually keeps the document in the browser?

What about the dates issue specifically? If I leave the year on a clinical note, am I OK for de-identification?

We use Acrobat Pro for everything and it has worked fine for years. Why change?

Are there special rules for substance use disorder, mental health, or HIV records?

Redact PHI without leaving your browser

Related articles

The Best Way to Redact Financial Documents Without Uploading Them to the Cloud

The Best Way to Redact Legal Documents Without Uploading Them to the Cloud

Why Drawing Black Boxes Over a PDF Is Not Real Redaction