PHI vs PII: Understanding the Key Differences for Compliance

February 9, 2026

PHI vs PII comparison diagram showing the relationship between Protected Health Information and Personally Identifiable Information

PHI (Protected Health Information) and PII (Personally Identifiable Information) are often confused, even by compliance professionals. Both involve sensitive personal data, but they fall under different regulations with distinct requirements. Getting them wrong can mean HIPAA fines up to $1.5 million per violation category — or failing audits that could have been avoided.

This guide clarifies the PHI vs PII distinction, explains the 18 HIPAA identifiers, shows when these categories overlap, and provides practical guidance for redacting PHI from images and documents.

PII Defined: Scope and Examples

Personally Identifiable Information (PII) is any data that can identify an individual, either alone or when combined with other information. This broad definition encompasses many categories of data.

Key Characteristics of PII

  • Direct identifiers: Data that uniquely identifies someone alone (SSN, driver's license, email)
  • Indirect identifiers: Data that identifies when combined (ZIP code + DOB + gender)
  • Governed by multiple laws: GDPR, CCPA, FERPA, and various state regulations
  • Context-dependent: Some data becomes PII only in certain contexts

Common PII Examples

Full name
Social Security number
Email address
Phone number
Home address
Date of birth
Driver's license
Passport number
Financial accounts
Biometric data
IP address
License plate

For a comprehensive list of 50+ PII types, see our detailed guide: PII Examples: 50+ Types of Personal Identifiable Information

PHI Defined: HIPAA-Specific Rules

Protected Health Information (PHI) is individually identifiable health information created, received, maintained, or transmitted by a HIPAA covered entity or business associate.

What Makes Data "Protected"

For information to qualify as PHI under HIPAA, it must meet three criteria:

  1. Health information: Relates to past, present, or future physical/mental health, healthcare provision, or payment for healthcare
  2. Individually identifiable: Can identify or reasonably be used to identify the individual
  3. Healthcare context: Created or received by a covered entity (healthcare provider, health plan, or healthcare clearinghouse)

Who Must Comply with HIPAA

  • Covered entities: Healthcare providers, health plans, healthcare clearinghouses
  • Business associates: Third parties that handle PHI on behalf of covered entities (IT vendors, billing companies, cloud storage providers)

Important: If you're not a covered entity or business associate, HIPAA doesn't apply to you — even if you handle health-related data. However, other laws like GDPR or state privacy laws may still apply.

PHI vs PII: Side-by-Side Comparison

PHI vs PII comparison table showing definition, governing law, scope, penalties, and who must comply
AspectPIIPHI
Full NamePersonally Identifiable InformationProtected Health Information
DefinitionAny data that can identify an individualHealth data linked to an identifiable individual
Governing LawGDPR, CCPA, FERPA, state lawsHIPAA (US healthcare)
ScopeBroad — any identifying dataNarrow — healthcare context required
Who CompliesMost organizations handling personal dataHIPAA covered entities and business associates
Maximum Penalty€20M or 4% revenue (GDPR)$1.5M/year per violation category + criminal
RelationshipBroader category; includes non-health dataSubset of PII; always involves health context

Key Insight

All PHI is PII, but not all PII is PHI. A patient's name combined with their diagnosis is PHI. The same name combined with their purchase history is PII but not PHI — there's no health information involved.

When PHI is Also PII: Overlap Scenarios

Understanding when these categories overlap is critical for compliance. Here are common scenarios where data qualifies as both PHI and PII:

Medical Records with Patient Names

Any medical record containing a patient identifier is simultaneously PHI (health info + identifier) and PII (the identifier itself). Examples:

  • Lab results with patient name and DOB
  • Prescription records with patient address
  • Diagnostic images with patient identifiers
  • Clinical notes referencing the patient

Insurance Information

Health insurance data combines multiple identifiers:

  • Insurance ID numbers (identifier + healthcare context = PHI)
  • Claims data showing treatments received
  • Explanation of Benefits (EOB) documents
  • Prior authorization records

Appointment Records

Even scheduling information can be PHI:

  • "John Smith, oncology appointment 3/15" — reveals health condition
  • "Patient #12345, psychiatry follow-up" — mental health indicator
  • Appointment reminders with clinic names that reveal conditions

The 18 HIPAA Identifiers Explained

HIPAA specifies 18 types of identifiers that must be removed to de-identify PHI under the Safe Harbor method. These are the elements that, when combined with health information, create protected data:

Visual diagram of the 18 HIPAA identifiers organized by category: names, dates, contact info, ID numbers, biometrics, and other unique identifiers
#IdentifierDescription / Notes
1NamesFull name, maiden name, aliases, initials
2Geographic DataAny location smaller than state; street, city, ZIP (first 3 digits OK if population >20,000)
3Dates (except year)Birth date, admission date, discharge date, death date; ages over 89 must be aggregated
4Phone NumbersAll telephone numbers
5Fax NumbersAll fax numbers
6Email AddressesPersonal and work email
7Social Security NumbersFull or partial SSN
8Medical Record NumbersHealthcare provider patient IDs
9Health Plan Beneficiary NumbersInsurance member/subscriber IDs
10Account NumbersFinancial account numbers
11Certificate/License NumbersDriver's license, professional licenses
12Vehicle IdentifiersLicense plates, VINs, serial numbers
13Device IdentifiersMedical device serial numbers, implant IDs
14Web URLsPersonal URLs that could identify patients
15IP AddressesInternet protocol addresses
16Biometric IdentifiersFingerprints, voiceprints, retina scans
17Full-Face PhotographsAny comparable image showing identifiable features
18Any Other Unique IdentifierAny number, code, or characteristic that could identify an individual

Redaction Requirements for PHI in Images

HIPAA provides two methods for de-identifying PHI. For image redaction, the Safe Harbor method is most commonly used:

Safe Harbor Method

Remove or redact all 18 HIPAA identifiers, and ensure the covered entity has no actual knowledge that the remaining information could identify an individual.

  • Pros: Clear, prescriptive checklist; doesn't require statistical expertise
  • Cons: May remove more data than necessary; can limit research utility
  • Best for: Most healthcare organizations; standard compliance approach

Expert Determination Method

A qualified statistical expert determines that the risk of identifying any individual is "very small" and documents their methods and results.

  • Pros: Can retain more data for research; flexible approach
  • Cons: Requires expert analysis; more expensive; must be documented
  • Best for: Research institutions; complex datasets

How Healthcare Organizations Use Image Redaction

Healthcare organizations handle images containing PHI in many contexts. Here's where redaction is typically required:

Patient Intake Forms

Scanned intake forms contain multiple HIPAA identifiers: names, DOB, SSN, addresses, phone numbers. Before sharing for training, audits, or research, these must be de-identified.

Medical Imaging

X-rays, MRIs, CT scans, and other diagnostic images often contain patient identifiers burned into the image or embedded in metadata. For research publications or teaching materials, this information must be removed.

Insurance Documents

EOBs, claims forms, and authorization documents contain beneficiary numbers, treatment codes, and provider information. These require redaction before any non-treatment use.

Research Publications

Clinical images included in papers, presentations, or case studies must have all patient identifiers removed. This includes visible text in images, patient faces, and any unique characteristics that could identify the individual.

Redacting PHI in Medical Images with PixBlur AI

PixBlur's AI-powered redaction automatically detects PHI in medical images, helping healthcare organizations achieve HIPAA compliance efficiently:

What PixBlur AI Detects

  • Personal names — First, last, and full names in any document context
  • Dates of birth — DOB in various formats
  • Phone numbers — Mobile, landline, fax numbers
  • Email addresses — Personal and work email
  • Physical addresses — Street addresses, cities, ZIP codes
  • ID numbers / SSNs — Social Security numbers, insurance IDs, account numbers
  • Credit card numbers — Full or partial card numbers
  • License plates — Vehicle registration numbers
  • Medical/financial information — Sensitive data in healthcare and financial contexts
  • Faces — Automatic face detection with >98% accuracy

Multi-language support: PixBlur AI supports 100+ languages, making it suitable for international healthcare organizations processing documents in multiple languages.

HIPAA-Compliant Workflow

  1. Upload medical images — JPEG, PNG, WebP (max 30 MB), or use PDF Converter for scanned documents
  2. AI scans for PHI — Automatically detects faces, names, phone numbers, addresses, ID numbers, and other sensitive text
  3. Review before export — Critical for compliance: verify all PHI is masked, add manual redactions for any missed items
  4. Choose redaction style — Select from blur, pixelate, emoji overlay, or solid color for faces
  5. Export de-identified image — EXIF metadata automatically removed; download in original quality

Privacy note: Images are processed temporarily and never stored on our servers, the original image data is discarded immediately after processing.

Try the Manual & AI Editor → for single-image redaction with full manual control.

Batch Processing for Large Datasets

For healthcare organizations processing hundreds or thousands of images, PixBlur's batch mode processes 10 images per batch with AI detection. Each image can be reviewed and edited individually before downloading the batch as a ZIP file.

Cost: 1 credit per image for AI redaction. Manual editor and PDF Converter are completely free.

Try Batch Processing → for bulk medical image de-identification (desktop only).

Frequently Asked Questions

What is the difference between PHI and PII?

PII (Personally Identifiable Information) is any data that can identify an individual. PHI (Protected Health Information) is health information linked to an individual under HIPAA. The key difference: PHI requires a healthcare context. All PHI is PII, but not all PII is PHI.

What are the 18 HIPAA identifiers?

The 18 HIPAA identifiers are: names, geographic data smaller than state, dates (except year), phone numbers, fax numbers, email addresses, Social Security numbers, medical record numbers, health plan beneficiary numbers, account numbers, certificate/license numbers, vehicle identifiers, device identifiers, URLs, IP addresses, biometric identifiers, full-face photographs, and any other unique identifying number.

Is a patient name alone considered PHI?

A patient name alone is PII, but it becomes PHI when linked to health information. For example, "John Smith" is PII. "John Smith has diabetes" is PHI because it combines an identifier with health data in a healthcare context.

What is the Safe Harbor method for de-identification?

The Safe Harbor method is a HIPAA de-identification standard that requires removing all 18 specified identifiers from health information. Once removed, the data is no longer considered PHI and falls outside HIPAA regulations. This is the most commonly used de-identification approach.

What are the penalties for PHI violations?

HIPAA penalties range from $100 to $50,000 per violation, with annual maximums up to $1.5 million per violation category. Criminal penalties can include fines up to $250,000 and imprisonment up to 10 years for knowing misuse of PHI.

Why PixBlur for PHI Redaction?

  • AI-Powered Detection — Automatically finds faces, names, phone numbers, email addresses, physical addresses, ID numbers, and other sensitive text with >98% accuracy
  • Review Before Export — Critical for compliance: AI results are editable; verify and adjust masks before downloading
  • 4 Redaction Styles — Choose blur, pixelate, emoji overlay, or solid color for face and text masks
  • Batch Processing — Redact up to 10 images per batch for large datasets; download as ZIP
  • 100+ Languages — Multi-language OCR support for international healthcare documents
  • Images Never Stored — Processed temporarily in memory; automatically discarded after redaction
  • Privacy-First Manual Mode — 100% local processing, no uploads, completely free
  • Free PDF Converter — Convert scanned documents to images for redaction, then convert back. 100% in browser.
  • EXIF Metadata Removed — GPS, device info, timestamps automatically stripped from exports
Try PixBlur Free

Manual editor requires no login. AI features give new users 5 free credits to try. Batch processing and PDF Converter require desktop browser.

Continue Learning