- Personally identifiable information (PII) is any data that can identify a specific individual, either on its own or when combined with other data points.
- PII divides into direct identifiers (Social Security numbers, passport numbers) that identify someone alone, and indirect identifiers (birth dates, ZIP codes) that become identifying in combination.
- Sensitive PII carries a higher risk of harm if exposed and requires stronger controls than non-sensitive PII, but context determines which category applies in practice.
- Regulations including GDPR, HIPAA, CCPA, and PCI DSS impose distinct requirements for how organizations collect, store, and disclose PII.
- Effective PII protection requires data discovery, classification, access controls, data loss prevention (DLP) monitoring, and a tested incident response plan.
What Is Personally Identifiable Information (PII)?
Personally identifiable information (PII) is any data that can be used to identify, locate, or contact a specific individual, either on its own or when combined with other information. PII includes obvious identifiers like Social Security numbers and full names as well as indirect data points, such as birth dates or location records, that become identifying when paired with other details. The term is foundational to both cybersecurity practice and data privacy law globally.
The U.S. Office of Management and Budget (OMB) defines PII as information that can distinguish or trace an individual's identity, alone or in combination with other linked or linkable data. The European Union's General Data Protection Regulation (GDPR) uses the broader phrase "personal data," which covers any information relating to an identified or identifiable natural person, including political opinions, physical characteristics, and online identifiers.
For security practitioners, understanding what constitutes PII matters because PII is among the most targeted categories of data in breaches, fraud campaigns, and insider incidents. When PII is exposed, the consequences range from regulatory fines and litigation to identity theft with lasting consequences for real people.
Direct vs. Indirect PII Identifiers
PII falls into two structural categories based on how reliably a single data point can establish someone's identity.
Direct identifiers are unique to one individual and can confirm identity on their own. A Social Security number (SSN) is the canonical example: government agencies and financial institutions use it as a primary identity key, so exposure of an SSN alone gives a threat actor direct access to sensitive records and accounts. Other direct identifiers include:
- Passport numbers
- Driver's license numbers
- Biometric data (fingerprints, retinal scans, facial recognition templates)
- Other government-issued ID numbers
Indirect identifiers are not unique by themselves but become identifying when combined. A ZIP code, a date of birth, and a gender marker are each individually non-identifying, yet research has shown that approximately 87% of U.S. residents can be uniquely identified from those three data points alone. Other common indirect identifiers include full name, email address, phone number, IP address, race or ethnicity, and employment information.
The distinction matters for risk prioritization. Security programs typically apply stricter controls to direct identifiers while treating indirect identifiers as lower risk. That logic breaks down when indirect identifiers are stored together in ways that enable re-identification, or when attackers can cross-reference them with publicly available datasets.
Sensitive vs. Non-Sensitive PII
Data privacy regulations and security frameworks further categorize PII by the level of harm its exposure could cause.
Sensitive PII
Sensitive PII is information that directly identifies an individual and carries significant risk of harm if disclosed or stolen. Most data privacy laws require organizations to protect sensitive PII through encryption, access controls, and documented retention policies. Examples include:
- Social Security numbers and national ID numbers
- Biometric data (e.g. fingerprints, retinal scans, DNA profiles)
- Medical records and health insurance identifiers
- Financial account numbers and credit card data
- Passport and driver's license numbers
Sensitive PII is typically not publicly available, and its exposure can directly enable identity theft, account takeover, or financial fraud.
Non-sensitive PII
Non-sensitive PII can identify an individual but in isolation poses a lower risk of direct harm. It is often publicly available. Examples include full name, phone number, email address, date of birth, ZIP code, social media handles, and mailing address.
The boundary between sensitive and non-sensitive PII is not fixed. Context determines classification. A full name is non-sensitive on its own, but a list of names associated with a specific medical condition or religious affiliation is highly sensitive. Similarly, a phone number may appear in a public directory, but a database of phone numbers linked to two-factor authentication credentials is sensitive PII.
Personally identifiable financial information
A specific subset is personally identifiable financial information (PIFI), defined broadly as any PII used in connection with financial products or services. The Gramm-Leach-Bliley Act (GLBA) in the U.S. governs how financial institutions must handle this category, which includes account numbers, transaction histories, credit scores, and other information collected in the course of providing a financial service.
Common Examples of PII Data
The table below maps common data types to their PII category and typical sensitivity level. Organizations use this kind of taxonomy as the foundation for data governance and classification programs.
Location data deserves specific attention. GPS coordinates captured by mobile apps or IoT devices may carry no name but can be matched with property records and public databases to identify an individual's home address and daily routine. The Federal Trade Commission (FTC) has taken enforcement action against data brokers selling this type of information on the grounds that it qualifies as PII under federal standards.
Why PII Matters for Data Security and Compliance
PII is the primary target of the most costly and disruptive cyber attacks. Identity theft, account takeover, business email compromise, and ransomware campaigns all depend on acquiring PII to succeed. Understanding the stakes is the starting point for any PII information security program.
Regulatory and legal exposure
Most jurisdictions now impose legal obligations to safeguard PII and notify affected individuals when it is compromised. GDPR fines can reach 4% of a company's annual global revenue or 20 million euros, whichever is greater. In one of the most high-profile cases, Amazon was fined 888 million dollars for GDPR violations in 2021. HIPAA penalties in the U.S. can accumulate to millions of dollars per violation category per year, with criminal charges possible for willful neglect.
Operational and financial costs
A PII data breach triggers incident response costs, legal fees, credit monitoring obligations for affected individuals, and potential civil litigation. Ransomware attacks that target PII carry an average breach cost of over 5.6 million dollars per incident, according to IBM's 2024 Cost of a Data Breach report. Beyond immediate response costs, organizations face increased cyber insurance premiums and long-term customer churn.
The distributed PII problem
The scale of the challenge grows as PII accumulates across more systems. Customer records in CRM platforms, employee data in HR systems, patient records in clinical applications, and transaction logs in financial software create distributed exposure that no single control can address. Without visibility into where PII lives and how it moves, organizations cannot protect it adequately.
PII Data Privacy Laws and Regulatory Requirements
PII is governed by an overlapping set of data compliance regulations that vary by jurisdiction, industry, and data type. Security and legal teams must map their specific PII holdings to the frameworks that apply.
Research indicates that organizations consistently struggle with compliance. According to ESG, 66% of companies that have undergone data privacy audits in the last three years have failed at least once, and 23% have failed three or more times. The core challenge is not understanding what regulations require but mapping those requirements to where PII actually exists across distributed systems.
Approximately 75% of countries now have data privacy laws governing PII, according to McKinsey, but these frameworks often impose conflicting requirements. The rise of cloud computing and remote work compounds this: PII may be collected in one jurisdiction, stored in another, and processed in a third, each potentially governed by different rules.
PII Security Controls: How to Protect PII
Protecting PII requires a layered approach that covers discovery, classification, access, monitoring, and response. The NIST Cybersecurity Framework provides a widely adopted structure for PII protection programs. The following steps reflect that approach.
1. Discover and inventory PII
Organizations cannot protect what they cannot find. Data security posture management (DSPM) tools and automated classification systems identify where PII exists across databases, cloud storage, endpoint devices, email, and collaboration platforms. This inventory is the prerequisite for every subsequent control.
2. Classify PII by sensitivity
Map discovered data against sensitivity categories and apply proportionate safeguards. Applying maximum controls uniformly across all PII creates operational friction without a proportional security benefit. Sensitive PII warrants stricter protection than non-sensitive PII, and classification policies should reflect that distinction.
3. Minimize collection and retention
Collect only the PII necessary for specific, documented business purposes. Establish retention schedules and delete PII that is no longer needed. Data minimization reduces the exposure surface available to attackers and simplifies compliance with deletion rights under GDPR and CCPA.
4. Encrypt PII at rest and in transit
Encryption renders PII unreadable to unauthorized parties even when systems are breached. Apply encryption to databases, cloud storage, email, and endpoint devices that hold PII. For systems handling financial or health data, encryption is typically a regulatory requirement, not an optional control.
5. Enforce access controls
Implement role-based access control (RBAC) and the principle of least privilege so employees access only the PII their job functions require. Multi-factor authentication (MFA) adds an additional barrier for systems storing sensitive PII and is explicitly required under several major compliance frameworks.
6. Monitor for unauthorized data movement
Data loss prevention (DLP) tools track how PII moves within and outside the organization, flagging anomalous transfers to unauthorized destinations, personal cloud storage, or external email addresses. Monitoring PII movement in real time is the control layer most likely to catch both insider mishandling and external exfiltration attempts before they complete.
7. Train employees on PII handling
Human error causes a significant proportion of PII exposures. Training programs covering phishing awareness, proper data handling procedures, and social engineering recognition reduce this risk vector. Employees who handle PII routinely, including HR, finance, and customer service teams, warrant role-specific guidance beyond general security awareness.
8. Plan and test incident response
Establish a breach response plan before an incident occurs. Under GDPR, organizations have 72 hours to notify supervisory authorities following discovery of a qualifying breach. U.S. state laws have their own notification timelines. Test the plan through tabletop exercises annually and update it when regulations or systems change.
How Cyberhaven Protects PII
Cyberhaven addresses PII protection through a unified data security platform that combines data discovery, behavioral monitoring, and policy enforcement to give security teams full visibility into where PII exists and how it moves.
Cyberhaven's DSPM capability continuously scans structured and unstructured data across cloud environments, endpoint devices, and SaaS applications to identify and classify PII at scale. Rather than requiring manual tagging or periodic audits, it builds a live inventory of sensitive data including names, SSNs, financial records, and health information across the organization's entire data estate.
On top of that discovery layer, Cyberhaven's DLP tracks the actual movement of PII in real time. Cyberhaven traces data back to its origin, following PII as it moves through editing, sharing, uploading, and exfiltration channels. This lineage-based approach catches PII leaving the organization even when it has been renamed, reformatted, or embedded in a new file, which pattern-based DLP tools miss.
When a user attempts to send PII to an unauthorized destination, whether a personal email account, a cloud storage service, or a generative AI tool, Cyberhaven can block, prompt, or alert based on policy, giving security teams the context to respond proportionately rather than blocking legitimate workflows.
Frequently Asked Questions
What does PII mean?
PII stands for personally identifiable information. It refers to any data that can be used to identify a specific individual, either on its own or when combined with other information. Examples include Social Security numbers, full names, email addresses, and biometric records.
What is considered PII?
Any information that can identify, locate, or contact a specific person qualifies as PII. This includes direct identifiers like passport numbers and SSNs, which identify someone on their own, and indirect identifiers like dates of birth or IP addresses, which can identify someone when combined with other data points.
What is the difference between PII and sensitive data?
Sensitive data is a broader category that includes PII but also covers other confidential information, such as intellectual property, trade secrets, and business-critical data that does not relate to individuals. PII specifically refers to data that can identify a natural person. Within PII, sensitive PII is information whose disclosure could cause significant harm, such as financial account numbers or medical records, while non-sensitive PII carries a lower immediate risk.
What are the main types of PII?
PII types fall into two main frameworks. By identification strength: direct identifiers (SSNs, biometrics, passport numbers) identify someone alone, while indirect identifiers (name, birth date, ZIP code) become identifying in combination. By sensitivity: sensitive PII (financial data, health records, biometrics) requires stronger controls, while non-sensitive PII (name, email, employer) poses lower standalone risk but can still be combined to re-identify individuals.
What are the most common PII security controls?
The most widely used PII security controls include data discovery and classification to find where PII exists, encryption to protect it at rest and in transit, role-based access controls to limit who can access it, DLP tools to monitor and block unauthorized movement, employee training to reduce human error, and incident response plans to contain and disclose breaches within regulatory timelines.
How is PII regulated?
PII is regulated by a patchwork of laws that vary by jurisdiction and industry. The GDPR covers all personal data of EU residents. In the U.S., HIPAA governs health-related PII, GLBA covers financial PII, and PCI DSS applies to payment card data. California's CCPA and CPRA give residents rights over how their PII is collected and used. Most other major economies have enacted national privacy laws with similar frameworks.

.avif)
.avif)
