Microsoft Purview Information Protection: The Enterprise Guide to Sensitivity Labels, Data Classification, and DLP
Data breaches cost enterprises an average of $4.88 million in 2025, and 82% involve data that was improperly classified or unprotected. Microsoft Purview Information Protection gives organizations the ability to discover, classify, label, and protect sensitive data across their entire digital estate. This guide covers enterprise deployment strategies, sensitivity label taxonomies, auto-labeling configurations, DLP policies, and compliance mappings for HIPAA, SOC 2, and GDPR — based on 500+ deployments by EPC Group.
What Is Microsoft Purview Information Protection?
Microsoft Purview Information Protection is the unified data protection platform within the Microsoft 365 ecosystem. It replaces what was previously known as Azure Information Protection (AIP), Microsoft Information Protection (MIP), and Office 365 DLP — consolidating them into a single compliance framework managed through the Microsoft Purview compliance portal.
The platform operates on a three-step model: Know your data (discover and classify sensitive information), Protect your data (apply sensitivity labels with encryption and access controls), and Prevent data loss (enforce DLP policies across email, Teams, SharePoint, endpoints, and third-party apps).
At EPC Group, our Microsoft 365 consulting practice has deployed Purview Information Protection for over 500 enterprise organizations, including healthcare systems managing HIPAA-regulated PHI, financial institutions with SOC 2 requirements, and government agencies operating under FedRAMP. The technology is critical to any serious data governance strategy.
Core Components
| Component | Function | License Required |
|---|---|---|
| Sensitivity Labels | Classify and protect documents, emails, containers | M365 E3 (manual), E5 (auto-labeling) |
| Sensitive Info Types (SITs) | Pattern-based detection of SSNs, credit cards, PHI | M365 E3 |
| Trainable Classifiers | ML-based classification for contracts, resumes, invoices | M365 E5 |
| Exact Data Match (EDM) | Match against exact values from your database (patient IDs, account numbers) | M365 E5 |
| DLP Policies | Block or warn on sharing sensitive data via email, Teams, SharePoint | M365 E3 (basic), E5 (advanced) |
| Endpoint DLP | Monitor and control sensitive data on Windows/macOS devices | M365 E5 |
| Data Map & Catalog | Scan and classify data across Azure, AWS, GCP, on-premises | Purview Governance (separate billing) |
Designing Your Sensitivity Label Taxonomy
The sensitivity label taxonomy is the foundation of your entire information protection strategy. A poorly designed taxonomy leads to user confusion, inconsistent classification, and gaps in protection. Based on hundreds of enterprise deployments, EPC Group recommends a four-tier taxonomy with sub-labels for regulatory categories.
Recommended Enterprise Label Taxonomy
Public
├── No protection applied
├── Watermark: "Public" in header
└── Use: Marketing materials, press releases, public-facing docs
General (Internal)
├── No encryption
├── Footer: "Internal Use Only"
└── Use: Day-to-day business documents, internal memos
Confidential
├── Confidential - All Employees
│ ├── Encrypt: All authenticated users in tenant
│ ├── Restrict: No external sharing
│ └── Use: HR policies, financial reports, strategy docs
├── Confidential - Project [Name]
│ ├── Encrypt: Specific group/team only
│ ├── Restrict: No copy, no print, no forward
│ └── Use: M&A documents, board materials
└── Confidential - HIPAA / PCI / GDPR
├── Encrypt: Designated compliance group
├── Restrict: No external sharing, watermark, audit trail
└── Use: PHI, payment card data, EU personal data
Highly Confidential
├── Highly Confidential - Executive
│ ├── Encrypt: C-suite distribution list only
│ ├── Restrict: No copy, no print, no forward, no extract
│ └── Use: Board presentations, acquisition targets
└── Highly Confidential - Regulated
├── Encrypt: Compliance team + legal only
├── Restrict: All restrictions, offline access 7 days max
└── Use: Legal holds, active litigation, audit findingsLabel Priority and Downgrade Policies
Labels must have a defined priority order (highest number = highest sensitivity). Configure downgrade justification to require users to provide a reason when lowering a document's classification. For regulated industries, EPC Group recommends blocking downgrades entirely for HIPAA and Highly Confidential labels — only compliance officers should be able to reduce classification.
- Priority 0: Public — no restrictions on downgrade
- Priority 1: General — justification required for downgrade
- Priority 2: Confidential — justification required, logged to audit
- Priority 3: Highly Confidential — downgrade blocked (admin override only)
Container Labels for SharePoint and Teams
Sensitivity labels extend beyond files to SharePoint sites, Teams, and Microsoft 365 Groups. Container labels control the privacy setting (public/private), external sharing policy, unmanaged device access, and Conditional Access requirements for the entire site or team. This is essential for organizations using SharePoint as their primary collaboration platform.
- General Sites: Private, allow external sharing with authenticated guests, full access from unmanaged devices
- Confidential Sites: Private, block external sharing, limited access from unmanaged devices (web-only, no download)
- Highly Confidential Sites: Private, block external sharing, block unmanaged devices entirely, require compliant device via Conditional Access
Auto-Labeling Policies and Trainable Classifiers
Manual labeling relies on user judgment, which is inconsistent at best. Enterprise organizations need automated classification that applies labels based on content inspection. Purview offers two auto-labeling approaches: client-side (real-time as users create documents) and service-side (bulk scanning of existing content in SharePoint and OneDrive).
Sensitive Information Types (SITs)
Microsoft provides 300+ built-in sensitive information types that detect patterns like Social Security numbers, credit card numbers, passport numbers, and medical terms. Each SIT uses a combination of regular expressions, keyword dictionaries, checksums, and proximity rules to reduce false positives.
Best Practice: Custom SITs for Your Organization
Built-in SITs cover common patterns, but every enterprise has unique identifiers: patient MRN formats, internal account numbering schemes, project codes, or proprietary data classifications. Create custom sensitive information types that match your specific patterns. For healthcare clients, EPC Group typically creates custom SITs for the organization's MRN format, internal physician IDs, and facility-specific coding systems.
Trainable Classifiers
Trainable classifiers use machine learning to identify document types based on content, not just patterns. Microsoft provides pre-trained classifiers for common categories (resumes, source code, financial statements, legal agreements) and allows organizations to train custom classifiers on their own document corpus.
- Pre-trained classifiers: Contracts, invoices, resumes, source code, financial statements, threats/harassment, profanity — ready to use immediately
- Custom classifiers: Provide 50-500 positive samples of your document type, Purview trains a classifier. Common use cases include proprietary report formats, engineering specifications, and clinical trial documents
- Accuracy tuning: Review classifier predictions, mark false positives/negatives, retrain. Target 95%+ precision before deploying in enforcement mode
Auto-Labeling Configuration Strategy
Deploy auto-labeling in three phases: simulation, recommendation, enforcement. The simulation phase runs policies against your existing content without applying labels — use this to validate accuracy and estimate the volume of affected files. Recommendation mode suggests labels to users (yellow bar in Office apps) without forcing application. Enforcement mode applies labels automatically with no user interaction.
| Phase | Duration | Action | User Impact |
|---|---|---|---|
| Simulation | 2-4 weeks | Policy runs in simulation, no labels applied | None — invisible to users |
| Recommendation | 4-6 weeks | Suggests label in Office apps, user decides | Low — tooltip recommendations |
| Enforcement | Ongoing | Labels applied automatically, no user action | Minimal — labels appear automatically |
Data Loss Prevention (DLP) Policies
DLP policies are the enforcement layer of information protection. While sensitivity labels classify and protect data at the file level, DLP policies monitor data in motion — preventing sensitive information from leaving the organization through email, Teams chat, SharePoint sharing, or cloud app uploads.
DLP Policy Architecture
Design DLP policies around three dimensions: what sensitive data to protect (SITs, labels, classifiers), where to monitor (Exchange, Teams, SharePoint, OneDrive, endpoints, Defender for Cloud Apps), and what action to take (notify, block, override with justification, encrypt).
Warning: Start with Audit-Only Mode
Never deploy DLP policies in blocking mode from day one. Start with audit-only (detect and log, no user action) for a minimum of two weeks. Review the DLP alerts to identify false positives, overly broad rules, and legitimate business processes that your policy would disrupt. Then move to notify mode (user sees a policy tip but can override) for another two weeks before enabling block mode. Skipping this process causes user backlash and undermines the entire program.
Enterprise DLP Policy Examples
- HIPAA PHI Protection: Detect content containing 2+ HIPAA-related SITs (patient names + MRN, diagnosis codes + SSN). Block external sharing via email and Teams. Allow override for care coordination with logged justification. Apply to Exchange Online, Teams chat, SharePoint/OneDrive.
- Financial Data Protection: Detect content with credit card numbers, bank account numbers, or financial statement classifiers. Block sharing with external domains except approved partners. Require encryption for any external email containing financial SITs.
- Source Code Protection: Detect source code using the pre-trained classifier. Block uploads to consumer cloud storage (personal OneDrive, Dropbox, Google Drive). Allow sharing within approved development repositories and partner portals.
- Executive Communications: Detect content labeled "Highly Confidential - Executive." Block all external sharing without exception. Block copy to USB drives via endpoint DLP. Alert compliance team on any match.
Endpoint DLP and Device Protection
Endpoint DLP extends data loss prevention to the device level — monitoring and controlling sensitive data on Windows 10/11 and macOS devices. This is critical for organizations with remote workforces where data moves beyond the Microsoft 365 boundary. Endpoint DLP monitors file activities including copying to USB drives, printing, uploading to cloud services via browser, copying to clipboard, and accessing by unallowed applications.
- USB and removable media: Block or audit-only when users copy files containing sensitive data to USB drives. Configure allowed USB device groups by hardware ID for approved encrypted drives.
- Printing: Block printing of documents with "Highly Confidential" labels. Allow printing to approved network printers for "Confidential" documents with audit logging.
- Browser upload: Block upload of labeled documents to non-approved cloud services. Integrate with Microsoft Defender for Cloud Apps for shadow IT detection.
- Application access: Define unallowed apps (personal email clients, consumer cloud sync tools). Block these apps from opening files with "Confidential" or higher labels.
Prerequisites for Endpoint DLP
Devices must be onboarded to Microsoft Purview (via Intune, Configuration Manager, or local script). Windows devices require version 1809+ (Windows 10) or Windows 11. macOS devices require Catalina 10.15+. Devices must be Azure AD joined or hybrid Azure AD joined. Microsoft 365 E5 or E5 Compliance add-on license is required per user.
Compliance Mapping: HIPAA, SOC 2, GDPR, FedRAMP
One of the most valuable aspects of Purview Information Protection is its direct mapping to regulatory compliance requirements. EPC Group maps every sensitivity label and DLP policy to specific compliance controls, creating an auditable evidence trail. This is central to our data governance consulting practice.
| Regulation | Requirement | Purview Control |
|---|---|---|
| HIPAA 164.312(a) | Access control for ePHI | Sensitivity labels with encryption + RMS permissions |
| HIPAA 164.312(e) | Transmission security | Mandatory encryption on Confidential-HIPAA labels |
| SOC 2 CC6.1 | Logical access controls | Label-based access restrictions + Conditional Access |
| SOC 2 CC6.7 | Data movement restrictions | DLP policies blocking unauthorized data transfer |
| GDPR Article 32 | Security of processing | Encryption, pseudonymization via sensitivity labels |
| GDPR Article 17 | Right to erasure | Content search + eDiscovery for labeled GDPR data |
| FedRAMP AC-3 | Access enforcement | Label-based encryption + DLP + Conditional Access |
Phased Deployment Strategy
Enterprise Purview Information Protection deployments should follow a phased approach that builds organizational readiness while progressively expanding coverage. Rushing deployment leads to misconfigured policies, user resistance, and compliance gaps. Our approach at EPC Group, refined across 500+ deployments, follows four phases.
Phase 1: Foundation (Weeks 1-4)
- Define sensitivity label taxonomy with stakeholder alignment
- Configure label policies for pilot group (50-100 users from IT, legal, compliance)
- Enable audit logging and content search
- Deploy basic DLP policies in audit-only mode for top 3 sensitive data types
- Train pilot users on manual label selection
- Deliverable: Label taxonomy document, pilot deployment report, baseline classification metrics
Phase 2: Expansion (Weeks 5-10)
- Expand label policies to all users organization-wide
- Enable auto-labeling in recommendation mode for top sensitive info types
- Promote DLP policies from audit-only to notify mode
- Configure container labels for SharePoint sites and Teams
- Deploy default labeling (auto-apply "General" if user does not choose)
- Deliverable: Organization-wide rollout complete, auto-labeling simulation report
Phase 3: Enforcement (Weeks 11-16)
- Enable auto-labeling enforcement for validated SITs and classifiers
- Promote DLP policies to block mode with override justification
- Deploy endpoint DLP on managed Windows and macOS devices
- Configure Defender for Cloud Apps integration for third-party SaaS protection
- Enable service-side auto-labeling to scan existing SharePoint/OneDrive content
- Deliverable: Full enforcement operational, endpoint DLP active, third-party coverage
Phase 4: Optimization (Ongoing)
- Review DLP alert volumes and tune false positive rates below 5%
- Train custom classifiers for organization-specific document types
- Deploy exact data match (EDM) for precise identification of proprietary data
- Integrate with Microsoft Purview Data Map for multi-cloud data classification
- Generate compliance reports for audit readiness (HIPAA, SOC 2, GDPR)
- Deliverable: Monthly optimization reports, compliance audit evidence package
Licensing and Cost Analysis
Understanding Purview licensing is critical for budget planning. Microsoft bundles information protection features across multiple SKUs, and choosing the wrong license tier means either paying for unused capabilities or lacking essential features. Here is the breakdown relevant to information protection.
| Feature | M365 E3 | M365 E5 | E5 Compliance Add-on |
|---|---|---|---|
| Manual sensitivity labels | Included | Included | Included |
| Client-side auto-labeling | Not included | Included | Included |
| Service-side auto-labeling | Not included | Included | Included |
| Trainable classifiers | Not included | Included | Included |
| Endpoint DLP | Not included | Included | Included |
| Exact Data Match | Not included | Included | Included |
| Basic DLP (Exchange, SPO) | Included | Included | Included |
| Price per user/month | $36 | $57 | $36 + $12 |
For most enterprise organizations, EPC Group recommends Microsoft 365 E5 for users who create and classify sensitive content (typically 30-50% of the workforce) and E3 for general users who only consume protected content. This tiered approach reduces licensing costs by 25-40% compared to full E5 deployment.
Common Deployment Mistakes to Avoid
After 500+ deployments, EPC Group has identified the recurring mistakes that derail Purview Information Protection projects. Avoiding these issues accelerates time to value and ensures lasting adoption.
- Too many labels: Organizations with 15+ sensitivity labels see adoption rates drop below 30%. Keep the top level to 4-5 labels. Use sub-labels for granularity. Users should be able to choose the right label in under 5 seconds.
- Skipping the simulation phase: Deploying auto-labeling or DLP in enforcement mode without simulation generates hundreds of false positives, overwhelms the SOC team, and causes users to request exemptions. Always simulate first, validate accuracy, then enforce.
- No executive sponsorship: Information protection changes how every employee works with data. Without visible C-suite sponsorship and clear communication about why classification matters, adoption stalls. The CISO or CIO must champion the program.
- Ignoring change management: Technical deployment is 40% of the effort. The remaining 60% is training, communication, help desk preparation, and user support. Budget for at least 3 rounds of training (pilot, general, refresher) and dedicated support resources for the first 90 days.
- Not integrating with existing workflows: If users must leave their workflow to classify data, they won't do it. Leverage default labels, auto-labeling, and Office integration so protection happens within the tools users already use daily.
- Failing to monitor and tune: A deployed policy is not a finished policy. Review DLP alerts weekly, track false positive rates, and adjust SIT confidence levels and keyword dictionaries quarterly. Information protection is an ongoing program, not a project with an end date.
Partner with EPC Group
EPC Group has deployed Microsoft Purview Information Protection for over 500 enterprise organizations across healthcare, financial services, and government. Our Microsoft 365 consulting team combines deep technical expertise with compliance domain knowledge to deliver information protection programs that achieve high adoption rates and pass regulatory audits. We also integrate Purview with broader data governance frameworks and AI governance strategies for organizations deploying Copilot and other AI tools.
Frequently Asked Questions
What is Microsoft Purview Information Protection?
Microsoft Purview Information Protection (formerly Microsoft Information Protection or MIP) is a suite of tools within the Microsoft 365 compliance ecosystem that helps organizations discover, classify, label, and protect sensitive data across emails, documents, SharePoint sites, Teams messages, and third-party cloud applications. It includes sensitivity labels, auto-labeling policies, data loss prevention (DLP), and encryption — all managed from the Microsoft Purview compliance portal. EPC Group has deployed Purview Information Protection for over 500 enterprise clients.
How do sensitivity labels work in Microsoft Purview?
Sensitivity labels are metadata tags applied to documents, emails, and containers (SharePoint sites, Teams, Microsoft 365 Groups) that define the classification level and enforce protection actions. When a user applies a "Confidential" label, Purview can automatically encrypt the file, add watermarks, restrict copy/paste, prevent forwarding, and control who can access the content. Labels can be applied manually by users, recommended by Purview based on content inspection, or automatically enforced through auto-labeling policies that scan for sensitive data patterns like SSNs, credit card numbers, or HIPAA identifiers.
How long does it take to deploy Microsoft Purview Information Protection?
A phased Purview Information Protection deployment typically takes 12-20 weeks for enterprise organizations. Phase 1 (weeks 1-4) covers planning, taxonomy design, and pilot group deployment. Phase 2 (weeks 5-10) involves auto-labeling policies, DLP rules, and expanded user rollout. Phase 3 (weeks 11-16) includes endpoint DLP, third-party app integration, and compliance validation. Organizations with HIPAA or FedRAMP requirements should add 4-6 weeks for additional audit documentation and validation testing.
What is the difference between sensitivity labels and retention labels?
Sensitivity labels protect data by controlling access and applying encryption — they answer "who can see this data and what can they do with it." Retention labels govern the data lifecycle by defining how long data must be kept and when it should be deleted — they answer "how long must we keep this and when do we dispose of it." Both label types can coexist on the same document. For example, a healthcare record might have a "Highly Confidential - HIPAA" sensitivity label (encrypts, restricts access) and a "Retain 7 Years" retention label (prevents deletion for regulatory compliance).
Can Microsoft Purview protect data in non-Microsoft applications?
Yes. Microsoft Purview extends protection beyond Microsoft 365 through several mechanisms: Microsoft Defender for Cloud Apps applies sensitivity labels to files in Box, Dropbox, Google Workspace, and Salesforce. The Azure Information Protection unified labeling client protects PDFs and non-Office file types. Microsoft Purview Data Map scans and classifies data in Azure SQL, AWS S3, Google Cloud Storage, on-premises SQL Server, and SAP. Endpoint DLP policies protect sensitive data on Windows and macOS devices regardless of the application being used.
What licenses are required for Microsoft Purview Information Protection?
Basic sensitivity labels (manual application) are included in Microsoft 365 E3/A3/G3 and Microsoft 365 Business Premium. Advanced features require Microsoft 365 E5/A5/G5 or the Microsoft 365 E5 Compliance add-on ($12/user/month): automatic labeling, trainable classifiers, exact data match, endpoint DLP, and Defender for Cloud Apps integration. For organizations needing only specific features, standalone add-ons include Microsoft 365 E5 Information Protection & Governance ($10/user/month) and Microsoft 365 E5 Insider Risk Management ($10/user/month).
How does Microsoft Purview support HIPAA compliance?
Microsoft Purview supports HIPAA compliance through multiple layers: sensitivity labels encrypt Protected Health Information (PHI) at rest and in transit, DLP policies prevent unauthorized sharing of patient data via email or Teams, auto-labeling identifies PHI patterns (medical record numbers, diagnosis codes, patient names) and applies protection automatically, and audit logs provide the access trail required by HIPAA Security Rule Section 164.312. EPC Group has implemented HIPAA-compliant Purview configurations for over 100 healthcare organizations.