
85% of organizations have sensitivity labels configured. Fewer than 15% have them enforced. That gap is where Copilot data exposure happens.
85% of organizations have sensitivity labels configured. Fewer than 15% have them enforced. That gap is where Copilot data exposure happens. A label that users can ignore — or that does not cover legacy content — provides zero protection. Only labels backed by Azure Rights Management encryption restrict Copilot from including content in its output.
Quick Answer: Sensitivity labels protect data from Copilot only when they are actually applied to documents — not just configured in the Microsoft Purview portal. Most organizations complete label configuration (creating the taxonomy, defining protection actions, publishing to users) and consider themselves protected. But in the average enterprise tenant, fewer than 15% of existing documents have sensitivity labels applied. The remaining 85% are invisible to label-based protection policies, meaning Copilot can surface them to any user with underlying permissions. Closing the enforcement gap requires auto-labeling, mandatory labeling policies, downgrade prevention, and retroactive labeling of legacy content.
Microsoft's documentation treats sensitivity label deployment as a configuration exercise: create labels, define protection actions, publish to users, enable auto-labeling. What the documentation underemphasizes is that configuration is the beginning of the process, not the end. The real measure of label effectiveness is enforcement — the percentage of content in your tenant that actually has labels applied and protection policies actively governing access.
Before Copilot, the enforcement gap was a compliance risk. With Copilot, it becomes a data exposure risk. Copilot queries span every document, email, and Teams message that a user has permission to access. If 85% of your content has no sensitivity label, then 85% of your content has no label-based protection against Copilot surfacing it in responses, summaries, and generated documents.
EPC Group has audited sensitivity label deployment in 700+ Microsoft 365 tenants. The pattern is consistent: label configuration is complete, but label enforcement is minimal. This guide explains why the gap exists, what Copilot does with labeled vs. unlabeled content, and how to close the enforcement gap before deploying Copilot. If you need a comprehensive assessment, start with EPC Group's 47-Point Copilot & M365 Security Review.
The distinction between configuration and enforcement is the single most important concept in sensitivity label deployment. Microsoft's admin portal makes it easy to conflate the two — a green checkmark next to “Sensitivity Labels” in the Purview compliance portal means labels are configured and published, not that they are applied to content.
The Numbers: In EPC Group's 700+ tenant audits, the average sensitivity label enforcement rate is 12%. That means 88% of content in the typical enterprise tenant has no sensitivity label applied. For organizations that have deployed auto-labeling, the rate improves to 35-45% — better, but still leaving the majority of content unprotected. Only organizations with mandatory labeling policies and retroactive labeling campaigns achieve 80%+ enforcement.
Copilot's interaction with sensitivity labels is governed by the protection actions configured on each label. Understanding this interaction is critical for evaluating whether your current label deployment actually protects content from Copilot-driven exposure.
Copilot cannot access content encrypted with sensitivity label-based encryption unless the querying user has the decryption rights defined in the label policy. This is the strongest protection. If a document is labeled "Highly Confidential" with encryption restricting access to the Legal department, Copilot will not surface this content to users outside Legal — even if the underlying SharePoint permissions would allow access.
Labels with access restrictions but without encryption provide partial protection. Copilot respects DLP policies tied to labels, but without encryption, the content is still accessible to users with underlying permissions. A label that says "Confidential — Do Not Share" without encryption is an advisory — Copilot may still surface it.
Labels that only apply headers, footers, or watermarks provide zero protection from Copilot. These are visual indicators for human readers. Copilot ignores content markings entirely. A watermark reading "CONFIDENTIAL" on a Word document does not prevent Copilot from including that document in search results or generated content.
Content with no sensitivity label has no label-based protection. Copilot treats it the same as Public content — accessible to any user with underlying permissions. This is the 85% of content in most tenants: financial records, HR documents, legal contracts, strategic plans — all searchable by Copilot with no sensitivity classification.
Key Takeaway: Only labels with encryption provide reliable Copilot protection. Labels with access restrictions (no encryption) provide partial DLP-based protection. Labels with content marking only provide zero Copilot protection. Unlabeled content is completely unprotected. If your label taxonomy does not include encryption on Confidential and Highly Confidential labels, your labels are not protecting content from Copilot.
The enforcement gap is not a technology failure — it is an operational failure. The technology works. Labels, encryption, auto-labeling, mandatory labeling — all of these features function as documented. The gap exists because most organizations stop at configuration and never operationalize enforcement.
Organizations with 5-15 years of SharePoint and OneDrive content have millions of documents that predate sensitivity label deployment. These documents were created, shared, and stored without any classification system. Auto-labeling can address some of this retroactively, but most auto-labeling deployments are limited in scope (covering only specific sensitive information types) and capacity (processing limits on bulk operations).
Without mandatory labeling policies, label application depends on user behavior. Users are not trained on when to apply labels, do not understand the label taxonomy, or skip labeling because it adds friction to their workflow. The result: new documents are created daily without labels, expanding the enforcement gap even as auto-labeling addresses legacy content.
Auto-labeling policies rely on sensitive information type detection — pattern matching for Social Security numbers, credit card numbers, medical terms, and similar identifiable data. But much sensitive content does not contain detectable patterns: strategic plans, competitive analyses, merger discussions, internal investigations. These documents require manual labeling or custom trainable classifiers, which most organizations have not deployed.
Most organizations do not track label enforcement metrics. They know labels are configured but have no dashboard showing what percentage of content is labeled, what percentage is unlabeled, or how the enforcement rate is trending over time. Without measurement, there is no accountability and no improvement. The enforcement gap persists indefinitely.
Auto-labeling is the most effective mechanism for closing the enforcement gap at scale. It operates in two modes, and both are essential for comprehensive Copilot protection.
Operates within Office applications (Word, Excel, PowerPoint, Outlook). Detects sensitive information types as users create or edit content and either recommends or automatically applies a sensitivity label.
Operates at the service level in SharePoint, OneDrive, and Exchange. Scans content at rest (existing documents) and content in transit (new uploads), applying labels based on sensitive information type detection.
EPC Group Recommendation: Deploy both client-side and service-side auto-labeling simultaneously. Client-side prevents new unlabeled content from being created. Service-side addresses the existing content backlog. Start with high-confidence sensitive information types (exact data match for employee IDs, account numbers, medical record numbers) and expand to medium-confidence patterns after validating accuracy. Target: 80%+ label coverage within 90 days of deployment.
Even with labels applied, protection can be undermined if users can downgrade or remove labels. Label priority ordering and downgrade prevention are the enforcement mechanisms that ensure labels remain effective over time.
Configure these settings in your label policy to prevent users from weakening label protection. These are critical for Copilot environments because a single label downgrade can expose a document to the entire organization's Copilot queries.
Deploying labels without testing their effectiveness against Copilot is like installing a firewall without testing its rules. You need to validate that labels are doing what you expect them to do when Copilot queries content across your tenant.
Build test documents at each sensitivity level containing identifiable content (not real data). Place them in a controlled SharePoint site with defined permissions.
Apply each sensitivity label to the test documents. Verify encryption is active on Confidential and Highly Confidential documents. Confirm DLP policies are associated with each label level.
Sign in as a user who has access to all sensitivity levels. Query Copilot with prompts designed to surface test content. Verify that labeled and encrypted content is accessible (confirming labels do not over-block authorized access).
Sign in as a user who should NOT have access to Highly Confidential content. Query Copilot with the same prompts. Verify that encrypted Highly Confidential content is NOT surfaced. This confirms label-based encryption is functioning.
Create identical test documents without labels. Verify that Copilot CAN access them for all users with underlying permissions. This proves that labels are the differentiator — not permissions alone.
Record test results in a matrix. Any failure (Copilot surfacing restricted content to unauthorized users, or blocking authorized users from accessible content) indicates a label configuration or enforcement issue that must be resolved before Copilot deployment.
Sensitivity labels protect data from Copilot only when they are enforced — meaning actually applied to documents, not just configured in the Microsoft Purview portal. A configured label that has not been applied to a document provides zero protection. Copilot accesses content based on user permissions and treats unlabeled content as general-access material. If a document containing financial forecasts, HR records, or legal strategies has no sensitivity label applied, Copilot will surface it to any user who has permission to the SharePoint site, OneDrive folder, or Teams channel where it resides. The critical distinction: configuration creates the label taxonomy and defines protection actions (encryption, access restrictions, watermarks). Enforcement means labels are actually applied to content — either manually by users, automatically by auto-labeling policies, or mandatorily through labeling requirements.
Configuration is the administrative act of creating sensitivity labels in the Microsoft Purview portal: defining the label hierarchy (Public, Internal, Confidential, Highly Confidential), assigning protection actions to each label (encryption, content marking, access restrictions), and publishing labels to users through label policies. This is a one-time setup that takes 1-2 days. Enforcement is the operational reality of labels being applied to actual content across your tenant. Enforcement requires: auto-labeling policies that scan existing and new content for sensitive information types, mandatory labeling policies that prevent users from saving documents without a label, default labels that automatically apply a baseline classification, and retroactive labeling campaigns for legacy content. In most organizations, configuration is 100% complete but enforcement covers fewer than 15% of documents.
Microsoft Purview provides sensitivity label analytics through the Purview compliance portal under Data Classification > Overview. Key metrics to track: total documents labeled vs. total documents in tenant (your enforcement percentage), label distribution (how many documents at each sensitivity level), auto-label vs. manual label ratio (indicates whether users are actually applying labels or relying on automation), unlabeled document count by location (SharePoint sites, OneDrive accounts, Exchange mailboxes), and label changes over time (trend showing whether adoption is increasing or stagnating). You can also use Microsoft Graph API reports to extract label analytics programmatically. EPC Group considers 80% label coverage the minimum threshold for Copilot deployment — most organizations are below 15% when they first engage us.
When Copilot accesses unlabeled content, it treats the content as accessible to anyone with the underlying permissions — no encryption, no access restriction, no DLP policy enforcement based on sensitivity. Copilot will include unlabeled content in search results, summaries, and generated responses without any sensitivity classification. This means a Copilot-generated executive summary could combine data from a labeled Confidential document (which has encryption and access controls) with data from unlabeled documents containing equally sensitive information (which have no protection). The resulting output inherits no label from the unlabeled sources, creating a document that contains sensitive data with no classification or protection. This is the fundamental enforcement gap: labels only protect content they are applied to.
Auto-labeling uses Microsoft Purview sensitive information types (SITs) to automatically detect and classify content. There are two modes: client-side auto-labeling recommends or automatically applies labels in Office apps as users create or edit documents. Service-side auto-labeling scans content at rest in SharePoint, OneDrive, and Exchange, applying labels to existing documents that match configured rules. For Copilot protection, service-side auto-labeling is critical because it addresses the legacy content problem — millions of documents created before labels existed. Configure auto-labeling rules for sensitive information types relevant to your organization: Social Security numbers, credit card numbers, medical record numbers, financial account numbers, and custom SITs for proprietary data. EPC Group recommends starting with high-confidence SITs (exact match) and expanding to medium-confidence patterns after validating accuracy.
By default, Microsoft Purview allows users to change or remove sensitivity labels on documents they own or have edit permissions for. This creates a significant Copilot risk: a user could downgrade a Highly Confidential document to Internal or Public, removing encryption and access restrictions, making the content broadly accessible to Copilot queries across the organization. Label downgrade prevention is configured in label policies: require justification for label changes (users must provide a reason for downgrading), require justification for label removal, and optionally block label downgrade entirely. EPC Group recommends requiring justification for all downgrades and blocking removal of Highly Confidential labels without admin approval. The justification log is auditable through Purview Activity Explorer, providing a compliance trail for label changes.
Testing label effectiveness for Copilot requires a systematic approach: 1) Create test documents at each sensitivity level in a controlled SharePoint site. 2) Apply sensitivity labels with known protection actions (encryption, access restrictions). 3) Sign in as a user who should NOT have access to Highly Confidential content. 4) Query Copilot with prompts designed to surface the test documents ("summarize the financial forecast" or "what are the Q4 projections"). 5) Verify that Copilot respects the label-based restrictions — Highly Confidential documents should not appear in results for users without appropriate clearance. 6) Test with unlabeled documents containing similar content to confirm Copilot CAN access them (proving labels are the differentiator). 7) Document results in a test matrix. EPC Group performs this testing as part of every 47-Point Assessment, using controlled test data to validate that labels are functioning as expected.
EPC Group performs Copilot & M365 Tenant Security Reviews for enterprises across all industries. With 700+ tenants secured and 29 years of Microsoft expertise, we identify exactly what Copilot can access that it shouldn't.
Our 47-Point Assessment measures your actual label enforcement rate — not just configuration status — and provides a prioritized remediation roadmap to achieve 80%+ coverage before Copilot deployment.
85% of organizations have sensitivity labels configured. Fewer than 15% have them enforced. That gap is where Copilot data exposure happens. A label that users can ignore — or that does not cover legacy content — provides zero protection. Only labels backed by Azure Rights Management encryption restrict Copilot from including content in its output.
Label protection level determines whether Copilot can access content. Only encryption blocks Copilot. Everything else is advisory.
Copilot can only access this content if the user has decryption rights. If the user does not have rights, the content is excluded from Copilot's response. This is the only label type that actually restricts Copilot.
Copilot can access and include this content in responses. The label restricts manual actions (downloading, printing, forwarding) but does not block Copilot queries.
Headers, footers, and watermarks are visual indicators only. Copilot ignores them entirely. The label does not restrict data access in any way.
Copilot treats unlabeled content the same as content labeled "Public." It is fully accessible and includable in any response.
Most organizations have years of documents created before any label policy existed. Auto-labeling published to users only applies to new content going forward. Legacy content stays unlabeled — and fully accessible to Copilot.
Users can dismiss or ignore label prompts if mandatory labeling is not enforced. In most tenants, label application is optional. Copilot searches content regardless of whether users chose to label it.
Client-side auto-labeling applies during document creation but relies on users having a current Office client. Service-side auto-labeling runs in the background — but most organizations enable it only for Exchange, not for SharePoint and OneDrive where the bulk of sensitive files live.
Without tracking label adoption rates in Purview, organizations do not know how much of their sensitive content is unlabeled. The gap grows silently until Copilot surfaces it.
Applies labels automatically when users create or edit documents in Office applications. The label is suggested or applied based on content detection rules. Best for new content going forward.
Scans existing content at rest across SharePoint, OneDrive, and Exchange. Applies labels retroactively to legacy content. This is the mechanism that closes the legacy content gap for Copilot.
Only labels backed by Azure Rights Management encryption restrict Copilot access. Labels with content marking only (headers, footers, watermarks) do not block Copilot. A label that is configured but not enforced — or that does not cover legacy content — provides zero protection.
Configuration means labels exist and are published to users. Enforcement means users must apply labels (mandatory labeling), auto-labeling covers legacy content, and adoption rates are measured and tracked. Most organizations have configuration only.
In Microsoft Purview, go to Information Protection → Label activity. This shows labeled vs unlabeled file counts by location. You can also run Content Explorer to see labeled content by location and label type. Target: 80%+ of sensitive content covered before Copilot goes live.
Copilot treats unlabeled content as Public — fully accessible. It includes unlabeled content in responses without restriction. This is why the legacy content gap is critical: years of sensitive unlabeled documents are fully accessible to any Copilot-licensed user.
Create test documents with known synthetic sensitive data. Apply labels with different protection levels. Then test Copilot queries as both authorized and unauthorized users. Encryption-backed labels should block unauthorized access; content-marking labels will not.
EPC Group measures label adoption, deploys auto-labeling, and validates protection effectiveness before any Copilot license is assigned. Call (888) 381-9725 or schedule a sensitivity label enforcement review.