
Power BI Copilot 'Prep Data for AI' 2026: Git-Friendly Metadata Architecture for Regulated Industries
Power BI Copilot Prep Data for AI tooling format: governance, Git-friendly metadata architecture, sensitivity-label gating, audit patterns for HIPAA, SOC 2, FedRAMP.
Power BI Copilot Prep Data for AI tooling format: governance, Git-friendly metadata architecture, sensitivity-label gating, audit patterns for HIPAA, SOC 2, FedRAMP.

For three years, regulated-industry enterprises have approached Microsoft Copilot in Power BI with caution. The capability is compelling — automatic natural-language summarization of report data, conversational analysis of complex models — but the audit and governance questions have been substantial. What does the auditor see when they ask "show me every Copilot response generated last quarter that summarized data with a Confidential sensitivity label"? Who approved the Copilot synonym that changed how Net Revenue is described to consumers? How do we know the language model is not surfacing detail to a consumer who should not see it?
The May 2026 Microsoft Fabric and Power BI release answers most of these questions through three converging capabilities:
The Copilot Tooling Format ("Prep Data for AI") puts Copilot metadata into Git-managed text files alongside the TMDL semantic-model definition, so changes to synonyms and descriptions go through the same code-review process as changes to DAX measures.
Microsoft Purview sensitivity labels propagate end-to-end through Fabric, gating Copilot behavior on confidentiality and triggering Microsoft Sentinel audit events that flow into existing SIEM pipelines.
Fabric audit logging at the capacity level captures Copilot prompts and responses for downstream review.
This guide is for enterprise data and security leaders responsible for a Power BI Copilot rollout in a regulated environment. We cover the architecture, the governance patterns, and the implementation framework EPC Group has refined across hundreds of regulated-industry engagements.
Three factors converge in mid-2026 to make Copilot rollout decisions urgent:
The Copilot Summarize feature shipped in May 2026 puts AI-generated descriptions of report data directly in front of every report consumer who clicks the Summarize button. This is the first Copilot capability with that surface area. Tenants that have not completed the sensitivity-label coverage work need to do it now.
The Copilot Tooling Format closes the source-control gap. Previously, Copilot metadata was stored in a format that made Git workflows awkward. The new format is text-based, diff-able, and merge-friendly. Enterprises that held back broader Copilot rollout because of source-control friction can now proceed.
Regulator scrutiny on AI-generated content has increased. HIPAA's 2026 access-control updates, SR 11-7 model risk management expectations for AI in financial services, and FedRAMP's emerging AI governance expectations all add weight to the audit-trail and explainability requirements for Copilot-generated summaries.
A Power BI semantic model that works well with Copilot has three layers of metadata:
The TMDL (Tabular Model Definition Language) file describes the model — tables, columns, measures, relationships, RLS rules, OLS rules, perspectives. This is the authoritative model definition and lives under version control in the team's Git repository.
Within the TMDL, each table, column, and measure has a Description property. Copilot reads these descriptions and uses them in its summaries. A measure named [Net Revenue] with a description "Net revenue after returns, allowances, and trade discounts, in reporting currency" gives Copilot the context it needs to summarize that metric correctly.
This layer is part of the semantic model definition and is versioned along with the model. It is the foundation of Copilot quality.
The Copilot Tooling Format adds the three additional concerns that the TMDL Description property cannot cleanly express:
Synonyms. Alternate business terms that should resolve to the same model concept. The synonym file maps ["Net Sales", "Topline Revenue", "Gross Revenue After Returns"] to the [Net Revenue] measure. When a user asks Copilot about "topline revenue trends," Copilot understands the user means Net Revenue.
Description overrides. Sometimes the technical description in the TMDL is correct for engineers but wrong for business users. The description override file provides the audience-appropriate phrasing that Copilot should use in summaries shown to consumers.
Sample questions. The canonical questions Copilot should be ready to answer for this model. These guide the language model and help users discover what they can ask.
The three layers work together. Layer 1 + 2 are the foundation; Layer 3 tunes Copilot's behavior on top of that foundation.
EPC Group's recommended repository structure for a governed enterprise Power BI / Fabric environment:
/fabric-tenant-repo/
├── semantic-models/
│ ├── sales-finance/
│ │ ├── definition.tmdl
│ │ ├── model.bim (legacy fallback)
│ │ ├── perspectives/
│ │ │ ├── executive.tmdl
│ │ │ └── operations.tmdl
│ │ └── copilot/
│ │ ├── synonyms.json
│ │ ├── descriptions.json
│ │ └── sample-questions.json
│ ├── operations/
│ ├── compliance/
│ └── _shared/
│ └── common-dimensions.tmdl
├── reports/
│ ├── certified/
│ │ └── finance-executive-summary.pbip/
│ └── self-service/
├── governance/
│ ├── sensitivity-label-map.yaml
│ ├── capacity-allocation.yaml
│ ├── rls-rules.md
│ └── copilot-policy.md
├── pipelines/
│ ├── ci-build.yaml
│ ├── cd-deploy.yaml
│ └── pre-commit-hooks.yaml
└── docs/
├── README.md
└── runbooks/
The Copilot metadata lives in /semantic-models/<model>/copilot/. A change to a synonym creates a diff in synonyms.json that goes through pull-request review the same way a DAX measure change does.
{
"model": "sales-finance",
"version": "1.4.0",
"lastReviewed": "2026-05-14",
"synonyms": [
{
"concept": "measures/Net Revenue",
"terms": [
"Net Sales",
"Topline Revenue",
"Revenue After Returns",
"NR"
],
"deprecated": [],
"owner": "finance-bi-team"
},
{
"concept": "tables/Customer",
"terms": ["Account", "Client", "Buyer"],
"deprecated": ["Customer Master"],
"owner": "customer-data-team"
}
]
}
The deprecated array tracks synonyms that were previously valid but are being retired. This matters for audit purposes — the auditor's question "what was the synonym definition for Customer as of January 1, 2026" is answered by looking at the file at the Git tag for that date.
{
"model": "sales-finance",
"version": "1.4.0",
"descriptions": [
{
"concept": "measures/Net Revenue",
"default": "Net revenue after returns, allowances, and trade discounts",
"audience": {
"executive": "Total revenue we recognize after subtracting returns and discounts",
"analyst": "SUM(Sales) - SUM(Returns) - SUM(Discounts), in reporting currency"
}
}
]
}
The audience-specific descriptions let Copilot use different phrasing for executive summaries vs. analyst-facing summaries. Power BI Copilot can be configured to select the audience description based on the consumer's group membership.
{
"model": "sales-finance",
"version": "1.4.0",
"sampleQuestions": [
"What is our net revenue this quarter compared to last quarter?",
"Which product line had the largest revenue decline last month?",
"Show me the top 10 customers by net revenue year-to-date",
"How has the gross margin trended over the past 12 months?"
]
}
Sample questions guide both the Copilot model (helping it understand the typical query patterns for this model) and the user interface (Power BI Copilot can surface these as suggested prompts).
Microsoft Purview sensitivity labels (Public, Internal, Confidential, Highly Confidential, and any custom labels the tenant has defined) apply across the Microsoft 365, Azure, and Fabric environment. For Power BI semantic models and reports, the label appears on the item and on derived items.
The Copilot behavior is gated as follows:
| Label | Copilot Summarize behavior |
|---|---|
| Public | Generates summary, no restriction |
| Internal | Generates summary, audit log entry created |
| Confidential | Generates summary, audit log entry, may include label disclaimer in response |
| Highly Confidential (with "block Copilot processing" flag) | Refuses to summarize, returns label-aware message |
| Custom labels | Behavior defined per label policy |
The exact behavior depends on the tenant's Microsoft Purview label policy configuration. The standard pattern in our regulated-industry deployments:
Before enabling Copilot Summarize tenant-wide, the data security team should validate:
Microsoft Fabric capacity-level audit logging captures Copilot interactions:
These events flow into the Microsoft Purview Audit log (Standard) and, for tenants that have configured the routing, into Microsoft Sentinel.
For regulated-industry tenants with an established SIEM, the audit-log routing pattern is:
Microsoft Fabric (Power BI Copilot interactions)
↓
Microsoft Purview Audit (Standard)
↓
Microsoft Sentinel (via the Microsoft Defender for Cloud Apps connector
or the direct Microsoft Purview connector)
↓
Analytic rules (regulated-industry rule pack):
- Copilot prompt against Highly Confidential data
- Anomalous Copilot prompt volume per user
- Copilot prompt containing PII-like patterns
- Copilot prompt from outside the expected geographic region
The analytic rule pack is industry-specific. Healthcare tenants extend with HIPAA-aligned rules; financial services tenants extend with SR 11-7 and SOX-aligned rules; federal tenants extend with FedRAMP and NIST 800-53 aligned rules.
Audit retention windows vary by regulatory framework:
Microsoft Purview Audit (Standard) retains 90 days; Audit (Premium) retains 1 year. For longer retention, the audit logs are archived to Azure Storage (Microsoft Sentinel can write to Azure Data Explorer for cost-effective long-retention).
For healthcare enterprises rolling out Copilot in Power BI on PHI-touching data:
For financial services enterprises:
For federal-sector enterprises:
For regulated-industry enterprises deploying Power BI Copilot, the implementation pattern that delivers consistent results without compliance friction:
Weeks 1–2: Discovery and gap analysis.
Weeks 3–6: Foundation.
Weeks 7–10: Metadata population.
Weeks 11–12: Pilot.
Weeks 13–14: Governance update.
Weeks 15–16: Broad rollout.
Across the regulated-industry Copilot rollouts we have guided in 2026, the recurring problem patterns:
Enabling Copilot before completing sensitivity-label coverage. A model with the default label inherits the tenant's default sensitivity, which is usually too permissive for regulated data. Cover labels first, broad-enable second.
Treating the Copilot Tooling Format as optional. Copilot will function without it, but the quality of summaries is substantially better with it. The investment is modest (typically 2–4 hours per semantic model) and pays back quickly in user adoption.
Skipping the audit-log routing setup. Tenants that enable Copilot without routing audit events into Sentinel (or the equivalent SIEM) discover the gap during the first regulatory review. Set up routing before broad enablement.
Underestimating the workforce training burden. Especially in healthcare, the HIPAA Security Rule workforce training requirement extends to Copilot. The training is not heavy (typically 20–30 minutes of content) but it needs to happen.
Letting business units self-author synonyms without governance. Synonyms are powerful and change Copilot's behavior in ways that surprise users. Synonym changes should go through the same code-review process as DAX measure changes.
Forgetting to update the model risk inventory. For financial services, the SR 11-7 model risk inventory must include Copilot. Banks that have not done this discover it during the next model risk audit.
The Copilot Tooling Format is the May 2026 GA storage format for Power BI Copilot metadata — synonyms (alternate business terms for model concepts), description overrides (audience-appropriate phrasing for Copilot to use), and sample questions (canonical questions Copilot should be ready to answer). The format is text-based, Git-friendly, and integrates cleanly into existing TMDL-based development pipelines.
No. Power BI Copilot will work without the Tooling Format using only the TMDL descriptions. The Tooling Format improves Copilot quality substantially by providing synonyms, audience-specific descriptions, and sample questions. We recommend implementing it for any model where Copilot will be exposed to broad user populations.
Microsoft Purview sensitivity labels can include a "block Copilot processing" policy. Labels with that flag prevent Copilot from generating summaries for the labeled content. The label propagates from the semantic model to the reports built on it. The tenant's label policy defines which labels block Copilot processing.
Power BI Copilot interactions generate audit events at the Fabric capacity level. The events include: user identity, timestamp, semantic model and report context, prompt text, response generated, and sensitivity label context. The events flow into Microsoft Purview Audit (Standard) and can be routed to Microsoft Sentinel.
The HIPAA Security Rule applies to Copilot the same way it applies to any other workforce member or system that accesses PHI. The covered entity must verify that the Microsoft BAA covers the Copilot service (it does for the current Fabric Copilot offering), establish appropriate access controls, audit logging, and workforce training. De-identified data is outside HIPAA scope and is generally the simpler path for Copilot rollout.
Most banks classify Power BI Copilot as a model under SR 11-7. The model risk function performs an inventory entry, periodic effective challenge, and documented governance review. The risk classification is typically moderate given the contained surface area (summarization rather than decision-making).
When a report visual is backed by multiple models (typically through composite models or DirectQuery to a remote semantic model), Copilot Summarize uses the metadata from each model. The visual's effective sensitivity label is the highest sensitivity of the contributing models.
Yes. The Copilot Tooling Format supports audience-specific descriptions. The audience is typically determined by the user's group membership at the time of the Copilot interaction. The configuration is in the description override file and the tenant's Copilot policy.
Mark the synonym as deprecated in the synonyms file (move it from the terms array to the deprecated array). Monitor usage through the Copilot audit logs. After a stable absence-of-use window (typically 30–90 days depending on the model's user base), remove the deprecated synonym entirely in a subsequent release.
TMDL descriptions are part of the semantic model definition. They are shown in tooltips and used by Copilot as the default description. The Copilot description override is part of the Copilot Tooling Format and provides audience-specific phrasing that Copilot uses in summaries. The override is layered on top of the TMDL description.
For a model the team is familiar with, allow 2–4 hours per model for the initial Tooling Format population (synonyms, descriptions, sample questions). Subsequent refinement based on production usage feedback is ongoing but lightweight.
Yes. Power BI Desktop's Copilot setup experience can edit the metadata files through a UI. For team-based development, we recommend the text-editor + Git workflow because it preserves the diff history and code-review process.
Microsoft Sentinel includes a Microsoft Defender for Cloud Apps connector that captures Power BI activity events, including Copilot interactions. Additional analytic rules can be authored against these events. Microsoft has published a Copilot-specific analytic rule library that tenants can deploy as a starting point.
EPC Group works with healthcare, financial services, and federal-sector enterprises on Power BI Copilot rollouts aligned to HIPAA, SOC 2, SOX, SR 11-7, and FedRAMP frameworks. The standard pattern is a 16-week engagement covering discovery, foundation, metadata population, pilot, governance update, and broad rollout. Our consultants — including Microsoft Press bestselling author Errin O'Connor — bring direct experience across hundreds of regulated-industry Copilot deployments.
Capacity consumption depends on adoption rate. A typical pattern after broad enablement is 1,000–2,500 Summarize invocations per week in a 5,000-user tenant. The CU consumption is workload-specific but typically requires F-SKU sizing at F4 or larger for the Copilot workload alone, beyond the existing Power BI workload.
If your enterprise is preparing to roll out Power BI Copilot in a regulated environment, the practical next steps:
EPC Group has 29 years of enterprise Microsoft consulting experience and is Microsoft Solutions Partner with the core designations. We were historically the oldest continuous Microsoft Gold Partner in North America from 2016 until the program's retirement. Our consultants — including Microsoft Press bestselling author Errin O'Connor — bring direct experience across hundreds of regulated-industry Copilot deployments in healthcare, financial services, and government. To discuss your Copilot rollout, contact EPC Group for a 30-minute discovery call.
CEO & Chief AI Architect
Microsoft Press bestselling author with 29 years of enterprise consulting experience.
View Full ProfileOur team of experts can help you implement enterprise-grade microsoft copilot solutions tailored to your organization's needs.