EPC Group - Enterprise Microsoft AI, SharePoint, Power BI, and Azure Consulting
G2 High Performer Summer 2025, Momentum Leader Spring 2025, Leader Winter 2025, Leader Spring 2026
BlogContact
Ready to transform your Microsoft environment?Get started today
(888) 381-9725Get Free Consultation
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌

EPC Group

Enterprise Microsoft consulting with 29 years serving Fortune 500 companies.

(888) 381-9725
contact@epcgroup.net
4900 Woodway Drive, Suite 830
Houston, TX 77056

Follow Us

Solutions

  • All Services
  • Microsoft 365 Consulting
  • AI Governance
  • Azure AI Consulting
  • Cloud Migration
  • Microsoft Copilot
  • Data Governance
  • Microsoft Fabric
  • Dynamics 365
  • Power BI Consulting
  • SharePoint Consulting
  • Microsoft Teams
  • vCIO / vCAIO Services
  • Large-Scale Migrations
  • SharePoint Development

Industries

  • All Industries
  • Healthcare IT
  • Financial Services
  • Government
  • Education
  • Teams vs Slack

Power BI

  • Case Studies
  • 24/7 Emergency Support
  • Dashboard Guide
  • Gateway Setup
  • Premium Features
  • Lookup Functions
  • Power Pivot vs BI
  • Treemaps Guide
  • Dataverse
  • Power BI Consulting

Company

  • About Us
  • Our History
  • Microsoft Gold Partner
  • Case Studies
  • Testimonials
  • Blog
  • Resources
  • All Guides & Articles
  • Video Library
  • Client Reviews
  • Contact
  • Schedule a consultation

Microsoft Teams

  • Teams Questions
  • Teams Healthcare
  • Task Management
  • PSTN Calling
  • Enable Dial Pad

Azure & SharePoint

  • Azure Databricks
  • Azure DevOps
  • Azure Synapse
  • SharePoint MySites
  • SharePoint ECM
  • SharePoint vs M-Files

Comparisons

  • M365 vs Google
  • Databricks vs Dataproc
  • Dynamics vs SAP
  • Intune vs SCCM
  • Power BI vs MicroStrategy

Legal

  • Sitemap
  • Privacy Policy
  • Terms
  • Cookies

About EPC Group

EPC Group is a Microsoft consulting firm founded in 1997 (originally Enterprise Project Consulting, renamed EPC Group in 2005). 29 years of enterprise Microsoft consulting experience. EPC Group historically held the distinction of being the oldest continuous Microsoft Gold Partner in North America from 2016 until the program's retirement. Because Microsoft officially deprecated the Gold/Silver tiering framework, EPC Group transitioned to the modern Microsoft Solutions Partner ecosystem and currently holds the core Microsoft Solutions Partner designations.

Headquartered at 4900 Woodway Drive, Suite 830, Houston, TX 77056. Public clients include NASA, FBI, Federal Reserve, Pentagon, United Airlines, PepsiCo, Nike, and Northrop Grumman. 6,500+ SharePoint implementations, 1,500+ Power BI deployments, 500+ Microsoft Fabric implementations, 70+ Fortune 500 organizations served, 11,000+ enterprise engagements, 200+ Microsoft Power BI and Microsoft 365 consultants on staff.

About Errin O'Connor

Errin O'Connor is the Founder, CEO, and Chief AI Architect of EPC Group. Microsoft MVP multiple years, first awarded 2003. 4× Microsoft Press bestselling author of Windows SharePoint Services 3.0 Inside Out (MS Press 2007), Microsoft SharePoint Foundation 2010 Inside Out (MS Press 2011), SharePoint 2013 Field Guide (Sams/Pearson 2014), and Microsoft Power BI Dashboards Step by Step (MS Press 2018).

Original SharePoint Beta Team member (Project Tahoe). Original Power BI Beta Team member (Project Crescent). FedRAMP framework contributor. Worked with U.S. CIO Vivek Kundra on the Obama administration's 25-Point Plan to reform federal IT, and with NASA CIO Chris Kemp as Lead Architect on the NASA Nebula Cloud project. Speaker at Microsoft Ignite, SharePoint Conference, KMWorld, and DATAVERSITY.

© 2026 EPC Group. All rights reserved. Microsoft, SharePoint, Power BI, Azure, Microsoft 365, Microsoft Copilot, Microsoft Fabric, and Microsoft Dynamics 365 are trademarks of the Microsoft group of companies.

‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
Power Bi Dataflows Enterprise Guide | EPC Group - EPC Group enterprise consulting

Power Bi Dataflows Enterprise Guide | EPC Group

Enterprise Microsoft consulting insights from EPC Group — 29 years serving Fortune 500.

February 27, 2026|24 min read|Power BI Consulting

Power BI Dataflows: The Enterprise Guide to Self-Service Data Preparation, Gen2, and Shared Dataflows

Power BI Dataflows are the foundation of scalable enterprise analytics, enabling centralized, governed, reusable data preparation that eliminates duplicate transformation logic across reports. This guide covers the complete Dataflow architecture -- Gen1 vs. Gen2, incremental refresh configuration, linked and computed entities, Microsoft Fabric lakehouse integration, governance frameworks, and implementation strategies -- based on 150+ enterprise deployments by EPC Group.

Table of Contents

  • Why Enterprise Analytics Needs Dataflows
  • Dataflows Gen1 vs. Gen2 Comparison
  • Three-Layer Enterprise Dataflow Architecture
  • Incremental Refresh Deep Dive
  • Linked Entities and Computed Entities
  • Microsoft Fabric Lakehouse Integration
  • Data Transformation Best Practices
  • Governance and Monitoring
  • Implementation Roadmap
  • Partner with EPC Group

Power BI Dataflows Enterprise Guide 2026

Last updated: 2026 · Read time: 10 min

Power BI Dataflows are the centralized data preparation layer between source systems and Power BI semantic models. Dataflows Gen2 in Microsoft Fabric extends this with OneLake output, Fabric compute, and 150+ connectors. This guide covers Gen1 vs. Gen2 architecture, incremental refresh, linked entities, governance, and EPC Group's implementation patterns from 150+ deployments.

Key facts

  • EPC Group: 150+ enterprise Power BI Dataflow implementations across healthcare, financial services, education, and government.
  • Microsoft Gold Partner (2003-2022) 2003–2022 (oldest continuous in North America). Now Microsoft Solutions Partner.
  • Incremental refresh reduces average enterprise Dataflow refresh times from 35 minutes to 4 minutes (88% reduction).
  • Incremental refresh reduces source system query load by 90%.
  • Dataflows Gen2 outputs to any Fabric destination: lakehouse, warehouse, or KQL database — not just CDM folders.

What Power BI Dataflows do

Without Dataflows, every Power BI report author connects directly to source systems and applies their own ad-hoc transformations. This creates inconsistency, duplicated logic, and source system overload.

With Dataflows, a central team extracts data from sources, applies standardized business logic, and stores the results in a certified location. Any report then consumes the pre-transformed, certified data — not the raw source.

Gen1 vs. Gen2: key differences

The critical difference is destination flexibility. Choose your generation based on your current platform (Power BI Premium vs. Microsoft Fabric).

  • Gen1 output — writes to CDM (Common Data Model) folders. Consumed by Power BI datasets via import.
  • Gen2 output — writes directly to Fabric lakehouse Delta tables. Available immediately via Spark notebooks, SQL analytics endpoint, and Power BI Direct Lake.
  • Gen2 compute — uses Fabric compute engines for faster transformation performance.
  • Gen2 orchestration — integrates with Fabric data pipelines for complex dependency management.
  • Gen2 connectors — 150+ data connectors including all Gen1 sources plus Fabric-native sources.

Incremental refresh: the enterprise performance lever

Incremental refresh is the highest-impact Dataflow configuration for enterprise deployments. Configure it for any Dataflow that refreshes data over 1 GB or runs more than twice daily.

  • Refresh time reduction: from 35 minutes to 4 minutes average (88% reduction based on EPC Group deployments).
  • Source system load reduction: 90% reduction in source system queries per refresh cycle.
  • Refresh frequency: incremental refresh enables schedules as frequent as every 30 minutes for near-real-time scenarios.
  • Configuration: define a rolling window (e.g., keep 3 years of data, refresh only the last 14 days on each run).

Linked entities and shared Dataflows

Linked entities let multiple Dataflows consume the output of a single "certified" Dataflow without re-running the transformation. This is the Dataflow equivalent of a certified semantic model.

  • Certified Dataflow — master transformation owned by the CoE. Refreshes once daily from source.
  • Linked Dataflow — department-level Dataflow that reads from the certified Dataflow and applies department-specific filters or calculations without re-querying the source.
  • Computed entities — transform Linked entity output further using Power Query in a child Dataflow. Enables 3-tier architectures (Bronze → Silver → Gold).

Dataflows governance framework

Ungoverned Dataflows create the same problems as ungoverned Power BI reports — duplication, inconsistency, and orphaned transformations with no owner.

  • Every Dataflow must have a named owner and a documented refresh schedule.
  • Use workspace separation — Certified Dataflows in a governed workspace, experimental Dataflows in a dev workspace.
  • Enable endorsement — mark production Dataflows as "Certified" in the Power BI service.
  • Quarterly audit — identify Dataflows with no downstream dependents and decommission them.
  • Connect to Microsoft Purview — Dataflow lineage surfaces in the Purview Data Catalog for regulated environments.

Dataflows vs. Azure Data Factory

Both tools move and transform data. The choice depends on complexity and persona.

  • Power BI Dataflows — designed for Power BI-centric transformations. Power Query UI. No code required. Best for analysts who own their data preparation.
  • Azure Data Factory (ADF) — designed for enterprise-scale ETL with complex orchestration, error handling, and multi-system dependencies. Best for data engineering teams building production pipelines.
  • Fabric Data Factory — the 2026 replacement for ADF in Fabric environments. Same capabilities, unified with OneLake governance.

Many enterprise deployments use both: ADF for source-to-landing-zone pipelines, and Dataflows Gen2 for the landing-zone-to-semantic-model transformation layer.

Frequently asked questions

What is the difference between Power BI Dataflows and datasets?

Dataflows prepare and store data in a shared data store (CDM folder or OneLake). Datasets (semantic models) consume that prepared data to build the analytical layer (measures, relationships, RLS). Dataflows are the ETL layer. Datasets are the analytical layer.

Do I need Dataflows if I already have Azure Data Factory?

You can use both. ADF handles source system extraction and complex orchestration. Dataflows handle the final transformation into Power Query-compatible format. Many enterprises use ADF to land data in Azure SQL or OneLake, then Dataflows for the Power Query transformations consumed by semantic models.

What is Dataflows Gen2 and do I need Fabric to use it?

Dataflows Gen2 is available in Microsoft Fabric. It requires a Fabric capacity (F2+). Gen1 Dataflows continue to work in Power BI Premium workspaces without Fabric. Move to Gen2 when you need to output data directly to a Fabric lakehouse or warehouse.

How do I handle schema changes in source systems?

Configure data type detection to "Do not detect column types" when connecting to dynamic schemas. Add explicit type conversion steps in Power Query after source connection. Monitor Dataflow refresh failure logs — schema changes appear as type mismatch errors before they break reports.

Schedule a Dataflows architecture review

EPC Group has completed 150+ enterprise Power BI Dataflow implementations. Talk to an architect about Dataflows Gen1 vs. Gen2 migration, incremental refresh configuration, or governance framework design. Call (888) 381-9725 or request a 30-minute discovery call.

Frequently Asked Questions

What are Power BI Dataflows and why should enterprises use them?

Power BI Dataflows are a self-service data preparation technology that enables business analysts and data engineers to extract, transform, and load (ETL) data using Power Query Online without writing code. Dataflows store transformed data in Azure Data Lake Storage Gen2 (Common Data Model format) or in a Microsoft Fabric lakehouse, making the data reusable across multiple Power BI datasets, reports, and other Azure services. For enterprises, Dataflows solve a critical problem: without them, every Power BI report author duplicates data transformation logic, leading to inconsistent business definitions, wasted processing resources, and maintenance nightmares. With Dataflows, you define the transformation once and every downstream report consumes the same certified data. EPC Group has implemented enterprise Dataflow architectures for 150+ organizations, typically reducing data preparation effort by 60% and eliminating business logic inconsistencies across reports.

What is the difference between Dataflows Gen1 and Dataflows Gen2?

Dataflows Gen1 is the original Power BI service feature that transforms data using Power Query Online and stores results in Azure Data Lake Storage Gen2 in CDM format. Dataflows Gen2 is the next-generation version in Microsoft Fabric that adds significant capabilities: output to any Fabric destination (lakehouse, warehouse, KQL database), faster performance through Fabric compute engines, data pipeline integration for orchestration, staging lakehouse for intermediate data, and 150+ data connectors. The key architectural difference is destination flexibility: Gen1 outputs only to CDM folders, while Gen2 lands data directly in Fabric lakehouse Delta tables for immediate availability via Spark notebooks, SQL analytics endpoints, and Power BI Direct Lake datasets. EPC Group recommends Gen2 for all new implementations and provides migration services for organizations moving from Gen1 to Gen2.

How does incremental refresh work in Power BI Dataflows?

Incremental refresh optimizes data refresh by only processing new or changed data rather than reloading the entire dataset. The Dataflow partitions data by a date/time column and maintains a rolling window: during refresh, only recent partitions (the refresh period) are reprocessed while historical partitions (the archive period) remain untouched. Configuration involves defining RangeStart and RangeEnd parameters in Power Query, filtering the source query using these parameters, and specifying archive and refresh periods. For example, a Dataflow with 3 years of sales data and a 7-day refresh period only refreshes the last 7 days on each run, reducing refresh time from 45 minutes to 3 minutes. EPC Group implements incremental refresh for all enterprise Dataflows processing more than 1 million rows, typically reducing refresh times by 80-95%.

What are linked entities and computed entities in Power BI Dataflows?

Linked entities and computed entities are enterprise features (requiring Premium or Fabric capacity) that enable data reuse without duplication. A linked entity references an entity from another Dataflow, consuming its output without re-extracting from the source system. A computed entity references other entities within the same Dataflow for further transformation, using the enhanced compute engine (SQL-backed) for dramatically faster joins, aggregations, and complex operations. The enterprise pattern is: Layer 1 Dataflows extract raw data, Layer 2 Dataflows use linked entities to reference Layer 1 and apply business transformations, and Layer 3 Dataflows use computed entities for final metrics. EPC Group implements this layered architecture to reduce total processing time by 50% and create a reusable data preparation layer.

How do Power BI Dataflows integrate with Microsoft Fabric lakehouses?

In Microsoft Fabric, Dataflows Gen2 output directly to lakehouse Delta tables. The Dataflow connects to source systems via Power Query Online, applies transformations, and writes results as Delta format to the lakehouse. The data is immediately available via SQL analytics endpoints for T-SQL queries, Spark notebooks for data science, and Power BI Direct Lake mode for zero-copy analytics. This integration enables business analysts to contribute to the enterprise lakehouse without learning Spark or Python. EPC Group designs hybrid architectures where Dataflows Gen2 handle structured business data ingestion while Spark notebooks handle complex data engineering, creating a unified lakehouse serving all analytics needs.

How much do Power BI Dataflows cost and what licensing is required?

Basic Dataflows Gen1 are available with Power BI Pro ($10/user/month) but with limitations: no linked entities, no computed entities, no enhanced compute engine. Enterprise features require Power BI Premium ($4,995/month for P1 capacity) or Premium Per User ($20/user/month). Dataflows Gen2 in Fabric require Fabric capacity (F2 starting at approximately $260/month, F64 at approximately $8,300/month). For a 500-user enterprise analytics team, EPC Group typically recommends Fabric F64 ($8,300/month) providing compute for 50-100 Dataflows with daily refresh. Total annual cost: approximately $100,000 for capacity plus $60,000 for Pro licenses. Implementation services range from $50,000 to $150,000 depending on data source complexity.

Ready to get started?

EPC Group has completed over 10,000 implementations across Power BI, Microsoft Fabric, SharePoint, Azure, Microsoft 365, and Copilot. Let's talk about your project.

contact@epcgroup.net(888) 381-9725www.epcgroup.net
Schedule a Free Consultation