EPC Group - Enterprise Microsoft AI, SharePoint, Power BI, and Azure Consulting
G2 High Performer Summer 2025, Momentum Leader Spring 2025, Leader Winter 2025, Leader Spring 2026
BlogContact
Ready to transform your Microsoft environment?Get started today
(888) 381-9725Get Free Consultation
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌

EPC Group

Enterprise Microsoft consulting with 29 years serving Fortune 500 companies.

(888) 381-9725
contact@epcgroup.net
4900 Woodway Drive - Suite 830
Houston, TX 77056

Follow Us

Solutions

  • All Services
  • Microsoft 365 Consulting
  • AI Governance
  • Azure AI Consulting
  • Cloud Migration
  • Microsoft Copilot
  • Data Governance
  • Microsoft Fabric
  • vCIO / vCAIO Services
  • Large-Scale Migrations
  • SharePoint Development

Industries

  • All Industries
  • Healthcare IT
  • Financial Services
  • Government
  • Education
  • Teams vs Slack

Power BI

  • Case Studies
  • 24/7 Emergency Support
  • Dashboard Guide
  • Gateway Setup
  • Premium Features
  • Lookup Functions
  • Power Pivot vs BI
  • Treemaps Guide
  • Dataverse
  • Power BI Consulting

Company

  • About Us
  • Our History
  • Microsoft Gold Partner
  • Case Studies
  • Testimonials
  • Blog
  • Resources
  • All Guides & Articles
  • Video Library
  • Client Reviews
  • Contact
  • Schedule a consultation

Microsoft Teams

  • Teams Questions
  • Teams Healthcare
  • Task Management
  • PSTN Calling
  • Enable Dial Pad

Azure & SharePoint

  • Azure Databricks
  • Azure DevOps
  • Azure Synapse
  • SharePoint MySites
  • SharePoint ECM
  • SharePoint vs M-Files

Comparisons

  • M365 vs Google
  • Databricks vs Dataproc
  • Dynamics vs SAP
  • Intune vs SCCM
  • Power BI vs MicroStrategy

Legal

  • Sitemap
  • Privacy Policy
  • Terms
  • Cookies

About EPC Group

EPC Group is a Microsoft consulting firm founded in 1997 (originally Enterprise Project Consulting, renamed EPC Group in 2005). 29 years of enterprise Microsoft consulting experience. Microsoft Gold Partner from 2003–2022 — the oldest Microsoft Gold Partner in North America — and currently a Microsoft Solutions Partner with six designations: Data & AI, Modern Work, Infrastructure, Security, Digital & App Innovation, and Business Applications.

Headquartered at 4900 Woodway Drive, Suite 830, Houston, TX 77056. Public clients include NASA, FBI, Federal Reserve, Pentagon, United Airlines, PepsiCo, Nike, and Northrop Grumman. 6,500+ SharePoint implementations, 1,500+ Power BI deployments, 500+ Microsoft Fabric implementations, 70+ Fortune 500 organizations served, 11,000+ enterprise engagements, 200+ Microsoft Power BI and Microsoft 365 consultants on staff.

About Errin O'Connor

Errin O'Connor is the Founder, CEO, and Chief AI Architect of EPC Group. Microsoft MVP for multiple years starting 2002–2003. 4× Microsoft Press bestselling author of Windows SharePoint Services 3.0 Inside Out (MS Press 2007), Microsoft SharePoint Foundation 2010 Inside Out (MS Press 2011), SharePoint 2013 Field Guide (Sams/Pearson 2014), and Microsoft Power BI Dashboards Step by Step (MS Press 2018).

Original SharePoint Beta Team member (Project Tahoe). Original Power BI Beta Team member (Project Crescent). FedRAMP framework contributor. Worked with U.S. CIO Vivek Kundra on the Obama administration's 25-Point Plan to reform federal IT, and with NASA CIO Chris Kemp as Lead Architect on the NASA Nebula Cloud project. Speaker at Microsoft Ignite, SharePoint Conference, KMWorld, and DATAVERSITY.

© 2026 EPC Group. All rights reserved. Microsoft, SharePoint, Power BI, Azure, Microsoft 365, Microsoft Copilot, Microsoft Fabric, and Microsoft Dynamics 365 are trademarks of the Microsoft group of companies.

‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
Home / Blog / Dataflow Gen1 to Fabric Migration

Dataflow Gen1 to Microsoft Fabric Migration: The Complete Enterprise Playbook

By Errin O'Connor | Published April 15, 2026 | 12 min read

Microsoft has announced the deprecation timeline for Dataflow Gen1. If your enterprise relies on dozens — or hundreds — of Gen1 dataflows for ETL, this playbook gives you the step-by-step framework EPC Group uses to migrate clients to Dataflow Gen2 in Microsoft Fabric without disrupting production pipelines.

Why Dataflow Gen1 Migration Cannot Wait

Dataflow Gen1 was introduced as Power BI's self-service ETL tool — a way for business analysts to prepare data without writing code. It worked well for departmental use cases, but enterprises quickly outgrew its limitations: no staging layer, limited orchestration, dataset-only output, and gateway-bound connectivity.

Microsoft Fabric changes the game. Dataflow Gen2 is not just an upgrade — it is a re-architecture. Gen2 dataflows run on Fabric compute, output to any Fabric destination (Lakehouse, Warehouse, KQL Database, Azure SQL), support staging lakehouses for intermediate transformations, and integrate natively with Fabric pipelines for enterprise orchestration.

The business case is straightforward: Gen1 dataflows will eventually lose support, Gen2 offers 2–5x better performance through query folding and fast copy, and Fabric's unified capacity model eliminates the separate Power BI Premium Per User or Per Capacity licensing overhead. Organizations that migrate early gain access to Copilot in Data Factory, enhanced monitoring, and the full Fabric governance stack through Microsoft Purview integration.

Phase 1: Migration Assessment and Inventory

Every successful migration starts with a complete inventory. EPC Group's assessment framework catalogs every Gen1 dataflow across your tenant, classifies migration complexity, and builds the dependency map that drives sequencing.

Dataflow Inventory Checklist

  • Total dataflow count — Use the Power BI REST API (Groups/Dataflows endpoint) to enumerate all dataflows across all workspaces. Do not rely on manual workspace audits.
  • Data source inventory — Catalog every data source connection: SQL Server, Oracle, REST APIs, SharePoint lists, Excel files, OData feeds. Flag any sources using on-premises data gateways.
  • M query complexity scoring — Parse each dataflow's M queries for custom functions, nested let expressions, error handling patterns, and Value.NativeQuery calls. Score each as Low (direct migration), Medium (minor refactoring), or High (rewrite required).
  • Downstream dependency mapping — Identify every Power BI dataset, report, and dashboard that consumes each dataflow. This determines migration sequencing — you cannot migrate a dataflow until its consumers are ready.
  • Refresh schedule and SLA documentation — Record current refresh frequencies, windows, and business SLAs. Gen2 refresh times may differ, so baseline metrics are essential for post-migration validation.
  • Incremental refresh configurations — Document every dataflow using RangeStart/RangeEnd parameters. These require specific migration handling in Gen2.

EPC Group delivers this assessment as a 2-week engagement with a detailed migration plan, effort estimate, and risk register. For organizations with complex Power BI environments, the assessment alone prevents costly surprises during execution.

Phase 2: Dataflow Gen2 Architecture Design

Gen2 is not a 1:1 replacement for Gen1. The architecture is fundamentally different, and a successful migration requires deliberate design decisions before writing a single M query.

Key Architecture Decisions

Output Destination: Lakehouse vs. Warehouse vs. Semantic Model

Gen1 dataflows output only to Power BI datasets. Gen2 gives you choices. For raw/bronze data, output to a Fabric Lakehouse (Delta tables). For curated/gold data consumed by SQL analysts, output to a Fabric Warehouse. For direct BI consumption, output to a Power BI semantic model. Most enterprises use a tiered approach: dataflow outputs to Lakehouse, a downstream pipeline transforms into Warehouse, and Fabric's Direct Lake mode connects Power BI to the gold layer.

Staging Lakehouse Strategy

Gen2 introduces staging lakehouses — intermediate storage for dataflow processing. Enable staging for high-volume dataflows (over 1 million rows) to leverage fast copy and improve query folding. Disable staging for small-volume, low-latency dataflows where the overhead is not justified. Design a naming convention: we recommend stg_df_[domain]_[source] for staging lakehouses and lh_[domain]_[layer] for output lakehouses.

Workspace and Capacity Planning

Organize Gen2 dataflows into Fabric workspaces by domain (Sales, Finance, Operations) rather than by source system. Each workspace maps to a Fabric capacity for cost allocation and performance isolation. Start with a shared development capacity (F64) and isolated production capacities per domain. Monitor CU consumption during the pilot phase using the Fabric Capacity Metrics app before committing to production SKUs.

Phase 3: M Query Conversion and Testing

The core migration work is converting M queries from Gen1 to Gen2. While most M code is compatible, several patterns require attention.

Common M Query Migration Patterns

  • Linked entities — Gen1 supports linked entities (referencing another dataflow's output). Gen2 replaces this with staging lakehouses. Convert linked entity references to Lakehouse.Table() calls pointing to the staging lakehouse.
  • Computed entities — Similar to linked entities, computed entities in Gen1 are replaced by reading from the staging lakehouse in Gen2. Refactor computed entity queries to read from Delta tables.
  • Custom connectors — Gen2 supports a growing but not yet complete set of connectors. Audit your custom connectors against the Gen2 compatibility list. For unsupported connectors, use a Fabric pipeline with a Web activity or Azure Function as a bridge.
  • Gateway-dependent sources — Gen2 supports on-premises data gateways, but the configuration differs. Re-bind gateway connections in the Gen2 dataflow settings. Test connectivity before migrating M queries.
  • Error handling patterns — M queries using try/otherwise patterns and custom error tables migrate directly. However, Gen2's error handling in the output configuration (row-level error capture) provides a better alternative. Consider refactoring error handling to use native Gen2 capabilities.

Testing Methodology

EPC Group uses a three-stage testing protocol for every migrated dataflow: (1) row count validation — Gen2 output row count must match Gen1 within 0.01%, (2) hash-based data comparison — generate SHA-256 hashes of key columns in both Gen1 and Gen2 output and compare, (3) refresh performance benchmarking — Gen2 refresh time must be within 120% of Gen1 baseline (most achieve 50–80% of Gen1 time). Document results in a migration validation matrix and obtain business owner sign-off before cutover.

Phase 4: Incremental Refresh Migration

Incremental refresh is the most complex migration topic. In Gen1, you configure RangeStart and RangeEnd parameters, and Power BI manages partition creation automatically. In Gen2, the approach depends on your output destination.

Key difference: If your Gen2 dataflow outputs to a Lakehouse, incremental refresh is managed through Fabric pipeline scheduling with date-parameterized M queries. If outputting to a semantic model, you can use the built-in incremental refresh policy similar to Gen1 — but configured at the semantic model level, not the dataflow level.

For Lakehouse destinations, EPC Group implements a sliding-window pattern: the M query accepts StartDate and EndDate parameters, the Fabric pipeline passes dynamic date values on each scheduled run, and the Lakehouse table is configured for merge (upsert) rather than overwrite. This gives you true incremental loading with full control over the refresh window.

Phase 5: Capacity Planning and Performance Optimization

Fabric capacity planning for Dataflow Gen2 workloads requires understanding CU consumption patterns. Unlike Gen1, where refresh capacity was part of your Power BI Premium SKU, Gen2 dataflows compete for CUs with every other Fabric workload in the same capacity.

Capacity Sizing Guidelines

Workload SizeDataflow CountRecommended SKUEstimated Monthly Cost
Small10–30F32$4,200
Medium30–100F64$8,400
Large100–300F128$16,800
Enterprise300+F256+$33,600+

These are starting estimates. Actual CU consumption depends on data volume, transformation complexity, connector type (fast copy vs. standard), and concurrency. EPC Group's capacity planning engagement includes a 2-week instrumented pilot with detailed CU consumption analysis and right-sizing recommendations.

Performance Optimization Checklist

  • Enable staging for dataflows processing over 1 million rows to leverage fast copy
  • Maximize query folding — push filters, column selections, and joins to the source system
  • Avoid Table.Buffer() and List.Buffer() in M queries — these prevent query folding and consume memory
  • Use native query (Value.NativeQuery) for SQL sources to ensure server-side execution
  • Schedule high-volume dataflows during off-peak hours to avoid CU contention
  • Monitor using the Fabric Capacity Metrics app — set alerts for CU utilization above 80%

Cutover Strategy: Parallel Run and Decommission

EPC Group never recommends a big-bang cutover. Instead, we use a parallel-run approach: Gen1 and Gen2 dataflows run simultaneously for 2–4 weeks, outputs are compared daily using automated validation scripts, and downstream consumers are switched in waves. Only after all consumers are migrated and validation passes for 5 consecutive business days do we decommission the Gen1 dataflow.

This approach adds capacity cost during the parallel period, but it eliminates the risk of data disruption. For regulated industries — healthcare, financial services, government — where data accuracy SLAs are contractual, parallel-run is not optional. It is a requirement.

Frequently Asked Questions

What is the difference between Dataflow Gen1 and Dataflow Gen2 in Microsoft Fabric?

Dataflow Gen1 runs inside Power BI Service and outputs to Power BI datasets. Dataflow Gen2 runs inside Microsoft Fabric and can output to Lakehouses, Warehouses, KQL databases, or Azure SQL — in addition to Power BI semantic models. Gen2 supports staging lakehouses for intermediate data, uses Fabric compute (CU-based), and integrates with Fabric pipelines for orchestration. Gen2 also adds fast copy for high-volume connectors and improved M query performance through query folding enhancements.

How long does a typical Dataflow Gen1 to Fabric migration take?

For an enterprise with 50–200 dataflows, EPC Group typically completes migration in 6–12 weeks. The timeline depends on M query complexity, custom connector usage, incremental refresh configurations, and downstream dependency mapping. A simple lift-and-shift of compatible dataflows can happen in 2–4 weeks, but enterprises with complex transformations, gateway dependencies, or regulatory requirements should plan for the full 12-week cycle including UAT and parallel-run validation.

Will my existing M queries work in Dataflow Gen2 without changes?

Most standard M queries migrate without modification. However, certain patterns require updates: custom connectors not yet supported in Gen2, queries using Power BI-specific functions (like Value.NativeQuery with gateway-specific syntax), and dataflows that rely on linked or computed entities — which are replaced by staging lakehouses in Gen2. EPC Group runs an automated compatibility scan on every M query before migration to identify breaking changes and estimate remediation effort.

How do I migrate incremental refresh from Gen1 to Gen2?

Incremental refresh in Gen2 works differently. In Gen1, incremental refresh is configured at the dataflow level with RangeStart/RangeEnd parameters. In Gen2, incremental refresh is handled through Fabric pipeline scheduling with watermark columns or through the built-in incremental refresh settings in the dataflow output configuration. You need to re-implement your refresh logic using Gen2's approach — typically by adding date-based filters in the M query and configuring the pipeline schedule to pass dynamic date parameters.

What capacity (SKU) do I need for Dataflow Gen2 in Fabric?

Dataflow Gen2 workloads consume Fabric Capacity Units (CUs). For enterprises running 50–100 dataflows with moderate complexity, an F64 capacity (equivalent to P1) is the starting recommendation — providing 64 CUs for shared use across Data Factory, warehousing, and BI workloads. EPC Group recommends running a 2-week capacity pilot: migrate your top 10 highest-volume dataflows, monitor CU consumption via the Fabric Capacity Metrics app, and extrapolate to size the production SKU. Over-provisioning is safer than under-provisioning during migration.

Ready to Migrate Your Dataflows to Microsoft Fabric?

EPC Group has migrated hundreds of enterprise Dataflow Gen1 environments to Fabric. Our 2-week assessment gives you a complete migration plan, effort estimate, risk register, and capacity sizing — so you can move with confidence. Call us at (888) 381-9725 or request an assessment below.

Request a Dataflow Migration Assessment

Microsoft Fabric Architecture: 2026 Considerations for Blog Dataflow Gen1 To Fabric Migration Enterprise Playbook

OneLake (Microsoft Fabric unified data lake) uses a shortcut model that lets a single physical Parquet dataset serve both Fabric Lakehouse queries (Spark) and Fabric Warehouse queries (T-SQL) without copy. This eliminates the historical lakehouse vs warehouse pick-one decision and reduces typical enterprise data-platform footprint by 30-50% versus comparable Snowflake plus Databricks dual-vendor deployments.

Fabric vs Snowflake in 2026 isn't a feature war; it is a stack-consolidation play. Enterprises already on Microsoft 365 plus Power BI typically see 30-50% lower TCO consolidating onto Fabric (single licensing relationship, OneLake-native semantic models, native Power BI Direct Lake integration) versus maintaining Snowflake as a separate analytics warehouse. Migration runbook is a 12-26 week project depending on workload count and downstream consumer migration complexity.

Decision factors EPC Group evaluates

  • OneLake shortcut strategy for cross-workload data sharing
  • Real-Time Intelligence vs Power BI streaming deployment patterns
  • Fabric vs Snowflake/Databricks consolidation TCO analysis
  • F-SKU capacity sizing (F2 to F2048) with Direct Lake compatibility
  • Microsoft Purview lineage tracking across Fabric workloads

EPC Group covers this topic across the relevant engagement portfolio. Reach the firm at contact@epcgroup.net for a 30-minute architect conversation.