EPC Group - Enterprise Microsoft AI, SharePoint, Power BI, and Azure Consulting
G2 High Performer Summer 2025, Momentum Leader Spring 2025, Leader Winter 2025, Leader Spring 2026
BlogContact
Ready to transform your Microsoft environment?Get started today
(888) 381-9725Get Free Consultation
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌

EPC Group

Enterprise Microsoft consulting with 28+ years serving Fortune 500 companies.

(888) 381-9725
contact@epcgroup.net
4900 Woodway Drive - Suite 830
Houston, TX 77056

Follow Us

Solutions

  • All Services
  • Microsoft 365 Consulting
  • AI Governance
  • Azure AI Consulting
  • Cloud Migration
  • Microsoft Copilot
  • Data Governance
  • Microsoft Fabric
  • vCIO / vCAIO Services
  • Large-Scale Migrations
  • SharePoint Development

Industries

  • All Industries
  • Healthcare IT
  • Financial Services
  • Government
  • Education
  • Teams vs Slack

Power BI

  • Case Studies
  • 24/7 Emergency Support
  • Dashboard Guide
  • Gateway Setup
  • Premium Features
  • Lookup Functions
  • Power Pivot vs BI
  • Treemaps Guide
  • Dataverse
  • Power BI Consulting

Company

  • About Us
  • Our History
  • Microsoft Gold Partner
  • Case Studies
  • Testimonials
  • Blog
  • Resources
  • Contact

Microsoft Teams

  • Teams Questions
  • Teams Healthcare
  • Task Management
  • PSTN Calling
  • Enable Dial Pad

Azure & SharePoint

  • Azure Databricks
  • Azure DevOps
  • Azure Synapse
  • SharePoint MySites
  • SharePoint ECM
  • SharePoint vs M-Files

Comparisons

  • M365 vs Google
  • Databricks vs Dataproc
  • Dynamics vs SAP
  • Intune vs SCCM
  • Power BI vs MicroStrategy

Legal

  • Sitemap
  • Privacy Policy
  • Terms
  • Cookies

© 2026 EPC Group. All rights reserved.

‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
Home / Blog / Dataflow Gen1 to Fabric Migration

Dataflow Gen1 to Microsoft Fabric Migration: The Complete Enterprise Playbook

By Errin O'Connor | Published April 15, 2026 | 12 min read

Microsoft has announced the deprecation timeline for Dataflow Gen1. If your enterprise relies on dozens — or hundreds — of Gen1 dataflows for ETL, this playbook gives you the step-by-step framework EPC Group uses to migrate clients to Dataflow Gen2 in Microsoft Fabric without disrupting production pipelines.

Why Dataflow Gen1 Migration Cannot Wait

Dataflow Gen1 was introduced as Power BI's self-service ETL tool — a way for business analysts to prepare data without writing code. It worked well for departmental use cases, but enterprises quickly outgrew its limitations: no staging layer, limited orchestration, dataset-only output, and gateway-bound connectivity.

Microsoft Fabric changes the game. Dataflow Gen2 is not just an upgrade — it is a re-architecture. Gen2 dataflows run on Fabric compute, output to any Fabric destination (Lakehouse, Warehouse, KQL Database, Azure SQL), support staging lakehouses for intermediate transformations, and integrate natively with Fabric pipelines for enterprise orchestration.

The business case is straightforward: Gen1 dataflows will eventually lose support, Gen2 offers 2–5x better performance through query folding and fast copy, and Fabric's unified capacity model eliminates the separate Power BI Premium Per User or Per Capacity licensing overhead. Organizations that migrate early gain access to Copilot in Data Factory, enhanced monitoring, and the full Fabric governance stack through Microsoft Purview integration.

Phase 1: Migration Assessment and Inventory

Every successful migration starts with a complete inventory. EPC Group's assessment framework catalogs every Gen1 dataflow across your tenant, classifies migration complexity, and builds the dependency map that drives sequencing.

Dataflow Inventory Checklist

  • Total dataflow count — Use the Power BI REST API (Groups/Dataflows endpoint) to enumerate all dataflows across all workspaces. Do not rely on manual workspace audits.
  • Data source inventory — Catalog every data source connection: SQL Server, Oracle, REST APIs, SharePoint lists, Excel files, OData feeds. Flag any sources using on-premises data gateways.
  • M query complexity scoring — Parse each dataflow's M queries for custom functions, nested let expressions, error handling patterns, and Value.NativeQuery calls. Score each as Low (direct migration), Medium (minor refactoring), or High (rewrite required).
  • Downstream dependency mapping — Identify every Power BI dataset, report, and dashboard that consumes each dataflow. This determines migration sequencing — you cannot migrate a dataflow until its consumers are ready.
  • Refresh schedule and SLA documentation — Record current refresh frequencies, windows, and business SLAs. Gen2 refresh times may differ, so baseline metrics are essential for post-migration validation.
  • Incremental refresh configurations — Document every dataflow using RangeStart/RangeEnd parameters. These require specific migration handling in Gen2.

EPC Group delivers this assessment as a 2-week engagement with a detailed migration plan, effort estimate, and risk register. For organizations with complex Power BI environments, the assessment alone prevents costly surprises during execution.

Phase 2: Dataflow Gen2 Architecture Design

Gen2 is not a 1:1 replacement for Gen1. The architecture is fundamentally different, and a successful migration requires deliberate design decisions before writing a single M query.

Key Architecture Decisions

Output Destination: Lakehouse vs. Warehouse vs. Semantic Model

Gen1 dataflows output only to Power BI datasets. Gen2 gives you choices. For raw/bronze data, output to a Fabric Lakehouse (Delta tables). For curated/gold data consumed by SQL analysts, output to a Fabric Warehouse. For direct BI consumption, output to a Power BI semantic model. Most enterprises use a tiered approach: dataflow outputs to Lakehouse, a downstream pipeline transforms into Warehouse, and Fabric's Direct Lake mode connects Power BI to the gold layer.

Staging Lakehouse Strategy

Gen2 introduces staging lakehouses — intermediate storage for dataflow processing. Enable staging for high-volume dataflows (over 1 million rows) to leverage fast copy and improve query folding. Disable staging for small-volume, low-latency dataflows where the overhead is not justified. Design a naming convention: we recommend stg_df_[domain]_[source] for staging lakehouses and lh_[domain]_[layer] for output lakehouses.

Workspace and Capacity Planning

Organize Gen2 dataflows into Fabric workspaces by domain (Sales, Finance, Operations) rather than by source system. Each workspace maps to a Fabric capacity for cost allocation and performance isolation. Start with a shared development capacity (F64) and isolated production capacities per domain. Monitor CU consumption during the pilot phase using the Fabric Capacity Metrics app before committing to production SKUs.

Phase 3: M Query Conversion and Testing

The core migration work is converting M queries from Gen1 to Gen2. While most M code is compatible, several patterns require attention.

Common M Query Migration Patterns

  • Linked entities — Gen1 supports linked entities (referencing another dataflow's output). Gen2 replaces this with staging lakehouses. Convert linked entity references to Lakehouse.Table() calls pointing to the staging lakehouse.
  • Computed entities — Similar to linked entities, computed entities in Gen1 are replaced by reading from the staging lakehouse in Gen2. Refactor computed entity queries to read from Delta tables.
  • Custom connectors — Gen2 supports a growing but not yet complete set of connectors. Audit your custom connectors against the Gen2 compatibility list. For unsupported connectors, use a Fabric pipeline with a Web activity or Azure Function as a bridge.
  • Gateway-dependent sources — Gen2 supports on-premises data gateways, but the configuration differs. Re-bind gateway connections in the Gen2 dataflow settings. Test connectivity before migrating M queries.
  • Error handling patterns — M queries using try/otherwise patterns and custom error tables migrate directly. However, Gen2's error handling in the output configuration (row-level error capture) provides a better alternative. Consider refactoring error handling to use native Gen2 capabilities.

Testing Methodology

EPC Group uses a three-stage testing protocol for every migrated dataflow: (1) row count validation — Gen2 output row count must match Gen1 within 0.01%, (2) hash-based data comparison — generate SHA-256 hashes of key columns in both Gen1 and Gen2 output and compare, (3) refresh performance benchmarking — Gen2 refresh time must be within 120% of Gen1 baseline (most achieve 50–80% of Gen1 time). Document results in a migration validation matrix and obtain business owner sign-off before cutover.

Phase 4: Incremental Refresh Migration

Incremental refresh is the most complex migration topic. In Gen1, you configure RangeStart and RangeEnd parameters, and Power BI manages partition creation automatically. In Gen2, the approach depends on your output destination.

Key difference: If your Gen2 dataflow outputs to a Lakehouse, incremental refresh is managed through Fabric pipeline scheduling with date-parameterized M queries. If outputting to a semantic model, you can use the built-in incremental refresh policy similar to Gen1 — but configured at the semantic model level, not the dataflow level.

For Lakehouse destinations, EPC Group implements a sliding-window pattern: the M query accepts StartDate and EndDate parameters, the Fabric pipeline passes dynamic date values on each scheduled run, and the Lakehouse table is configured for merge (upsert) rather than overwrite. This gives you true incremental loading with full control over the refresh window.

Phase 5: Capacity Planning and Performance Optimization

Fabric capacity planning for Dataflow Gen2 workloads requires understanding CU consumption patterns. Unlike Gen1, where refresh capacity was part of your Power BI Premium SKU, Gen2 dataflows compete for CUs with every other Fabric workload in the same capacity.

Capacity Sizing Guidelines

Workload SizeDataflow CountRecommended SKUEstimated Monthly Cost
Small10–30F32$4,200
Medium30–100F64$8,400
Large100–300F128$16,800
Enterprise300+F256+$33,600+

These are starting estimates. Actual CU consumption depends on data volume, transformation complexity, connector type (fast copy vs. standard), and concurrency. EPC Group's capacity planning engagement includes a 2-week instrumented pilot with detailed CU consumption analysis and right-sizing recommendations.

Performance Optimization Checklist

  • Enable staging for dataflows processing over 1 million rows to leverage fast copy
  • Maximize query folding — push filters, column selections, and joins to the source system
  • Avoid Table.Buffer() and List.Buffer() in M queries — these prevent query folding and consume memory
  • Use native query (Value.NativeQuery) for SQL sources to ensure server-side execution
  • Schedule high-volume dataflows during off-peak hours to avoid CU contention
  • Monitor using the Fabric Capacity Metrics app — set alerts for CU utilization above 80%

Cutover Strategy: Parallel Run and Decommission

EPC Group never recommends a big-bang cutover. Instead, we use a parallel-run approach: Gen1 and Gen2 dataflows run simultaneously for 2–4 weeks, outputs are compared daily using automated validation scripts, and downstream consumers are switched in waves. Only after all consumers are migrated and validation passes for 5 consecutive business days do we decommission the Gen1 dataflow.

This approach adds capacity cost during the parallel period, but it eliminates the risk of data disruption. For regulated industries — healthcare, financial services, government — where data accuracy SLAs are contractual, parallel-run is not optional. It is a requirement.

Frequently Asked Questions

What is the difference between Dataflow Gen1 and Dataflow Gen2 in Microsoft Fabric?

Dataflow Gen1 runs inside Power BI Service and outputs to Power BI datasets. Dataflow Gen2 runs inside Microsoft Fabric and can output to Lakehouses, Warehouses, KQL databases, or Azure SQL — in addition to Power BI semantic models. Gen2 supports staging lakehouses for intermediate data, uses Fabric compute (CU-based), and integrates with Fabric pipelines for orchestration. Gen2 also adds fast copy for high-volume connectors and improved M query performance through query folding enhancements.

How long does a typical Dataflow Gen1 to Fabric migration take?

For an enterprise with 50–200 dataflows, EPC Group typically completes migration in 6–12 weeks. The timeline depends on M query complexity, custom connector usage, incremental refresh configurations, and downstream dependency mapping. A simple lift-and-shift of compatible dataflows can happen in 2–4 weeks, but enterprises with complex transformations, gateway dependencies, or regulatory requirements should plan for the full 12-week cycle including UAT and parallel-run validation.

Will my existing M queries work in Dataflow Gen2 without changes?

Most standard M queries migrate without modification. However, certain patterns require updates: custom connectors not yet supported in Gen2, queries using Power BI-specific functions (like Value.NativeQuery with gateway-specific syntax), and dataflows that rely on linked or computed entities — which are replaced by staging lakehouses in Gen2. EPC Group runs an automated compatibility scan on every M query before migration to identify breaking changes and estimate remediation effort.

How do I migrate incremental refresh from Gen1 to Gen2?

Incremental refresh in Gen2 works differently. In Gen1, incremental refresh is configured at the dataflow level with RangeStart/RangeEnd parameters. In Gen2, incremental refresh is handled through Fabric pipeline scheduling with watermark columns or through the built-in incremental refresh settings in the dataflow output configuration. You need to re-implement your refresh logic using Gen2's approach — typically by adding date-based filters in the M query and configuring the pipeline schedule to pass dynamic date parameters.

What capacity (SKU) do I need for Dataflow Gen2 in Fabric?

Dataflow Gen2 workloads consume Fabric Capacity Units (CUs). For enterprises running 50–100 dataflows with moderate complexity, an F64 capacity (equivalent to P1) is the starting recommendation — providing 64 CUs for shared use across Data Factory, warehousing, and BI workloads. EPC Group recommends running a 2-week capacity pilot: migrate your top 10 highest-volume dataflows, monitor CU consumption via the Fabric Capacity Metrics app, and extrapolate to size the production SKU. Over-provisioning is safer than under-provisioning during migration.

Ready to Migrate Your Dataflows to Microsoft Fabric?

EPC Group has migrated hundreds of enterprise Dataflow Gen1 environments to Fabric. Our 2-week assessment gives you a complete migration plan, effort estimate, risk register, and capacity sizing — so you can move with confidence. Call us at (888) 381-9725 or request an assessment below.

Request a Dataflow Migration Assessment