EPC Group - Enterprise Microsoft AI, SharePoint, Power BI, and Azure Consulting
G2 High Performer Summer 2025, Momentum Leader Spring 2025, Leader Winter 2025, Leader Spring 2026
BlogContact
Ready to transform your Microsoft environment?Get started today
(888) 381-9725Get Free Consultation
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌

EPC Group

Enterprise Microsoft consulting with 29 years serving Fortune 500 companies.

(888) 381-9725
contact@epcgroup.net
4900 Woodway Drive, Suite 830
Houston, TX 77056

Follow Us

Solutions

  • M&A Practices

    • M&A Tenant Migration
    • Carve-Out Migration
    • Private Equity Practice
    • Engagement Operating Model
  • All Services
  • Microsoft 365 Consulting
  • AI Governance
  • Azure AI Consulting
  • Cloud Migration
  • Microsoft Copilot
  • Data Governance
  • Microsoft Fabric
  • Dynamics 365
  • Power BI Consulting
  • SharePoint Consulting
  • Microsoft Teams
  • vCIO / vCAIO Services
  • Large-Scale Migrations
  • SharePoint Development

Industries

  • All Industries
  • Healthcare IT
  • Financial Services
  • Government
  • Education
  • Teams vs Slack

Power BI

  • Case Studies
  • 24/7 Emergency Support
  • Dashboard Guide
  • Gateway Setup
  • Premium Features
  • Lookup Functions
  • Power Pivot vs BI
  • Treemaps Guide
  • Dataverse
  • Power BI Consulting

Company

  • About Us
  • Our History
  • Microsoft Gold Partner
  • Case Studies
  • Testimonials
  • Fixed-Fee Accelerators
  • Blog
  • Resources
  • All Guides & Articles
  • Video Library
  • Client Reviews
  • Engagement Operating Model
  • FAQ
  • Contact
  • Schedule a consultation

Microsoft Teams

  • Teams Questions
  • Teams Healthcare
  • Task Management
  • PSTN Calling
  • Enable Dial Pad

Azure & SharePoint

  • Azure Databricks
  • Azure DevOps
  • Azure Synapse
  • SharePoint MySites
  • SharePoint ECM
  • SharePoint vs M-Files

Comparisons

  • M365 vs Google
  • Databricks vs Dataproc
  • Dynamics vs SAP
  • Intune vs SCCM
  • Power BI vs MicroStrategy

Legal

  • Sitemap
  • Privacy Policy
  • Terms
  • Cookies

About EPC Group

EPC Group is a Microsoft consulting firm founded in 1997 (originally Enterprise Project Consulting, renamed EPC Group in 2005). 29 years of enterprise Microsoft consulting experience. EPC Group historically held the distinction of being the oldest continuous Microsoft Gold Partner in North America from 2016 until the program's retirement. Because Microsoft officially deprecated the Gold/Silver tiering framework, EPC Group transitioned to the modern Microsoft Solutions Partner ecosystem and currently holds the core Microsoft Solutions Partner designations.

Headquartered at 4900 Woodway Drive, Suite 830, Houston, TX 77056. Public clients include NASA, FBI, Federal Reserve, Pentagon, United Airlines, PepsiCo, Nike, and Northrop Grumman. 6,500+ SharePoint implementations, 1,500+ Power BI deployments, 500+ Microsoft Fabric implementations, 70+ Fortune 500 organizations served, 11,000+ enterprise engagements, 200+ Microsoft Power BI and Microsoft 365 consultants on staff.

About Errin O'Connor

Errin O'Connor is the Founder, CEO, and Chief AI Architect of EPC Group. Microsoft MVP multiple years, first awarded 2003. 4× Microsoft Press bestselling author of Windows SharePoint Services 3.0 Inside Out (MS Press 2007), Microsoft SharePoint Foundation 2010 Inside Out (MS Press 2011), SharePoint 2013 Field Guide (Sams/Pearson 2014), and Microsoft Power BI Dashboards Step by Step (MS Press 2018).

Original SharePoint Beta Team member (Project Tahoe). Original Power BI Beta Team member (Project Crescent). FedRAMP framework contributor. Worked with U.S. CIO Vivek Kundra on the Obama administration's 25-Point Plan to reform federal IT, and with NASA CIO Chris Kemp as Lead Architect on the NASA Nebula Cloud project. Speaker at Microsoft Ignite, SharePoint Conference, KMWorld, and DATAVERSITY.

© 2026 EPC Group. All rights reserved. Microsoft, SharePoint, Power BI, Azure, Microsoft 365, Microsoft Copilot, Microsoft Fabric, and Microsoft Dynamics 365 are trademarks of the Microsoft group of companies.

‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
Azure Data Factory Enterprise Guide | EPC Group - EPC Group enterprise consulting

Azure Data Factory Enterprise Guide | EPC Group

Enterprise Microsoft consulting insights from EPC Group — 29 years serving Fortune 500.

Azure Data Factory (ADF) is Microsoft's fully managed, serverless data integration service. It orchestrates data movement and transformation across 100+ built-in connectors. EPC Group has implemented ADF for 150+ enterprise organizations — from mid-market companies integrating a dozen sources to Fortune 500 enterprises running thousands of daily pipelines across multiple Azure regions.

Key Facts

  • ADF supports 100+ built-in connectors: Azure SQL, Synapse, Blob Storage, Amazon S3, Snowflake, Salesforce, SAP, Oracle, and on-premises SQL Server.
  • ADF pricing — four components: pipeline orchestration at $1.00/1,000 activity runs; data movement at $0.25/DIU-hour (minimum 4 DIUs per copy activity); mapping data flows at $0.274/vCore-hour; SSIS Integration Runtime at $0.274/vCore-hour.
  • Typical enterprise spend: $800–$2,000/month for 500 pipelines with 5,000 daily activity runs and 100 GB daily data movement.
  • EPC Group is a Microsoft Solutions Partner with 150+ ADF implementations across healthcare, financial services, manufacturing, and government.
  • EPC Group recommendation: use ELT (not ETL) for most enterprise scenarios — load raw data into OneLake or ADLS Gen2, then transform using Synapse, Databricks, or Microsoft Fabric.
February 26, 2026|22 min read|Azure Cloud Services

Azure Data Factory: The Enterprise Guide to ETL/ELT Pipelines, Data Integration, and CI/CD

Azure Data Factory is the backbone of enterprise data integration on the Microsoft platform, orchestrating data movement and transformation across hundreds of source and destination systems. This guide covers enterprise ADF architecture, pipeline design patterns, mapping data flows, CI/CD implementation with Azure DevOps, monitoring and alerting, SSIS migration strategies, and cost optimization — based on 150+ ADF implementations delivered by EPC Group.

Table of Contents

  • Why Azure Data Factory for Enterprise Data Integration
  • Enterprise ADF Architecture Patterns
  • Pipeline Design Best Practices
  • Mapping Data Flows for Transformation
  • ETL vs. ELT: Choosing the Right Pattern
  • CI/CD for ADF Pipelines
  • Security and Compliance
  • Monitoring, Alerting, and Observability
  • Migrating from SSIS to ADF
  • Cost Optimization Strategies
  • Partner with EPC Group

Azure Data Factory Enterprise Guide 2026

Azure Data Factory (ADF) is Microsoft's fully managed, serverless data integration service. It orchestrates data movement and transformation across 100+ built-in connectors. EPC Group has implemented ADF for 150+ enterprise organizations — from mid-market companies integrating a dozen sources to Fortune 500 enterprises running thousands of daily pipelines across multiple Azure regions.

Key facts

  • ADF supports 100+ built-in connectors: Azure SQL, Synapse, Blob Storage, Amazon S3, Snowflake, Salesforce, SAP, Oracle, and on-premises SQL Server.
  • ADF pricing — four components: pipeline orchestration at $1.00/1,000 activity runs; data movement at $0.25/DIU-hour (minimum 4 DIUs per copy activity); mapping data flows at $0.274/vCore-hour; SSIS Integration Runtime at $0.274/vCore-hour.
  • Typical enterprise spend: $800–$2,000/month for 500 pipelines with 5,000 daily activity runs and 100 GB daily data movement.
  • EPC Group is a Microsoft Solutions Partner with 150+ ADF implementations across healthcare, financial services, manufacturing, and government.
  • EPC Group recommendation: use ELT (not ETL) for most enterprise scenarios — load raw data into OneLake or ADLS Gen2, then transform using Synapse, Databricks, or Microsoft Fabric.

Why Azure Data Factory for enterprise data integration

ADF is Microsoft's strategic replacement for on-premises SSIS. It is the recommended data integration platform for organizations running workloads on Azure.

Key enterprise advantages

  • Serverless and fully managed — no infrastructure to provision, patch, or scale. ADF allocates compute resources per pipeline run and charges only for consumption.
  • 100+ native connectors — Azure services, cloud platforms (AWS, GCP), SaaS applications (Salesforce, Dynamics 365, SAP, ServiceNow), and on-premises databases via Self-Hosted Integration Runtime.
  • Visual pipeline designer — drag-and-drop pipeline builder. Accelerates development. Makes pipelines accessible to analysts with minimal coding experience.
  • Enterprise reliability — built-in retry policies, fault tolerance, dependency management, and 99.9% SLA-backed uptime. Execution is idempotent by design.
  • Native Azure integration — Azure Key Vault, Azure Monitor, Microsoft Entra ID, Azure DevOps, and Microsoft Purview (data lineage).

Enterprise ADF architecture patterns

A well-designed ADF architecture separates concerns into four layers: ingestion, transformation, orchestration, and monitoring. This approach lets each layer scale independently and gives clear ownership.

Integration Runtime configuration

  • Azure Integration Runtime — managed by Microsoft, auto-scaling. Used for cloud-to-cloud data movement and mapping data flows. Use managed virtual network for private endpoint connectivity.
  • Self-Hosted Integration Runtime — installed on-premises or in an Azure VM to access data sources behind firewalls. Deploy at least two nodes for high availability. Use a dedicated service account with least-privilege access.
  • Azure-SSIS Integration Runtime — managed SSIS environment in Azure for running existing SSIS packages unchanged. Standard_D8_v3 (8 vCPU, 32 GB RAM) as baseline for most enterprise workloads.

ETL vs. ELT: choosing the right pattern

ETL (Extract-Transform-Load) transforms data inside ADF using mapping data flows before loading to the destination. Transformation runs on ADF-managed Spark clusters. Use this when you need to cleanse or enrich data before it reaches the data warehouse.

ELT (Extract-Load-Transform) uses ADF to extract and load raw data into the destination first, then transforms using the destination's compute engine. EPC Group recommends ELT for most enterprise scenarios. Load raw data into OneLake or ADLS Gen2, then transform using Synapse SQL pools, Databricks, or Microsoft Fabric.

Pipeline design best practices

Naming conventions

  • Pipelines: PL_Ingest_Salesforce_Accounts, PL_Transform_Silver_Customer
  • Datasets: DS_ADLS_Bronze_Salesforce_Accounts_Parquet, DS_AzSQL_Gold_DimCustomer
  • Linked Services: LS_AzSQL_DW_Production, LS_ADLS_DataLake, LS_KeyVault_Production
  • Triggers: TR_Schedule_Daily_0600UTC, TR_Event_BlobCreated_Raw

Pipeline composition patterns

  • Master-child pattern — master orchestration pipeline calls child pipelines via Execute Pipeline activity. Child pipelines perform single responsibilities. This enables independent testing, reuse, and parallel execution.
  • Metadata-driven ingestion — a single parameterized pipeline reads metadata from a control table (source system, schema, table name, watermark column, destination path). A ForEach activity iterates over the metadata. This reduces hundreds of pipelines to one.
  • Incremental load with watermarks — for large tables, use high-watermark patterns to load only changed data. Store the last successful watermark in a control table. Update it only after successful load to maintain idempotency.

CI/CD for ADF pipelines

Production ADF environments must use CI/CD pipelines for all changes. Manual publishing in production bypasses code review, has no rollback capability, and creates inconsistencies between environments.

Recommended CI/CD workflow using Azure DevOps or GitHub Actions:

  1. Connect ADF to a Git repository. Each developer works in feature branches.
  2. Develop and test pipelines in the ADF Studio UI connected to a development ADF instance.
  3. Merge changes to the collaboration branch (main) via pull requests.
  4. ADF automatically generates ARM templates or Bicep files from the publish branch (adf_publish).
  5. An Azure DevOps or GitHub Actions pipeline deploys ARM/Bicep templates to staging and production with environment-specific parameters.

The golden rule: every value that differs between environments must be a parameter. Never hard-code environment-specific values in pipeline definitions.

Security and compliance

Network security

  • Enable Managed Virtual Network on Azure Integration Runtime — all data flows execute within a Microsoft-managed VNet with no public internet exposure.
  • Use private endpoints for all data stores (Azure SQL, ADLS Gen2, Key Vault, Synapse).
  • Deploy Self-Hosted Integration Runtime in a dedicated subnet with NSG rules restricting outbound to required ADF service endpoints only.

Secrets management

  • Store all connection strings, passwords, and API keys in Azure Key Vault — never in linked service definitions.
  • Use managed identity authentication for Azure-to-Azure connections (no credentials to manage).
  • Rotate credentials automatically using Key Vault rotation policies.

Data protection

  • Enable encryption at rest with customer-managed keys (CMK) for HIPAA and FedRAMP compliance.
  • Enable encryption in transit (TLS 1.2+) — ADF enforces this by default.
  • Use Microsoft Purview to scan ADF pipelines and provide end-to-end data lineage from source to consumption.

Monitoring and observability

ADF Monitor hub shows pipeline runs, activity runs, trigger runs, and integration runtime status with up to 45 days of history. For enterprise monitoring, integrate ADF with Azure Monitor.

  • Tier 1 — ADF Monitor Hub: real-time visibility into pipeline runs. Use for ad-hoc troubleshooting and debugging.
  • Tier 2 — Azure Monitor: send diagnostic logs to Log Analytics for custom KQL queries, cross-pipeline analytics, and alerting (pipeline failures, duration anomalies, high DIU consumption, IR errors).
  • Tier 3 — Data quality monitoring: implement row count checks (source vs. destination), null value counts, referential integrity validations, and business rule assertions. Write validation results to a quality metrics table.

Migrating from SSIS to ADF

ADF is Microsoft's strategic replacement for SSIS. EPC Group has run SSIS-to-ADF migrations for 60+ enterprise clients. Two paths exist.

  • Path 1: Lift-and-shift with SSIS Integration Runtime — run existing SSIS packages unchanged on an ADF-managed SSIS runtime. Fastest migration path. Does not modernize the packages. Minimum IR cost ~$275/month. Timeline: 2–4 weeks for environment setup, 1–2 weeks for package deployment and testing.
  • Path 2: Re-architect to native ADF pipelines — redesign SSIS packages as ADF pipelines with mapping data flows. Full cloud-native benefits: serverless scaling, consumption pricing, built-in monitoring. Requires development effort. Timeline: 8–16 weeks depending on package count and complexity.

EPC Group recommendation: lift-and-shift critical packages first to get off on-premises infrastructure. Then re-architect high-value packages to native ADF pipelines based on ROI analysis.

Cost optimization

  • Right-size DIUs — ADF defaults to auto-DIU allocation (up to 256 DIUs per copy). For most enterprise scenarios, 16–32 DIUs provide optimal throughput without overspending.
  • Use TTL for data flow clusters — set 10–15 minute TTL to keep Spark clusters warm during batch windows. Eliminates 3–5 minute cold-start penalties.
  • Batch windows — group pipeline executions into concentrated batch windows rather than spreading through the day. Maximizes TTL effectiveness.
  • ELT over ETL — offload transformations to existing Synapse or Fabric compute rather than running ADF Spark clusters. This makes incremental transformation cost effectively $0.
  • Incremental loads — watermark-based incremental loads move 1–5% of the data compared to full loads, reducing copy activity duration, DIU consumption, and data flow processing time proportionally.

Frequently asked questions

What is Azure Data Factory and what is it used for?

ADF is Microsoft's fully managed, serverless data integration service. It orchestrates and automates data movement and transformation across 100+ connectors — including Azure SQL, Synapse, Blob Storage, Amazon S3, Snowflake, Salesforce, SAP, Oracle, and on-premises SQL Server. ADF is the successor to on-premises SSIS and the recommended data integration platform for Azure environments.

How much does Azure Data Factory cost?

ADF pricing has four components: pipeline orchestration at $1.00 per 1,000 activity runs; data movement at $0.25 per DIU-hour (minimum 4 DIUs per copy activity); mapping data flows at $0.274/vCore-hour; SSIS Integration Runtime at $0.274/vCore-hour. For a typical enterprise running 500 pipelines with 5,000 daily activity runs and 100 GB of daily data movement, expect $800–$2,000/month.

What is the difference between ETL and ELT in Azure Data Factory?

ETL transforms data inside ADF using mapping data flows before loading to the destination — ADF manages the Spark clusters. ELT loads raw data into the destination first and transforms it using the destination's compute engine (Synapse, Databricks, or Fabric). EPC Group recommends ELT for most enterprise scenarios. It's more cost-effective and keeps transformation logic close to the consumption layer.

How do I implement CI/CD for Azure Data Factory pipelines?

Connect ADF to Azure DevOps Repos or GitHub. Developers work in feature branches. Changes merge to main via pull requests. ADF generates ARM/Bicep templates from the publish branch. An Azure DevOps or GitHub Actions pipeline deploys those templates to staging and production with environment-specific parameters. Never manually publish to production — all changes flow through the CI/CD pipeline.

Can Azure Data Factory replace SSIS?

Yes. ADF is Microsoft's strategic replacement for SSIS. Two migration paths: lift-and- shift with SSIS Integration Runtime (fastest, 2–6 weeks, but not cloud-native) or re-architect to native ADF pipelines (8–16 weeks, full cloud-native benefits). EPC Group recommends a phased approach: lift-and-shift first, then re-architect high-value packages.

Schedule a consultation

EPC Group has completed 10,000+ implementations across Azure, Power BI, Microsoft Fabric, SharePoint, and Copilot. Talk to an Azure architect about your ADF implementation or SSIS migration. Call (888) 381-9725 or request a discovery call.

Frequently Asked Questions

What is Azure Data Factory and what is it used for?

Azure Data Factory (ADF) is Microsoft's fully managed, serverless data integration service in Azure. It orchestrates and automates the movement and transformation of data at scale across 100+ built-in connectors — including Azure SQL Database, Azure Synapse Analytics, Azure Blob Storage, Amazon S3, Snowflake, Salesforce, SAP, Oracle, and on-premises SQL Server. ADF supports both ETL (Extract-Transform-Load) and ELT (Extract-Load-Transform) patterns. Enterprises use ADF to build data pipelines that ingest raw data from source systems, transform it using mapping data flows or external compute (Databricks, HDInsight, Synapse), and load it into data warehouses, data lakes, or analytical stores. ADF is the successor to on-premises SSIS (SQL Server Integration Services) and is the recommended data integration platform for organizations running workloads on Azure.

How much does Azure Data Factory cost?

ADF pricing is consumption-based with four billing components: (1) Pipeline orchestration and execution at $1.00 per 1,000 activity runs, (2) Data movement at $0.25 per DIU-hour (Data Integration Unit), with a minimum of 4 DIUs per copy activity, (3) Mapping data flow execution at $0.274/vCore-hour for compute-optimized clusters, and (4) SSIS Integration Runtime at $0.274/vCore-hour. For a typical enterprise running 500 pipelines with 5,000 daily activity runs and 100 GB of daily data movement, expect $800-$2,000/month. Mapping data flows are the most expensive component — optimize by right-sizing cluster configurations and using time-to-live (TTL) settings to keep clusters warm during batch windows rather than cold-starting for each execution.

What is the difference between ETL and ELT in Azure Data Factory?

ETL (Extract-Transform-Load) transforms data within ADF using mapping data flows before loading it into the destination. The transformation runs on ADF-managed Spark clusters. This pattern is ideal when you need to cleanse, reshape, or enrich data before it reaches the data warehouse. ELT (Extract-Load-Transform) uses ADF to extract and load raw data into the destination (typically Azure Synapse, Databricks, or a data lake), then transforms it using the destination's compute engine. ELT leverages the destination's processing power, which is often more cost-effective for large-scale transformations. EPC Group recommends ELT for most enterprise scenarios — load raw data into OneLake or Azure Data Lake Storage Gen2, then transform using Synapse SQL pools, Databricks, or Microsoft Fabric for better performance and lower cost than running transformations on ADF's Spark clusters.

How do I implement CI/CD for Azure Data Factory pipelines?

ADF supports Git integration with Azure DevOps Repos or GitHub. The recommended CI/CD workflow is: (1) Connect ADF to a Git repository (each developer works in feature branches), (2) Develop and test pipelines in the ADF Studio UI connected to a development ADF instance, (3) Merge changes to the collaboration branch (typically main) via pull requests, (4) ADF automatically generates ARM templates or Bicep files from the publish branch (adf_publish), (5) An Azure DevOps or GitHub Actions pipeline deploys the ARM/Bicep templates to staging and production ADF instances with environment-specific parameters (linked services, connection strings, key vault references). Use parameterized linked services and global parameters to manage environment differences. Never manually publish changes in production — all changes flow through the CI/CD pipeline.

Can Azure Data Factory replace SSIS (SQL Server Integration Services)?

Yes. ADF is Microsoft's strategic replacement for SSIS. Organizations can migrate SSIS packages to ADF using two approaches: (1) Lift-and-shift with SSIS Integration Runtime — run existing SSIS packages unchanged on an ADF-managed SSIS runtime in Azure. This is the fastest migration path but does not modernize the packages. (2) Re-architect to native ADF pipelines — redesign SSIS packages as ADF pipelines with mapping data flows. This provides full cloud-native benefits (serverless scaling, consumption pricing, built-in monitoring) but requires development effort. EPC Group recommends a phased approach: lift-and-shift critical packages first to migrate off on-premises infrastructure, then gradually re-architect high-value packages to native ADF pipelines based on ROI analysis.

How do I monitor and troubleshoot Azure Data Factory pipelines?

ADF provides built-in monitoring through the ADF Monitor hub, which shows pipeline runs, activity runs, trigger runs, and integration runtime status with up to 45 days of history. For enterprise monitoring, integrate ADF with Azure Monitor by sending diagnostic logs to a Log Analytics workspace — this enables custom KQL queries, alerting on pipeline failures, and long-term log retention (beyond 45 days). Set up Azure Monitor alerts for: pipeline failure (immediate notification), long-running pipelines (exceeding expected duration by 2x), high DIU consumption (cost anomaly), and integration runtime errors. EPC Group configures a centralized monitoring dashboard in Azure Monitor Workbooks that tracks pipeline SLAs, data freshness metrics, and cost trends across all ADF instances in the environment.

Ready to get started?

EPC Group has completed over 10,000 implementations across Power BI, Microsoft Fabric, SharePoint, Azure, Microsoft 365, and Copilot. Let's talk about your project.

contact@epcgroup.net(888) 381-9725www.epcgroup.net
Schedule a Free Consultation