EPC Group - Enterprise Microsoft AI, SharePoint, Power BI, and Azure Consulting
Clutch Top Power BI & Data Solutions Company 2026, G2 High Performer, Momentum Leader, Leader Awards
BlogContact
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌

EPC Group

Enterprise Microsoft consulting with 28+ years serving Fortune 500 companies.

(888) 381-9725
contact@epcgroup.net
4900 Woodway Drive - Suite 830
Houston, TX 77056

Follow Us

Solutions

  • All Services
  • Microsoft 365 Consulting
  • AI Governance
  • Azure AI Consulting
  • Cloud Migration
  • Microsoft Copilot
  • Data Governance
  • Microsoft Fabric
  • vCIO / vCAIO Services
  • Large-Scale Migrations
  • SharePoint Development

Industries

  • All Industries
  • Healthcare IT
  • Financial Services
  • Government
  • Education
  • Teams vs Slack

Power BI

  • Case Studies
  • 24/7 Emergency Support
  • Dashboard Guide
  • Gateway Setup
  • Premium Features
  • Lookup Functions
  • Power Pivot vs BI
  • Treemaps Guide
  • Dataverse
  • Power BI Consulting

Company

  • About Us
  • Our History
  • Microsoft Gold Partner
  • Case Studies
  • Testimonials
  • Blog
  • Resources
  • Contact

Microsoft Teams

  • Teams Questions
  • Teams Healthcare
  • Task Management
  • PSTN Calling
  • Enable Dial Pad

Azure & SharePoint

  • Azure Databricks
  • Azure DevOps
  • Azure Synapse
  • SharePoint MySites
  • SharePoint ECM
  • SharePoint vs M-Files

Comparisons

  • M365 vs Google
  • Databricks vs Dataproc
  • Dynamics vs SAP
  • Intune vs SCCM
  • Power BI vs MicroStrategy

Legal

  • Sitemap
  • Privacy Policy
  • Terms
  • Cookies

© 2026 EPC Group. All rights reserved.

Back to Blog

Microsoft Graph Data Connect: Copy Graph Datasets into Azure Data Factory

Errin O\'Connor
December 2025
8 min read

Microsoft Graph Data Connect enables enterprises to copy Microsoft 365 organizational data -- including email metadata, calendar events, Teams messages, OneDrive files, and user profiles -- into Azure Data Factory for large-scale analytics, machine learning, and business intelligence workloads. Unlike the standard Microsoft Graph API, which is designed for real-time, per-user queries, Graph Data Connect provides bulk access to organizational datasets with built-in privacy controls and data governance.

Understanding Microsoft Graph Data Connect

Microsoft Graph Data Connect bridges the gap between the rich organizational data in Microsoft 365 and the analytical capabilities of Azure. The standard Microsoft Graph API serves individual requests well (fetch a user's calendar, read a specific email), but it is not designed for bulk data extraction. Attempting to pull millions of email records or user profiles through the API would hit throttling limits and take days.

Graph Data Connect solves this by providing a pipeline-based approach that delivers Microsoft 365 data in bulk to your Azure storage, where it can be processed by Azure Data Factory, Azure Synapse Analytics, Azure Databricks, or any other Azure analytics service. Key capabilities include:

  • Bulk data delivery -- Extract millions of records from Microsoft 365 datasets (emails, calendar events, contacts, Teams messages, OneDrive activity) in a single pipeline run, with data delivered in JSON format to Azure Blob Storage or Azure Data Lake Storage.
  • Built-in data governance -- All data extraction requests go through an approval workflow in the Microsoft 365 admin center. A designated admin must approve each pipeline before data is delivered, preventing unauthorized data extraction.
  • Privacy controls -- Graph Data Connect supports column-level filtering (exclude sensitive fields), row-level filtering (extract data for specific groups only), and data pseudonymization (replace user IDs with pseudonyms for privacy-preserving analytics).
  • Incremental extraction -- Pipelines can be configured to extract only new or changed data since the last run, reducing data transfer volumes and processing costs for recurring analytics workloads.

Available Datasets

Graph Data Connect provides access to a growing catalog of Microsoft 365 datasets:

  • Mail messages and events -- Email metadata and content, calendar events, and meeting details. Useful for communication pattern analysis, meeting culture assessment, and collaboration network mapping.
  • Users and groups -- User profile data (name, title, department, location, manager) and group membership. Powers organizational analytics, reporting hierarchies, and people analytics.
  • Teams messages -- Teams channel and chat messages for compliance archiving, sentiment analysis, and collaboration analytics.
  • OneDrive and SharePoint -- File metadata, sharing activity, and access patterns. Useful for content governance, data sprawl analysis, and collaboration reporting.
  • Contacts -- Organizational contact data for relationship mapping and communication analysis.
  • Viva Insights signals -- Aggregated productivity signals from Viva Insights for workforce analytics (requires appropriate licensing).

Setting Up the Pipeline: Step by Step

Here is how to configure a Graph Data Connect pipeline to copy Microsoft 365 data into Azure Data Factory:

  • Step 1: Prerequisites -- Enable Microsoft Graph Data Connect in your Microsoft 365 tenant (via the Microsoft 365 admin center under Settings > Org settings). Provision an Azure subscription in the same Azure AD tenant. Create an Azure Data Factory instance and Azure Storage account (Blob or Data Lake Gen2).
  • Step 2: Register an Azure AD application -- Create an app registration in Azure AD with the appropriate Graph Data Connect permissions. The app does not require individual user consent -- permissions are approved by the M365 admin through the approval workflow.
  • Step 3: Configure the Azure Data Factory pipeline -- In Azure Data Factory, create a new pipeline with a Copy Data activity. Configure the source as "Office 365" connector, select the dataset (e.g., BasicDataSet_v0.Message_v1 for emails), and configure the sink as your Azure Storage account.
  • Step 4: Configure data filtering -- Specify column filters to exclude sensitive fields you do not need. Configure row filters to limit data extraction to specific Azure AD groups or time ranges. Enable pseudonymization if your analytics use case does not require real user identities.
  • Step 5: Submit for approval -- When you trigger the pipeline, Graph Data Connect generates an approval request in the Microsoft 365 admin center. The designated approver reviews the requested dataset, columns, filters, and destination before approving or denying the extraction.
  • Step 6: Process the data -- Once approved, data flows to your Azure Storage account in JSON format. Use Azure Synapse, Databricks, or Data Factory data flows to transform, aggregate, and analyze the data for your business intelligence or machine learning workloads.

Enterprise Use Cases

Organizations leverage Graph Data Connect for sophisticated analytics that would be impractical with the standard Graph API:

  • Organizational network analysis (ONA) -- Map communication patterns across the organization to identify collaboration bottlenecks, information silos, and key influencers. This data powers informed decisions about organizational restructuring and team composition.
  • Meeting culture optimization -- Analyze calendar data at scale to identify meeting overload patterns, fragmented schedules, and back-to-back meeting chains. Provide data-driven recommendations to leadership for meeting policy changes.
  • Compliance and risk analytics -- Extract email and Teams data for compliance pattern detection, insider risk signals, and regulatory reporting in financial services and healthcare organizations.
  • Employee experience analytics -- Combine Graph Data Connect signals with HR data to build comprehensive employee experience dashboards that correlate digital work patterns with engagement, retention, and performance metrics.
  • Custom Power BI dashboards -- Feed Graph Data Connect output into Power BI through Azure Synapse to create rich organizational analytics dashboards that go far beyond what Viva Insights provides out of the box.

How EPC Group Can Help

With 28+ years of enterprise Microsoft consulting experience and deep expertise in both Microsoft 365 and Azure analytics, EPC Group is uniquely positioned to help organizations leverage Graph Data Connect. Our services include:

  • Use case assessment -- We identify the highest-value analytics use cases for your organization and determine which Microsoft 365 datasets to extract, what filters to apply, and how to structure the analytical output.
  • Pipeline design and implementation -- We design and build Azure Data Factory pipelines with proper data filtering, incremental extraction, error handling, and monitoring for production reliability.
  • Privacy and governance configuration -- We configure data pseudonymization, column filtering, and group-based row filtering to ensure your analytics respect employee privacy and meet GDPR, HIPAA, and organizational data governance policies.
  • Analytics and visualization -- We build Power BI dashboards, Azure Synapse notebooks, and machine learning models that transform raw Graph Data Connect output into actionable business intelligence.
  • Approval workflow design -- We establish the admin approval workflow, define approver roles, and create governance documentation that ensures data extraction requests are reviewed and approved by authorized personnel.

Unlock Your Microsoft 365 Data

Ready to extract actionable insights from your Microsoft 365 organizational data? Our analytics team can design and implement a Graph Data Connect pipeline tailored to your business intelligence needs.

Schedule a ConsultationCall (888) 381-9725

Frequently Asked Questions

How is Graph Data Connect different from the Microsoft Graph API?

The Microsoft Graph API is designed for real-time, per-user queries -- fetching a single user's calendar or reading specific emails. It has throttling limits that make bulk data extraction impractical. Graph Data Connect is designed for bulk, organizational-scale data extraction, delivering millions of records in batch via Azure Data Factory pipelines. It includes data governance features (admin approval, pseudonymization) not available in the standard API.

What Azure services does Graph Data Connect require?

At minimum, you need an Azure subscription, Azure Data Factory (for pipeline orchestration), and Azure Blob Storage or Azure Data Lake Storage Gen2 (as the data destination). For analytics, you will typically add Azure Synapse Analytics, Azure Databricks, or connect the data to Power BI. The Azure subscription must be in the same Azure AD tenant as your Microsoft 365 organization.

Does Graph Data Connect support GDPR and employee privacy?

Yes, privacy controls are a core feature. You can pseudonymize user identities (replacing email addresses with opaque IDs), exclude specific columns from extraction, limit data to specific Azure AD groups, and filter by date range. All extraction requests require admin approval before data is delivered. For GDPR compliance, ensure your data processing agreements cover the extracted data and that your retention policies are applied in Azure Storage.

How much does Graph Data Connect cost?

Graph Data Connect pricing is based on the number of Microsoft Graph objects extracted per pipeline run. Microsoft charges per 1,000 objects extracted, with pricing varying by dataset type. Additionally, you pay standard Azure costs for Data Factory pipeline runs, storage, and any analytics services you use. For organizations extracting data for millions of users, EPC Group helps optimize extraction filters and incremental loading strategies to minimize costs.

Can we extract Teams channel and chat messages?

Yes. Graph Data Connect provides datasets for Teams channel messages and chat messages. This data is valuable for compliance analytics, sentiment analysis, and collaboration pattern studies. Be aware that extracting message content has significant privacy implications -- ensure you have appropriate legal basis (GDPR), employee notice, and data governance policies in place. In many organizations, extracting message metadata (timestamps, participants) is preferred over extracting full message content.