EPC Group - Enterprise Microsoft AI, SharePoint, Power BI, and Azure Consulting
G2 High Performer Summer 2025, Momentum Leader Spring 2025, Leader Winter 2025, Leader Spring 2026
BlogContact
Ready to transform your Microsoft environment?Get started today
(888) 381-9725Get Free Consultation
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌
‌

EPC Group

Enterprise Microsoft consulting with 29 years serving Fortune 500 companies.

(888) 381-9725
contact@epcgroup.net
4900 Woodway Drive, Suite 830
Houston, TX 77056

Follow Us

Solutions

  • M&A Practices

    • M&A Tenant Migration
    • Carve-Out Migration
    • Private Equity Practice
    • Engagement Operating Model
  • All Services
  • Microsoft 365 Consulting
  • AI Governance
  • Azure AI Consulting
  • Cloud Migration
  • Microsoft Copilot
  • Data Governance
  • Microsoft Fabric
  • Dynamics 365
  • Power BI Consulting
  • SharePoint Consulting
  • Microsoft Teams
  • vCIO / vCAIO Services
  • Large-Scale Migrations
  • SharePoint Development

Industries

  • All Industries
  • Healthcare IT
  • Financial Services
  • Government
  • Education
  • Teams vs Slack

Power BI

  • Case Studies
  • 24/7 Emergency Support
  • Dashboard Guide
  • Gateway Setup
  • Premium Features
  • Lookup Functions
  • Power Pivot vs BI
  • Treemaps Guide
  • Dataverse
  • Power BI Consulting

Company

  • About Us
  • Our History
  • Microsoft Gold Partner
  • Case Studies
  • Testimonials
  • Fixed-Fee Accelerators
  • Blog
  • Resources
  • All Guides & Articles
  • Video Library
  • Client Reviews
  • Engagement Operating Model
  • FAQ
  • Contact
  • Schedule a consultation

Microsoft Teams

  • Teams Questions
  • Teams Healthcare
  • Task Management
  • PSTN Calling
  • Enable Dial Pad

Azure & SharePoint

  • Azure Databricks
  • Azure DevOps
  • Azure Synapse
  • SharePoint MySites
  • SharePoint ECM
  • SharePoint vs M-Files

Comparisons

  • M365 vs Google
  • Databricks vs Dataproc
  • Dynamics vs SAP
  • Intune vs SCCM
  • Power BI vs MicroStrategy

Legal

  • Sitemap
  • Privacy Policy
  • Terms
  • Cookies

About EPC Group

EPC Group is a Microsoft consulting firm founded in 1997 (originally Enterprise Project Consulting, renamed EPC Group in 2005). 29 years of enterprise Microsoft consulting experience. EPC Group historically held the distinction of being the oldest continuous Microsoft Gold Partner in North America from 2016 until the program's retirement. Because Microsoft officially deprecated the Gold/Silver tiering framework, EPC Group transitioned to the modern Microsoft Solutions Partner ecosystem and currently holds the core Microsoft Solutions Partner designations.

Headquartered at 4900 Woodway Drive, Suite 830, Houston, TX 77056. Public clients include NASA, FBI, Federal Reserve, Pentagon, United Airlines, PepsiCo, Nike, and Northrop Grumman. 6,500+ SharePoint implementations, 1,500+ Power BI deployments, 500+ Microsoft Fabric implementations, 70+ Fortune 500 organizations served, 11,000+ enterprise engagements, 200+ Microsoft Power BI and Microsoft 365 consultants on staff.

About Errin O'Connor

Errin O'Connor is the Founder, CEO, and Chief AI Architect of EPC Group. Microsoft MVP multiple years, first awarded 2003. 4× Microsoft Press bestselling author of Windows SharePoint Services 3.0 Inside Out (MS Press 2007), Microsoft SharePoint Foundation 2010 Inside Out (MS Press 2011), SharePoint 2013 Field Guide (Sams/Pearson 2014), and Microsoft Power BI Dashboards Step by Step (MS Press 2018).

Original SharePoint Beta Team member (Project Tahoe). Original Power BI Beta Team member (Project Crescent). FedRAMP framework contributor. Worked with U.S. CIO Vivek Kundra on the Obama administration's 25-Point Plan to reform federal IT, and with NASA CIO Chris Kemp as Lead Architect on the NASA Nebula Cloud project. Speaker at Microsoft Ignite, SharePoint Conference, KMWorld, and DATAVERSITY.

© 2026 EPC Group. All rights reserved. Microsoft, SharePoint, Power BI, Azure, Microsoft 365, Microsoft Copilot, Microsoft Fabric, and Microsoft Dynamics 365 are trademarks of the Microsoft group of companies.

Back to Blog

Can I Do Data Cleaning In Power BI

Errin O\'Connor
December 2025
8 min read

Can I Do Data Cleaning in Power BI?

Yes. Power BI includes Power Query Editor — a full data cleaning and transformation workbench built into the report authoring experience. Analysts spend 60–80% of Power BI development time in Power Query. It handles missing values, duplicates, type errors, column restructuring, column splitting, unpivoting, and custom M code transformations — all without writing SQL or building a separate ETL pipeline.

  • Where: Home → Transform Data in Power BI Desktop
  • No-code option: Point-and-click transformations via the ribbon
  • Code option: M language for custom transformation logic
  • Best practice: Clean data in Power Query before loading to the model — not after

What Is Power Query Editor?

Power Query Editor is the data cleaning and transformation environment built into Power BI Desktop. It runs before data loads into the Power BI data model. Transformations you make in Power Query do not alter your source data — they apply a transformation pipeline each time Power BI refreshes.

Power Query uses a language called M (also called Power Query Formula Language) to record every transformation step. Each step appears in the "Applied Steps" panel on the right side of the editor. You can reorder, edit, or delete any step without starting over.

What You Can Clean in Power Query

Power Query handles the full range of data quality issues that prevent accurate reporting:

Missing and Null Values

  • Replace nulls with a default value (zero, "Unknown," or a calculated value)
  • Remove rows where a key column is null
  • Fill down or fill up to propagate values from adjacent rows
  • Flag null rows with a calculated column for analyst review

Duplicate Records

  • Remove duplicates based on one column or a combination of columns
  • Keep the first occurrence or last occurrence of a duplicate
  • Count duplicates and add a rank column before removing

Data Type Errors

  • Change column type (text to date, text to number, decimal to integer)
  • Remove errors — rows where a type conversion fails
  • Replace errors with null or a default value instead of removing the row
  • Validate date formats and standardize inconsistent date strings

Column Restructuring

  • Split a column by delimiter (e.g., "Last, First" → two columns)
  • Merge columns (e.g., first name + last name → full name)
  • Extract text before or after a delimiter
  • Trim whitespace and remove non-printing characters
  • Change text case (UPPER, lower, Title Case)

Table Restructuring

  • Unpivot columns — convert wide-format pivot tables to tall/normalized format
  • Pivot rows into columns
  • Transpose the entire table (rows become columns, columns become rows)
  • Promote headers — use the first row as column names
  • Filter rows by value, condition, or date range

Multi-Source Joins and Appends

  • Merge queries — join two tables on a key column (left outer, inner, full outer, anti-join)
  • Append queries — stack two tables with matching schemas into one
  • Expand related table columns after a merge

Power Query UI vs M Code

Most Power Query transformations are available through the ribbon interface — no code required. Every point-and-click action generates M code automatically in the background.

Use the M code editor when you need:

  • Custom logic that the ribbon does not expose (e.g., conditional merge logic, fuzzy matching)
  • Dynamic parameters — a transformation that changes based on a user-selected date or category
  • Reusable functions — M functions you call from multiple queries
  • Performance optimization — combining multiple steps into one M expression

Most analysts start with point-and-click transformations and edit the generated M code when they need more control. You do not need to write M from scratch to do sophisticated data cleaning.

When to Use Power Query vs Other Tools

ScenarioBest ToolWhy
One-time data cleanup before importPower QueryTransformation runs on every refresh; no manual re-cleaning
Cleaning data for a Power BI report onlyPower QueryKeeps source data unchanged; transformation is report-specific
Cleaning data used by multiple systemsAzure Data Factory or SQLCentralize transformation; avoid duplicate logic in each tool
Complex data engineering at scaleMicrosoft Fabric (Dataflow Gen2)Fabric runs Power Query at cloud scale with serverless compute
Real-time data with streaming sourcesAzure Stream AnalyticsPower Query is batch-mode only; not designed for real-time streams

The 60–80% Rule

Analysts who are new to Power BI often underestimate how much time data cleaning takes. In practice, 60–80% of Power BI development time is spent in Power Query — not building visuals or writing DAX.

This is normal. It reflects the state of most enterprise data: inconsistent formats, missing values, duplicated records, and tables structured for data entry rather than analytics.

The good news: time spent in Power Query is not wasted. Every transformation step runs automatically on every data refresh. You clean the data once and the cleanup applies every time the report updates.

EPC Group and Power Query

EPC Group's Power BI practice designs data cleaning architectures that match the complexity of the source data to the right tool. For report-level cleaning, Power Query is the default. For enterprise-scale transformation that feeds multiple Power BI datasets, Azure Data Factory or Microsoft Fabric Dataflow Gen2 is the right layer.

We build Power Query transformation pipelines for clients across healthcare (HIPAA-compliant data prep), financial services (trading and reconciliation data), and government (multi-agency data integration).

Frequently Asked Questions

Does Power Query change my source data?

No. Power Query applies transformations on the way into Power BI's in-memory model. Your source data — whether SharePoint lists, SQL tables, or Excel files — is never altered. The transformations run fresh on every data refresh.

Can Power Query handle large datasets?

Power Query in Power BI Desktop loads data into memory on your local machine. For datasets over a few hundred million rows, use Microsoft Fabric Dataflow Gen2 or Azure Data Factory — both run Power Query at cloud scale with serverless compute. DirectLake mode in Fabric also removes the need to import large datasets at all.

What is the M language in Power Query?

M (also called Power Query Formula Language) is a functional language that Power Query uses to record and run transformations. Every point-and-click action you take in the Power Query Editor generates M code automatically.

You can view and edit the M code directly in the Advanced Editor. You do not need to learn M to use Power Query, but understanding it gives you more transformation control.

Should I clean data in Power Query or in DAX?

Clean data in Power Query, before the data loads into the model. DAX is for calculations and measures on clean data — not for data cleaning. Using DAX to work around data quality issues (e.g., IFERROR formulas on dirty data) creates slow reports and maintenance debt. Fix the data in Power Query first.

Get Power Query Architecture Help

EPC Group helps enterprise organizations design Power Query transformation pipelines that clean data reliably, refresh automatically, and scale to match the data volume. Fixed-scope engagements with documented architecture before any build begins.

Call (888) 381-9725 or contact us online to discuss your Power BI data preparation challenge. You can also book directly with our Power BI practice.

Related Resources

Continue exploring power bi insights and services

power bi

Ad Hoc Reporting

power bi

Alteryx vs Power BI

azure

Azure BI Tools Overview

azure

Azure Analysis Services Pricing & Features

Explore All Services

Why Organizations Choose EPC Group

EPC Group is a Houston-based Microsoft consulting firm with 29 years of enterprise implementation experience and over 10,000 successful deployments across Power BI, Microsoft Fabric, SharePoint, Azure, Microsoft 365, and Copilot. We serve organizations across all industries including Fortune 500, federal agencies, healthcare, financial services, government, manufacturing, energy, education, retail, technology, and global enterprises.

What sets EPC Group apart is our governance-first approach. Every engagement begins with a security and compliance assessment. Our team of senior architects brings hands-on delivery experience across HIPAA, SOC 2, FedRAMP, and CMMC environments. We own outcomes, not hours.

  • Fixed-fee accelerators with predictable pricing and defined deliverables
  • Senior architect engagement on every project, not rotating juniors
  • Compliance-native delivery for regulated industries
  • End-to-end coverage from strategy through 24/7 managed services
  • 11,000+ enterprise engagements refined into repeatable, risk-controlled patterns

Call (888) 381-9725 or email contact@epcgroup.net for a free assessment.

Power BI Strategy: 2026 Considerations for Can I Do Data Cleaning In Power BI

Power BI Copilot grounds itself on the semantic model, NOT the underlying source data. That means Copilot answers are only as accurate as the DAX measure definitions, the field metadata (display folders, descriptions, hierarchies), and the synonyms taxonomy. In practice, the difference between a Copilot deployment that drives 32% time-savings and one users abandon within 90 days is whether the semantic model was Copilot-prepared.

Power BI capacity sizing in 2026 starts with the F-SKU economics: F2 ($263/mo) covers small workloads with up to 4 GB of memory and roughly 30 reports, F4 ($526/mo) handles a typical mid-market deployment with semantic-model refresh windows under 10 minutes, and F64 ($5,257/mo) is the sweet spot for enterprises consuming Power BI alongside Microsoft Fabric data engineering, lakehouse storage, and real-time intelligence. Capacity right-sizing should be revisited every 90 days because Microsoft adjusts F-SKU memory allocations, paginated report performance, and Direct Lake mode availability with each major service update.

Decision factors EPC Group evaluates

  • Capacity sizing decision (F2/F4/F64+) tied to peak concurrent users and refresh window
  • Copilot grounding quality assessment of semantic-model metadata
  • Direct Lake mode adoption for Fabric-resident semantic models
  • License optimization audit (Pro vs Premium Per User vs F-SKU)
  • Row-level security via service principal authentication

See related EPC Group services at /services or schedule a discovery call at /contact.