
Windows Becomes a Local AI Platform: What Build 2026 Means for Secure Enterprise AI Development
The RTX Spark superchip and DGX Spark redefine local AI for Windows. Aion runs on-device, MAI runs in Foundry, and MXC ties them together. EPC Group lays out the secure-AI-dev architecture for healthcare, finance, and government workloads that cannot leave the boundary.
The RTX Spark superchip and DGX Spark redefine local AI for Windows. Aion runs on-device, MAI runs in Foundry, and MXC ties them together. EPC Group lays out the secure-AI-dev architecture for healthcare, finance, and government workloads that cannot leave the boundary.

This article is part of the EPC Group Microsoft Build 2026 series. For the full strategic read on Project Solara, the Copilot Super App tease, MAI, Scout, MDASH, and RTX Spark — see the pillar: Project Solara, the Death of Apps, and the One Copilot That Wasn't.
There's a version of the "local AI" story that gets told as a convenience narrative: inference on the device, no latency, no cloud bill, no data leaving the building. That story is real, but it's not the most important part of what Microsoft announced at Build 2026. The more important part is the part that doesn't make for a clean consumer headline.
Local agents running on enterprise devices create a containment problem. The whole value proposition of an autonomous agent — the thing that makes it useful — is also the thing that makes it dangerous without proper architecture: it can take actions, access files, read email, interact with services, and do things you didn't explicitly authorize on a step-by-step basis. When that agent runs in a cloud service, you at least have the cloud provider's infrastructure controls around it. When it runs natively on a Windows device in your organization, the containment question becomes yours to answer.
Microsoft's Build 2026 Windows announcements are, at their core, an attempt to provide an architectural answer to that containment question. Some of those answers are mature and ready to evaluate now. Others are previews and "coming months" commitments. And one of them — the new silicon story — signals a change in the economics of local AI that I think enterprises are underestimating.
I've been through enough Windows platform shifts to distinguish between a marketing repositioning and a genuine architectural pivot. This is closer to the latter, though the governance work it creates is as significant as the capability it delivers.
One naming clarification before we go further, because Build 2026 coverage has blurred this in multiple outlets: the Aion models announced in the Windows track are local, on-device, in-box Windows models — they run on the endpoint hardware, no cloud call, no data egress, no per-token billing. They are entirely separate from the MAI model family, which are Microsoft's cloud-hosted Foundry models (reasoning, image, voice, transcription, coding). Different execution environments, different governance profiles, different cost structures. We'll cover MAI in our cloud AI article; this article is about what happens on the device.
The centerpiece of the Windows AI security story at Build 2026 is Microsoft Execution Containers — the MXC SDK. MXC is a cross-platform, policy-driven execution layer for agents running across Windows and WSL. The model it provides is declarative containment: when you deploy an agent, you declare what it can access — which files, which network endpoints, which services — and that containment is enforced at runtime.
That's a fundamentally different security posture from the default behavior of most current agent deployments, where the agent has whatever permissions the process inherits. MXC is still in early preview, but the architectural pattern it establishes is the right one, and enterprise security teams should pay attention to it now — not when it reaches GA — because it changes how you design local agent deployments from the ground up.
The practical integration that makes MXC relevant at enterprise scale is the native integration of Agent 365 with MXC, expected in preview in July 2026 — that is a July delivery, not a Build GA. Agent 365 itself reached GA on May 1st before Build; the MXC integration is coming next month. When it arrives, it will deliver what every enterprise CISO actually wants: Defender, Entra, Intune, and Purview protections applied to locally running agents. Not just to cloud agents. Not just to agents in Foundry. To agents running on the Windows device in your employee's hands.
That's the conversation shift that matters here. When a local agent has an Entra identity, Defender monitoring it, Purview classifying the data it touches, and Intune governing the device it runs on — that's not a convenience feature. That's a security architecture. The same security architecture you'd want for a human employee accessing the same resources.
Think of MXC not as a container in the Docker sense, but as a policy envelope. You're not virtualizing the agent. You're declaring its rights and enforcing those declarations at execution time. The distinction is important because it means the containment doesn't require significant infrastructure overhead — it requires careful policy design, which is a different kind of work but exactly the work enterprise security teams know how to do.
For agents that need isolation beyond what local containment provides, Windows 365 for Agents is now generally available within Agent 365. This is the managed Cloud PC model applied to agent workloads: a secure, managed environment in the cloud where computer-using agents operate, extending the containment story beyond the physical device boundary.
The practical use case: an agent that needs to browse the web, interact with external services, or perform actions that you'd never want a local process running with device-level access. Windows 365 for Agents gives that agent a dedicated, managed, network-isolated environment — provisioned, monitored, and decommissioned on demand. For regulated industries where the principle of least privilege needs to apply to AI agents just as strictly as it applies to human users, this is a meaningful architectural option.
Microsoft announced two on-device models at Build 2026: Aion 1.0 Instruct and Aion 1.0 Plan, both coming in the months ahead. These are local Windows models — not the same as Microsoft's MAI cloud models in Foundry. Aion runs entirely on device. No cloud call. No data leaves the endpoint.
Aion 1.0 Instruct is positioned as a faster, more capable on-device SLM for responsive local inference — the kind of model that handles immediate, conversational, task-execution scenarios where latency matters more than deep reasoning depth.
Aion 1.0 Plan is the more architecturally interesting of the two. At 14 billion parameters with a 32,000-token context window and fully local agentic capability including tool-calling, it represents something that didn't really exist in the Windows on-device story before: a local reasoning model capable of multi-step planning and tool use, running entirely on device with no cloud dependency.
Why does that matter for enterprise architecture? Because it changes the workload placement decision. With Aion 1.0 Plan available locally, architects can now design hybrid AI systems where routine planning and tool-calling runs on device — near-zero latency, no data egress, no per-token cloud cost — while complex reasoning, specialized tasks, or outputs requiring centralized logging escalate to Foundry-hosted models. That's not an edge case optimization. That's a new architectural pattern for the majority of enterprise AI workflows.
The governance implication is immediate: which workloads belong on-device, which belong in the cloud, and who makes that decision? The answer is not purely technical. It involves data sensitivity, compliance requirements, latency expectations, cost envelopes, and the maturity of your local device management posture. Enterprises that get ahead of that workload placement policy now will have a significant operational advantage when these models ship.
Two hardware announcements from Build 2026 deserve more attention than they typically get in the enterprise IT conversation.
The Surface RTX Spark Dev Box is built on NVIDIA RTX Spark silicon and delivers up to 1 petaflop of AI compute with 128GB of unified memory shared across CPU and GPU. It ships with VS Code and GitHub Copilot preconfigured, targets developer workflows, and arrives in the US later this year.
For enterprise AI development teams, the unified memory architecture is the significant specification: 128GB of memory that the CPU and GPU share means a developer can run a 70-billion-parameter model locally — the class of model that previously required either cloud infrastructure or a dedicated GPU server — on a device that fits on a desk and ships from Microsoft.
The 1-petaflop figure is not a marketing abstraction. It's the threshold at which the device becomes a legitimate alternative to remote development infrastructure for AI workloads that are currently cloud-dependent out of necessity, not by architectural choice. When that threshold moves to a desktop device, your development infrastructure decisions change.
The DGX Station for Windows is a different category of announcement entirely. It's a deskside AI supercomputer based on the NVIDIA GB300 Grace Blackwell Ultra Superchip, capable of running frontier models with up to one trillion parameters locally. Expected in Q4 this year.
Let me put that in context: the trillion-parameter threshold is roughly the scale of GPT-4-class models — the models that have, until now, required hyperscale cloud infrastructure or dedicated data center hardware to run. The DGX Station for Windows brings that compute to a deskside form factor under Windows, which means it's Intune-manageable, Entra-joinable, and Defender-protected by default.
For enterprises with highly sensitive data — government contractors, healthcare systems, financial institutions, legal firms — the possibility of running a frontier-class model on a device that never leaves your physical security perimeter, managed through your existing Microsoft endpoint management stack, is not a small thing. It changes the entire calculus of "we can't use AI for this because we can't send that data to a cloud service."
Two additional capabilities that enterprise architecture teams should catalog now: Windows AI APIs and Windows ML.
Windows AI APIs have expanded beyond NPU-only execution to include CPUs and GPUs, which matters because it removes the NPU availability requirement from the equation. You don't need next-generation Copilot+ PC hardware to leverage AI APIs in enterprise applications. The Speech Recognition API — real-time and batch on-device speech-to-text, English first, currently in public preview — is the first production-ready capability in this expanded stack.
Windows ML is Microsoft's description of the platform layer that "unlocks unmetered intelligence on Windows" — the ability to build, optimize, and deploy AI across all silicon types without rewriting the application for each hardware target. For enterprise developers building internal tools, line-of-business applications, or custom agents on Windows, this is the abstraction layer that makes hardware diversity manageable.
Assembling these announcements into a coherent architecture picture — MXC containment, Windows 365 for Agents, Aion on-device models, new silicon, Windows ML, expanding AI APIs — the thesis that emerges is Microsoft's argument for what I'd call the hybrid AI architecture imperative.
The argument is this: the question of where an AI workload runs — local device, cloud PC, cloud service, on-premises data center — should be answered by policy, not by default. Today, most enterprises answer it by default, because local compute wasn't powerful enough, local models weren't capable enough, and local containment wasn't mature enough to be a real choice. Those limitations are changing. Build 2026 is the clearest signal yet that they're changing on Windows.
The corollary is what makes this a security and architecture conversation, not just a developer tools conversation: as local AI capability increases, local AI risk increases proportionally. An agent with Aion 1.0 Plan that can plan, reason, and call tools locally — but without MXC containment, without Entra identity, without Defender monitoring, without Purview-classified data scope — is a more capable threat surface than anything that existed on Windows endpoints 12 months ago.
The organizations that are ahead of this understand that "local AI" is not inherently safer than "cloud AI." It's just differently exposed. Local containment via MXC, identity via Entra, monitoring via Defender, data governance via Purview — those aren't optional additions to a local AI deployment. They are the local AI deployment, in the sense that without them you've added capability without adding accountability.
When we engage with enterprise clients on their Windows AI architecture — whether that's planning a local agent deployment, evaluating MXC-based containment design, or hardening an existing Copilot+ PC rollout — we focus on five areas that most Windows AI conversations don't reach until something goes wrong.
Endpoint governance before model deployment. Your Intune configuration, your Entra conditional access policies, and your Defender for Endpoint baseline need to be current before any AI capability runs locally. The AI capability doesn't compensate for a weak endpoint posture. It depends on a strong one.
Workload placement policy. Which tasks run on-device, which run in Windows 365 for Agents, and which escalate to Foundry? This policy needs to be written before the architecture is built, because retrofitting it is significantly harder. Data sensitivity, compliance requirements, latency, and cost all factor in.
MXC policy design for local agents. When MXC reaches broader availability, the containment policy you configure for each local agent is the security document for that agent. Enterprises that wait for GA to start thinking about it will be behind. Start the policy conversation now.
Identity for every agent. Agent 365's model of binding each agent to an Entra identity is the right architecture. Every local agent should have an attributed, auditable identity — not inherit the credentials of the user process that spawned it.
Purview classification of local model inputs and outputs. When a local agent reads a sensitive document, generates a summary, or calls an external API — Purview needs to classify those interactions the same way it classifies human user interactions with the same content. This isn't automatic. It's configuration work, and it's our 30-Day Copilot/Purview/M365 Tenant Hardening Accelerator work.
Windows is becoming a local AI platform. The question isn't whether your organization will run AI locally — it's whether you'll run it with governance or without it. The platform capabilities announced at Build 2026 make the first option genuinely available for the first time. Choosing it is still a decision someone has to make.
Q: What is Microsoft Execution Containers (MXC) and when will it be production-ready?
MXC is a policy-driven execution layer that declares and enforces what a local AI agent can access at runtime — files, network, services. It's currently in early preview on Windows and WSL. Note the timeline: the MXC SDK itself is in early preview now; the Agent 365 integration that delivers Defender, Entra, Intune, and Purview protections is expected in preview in July 2026. That July integration was announced at Build but does not ship at Build. Full GA timelines haven't been specified by Microsoft; monitor the Windows Developer Blog for updates.
Q: What is the difference between Aion models and MAI models?
Aion models (Aion 1.0 Instruct, Aion 1.0 Plan) are local, on-device Windows models — they run on the endpoint hardware, require no cloud call, produce no data egress, and carry no per-token billing. MAI models are Microsoft's cloud-hosted Foundry models (reasoning, image, voice, transcription, coding) that run on Microsoft's cloud infrastructure. They are architecturally separate. In a hybrid AI architecture, Aion handles local tasks and MAI handles cloud-scale workloads — different execution environments, different governance profiles, different cost structures.
Q: What is Aion 1.0 Plan and what use cases is it designed for?
Aion 1.0 Plan is a 14-billion-parameter, 32,000-token-context on-device reasoning model with tool-calling capability, designed to run fully locally on capable Windows devices. It's built for multi-step planning and agentic tasks that need low latency and data locality. It's expected in the coming months.
Q: How does Windows 365 for Agents differ from a regular Windows 365 Cloud PC?
Windows 365 for Agents is a managed cloud PC environment specifically provisioned for AI agent workloads — not for human users. It extends containment beyond the local device, provides a network-isolated execution environment, and is managed within Agent 365. It's now generally available.
Q: Does the DGX Station for Windows mean we can run frontier AI models on premises without cloud?
Yes — that is the explicit capability Microsoft announced. The DGX Station for Windows, based on the NVIDIA GB300 Grace Blackwell Ultra Superchip, is designed to run frontier models up to one trillion parameters on a deskside device without cloud dependency. It's expected in Q4 2026.
Q: How does EPC Group help enterprises prepare for local AI deployments?
EPC Group provides architecture design, endpoint governance hardening, Purview/Entra/Defender/Intune integration, MXC policy design, and Virtual Chief AI Officer advisory services for enterprises deploying AI on Windows. Our 30-Day Tenant Hardening Accelerator is the fastest path to a production-ready foundation. Contact us at contact@epcgroup.net or 888-381-9725.
EPC Group | contact@epcgroup.net | 888-381-9725 | www.epcgroup.net
Microsoft Build 2026 raised the ceiling on what agentic AI can do across the Microsoft estate — and the floor on what your tenant has to be to deploy it safely. EPC Group has been doing this work for 29 years across Fortune 500 and federal organizations, with six Microsoft Solutions Partner designations and a perfect 100 NPS on G2.
If any of the following sound like your next 90 days, that is exactly the work we do:
Email contact@epcgroup.net, call 888-381-9725, or request a consultation. Senior architects only — no offshore handoff, no junior account managers.
"Local AI" is being sold as a convenience story. Move inference to the device, cut latency, eliminate cloud cost, keep data on premises. All of that is real.
But it's the wrong starting point for enterprise architecture.
At Build 2026, Microsoft made the most significant set of Windows AI announcements I've seen in years — not primarily because the hardware is impressive (though some of it genuinely is), but because the security architecture underneath it is a real attempt to solve a problem most enterprises haven't fully named yet.
When a local AI agent can plan, reason, call tools, and take actions on a device in your employee's hands, the question isn't "how do I get AI running locally." The question is "how do I contain, govern, and attribute what it does."
Here's what Microsoft announced and why it matters for enterprise security and architecture teams.
MXC: CONTAINMENT BY DECLARATION
Microsoft Execution Containers (MXC SDK) is a cross-platform, policy-driven execution layer for agents running across Windows and WSL. You declare what the agent can access — files, network endpoints, services — and that containment is enforced at runtime. Currently in early preview.
The Agent 365 native integration with MXC arrives in preview in July, and it delivers what enterprise CISOs actually want: Defender, Entra, Intune, and Purview protections applied to locally running agents. Not just cloud agents. Not just agents in Foundry. Agents on the device.
Think of MXC as a policy envelope, not a container in the infrastructure sense. You're not virtualizing the agent. You're declaring its rights and enforcing them. The work involved is policy design — exactly what your security and compliance teams already know how to do.
Windows 365 for Agents is now GA within Agent 365: managed Cloud PCs provisioned for agent workloads, not human users. For agents that interact with external services or need strict network isolation, this is the right containment model.
AION ON-DEVICE: A NEW WORKLOAD PLACEMENT DECISION
Two on-device Windows models are coming in the months ahead: Aion 1.0 Instruct (fast, responsive inference) and Aion 1.0 Plan (14B params, 32K context, fully local reasoning + tool-calling). These are local models — not the MAI cloud family. Aion runs on the device. No cloud call. No data egress.
Aion 1.0 Plan is the architecturally important one. A 14B-parameter model with tool-calling running entirely locally — no cloud dependency, no data egress, no per-token cost — means you can now design hybrid AI systems where routine planning and tool-calling runs on device while complex reasoning escalates to Foundry-hosted MAI or other cloud models. That's a new architectural pattern, not just a hardware feature.
The governance question it immediately creates: which workloads belong on-device vs. cloud, based on data sensitivity, compliance requirements, and latency needs? That policy should be written before the models ship.
THE SILICON STORY IS NOT JUST SPEC SHEET NEWS
Surface RTX Spark Dev Box: up to 1 petaflop AI compute, 128GB unified memory shared across CPU and GPU. Ships with VS Code + GitHub Copilot, Windows 11 Pro, arrives in US later this year. The 128GB unified memory means running a 70B-parameter model locally on a desktop device — no cloud dependency.
DGX Station for Windows: NVIDIA GB300 Grace Blackwell Ultra Superchip, runs frontier models up to 1 trillion parameters locally. Q4 this year. Deskside. Windows-managed. Intune-joinable. Defender-protected.
For organizations in highly regulated industries where certain data simply cannot leave the physical perimeter — this changes the calculus. Frontier-class inference, on premises, under your endpoint management stack.
THE HYBRID AI ARCHITECTURE IMPERATIVE
The through-line across all of these announcements is what I'd call intelligent workload placement: the question of where AI runs should be answered by policy, not by default. Today most enterprises answer it by default because local capability wasn't sufficient. Build 2026 changes that.
But here's the corollary: as local AI capability increases, local AI risk increases proportionally. An agent that can plan and act locally without MXC containment, without Entra identity, without Defender monitoring — that's a more capable threat surface than anything that ran on Windows endpoints a year ago.
"Local AI" is not inherently safer than "cloud AI." It's just differently exposed.
How is your organization thinking about the containment architecture for agents that will run on Windows endpoints — and is MXC on your security team's radar yet?
#MicrosoftBuild #WindowsAI #EnterpriseAI #AIGovernance #LocalAI #CyberSecurity #EPCGroup #MicrosoftSecurity
Thread 🧵
1/ Build 2026 reframed Windows as a local AI platform. MXC containment (early preview), Aion 1.0 Plan (14B/32K, fully local tool-calling), Windows 365 for Agents (GA), Surface RTX Spark Dev Box (1 petaflop/128GB), DGX Station (frontier models, 1T params, local).
2/ "Local AI" isn't inherently safer than cloud AI. It's differently exposed. The architecture answer: MXC policy containment + Entra identity per agent + Defender monitoring + Purview classification. That's the governance stack, not an add-on.
3/ Hybrid AI = intelligent workload placement by policy, not by default. Full breakdown: https://www.epcgroup.net/windows-local-ai-platform-build-2026-enterprise-development/ #WindowsAI #EnterpriseAI
Founder & Chief AI Architect, EPC Group
Microsoft Press bestselling author with 29 years of enterprise consulting experience.
View Full ProfileMicrosoft Build 2026 unveiled Project Solara, the MAI model family, Scout, MDASH, and a Copilot Super App tease. EPC Group reads what is real, what is hype, and what every regulated enterprise needs to do in the runway before agent-first devices arrive.
AI & InnovationMicrosoft Build 2026 made the agentic shift official: Work IQ, Fabric IQ, Foundry IQ, Agent 365, MAI models, Scout. EPC Group lays out what every CIO must do in the next 90 days to get tenant-ready before agents act across the enterprise.
AI & InnovationWork IQ goes GA June 16 2026. It is the context layer that lets every Microsoft AI agent reach across your tenant. EPC Group explains the Microsoft IQ umbrella, Agent 365 control plane, and the governance work to do before flipping the switch.
Our team of experts can help you implement enterprise-grade ai & innovation solutions tailored to your organization's needs.