Microsoft Solutions Partner — Code-first AI · 11,000+ engagements

Azure AI Foundry Agent Service Enterprise Guide (2026)

The code-first Microsoft platform for building governed enterprise AI agents — agent runtime, function calling, multi-agent orchestration, Assistants API migration, evaluation, and Defender for Cloud AI hardening. Six capability areas, six enterprise agent patterns, three orchestration topologies — shipped by a senior-architect-led 29-year Microsoft Solutions Partner.

Book a Foundry Agent briefing Call 888-381-9725

What is Azure AI Foundry Agent Service and how do enterprises build code-first AI agents? The Azure AI Foundry Agent Service is the code-first agent runtime inside Azure AI Foundry (the unified platform replacing the standalone Azure OpenAI Service portal). It manages agent lifecycle, thread state, parallel function calling, knowledge grounding through managed vector stores, multi-agent orchestration (handoff, supervisor, planner-executor), evaluation, and observability — all developer-facing through the Azure SDK and a contract-compatible Assistants API. Enterprises deploy Foundry agents through a five-phase Assess, Design, Build + Ground, Governance Harden, Operate program that produces the responsible AI baseline, Defender for Cloud AI monitoring, and capacity model required for production rollout — and choose Foundry over Microsoft Copilot Studio when the build team is code-first and the agent lives inside a custom application rather than Microsoft 365 surfaces.

Azure AI Foundry Agent Service is the code-first Microsoft agent runtime inside Azure AI Foundry — threads, runs, parallel tool calling, managed vector stores, evaluation, Content Safety, and Defender for Cloud AI exposed to developers through the Azure SDK with an Assistants-API-compatible contract. Six capability areas span orchestration, tools, knowledge, models, evaluation, and observability. Three orchestration patterns (handoff, supervisor, planner-executor) support multi-agent workflows. EPC Group ships the five-phase Foundry Accelerator covering use case selection plus Assistants API migration, tool/knowledge design, build, responsible AI hardening, and managed operation under fixed fee $200K to $700K.

Key Facts

Six capability areas: orchestration, tools, knowledge sources, models, evaluation harness, observability
Code-first developer surface — Azure SDK (Python, .NET, JavaScript, Java) plus REST API
Assistants API contract compatibility — threads, runs, messages, steps migrate without rewrites
Three orchestration patterns: handoff, supervisor, planner-executor for multi-agent workflows
Full Foundry model catalog — Azure OpenAI GPT-4o/4.1/o-series/GPT-5-class plus open weights
Built-in evaluation harness — groundedness, relevance, F1, retrieval; CI/CD gating
Content Safety + Defender for Cloud AI + Purview audit — production responsible AI baseline
29-year Microsoft Solutions Partner, 70+ Fortune 500 clients, 11,000+ engagements

Foundry Agent Service vs Copilot Studio vs Azure OpenAI Assistants API

The 2026 Microsoft agent stack has three developer entry points. Microsoft Copilot Studio is the low-code platform for Microsoft 365 surface agents. The Foundry Agent Service is the code-first platform for product-grade and custom-application agents. The legacy Azure OpenAI Assistants API is the contract Foundry now hosts — most customers are migrating off the standalone Azure OpenAI portal into a Foundry project.

Dimension	Copilot Studio	Foundry Agent Service	Azure OpenAI Assistants API (legacy)
Build skill	Low-code maker	Code-first developer	Code-first developer
Where it runs	Teams, M365 Copilot, web, SMS	Custom application, product UX	Custom application (legacy portal)
Knowledge surface	SharePoint, Dataverse, Graph	Vector stores + connectors	Vector store + file search
Orchestration	Built-in multi-agent (low-code)	Handoff, supervisor, planner-executor	Single agent + function tools
Evaluation	Topic-level CSAT	Built-in harness, CI/CD gating	Custom / external
Governance	Purview + Defender for AI	Content Safety + Defender for Cloud AI + Purview	Content Safety + custom
Best use	Departmental + line-of-business agents	Product-grade + custom-app agents	Migrate to Foundry

Foundry Agent Service is the developer-facing successor to both the standalone Azure OpenAI Service portal and the Assistants API surface. Customers running Assistants-API workloads in 2025 are migrating into Foundry projects in 2026 because evaluation, observability, knowledge connectors, and Defender for Cloud AI integration are only available inside Foundry. Cross-link to the Copilot Studio enterprise guide for the parallel low-code platform and to the Azure OpenAI Service enterprise guide for the underlying model layer.

The six Foundry Agent Service capability areas

Every Foundry agent composes from six capability areas exposed through the SDK. The difference between a passable prototype and a production agent is whether the build team has wired all six — orchestration, tools, knowledge, models, evaluation, and observability — before shipping.

Agent orchestration — code-first runtime

What it does: The Foundry Agent Service runtime manages agent lifecycle, thread state, tool execution, run scheduling, and message ordering. Developers create an agent with a model, instructions, and a tool list — then create threads and runs through the Azure SDK (Python, .NET, JavaScript, Java) or the REST API. Foundry handles the reasoning loop, parallel tool calls, retry semantics, run cancellation, and crash-safe resume. State persists in a Foundry-managed thread store backed by Azure storage in the customer subscription.

Threads, runs, messages, and steps as first-class persistent objects in the runtime
Parallel tool calling with deterministic ordering and per-call timeout governance
Crash-safe resume — interrupted runs resume from the last completed step on restart
Bring-your-own-storage option pinning thread state to the customer Azure subscription
Compatible with the OpenAI Assistants API contract for migration without rewriting the agent loop

Tools — function calling, code interpreter, file search

What it does: Foundry agents expose three categories of tools to the model — function tools (developer-defined functions invoked via JSON schema), built-in tools (Code Interpreter, File Search, Bing Grounding, OpenAPI tools, Azure Function tools, Logic Apps tools), and Microsoft connector tools (Microsoft Graph, Fabric, SharePoint, Azure AI Search, Azure SQL, Cosmos DB). Tool calls execute inside the run loop and the result is fed back to the model for reasoning over multiple steps.

Function tools defined with JSON Schema parameters and developer-owned execution
Code Interpreter sandbox for Python execution against uploaded files and data
File Search with vector-store-backed retrieval across uploaded enterprise documents
OpenAPI tools that import a REST API spec and expose every operation as a tool
Azure Function and Logic App tools for serverless and integration-platform action surfaces

Knowledge sources — vector stores + grounding

What it does: Knowledge grounding in Foundry uses managed vector stores (chunking, embedding, indexing, retrieval) backed by Azure AI Search under the hood, with file-based ingest from blob storage and connector-based ingest from SharePoint, Fabric OneLake, Azure SQL, Cosmos DB, and Graph-connected sources. Grounding is composable — an agent can ground on multiple stores, filter by metadata, and honor sensitivity labels surfaced through the Graph connector layer when reading from Microsoft 365 content.

Managed vector stores with automatic chunking, embedding, refresh, and re-ranking
Native ingest from Azure Blob, Azure Data Lake, SharePoint, Fabric OneLake, Graph connectors
Sensitivity label honored when grounding through Graph connectors into Microsoft 365 content
Hybrid retrieval — semantic plus keyword plus filter-by-metadata for precision tuning
Citation surfaces in agent responses with link-back to source document or row

Models — full Azure OpenAI catalog plus open weights

What it does: Foundry Agent Service runs on the full Azure AI Foundry model catalog — Azure OpenAI (GPT-4o, GPT-4.1, o-series reasoning, GPT-5-class as Microsoft releases), Microsoft Phi, Meta Llama, Mistral, Cohere, and the open-weight catalog. Agents specify a model at creation and can swap models without rewriting tools or threads. Provisioned Throughput Units (PTU) for predictable latency and reserved capacity, or pay-as-you-go for variable workloads. Reasoning models support extended-thinking budgets exposed through the run options.

Azure OpenAI GPT-4o, GPT-4.1, o-series reasoning, GPT-5-class as available
Open-weight catalog — Phi, Llama, Mistral, Cohere, plus partner and customer fine-tunes
PTU for reserved capacity with predictable latency, PAYG for variable workload elasticity
Reasoning-model extended-thinking budgets exposed at the run-options level
Model swap at agent edit time — tools, threads, and instructions remain intact

Evaluation harness — pre-prod and continuous

What it does: Foundry ships a built-in evaluation harness for AI quality testing — predefined evaluators (groundedness, relevance, coherence, fluency, similarity, retrieval, F1) plus custom evaluators authored as prompts or as Python functions. Evaluations run pre-production against test datasets, continuously in production against sampled live conversations, and gate deployments through the Foundry CI/CD integrations. Results land in the Foundry observability surface and can route to Azure Monitor and Application Insights.

Predefined evaluators — groundedness, relevance, coherence, fluency, F1, retrieval quality
Custom prompt-based or function-based evaluators authored by the build team
Pre-prod evaluation against test datasets gating CI/CD promotion to higher environments
Continuous in-production evaluation against sampled live conversations with PII redaction
Evaluation results streamed to Azure Monitor and Application Insights for trend analysis

Observability + governance — tracing, content safety, Defender

What it does: Foundry agents emit OpenTelemetry traces for every run, tool call, and model call, with token-level cost attribution and latency breakdown surfaced in Azure Monitor and Application Insights. Azure AI Content Safety runs pre-call and post-call moderation for jailbreak attempts, protected material detection, groundedness checks, and PII detection. Defender for Cloud AI workload protection monitors for prompt injection, model theft, data exfiltration, and adversarial inputs. Purview audit logs every conversation when grounding touches Microsoft 365 content.

OpenTelemetry traces per run, tool call, and model call with token and latency attribution
Azure Monitor + Application Insights integration with prebuilt agent observability workbooks
Azure AI Content Safety — jailbreak, protected material, PII, groundedness, harmful content
Defender for Cloud AI workload protection — prompt injection, model theft, exfiltration
Purview audit log integration when grounding touches Microsoft 365 content surfaces

Six enterprise Foundry agent patterns

EPC Group has shipped Foundry-class code-first agents across capital markets, life sciences, legal operations, IT security, manufacturing supply chain, and finance close. Each pattern has a different grounding profile, a different orchestration topology, and a different governance bar. Most customers start with one regulated workload and reuse the platform baseline for subsequent agents.

Pattern 1 — Trading floor research + execution agent

The trading floor agent runs inside a custom Foundry-powered application embedded in the trader workstation. It grounds on Bloomberg and Refinitiv news feeds via OpenAPI tools, the firm’s internal research library in SharePoint via Graph connector with MNPI segmentation enforced by sensitivity label, the position book in Azure SQL, and the historical trade store in Fabric OneLake. Function tools wrap the order management system for staged-order creation, the risk system for pre-trade limit checks, and the compliance engine for restricted-list verification. The agent uses a planner-executor orchestration pattern — a reasoning model plans the multi-step research path, a faster model executes each tool call, and a final reasoning pass summarizes the recommendation with full citation. Defender for Cloud AI monitors for any cross-segment data leakage. FINRA-aligned audit logs every prompt, tool call, and recommendation through Purview audit and the firm’s WORM-compliant archive.

Pattern 2 — Clinical trial protocol agent for life sciences

The clinical trial agent grounds on the protocol library, the case report form (CRF) data captured in the EDC system, the adverse event database, and the regulatory submission archive. EPC Group wires it to Azure Health Data Services for FHIR-conformant patient data access, function tools for site enrollment status lookup, eCRF query generation, deviation reporting, and adverse event coding. The orchestration pattern is supervisor — a parent supervisor agent classifies the incoming request (protocol question, eligibility check, AE coding, regulatory question) and routes to a specialist child agent with the right tool list and grounding scope. GxP-aligned validation is enforced through the evaluation harness with locked test datasets per release, electronic signature on every model and prompt change, and Part 11 audit trail through Purview. HIPAA BAA covers the underlying Azure OpenAI and Foundry services. Cross-link to /microsoft-purview-data-governance-enterprise-2026 for the Purview baseline.

Pattern 3 — Contract review agent for legal operations

The contract review agent ingests an inbound third-party contract, grounds against the firm’s playbook (approved clauses, prohibited clauses, fallback positions), and produces a structured redline with rationale per change. Knowledge sources include the playbook library in SharePoint with sensitivity-label gating, the precedent contract corpus in the vector store, and the regulatory-clause library covering data protection (GDPR, CCPA), security (SOC 2 attestation flow-through), and indemnification standards. Function tools wrap the contract lifecycle management (CLM) system for clause-by-clause checkout, the e-signature platform for routing, and the conflicts-check system. The agent uses a handoff orchestration pattern — a triage agent classifies the contract type, hands off to a specialist agent per contract family (MSA, DPA, NDA, SOW, vendor) with the right playbook scope. Human-in-the-loop sign-off is mandatory on every redline before send.

Pattern 4 — IT incident commander agent on Sentinel + Defender XDR

The IT incident commander agent runs inside a custom SOC console powered by Foundry. It grounds on the Sentinel incident feed, the Defender XDR alert graph, the asset inventory in Microsoft Defender for Endpoint, the configuration management database (CMDB) in ServiceNow, and the historical incident runbook library. Function tools wrap Sentinel for KQL execution, Defender for Cloud for posture queries, Entra ID for token revocation and conditional access policy changes (under approval gate), and ServiceNow for incident creation and update. The orchestration pattern is supervisor with specialist children — an Identity child for Entra-side response, an Endpoint child for Defender-side response, a Network child for Azure Firewall and NSG changes, and a Communications child for stakeholder notification drafting. Human-in-the-loop gates govern every change-the-state action. Cross-link to /microsoft-defender-xdr-enterprise-2026 for the Defender XDR baseline.

Pattern 5 — Supply chain planner agent for manufacturing

The supply chain agent grounds on the demand forecast in Fabric OneLake, the inventory position from D365 Supply Chain Management, the supplier lead-time history, and the logistics-route library. Function tools wrap the MRP run for scenario simulation, the supplier portal for purchase order issuance under approval gate, the freight carrier APIs for booking, and the warehouse management system for slotting changes. The agent uses a planner-executor pattern — a reasoning model plans the multi-step replenishment scenario across constraints (lead time, MOQ, dual-source, transportation cost), an executor model runs each function tool in sequence with backpressure-aware retry, and the result is a costed plan handed to the planner for sign-off. The Foundry observability surface tracks per-run cost and latency so the planning team can balance reasoning quality against decision speed during a supply disruption.

Pattern 6 — Finance close agent on Dynamics 365 + Fabric

The finance close agent automates the recurring month-end and quarter-end close. It grounds on the trial balance in D365 Finance, the consolidation worksheet in Fabric, the close calendar in Project Operations, and the SOX control library in the GRC platform. Function tools wrap the journal entry creation in D365, the close task certification in the close-management platform, the variance analysis prompt against prior-period and forecast data, and the auditor evidence package generation. Orchestration is supervisor with a Reconciliation child agent per balance-sheet account family, a Variance Analysis child for material-deviation explanation, a Disclosure child for footnote drafting, and a Controls child for SOX control attestation. Every artifact produced is signed off by a human controller before posting, and the agent surfaces the SOX control map (cross-link to /standards-alignment) as the documentation evidence layer for the external audit.

Multi-agent orchestration

Three multi-agent orchestration topologies

Foundry exposes multi-agent orchestration through SDK primitives rather than a no-code orchestrator. Developers select the topology per use case — and the choice has material implications for response composition, token cost, evaluation strategy, and operational complexity.

Handoff — triage and specialize

A triage agent classifies the incoming request and hands off the conversation to a specialist child agent with the right tool list, grounding scope, and instructions. The child agent owns the thread for the rest of the conversation unless it hands back. Best for use cases with clearly separable subdomains — contract type families, support ticket categories, customer segment differentiation. The handoff pattern is the easiest to operate because each child is independently testable and independently governed.

Supervisor — parent orchestrates child specialists

A parent supervisor agent retains the conversation thread and delegates discrete subtasks to specialist child agents through function-tool calls. The parent reasons over the consolidated result and continues the conversation. Best for use cases that require composition across subdomains within a single response — a finance close agent reasoning across reconciliation, variance, and disclosure children. The supervisor pattern requires careful child-agent tool-list scoping to prevent cascading failure modes.

Planner-executor — reason once, execute many

A planner model produces a multi-step execution plan in structured output. An executor model — typically a faster, cheaper model — runs each step against the available tools, with backpressure-aware retry and partial-failure handling. A final reasoning pass over the executor output produces the user-facing response with full citation. Best for high-step-count agentic workflows where the planning reasoning cost is amortized across many executor calls. Token cost optimization is material because planner reasoning is concentrated and executor calls run on a cheaper model.

Capacity + economics

Sizing Foundry capacity — PTU vs PAYG, tool calls, vector storage

Foundry billing decomposes across four axes — model inference, runtime tool execution, vector-store ingest and storage, and platform protection. EPC Group sizes the capacity model in Phase 1 using projected run volume, average tool-call count per run, model mix (reasoning vs non-reasoning), vector-store size, and Defender for Cloud AI scope. The recommended posture is a hybrid PTU-plus-PAYG model — PTU for the primary user-facing workload where latency is predictable and PAYG for variable batch and exploratory work.

Run volume + model mix

Estimate runs per month by use case and tool-call count per run. Multiply by per-run token cost (input + output) at the chosen model. Reasoning models cost materially more per run but reduce tool-call iteration — net cost is often comparable.

Vector storage + ingest

File Search billing is per-GB-month for storage and per-token for embedding at ingest. Refresh cadence matters — a daily-refresh corpus costs roughly 30x a monthly-refresh corpus. Tune refresh to source-change rate, not calendar.

PTU reservation

PTU reservation discounts (monthly or annual) make sense once predictable peak throughput exceeds the breakeven point against PAYG. EPC Group runs the breakeven model with 30-90 day production telemetry rather than guessing pre-launch.

Three-year TCO modeling is the Phase 1 deliverable. EPC Group benchmarks customer Foundry agent unit economics against the 11,000+ engagement portfolio across regulated industries — including scenarios where vector-store storage dominates run-time inference cost and the optimization lever is refresh policy rather than model swap.

The make-or-break layer

Responsible AI is what separates a Foundry demo from a production agent

Code-first agents fail compliance review for the same reasons low-code agents fail — uncovered prompt injection, missing groundedness checks, no audit trail on tool calls, no abuse-monitoring posture aligned to data residency commitments. The difference is that Foundry exposes the controls to the build team to configure explicitly — which is the right model for product-grade agents and a foot-gun if the controls are skipped. EPC Group runs governance hardening as Phase 4 of every Foundry engagement, not as an afterthought.

Azure AI Content Safety

Pre-call and post-call moderation for jailbreak attempts, protected material, groundedness checks, PII detection, and harmful content across six categories. Each control has tunable severity thresholds per use case — the build team owns the trade-off between false positives and risk acceptance.

Defender for Cloud AI workload protection

Monitors the Foundry project for prompt injection, model theft, sensitive-data exposure, and adversarial conversations. Alerts route into Microsoft Defender XDR and Sentinel — see the Defender XDR enterprise guide for the integrated SOC posture.

Purview audit + abuse monitoring posture

Purview audit covers Microsoft 365 content surfaced through Foundry knowledge sources. Customer-managed abuse monitoring lets regulated workloads opt out of human review of stored prompts. Aligned to the Purview enterprise data governance baseline.

Evaluation harness as a gate

Pre-prod evaluation against locked test datasets gates CI/CD promotion. Continuous in-production evaluation against sampled live conversations surfaces regression before the user complaint queue does. Evaluation thresholds become the SLA contract with the business sponsor and the audit evidence for the compliance officer.

The EPC Group five-phase Foundry Agent Accelerator

Fixed-fee delivery in 10 to 18 weeks per first agent, anchored on the The EPC Group Lifecycle. The engagement covers Assess (use case + Assistants API migration + capacity model), Design (tools + knowledge + orchestration topology), Build + Ground, Governance Harden (Content Safety + Defender for Cloud AI + Purview), and Operate (managed service). Pricing ranges $200K to $700K depending on agent complexity, multi-agent orchestration scope, knowledge-source breadth, and evaluation-suite depth.

Phase 1 — Assess

Use case selection, Assistants API migration plan, capacity model

Phase one selects the right first Foundry agent for the customer maturity profile, inventories any existing Azure OpenAI Assistants API surface that should migrate to Foundry, and produces a capacity model sized against projected run volume, tool-call volume, and model mix. EPC Group runs a two-week assessment ending in a costed roadmap, an agent backlog ranked by ROI and risk, an Assistants-API-to-Foundry migration sequence (if applicable), and a PTU-versus-PAYG capacity recommendation aligned to the Assess stage of the EPC Group Lifecycle.

Use case backlog with ROI, complexity, and risk weighting
Inventory of existing Azure OpenAI Assistants API usage with a Foundry migration sequence
Capacity model — run volume, tool-call volume, model mix, PTU vs PAYG, three-year TCO
Code-first Foundry vs Copilot Studio decision per agent in the backlog

Phase 2 — Design

Tool taxonomy, knowledge architecture, orchestration topology

Phase two designs the function-tool catalog, the knowledge architecture (vector stores, connectors, refresh cadence), the orchestration topology (single agent, handoff, supervisor, planner-executor), the evaluation strategy (predefined plus custom evaluators with test datasets), and the observability plan (OpenTelemetry destinations, Content Safety rules, Defender for Cloud AI scope). EPC Group ships the design document with sequence diagrams, tool-by-tool schemas, knowledge-source matrices, and the responsible AI risk register reviewed by the customer compliance function before any code is written.

Tool catalog with JSON Schema definitions and developer-owned execution contracts
Knowledge architecture — vector stores, ingest sources, refresh, sensitivity label scope
Orchestration topology — single, handoff, supervisor, or planner-executor per use case
Evaluation plan and responsible AI risk register, signed off pre-build

Phase 3 — Build + ground

Agent build with SDK, vector ingest, and evaluation gates

Phase three is the code-first build sprint — Azure SDK (Python, .NET, JavaScript) for agent and tool implementation, vector-store ingest pipelines for knowledge sources, function-tool execution endpoints (Azure Functions, Container Apps, AKS, or on-prem call-out via Azure Relay), evaluation suites running in CI/CD, and the observability instrumentation. EPC Group ships in two-week increments to a controlled tester cohort, captures evaluation metrics and live telemetry, and tunes against measured groundedness, relevance, and tool-call accuracy rather than building in a vacuum and launching cold.

Two-week sprints with controlled tester cohort and live evaluation feedback
Vector-store ingest pipelines with refresh, deduplication, and chunking tuning
CI/CD-gated evaluation suites — pre-merge groundedness, relevance, F1 thresholds
OpenTelemetry instrumentation wired to Azure Monitor and Application Insights from day one

Phase 4 — Governance harden

Content Safety, Defender for Cloud AI, Purview audit, abuse monitoring

Phase four anchors the agent in the Azure responsible AI baseline — Azure AI Content Safety configured for jailbreak, protected material, PII, and groundedness checks; Defender for Cloud AI workload protection for prompt injection, model theft, and exfiltration; Purview audit log capture when grounding touches Microsoft 365 content; and customer-managed abuse monitoring settings consistent with the customer data residency and privacy posture. The customer security and compliance functions sign off against a documented control map before production cutover.

Azure AI Content Safety rules tuned per use case with severity thresholds
Defender for Cloud AI workload protection covering the Foundry project and dependencies
Purview audit log integration when grounding touches Microsoft 365 content surfaces
Customer-managed abuse monitoring posture aligned to data residency and privacy commitments

Phase 5 — Operate

Managed Foundry agents with senior-architect escalation

Phase five is steady-state operation. EPC Group provides managed Foundry agent services — model upgrade evaluation, evaluation harness maintenance, vector-store ingest health, tool-execution-endpoint SRE, capacity rebalancing across PTU and PAYG, and senior-architect escalation for incident response. Quarterly governance review with the customer security and compliance officers. Continuous responsible AI monitoring with content safety and Defender for Cloud AI signal triage routed into the broader Defender XDR incident graph.

Monthly agent health — evaluation trend, tool-call accuracy, capacity burn, model cost
Quarterly governance review with customer security and compliance officers
Continuous responsible AI signal triage routed into Defender XDR for incident response
Senior-architect on-call escalation tied to incident severity matrix

Why enterprises choose EPC Group for Azure AI Foundry agents

Senior-architect-led delivery from a 29-year Microsoft Solutions Partner with six Microsoft designations, four-time Microsoft Press authorship by founder Errin O’Connor, and a Fortune 500 portfolio across regulated industries.

11,000+

Microsoft engagements

70+

Fortune 500 clients

216+

M&A tenant consolidations

Years Microsoft consulting

Microsoft Solutions Partner — six designations

Data & AI (Azure), Digital & App Innovation, Infrastructure, Modern Work, Security, and Business Applications — the full Microsoft AI stack coverage required for code-first Foundry agent delivery.

Senior-architect-led delivery

Every engagement is led by a senior Microsoft architect — no junior staff on client-facing design, build, evaluation, or governance hardening work. The hands designing your agent ship the production deployment.

Microsoft Press authorship

Founder Errin O’Connor is a four-time Microsoft Press author across Power BI, SharePoint, Azure, and large-scale Microsoft 365 migrations — and brings nearly three decades of Microsoft consulting leadership to every Foundry engagement.

Compliance-native delivery

Foundry engagements aligned to HIPAA, SOC 2, FedRAMP, FINRA, CMMC, GxP, with Content Safety, Defender for Cloud AI, Purview audit, and abuse-monitoring posture mapped to regulatory profile — see the standards alignment map.

Frequently asked questions — Azure AI Foundry Agent Service

What is Azure AI Foundry Agent Service and how is it different from Azure OpenAI Service?

Azure AI Foundry is the unified platform that replaces the standalone Azure OpenAI Service portal experience. The Foundry Agent Service is the agent-runtime capability inside AI Foundry — a code-first, server-managed runtime that handles agent lifecycle, thread state, parallel tool calling, run scheduling, evaluation, and observability. Azure OpenAI Service models (GPT-4o, GPT-4.1, o-series, GPT-5-class) are still the underlying model layer; Foundry adds the agent runtime, the unified Foundry catalog (open-weight models alongside Azure OpenAI), the evaluation harness, the responsible AI tooling, and project-level governance. Customers using the legacy Azure OpenAI standalone portal are migrating to Foundry projects as Microsoft consolidates the experience. The model endpoints remain Azure OpenAI; the developer surface is now Foundry.

How do I migrate from the Azure OpenAI Assistants API to the Foundry Agent Service?

Foundry Agent Service maintains a compatible API contract with the Azure OpenAI Assistants API — threads, runs, messages, and steps remain first-class objects with the same semantics, and the SDK call shapes are nearly identical. A typical migration is three steps. First, point the SDK base URL at the Foundry project endpoint and update the authentication to use Azure RBAC against the Foundry project (rather than Azure OpenAI resource RBAC). Second, recreate agents in the Foundry project — the create-agent payload accepts the same model, instructions, and tool list with minor schema additions for Foundry-specific features like knowledge sources and connector tools. Third, migrate any existing thread state if continuity matters; new threads typically just flow into the Foundry runtime. EPC Group has run this migration across customers with multi-agent Assistants-API deployments in 2025 and ships the migration plan as part of the Phase 1 Foundry Accelerator deliverable.

When should I use Azure AI Foundry Agent Service versus Microsoft Copilot Studio?

Use Foundry Agent Service when the build team is code-first (Python, .NET, JavaScript developers), when the agent will run inside a custom application or product rather than inside Microsoft 365 surfaces, when fine-grained control over the model call (model, temperature, function-calling shape, streaming, reasoning-budget) is required, when the integration surface is dominated by non-Microsoft systems or by APIs that need OpenAPI-tool import, when token-level cost optimization and PTU reservation matter, or when multi-agent orchestration patterns (planner-executor, supervisor-with-specialists) require explicit programmatic control. Use Microsoft Copilot Studio (see /microsoft-copilot-studio-agents-enterprise-2026) when the build team is low-code, when the agent lives primarily inside Teams, Microsoft 365 Copilot, or web chat, and when out-of-the-box Purview governance is preferred over custom controls. Many enterprises run both — Foundry for the product-grade and custom-application agents, Copilot Studio for the Microsoft-surface departmental agents.

What is the cost model for Foundry Agent Service — PTU versus pay-as-you-go?

Foundry Agent Service runtime is billed on the underlying model and tool surface. Model calls follow Azure OpenAI pricing — Provisioned Throughput Units (PTU) for reserved capacity with predictable latency and a monthly or annual reservation discount, or pay-as-you-go (PAYG) per input and output token for variable workloads. Code Interpreter sandbox sessions are billed per session-hour. File Search vector-store storage is billed per gigabyte-month, and ingest is billed per token embedded. Function tools execute in the customer Azure subscription (Functions, Container Apps, AKS) and bill against those services directly. Defender for Cloud AI workload protection has its own SKU. EPC Group sizes the capacity model in Phase 1 using projected run volume, average tool-call count per run, model mix, and channel-mix to recommend a PTU-plus-PAYG hybrid that minimizes idle reservation cost while protecting latency for the primary user-facing workload.

How does multi-agent orchestration work in Foundry Agent Service?

Foundry supports multi-agent orchestration through three composable patterns. The handoff pattern uses a triage agent that classifies the incoming request and transfers thread ownership to a specialist child agent. The supervisor pattern keeps a parent agent in conversational control and uses function tools (each tool wraps a child agent invocation) to delegate discrete subtasks; the parent reasons over the consolidated result. The planner-executor pattern uses a reasoning model to produce a structured multi-step plan and a faster executor model to run each step, with a final reasoning pass over the consolidated output. Foundry exposes these patterns through SDK primitives rather than a no-code orchestrator, which gives developers explicit control over the cost-versus-quality trade-offs at every step. EPC Group selects the pattern per agent in the Phase 2 Design deliverable based on subdomain separability, response composition needs, and token cost profile.

How does Foundry handle responsible AI, content safety, and prompt injection defense?

Foundry agents run inside the Azure responsible AI layer with multiple defenses. Azure AI Content Safety runs pre-call and post-call moderation for jailbreak attempts, protected material detection, groundedness violations, PII surfacing, and harmful content across six harm categories — each with severity thresholds the build team tunes per use case. Defender for Cloud AI workload protection monitors the Foundry project for prompt injection, model theft attempts, sensitive-data exposure, and adversarial input patterns, and routes alerts into Microsoft Defender XDR and Sentinel. Customer-managed abuse monitoring lets enterprises opt out of human review of stored prompts for highly regulated workloads. Purview audit covers Microsoft 365 content access through Foundry knowledge sources. The combined layer is the production-grade equivalent of what Copilot Studio bundles in low-code — but exposed to developers for explicit configuration and tunable thresholds.

Can I deploy Foundry agents in HIPAA, FedRAMP, FINRA, or CMMC environments?

Yes. Azure AI Foundry and the Foundry Agent Service are covered by the Microsoft Business Associate Agreement (BAA) for HIPAA, are on the FedRAMP High path inside Azure Government, support FINRA-aligned audit through Purview and customer-managed retention, and align to CMMC 2.0 Level 2 boundary controls inside Azure Government for the defense industrial base. The compliance posture is anchored on the customer-chosen Foundry project region, data residency commitments, abuse-monitoring opt-out, and the broader Azure platform compliance baseline. EPC Group runs the Phase 4 governance hardening against the customer regulatory profile, produces the auditor evidence package mapped to the relevant control framework, and cross-links to HIPAA, SOC 2, FedRAMP, FINRA, CMMC, GxP as documented in the standards alignment map.

What does an EPC Group Foundry Agent engagement cost and how long does it take?

EPC Group Foundry Agent engagements are fixed-fee, ranging from $200K to $700K for a first agent depending on complexity, multi-agent orchestration scope, knowledge-source breadth, and evaluation-suite depth. Typical timeline is 10 to 18 weeks across the five-phase accelerator (Assess, Design, Build + Ground, Governance Harden, Operate), with managed Foundry agent services available on a quarterly or annual basis once production is live. The first agent is the most expensive because the platform baseline (Foundry project provisioning, Defender for Cloud AI enablement, Content Safety tuning, evaluation-harness wiring, observability plumbing) is set up once and reused. Subsequent agents land at 40 to 60 percent of first-agent cost. EPC Group is a 29-year Microsoft Solutions Partner with senior-architect-led delivery across 70+ Fortune 500 clients and 11,000+ engagements; the Foundry Accelerator is the standard delivery vehicle for code-first agent programs.

Related EPC Group enterprise guides

Ship governed Foundry agents — fixed fee, senior-architect-led

Book a Foundry Agent briefing with a senior EPC Group architect. Two-week assessment, costed five-phase roadmap, Assistants API migration plan (if applicable), responsible AI baseline review, and capacity model — all delivered against the The EPC Group Lifecycle.

Book a Foundry Agent briefing Call 888-381-9725

‌
‌
‌

‌
‌

‌
‌
‌

‌
‌
‌
‌
‌

‌
‌
‌
‌
‌
‌

‌

‌
‌

AI assistant — not human

Microsoft Solutions Partner — Code-first AI · 11,000+ engagements

Azure AI Foundry Agent Service Enterprise Guide (2026)

Book a Foundry Agent briefing Call 888-381-9725

Key Facts

Six capability areas: orchestration, tools, knowledge sources, models, evaluation harness, observability
Code-first developer surface — Azure SDK (Python, .NET, JavaScript, Java) plus REST API
Assistants API contract compatibility — threads, runs, messages, steps migrate without rewrites
Three orchestration patterns: handoff, supervisor, planner-executor for multi-agent workflows
Full Foundry model catalog — Azure OpenAI GPT-4o/4.1/o-series/GPT-5-class plus open weights
Built-in evaluation harness — groundedness, relevance, F1, retrieval; CI/CD gating
Content Safety + Defender for Cloud AI + Purview audit — production responsible AI baseline
29-year Microsoft Solutions Partner, 70+ Fortune 500 clients, 11,000+ engagements

Foundry Agent Service vs Copilot Studio vs Azure OpenAI Assistants API

Dimension	Copilot Studio	Foundry Agent Service	Azure OpenAI Assistants API (legacy)
Build skill	Low-code maker	Code-first developer	Code-first developer
Where it runs	Teams, M365 Copilot, web, SMS	Custom application, product UX	Custom application (legacy portal)
Knowledge surface	SharePoint, Dataverse, Graph	Vector stores + connectors	Vector store + file search
Orchestration	Built-in multi-agent (low-code)	Handoff, supervisor, planner-executor	Single agent + function tools
Evaluation	Topic-level CSAT	Built-in harness, CI/CD gating	Custom / external
Governance	Purview + Defender for AI	Content Safety + Defender for Cloud AI + Purview	Content Safety + custom
Best use	Departmental + line-of-business agents	Product-grade + custom-app agents	Migrate to Foundry

The six Foundry Agent Service capability areas

Agent orchestration — code-first runtime

Threads, runs, messages, and steps as first-class persistent objects in the runtime
Parallel tool calling with deterministic ordering and per-call timeout governance
Crash-safe resume — interrupted runs resume from the last completed step on restart
Bring-your-own-storage option pinning thread state to the customer Azure subscription
Compatible with the OpenAI Assistants API contract for migration without rewriting the agent loop

Tools — function calling, code interpreter, file search

Function tools defined with JSON Schema parameters and developer-owned execution
Code Interpreter sandbox for Python execution against uploaded files and data
File Search with vector-store-backed retrieval across uploaded enterprise documents
OpenAPI tools that import a REST API spec and expose every operation as a tool
Azure Function and Logic App tools for serverless and integration-platform action surfaces

Knowledge sources — vector stores + grounding

Managed vector stores with automatic chunking, embedding, refresh, and re-ranking
Native ingest from Azure Blob, Azure Data Lake, SharePoint, Fabric OneLake, Graph connectors
Sensitivity label honored when grounding through Graph connectors into Microsoft 365 content
Hybrid retrieval — semantic plus keyword plus filter-by-metadata for precision tuning
Citation surfaces in agent responses with link-back to source document or row

Models — full Azure OpenAI catalog plus open weights

Azure OpenAI GPT-4o, GPT-4.1, o-series reasoning, GPT-5-class as available
Open-weight catalog — Phi, Llama, Mistral, Cohere, plus partner and customer fine-tunes
PTU for reserved capacity with predictable latency, PAYG for variable workload elasticity
Reasoning-model extended-thinking budgets exposed at the run-options level
Model swap at agent edit time — tools, threads, and instructions remain intact

Evaluation harness — pre-prod and continuous

Predefined evaluators — groundedness, relevance, coherence, fluency, F1, retrieval quality
Custom prompt-based or function-based evaluators authored by the build team
Pre-prod evaluation against test datasets gating CI/CD promotion to higher environments
Continuous in-production evaluation against sampled live conversations with PII redaction
Evaluation results streamed to Azure Monitor and Application Insights for trend analysis

Observability + governance — tracing, content safety, Defender

OpenTelemetry traces per run, tool call, and model call with token and latency attribution
Azure Monitor + Application Insights integration with prebuilt agent observability workbooks
Azure AI Content Safety — jailbreak, protected material, PII, groundedness, harmful content
Defender for Cloud AI workload protection — prompt injection, model theft, exfiltration
Purview audit log integration when grounding touches Microsoft 365 content surfaces

Six enterprise Foundry agent patterns

Pattern 1 — Trading floor research + execution agent

Pattern 2 — Clinical trial protocol agent for life sciences

Pattern 3 — Contract review agent for legal operations

Pattern 4 — IT incident commander agent on Sentinel + Defender XDR

Pattern 5 — Supply chain planner agent for manufacturing

Pattern 6 — Finance close agent on Dynamics 365 + Fabric

Multi-agent orchestration

Three multi-agent orchestration topologies

Handoff — triage and specialize

Supervisor — parent orchestrates child specialists

Planner-executor — reason once, execute many

Capacity + economics

Sizing Foundry capacity — PTU vs PAYG, tool calls, vector storage

Run volume + model mix

Vector storage + ingest

PTU reservation

The make-or-break layer

Responsible AI is what separates a Foundry demo from a production agent

Azure AI Content Safety

Defender for Cloud AI workload protection

Purview audit + abuse monitoring posture

Evaluation harness as a gate

The EPC Group five-phase Foundry Agent Accelerator

Phase 1 — Assess

Use case selection, Assistants API migration plan, capacity model

Use case backlog with ROI, complexity, and risk weighting
Inventory of existing Azure OpenAI Assistants API usage with a Foundry migration sequence
Capacity model — run volume, tool-call volume, model mix, PTU vs PAYG, three-year TCO
Code-first Foundry vs Copilot Studio decision per agent in the backlog

Phase 2 — Design

Tool taxonomy, knowledge architecture, orchestration topology

Tool catalog with JSON Schema definitions and developer-owned execution contracts
Knowledge architecture — vector stores, ingest sources, refresh, sensitivity label scope
Orchestration topology — single, handoff, supervisor, or planner-executor per use case
Evaluation plan and responsible AI risk register, signed off pre-build

Phase 3 — Build + ground

Agent build with SDK, vector ingest, and evaluation gates

Two-week sprints with controlled tester cohort and live evaluation feedback
Vector-store ingest pipelines with refresh, deduplication, and chunking tuning
CI/CD-gated evaluation suites — pre-merge groundedness, relevance, F1 thresholds
OpenTelemetry instrumentation wired to Azure Monitor and Application Insights from day one

Phase 4 — Governance harden

Content Safety, Defender for Cloud AI, Purview audit, abuse monitoring

Azure AI Content Safety rules tuned per use case with severity thresholds
Defender for Cloud AI workload protection covering the Foundry project and dependencies
Purview audit log integration when grounding touches Microsoft 365 content surfaces
Customer-managed abuse monitoring posture aligned to data residency and privacy commitments

Phase 5 — Operate

Managed Foundry agents with senior-architect escalation

Monthly agent health — evaluation trend, tool-call accuracy, capacity burn, model cost
Quarterly governance review with customer security and compliance officers
Continuous responsible AI signal triage routed into Defender XDR for incident response
Senior-architect on-call escalation tied to incident severity matrix

Why enterprises choose EPC Group for Azure AI Foundry agents

11,000+

Microsoft engagements

70+

Fortune 500 clients

216+

M&A tenant consolidations

Years Microsoft consulting

Microsoft Solutions Partner — six designations

Data & AI (Azure), Digital & App Innovation, Infrastructure, Modern Work, Security, and Business Applications — the full Microsoft AI stack coverage required for code-first Foundry agent delivery.

Senior-architect-led delivery

Microsoft Press authorship

Compliance-native delivery

Frequently asked questions — Azure AI Foundry Agent Service

What is Azure AI Foundry Agent Service and how is it different from Azure OpenAI Service?

How do I migrate from the Azure OpenAI Assistants API to the Foundry Agent Service?

When should I use Azure AI Foundry Agent Service versus Microsoft Copilot Studio?

What is the cost model for Foundry Agent Service — PTU versus pay-as-you-go?

How does multi-agent orchestration work in Foundry Agent Service?

How does Foundry handle responsible AI, content safety, and prompt injection defense?

Can I deploy Foundry agents in HIPAA, FedRAMP, FINRA, or CMMC environments?

What does an EPC Group Foundry Agent engagement cost and how long does it take?

Related EPC Group enterprise guides

Ship governed Foundry agents — fixed fee, senior-architect-led

Book a Foundry Agent briefing Call 888-381-9725