Back to courses
AzureAKSKubernetesInfrastructureNetwork SecurityPlatform Engineering

Enterprise Agent Infrastructure on Azure

Bridge the gap between building agents and running them in production. This advanced course covers the full Azure infrastructure stack for hosting AI agents at enterprise scale. Design AKS clusters with tiered node pools and a 4-tier storage architecture. Build a 3-layer egress defense system with Envoy sidecar proxies, Cilium FQDN policies, and Azure Firewall Premium. Implement a custom Kubernetes Operator (AgentFilterPolicy CRD) that reconciles per-agent filtering rules across all security layers. Wire up Entra ID workload identity, per-tenant Key Vault isolation, hub-and-spoke networking, and a GitOps CI/CD pipeline with Flux and blue-green deployments.

Duration 5 weeks
Level Advanced
lessons 20
Instructor Rafael Kemish Microsoft Certified Trainer · Azure AI & Platform Engineering

Curriculum

Azure compute options for AI agents: AKS vs. ACA vs. Functions vs. ACI
AKS cluster design: node pools, reserved instances & sizing for 200+ agents
Per-agent resource budgets: vCPU, memory, disk & runtime requirements
Deployment patterns: ephemeral, long-running & hybrid agent sessions
4-tier storage architecture: Azure Files SSD, Blob Hot, Blob Cool & tmpfs scratch
Entra ID workload identity & Kubernetes service account binding
Per-tenant Key Vault isolation with CSI Secrets Store Driver
Tiered tenant isolation: dedicated cluster, namespace & shared models
Hub-and-spoke networking: Azure Firewall, Private Endpoints & NSGs
Layer 1 — Envoy sidecar proxy: forward proxy, ext_authz & URL categorization
Layer 2 — Cilium FQDN policies: Azure CNI with ACNS & CiliumNetworkPolicy per agent
Layer 3 — Azure Firewall Premium: threat intelligence, web categories & TLS inspection
AgentFilterPolicy CRD: designing the custom resource schema
Operator reconciliation: syncing policy across sidecar, Cilium & Firewall layers
Identity-to-policy binding: Entra ID → Kubernetes labels → filtering rules
Per-node-pool routing & the decision guide: minimum viable to full enterprise
Observability: Application Insights, OpenTelemetry & Azure AI Foundry integration
Unified filtering audit trail: KQL queries across all egress layers
GitOps CI/CD: GitHub Actions, ACR, Helm, Flux & blue-green deployments
Cost modeling: infrastructure budgets, API token economics & model routing savings

What You'll Build

  • A production AKS cluster with tiered tenant isolation, node pools, and a 4-tier storage architecture
  • A 3-layer egress defense system with Envoy sidecar proxy, Cilium FQDN policies, and Azure Firewall Premium
  • A custom Kubernetes Operator (AgentFilterPolicy CRD) that reconciles per-agent filtering across all three security layers
  • A GitOps CI/CD pipeline with GitHub Actions, ACR, Helm charts, and blue-green deployments via Flux

Prerequisites

  • Working knowledge of Kubernetes (pods, deployments, services, namespaces) and Helm
  • An Azure subscription with permissions to create AKS clusters, VNets, and Firewall resources
  • Familiarity with networking fundamentals (DNS, HTTP proxies, TLS) and Linux command line

Frequently Asked Questions

Yes. After purchase you'll receive a detailed invoice via email. For custom invoicing or purchase orders, reach out via the contact form on our Teams page.

Absolutely. We offer 15–25% discounts for teams of 2 or more. Visit our Teams page and fill out the contact form for a custom quote.

We offer a full refund up to 7 days before the first session. If you're not satisfied after session 1, we'll work with you on a case-by-case basis.

Yes. Upon completing all sessions, you'll receive a verifiable certificate of completion that you can share on LinkedIn or with your employer.

Yes. All live sessions are recorded and available for 6 months after the cohort ends. Watch them at your own pace to review material.

No problem. Recordings are available within 24 hours. You can also ask questions asynchronously in the private community channel.

Each course lists specific prerequisites on this page. In general, you need comfortable programming skills and basic familiarity with APIs and cloud services.

About Your Instructor

Instructor

Rafael Kemish is a Microsoft Certified Trainer with 25 years in the Microsoft ecosystem and 23 active certifications. As a Senior Technical Consultant at Microsoft, he has trained over 5,000 professionals at organizations including NASA, Microsoft, Honda, and Citrix. He specializes in Azure AI, OpenAI integration, and enterprise platform engineering — teaching what actually works in real-world production deployments.