Phase 1: Public Cloud¶
Chapter Summary
Phase 1 represents Contoso Insurance's initial cloud-native deployment on Azure public cloud. This architecture maximizes operational simplicity by leveraging fully managed Azure PaaS services. This chapter details the architecture, deployment approach, operational model, and cost profile of the Azure public cloud implementation.
Architecture Overview¶
The Phase 1 architecture fully embraces Azure's managed services philosophy. Every component runs on Azure PaaS offerings, minimizing operational overhead while maximizing scalability, reliability, and integration with Azure's ecosystem.
Component Mapping — Azure PaaS Services¶
| Application Component | Azure Service | Configuration |
|---|---|---|
| Frontend (React SPA) | Azure Kubernetes Service (AKS) | NGINX ingress controller, 3-10 replicas, autoscaling |
| API Backend (.NET 8) | Azure Kubernetes Service (AKS) | 5-20 replicas, autoscaling on CPU/latency |
| Background Workers (.NET 8) | Azure Kubernetes Service (AKS) | Separate node pool, 2-8 replicas per worker type |
| Database (SQL Server) | Azure SQL Database | Business Critical tier, 8 vCores, zone-redundant |
| File Storage (Objects) | Azure Blob Storage | Hot tier, geo-redundant storage (GRS), lifecycle policies |
| Message Queue | Azure Service Bus | Standard tier, 3 queues, partitioning enabled |
| Identity (Customers) | Azure AD B2C | Custom policies, social identity providers |
| Identity (Employees) | Microsoft Entra ID | Conditional access, MFA enforcement |
| Secrets Management | Azure Key Vault | Premium tier (HSM-backed), RBAC-based access |
| Container Registry | Azure Container Registry | Premium tier, geo-replication, Defender scanning |
| Ingress & CDN | Azure Front Door | Premium tier, WAF enabled, SSL/TLS termination |
| Monitoring & Logging | Azure Monitor + App Insights | Distributed tracing, custom metrics, log analytics |
| CI/CD Pipeline | GitHub Actions | Infrastructure as Code (Bicep), Helm chart deployments |
Infrastructure Architecture¶
The Phase 1 deployment uses a hub-spoke network topology with Azure as the central hub:
- Hub Virtual Network (
10.0.0.0/16) - Shared services including Azure Firewall, Azure Bastion for secure VM access, VPN Gateway for administrative access. This VNet hosts management and monitoring infrastructure.
- Spoke Virtual Network (
10.1.0.0/16) - Application workloads running on AKS. Subnets include: -
aks-system(10.1.0.0/24) — System pods, ingress controllers, cluster monitoring -aks-frontend(10.1.1.0/24) — Frontend NGINX pods
-aks-backend(10.1.2.0/24) — API backend pods -aks-workers(10.1.3.0/24) — Background worker pods - Private Endpoints
- Azure SQL Database, Azure Blob Storage, and Azure Service Bus are accessed via private endpoints injected into the spoke VNet. This ensures all data plane traffic remains within the Azure backbone network, never traversing the public internet.
- Azure Front Door
- Sits at the internet edge, provides global load balancing, SSL/TLS termination, and Web Application Firewall (WAF) protection. Routes traffic to the AKS ingress controller via private link.
graph TB
Internet([🌐 Internet])
subgraph Edge["Edge Services"]
AFD[Azure Front Door<br/>+ WAF]
end
Internet --> AFD
subgraph Hub["Hub VNet (10.0.0.0/16)"]
Firewall[Azure Firewall]
Bastion[Azure Bastion]
VPN[VPN Gateway]
end
subgraph Spoke["Spoke VNet (10.1.0.0/16)"]
subgraph AKS["AKS Cluster"]
direction TB
Ingress[Ingress Controller<br/>10.1.0.0/24]
Frontend[Frontend Pods<br/>NGINX<br/>10.1.1.0/24]
Backend[Backend Pods<br/>.NET API<br/>10.1.2.0/24]
Workers[Worker Pods<br/>Background<br/>10.1.3.0/24]
end
subgraph PrivateEndpoints["Private Endpoints"]
PE_SQL[SQL PE]
PE_Blob[Blob PE]
PE_SBus[Service Bus PE]
end
end
AFD -.->|Private Link| Ingress
subgraph PaaS["Azure PaaS Services"]
SQL[(Azure SQL<br/>Database)]
Blob[(Azure Blob<br/>Storage)]
SBus[Azure Service<br/>Bus]
KV[Azure Key<br/>Vault]
ACR[Azure Container<br/>Registry]
end
subgraph Identity["Identity & Auth"]
B2C[Azure AD B2C<br/>Customers]
EntraID[Microsoft Entra ID<br/>Employees]
end
subgraph Management["Management & Monitoring"]
Monitor[Azure Monitor<br/>+ App Insights]
GitHub[GitHub Actions<br/>CI/CD]
end
Ingress --> Frontend
Frontend --> Backend
Backend --> Workers
Backend --> PE_SQL
Backend --> PE_Blob
Backend --> PE_SBus
Workers --> PE_SQL
Workers --> PE_Blob
Workers --> PE_SBus
PE_SQL -.-> SQL
PE_Blob -.-> Blob
PE_SBus -.-> SBus
Backend -.->|Secrets| KV
Workers -.->|Secrets| KV
Frontend -.->|Auth| B2C
Backend -.->|Auth| EntraID
AKS -.->|Telemetry| Monitor
AKS -.->|Images| ACR
GitHub -.->|Deploy| AKS
Hub -.->|Peering| Spoke
style Edge fill:#0078d4,stroke:#005a9e,stroke-width:2px,color:#fff
style Hub fill:#50e6ff,stroke:#0078d4,stroke-width:2px
style Spoke fill:#50e6ff,stroke:#0078d4,stroke-width:2px
style AKS fill:#326ce5,stroke:#1a4d8c,stroke-width:2px,color:#fff
style PaaS fill:#0078d4,stroke:#005a9e,stroke-width:2px,color:#fff
style Identity fill:#dc3545,stroke:#a71d2a,stroke-width:2px,color:#fff
style Management fill:#107c10,stroke:#004b1c,stroke-width:2px,color:#fff graph TB
subgraph Cluster["AKS Cluster - Phase 1"]
direction TB
subgraph SystemPool["System Node Pool<br/>3x Standard_D4s_v5<br/>Ubuntu 22.04"]
SysNode1[System Node 1<br/>Core DNS, Metrics Server]
SysNode2[System Node 2<br/>Ingress Controller]
SysNode3[System Node 3<br/>Azure Monitor Agent]
end
subgraph FrontendPool["Frontend Node Pool<br/>2-8x Standard_D2s_v5<br/>Autoscaling"]
FENode1[Frontend Pods<br/>NGINX Static]
FENode2[Frontend Pods<br/>NGINX Static]
end
subgraph BackendPool["Backend Node Pool<br/>3-12x Standard_D4s_v5<br/>Autoscaling"]
BENode1[API Pods<br/>.NET 8]
BENode2[API Pods<br/>.NET 8]
BENode3[API Pods<br/>.NET 8]
end
subgraph WorkerPool["Worker Node Pool<br/>2-6x Standard_D4s_v5<br/>Autoscaling"]
WNode1[Document Worker<br/>OCR Processing]
WNode2[Notification Worker<br/>Email/SMS]
end
subgraph Network["Networking"]
CNI[Azure CNI<br/>+ Calico Policies]
LB[Azure Load<br/>Balancer]
end
SysNode2 -.->|Routes| LB
LB --> FENode1
LB --> FENode2
FENode1 --> BENode1
FENode2 --> BENode2
FENode2 --> BENode3
BENode1 -.->|Messages| WNode1
BENode2 -.->|Messages| WNode2
end
subgraph External["External Connectivity"]
PrivateEP[Private Endpoints<br/>SQL, Blob, Service Bus]
AzureServices[Azure Monitor<br/>Azure Key Vault<br/>Azure AD]
end
Cluster -.->|Azure CNI| CNI
Cluster -.->|Private| PrivateEP
Cluster -.->|Management| AzureServices
style SystemPool fill:#326ce5,stroke:#1a4d8c,stroke-width:2px,color:#fff
style FrontendPool fill:#50e6ff,stroke:#0078d4,stroke-width:2px
style BackendPool fill:#0078d4,stroke:#005a9e,stroke-width:2px,color:#fff
style WorkerPool fill:#107c10,stroke:#004b1c,stroke-width:2px,color:#fff
style Network fill:#ffc107,stroke:#f57c00,stroke-width:2px
style External fill:#dc3545,stroke:#a71d2a,stroke-width:2px,color:#fff Azure Kubernetes Service (AKS) Configuration¶
Contoso chose AKS as the container orchestration platform for all application workloads. While alternatives like Azure App Service or Azure Container Apps were considered, AKS provided the portability needed for future hybrid scenarios.
Cluster Specifications¶
Kubernetes Version: 1.28 (automatically upgraded via maintenance windows)
Node Pools:
- System Node Pool: 3 nodes, Standard_D4s_v5 (4 vCPU, 16 GB RAM), Ubuntu 22.04
- Frontend Node Pool: 2-8 nodes, Standard_D2s_v5 (2 vCPU, 8 GB RAM), autoscaling
- Backend Node Pool: 3-12 nodes, Standard_D4s_v5 (4 vCPU, 16 GB RAM), autoscaling
- Worker Node Pool: 2-6 nodes, Standard_D4s_v5 (4 vCPU, 16 GB RAM), autoscaling
Networking: Azure CNI with Calico network policies for pod-level segmentation
Monitoring: Azure Monitor Container Insights with Log Analytics integration
Security: Azure AD workload identity for pod authentication, Azure Policy for governance
Application Deployment Model¶
All application components deploy as Kubernetes Deployments with accompanying Services and Ingresses:
- Frontend Deployment (
web-frontend) - 3 replicas minimum, horizontal pod autoscaler (HPA) targets 70% CPU utilization, maxes at 10 replicas. NGINX serves static React assets. Liveness probe checks HTTP 200 on
/health. Readiness probe checks/ready. - Backend Deployment (
api-backend) - 5 replicas minimum, HPA targets 70% CPU and p95 latency <500ms, maxes at 20 replicas. .NET 8 Web API with Entity Framework Core. Connection pooling to Azure SQL Database. Managed identity for Key Vault and Blob Storage access.
- Worker Deployments (3 separate deployments)
- Each worker type (
document-processor,premium-calculator,notification-sender) runs 2-4 replicas with HPA based on Azure Service Bus queue depth. Managed identities for queue access.
All deployments use rolling update strategy with maxUnavailable: 1 and maxSurge: 1 to ensure zero-downtime deployments.
Azure SQL Database Configuration¶
Contoso uses Azure SQL Database in the Business Critical tier to ensure high availability and performance:
Service Tier: Business Critical (Gen5, 8 vCores)
Storage: 500 GB allocated, auto-growth enabled, max 1 TB
High Availability: Zone-redundant configuration with automatic failover
Backup: Automated backups with 35-day retention, point-in-time restore
Security: Private endpoint access only, TDE (Transparent Data Encryption) enabled, Azure AD authentication
- Connection Model
- API backend and workers connect using managed identities (no passwords). Connection strings retrieved from Azure Key Vault at startup. Connection pooling with max pool size of 100 per replica.
- Performance Profile
- DTU consumption averages 60-70% during business hours, spiking to 85% during end-of-month reporting. Query Performance Insight identifies slow queries for optimization.
Azure Blob Storage Configuration¶
Document storage uses Azure Blob Storage with lifecycle management policies:
Storage Account: contosoclaimsstorage (General Purpose v2)
Redundancy: Geo-redundant storage (GRS) with read access (RA-GRS)
Access Tier Strategy:
- Hot tier: Documents from active claims (last 90 days) — ~200 GB
- Cool tier: Documents from closed claims (90 days - 3 years) — ~2 TB
- Archive tier: Documents from claims older than 3 years — ~6 TB
- Lifecycle Policies
- Automated transitions from Hot → Cool after 90 days, Cool → Archive after 3 years. Documents retained for 10 years per regulatory requirements, then deleted.
- Access Model
- API backend generates SAS (Shared Access Signature) tokens with 15-minute expiration for direct client uploads. Eliminates API bandwidth consumption for document uploads.
Azure Service Bus Configuration¶
Azure Service Bus provides reliable message queuing with at-least-once delivery guarantees:
Namespace: contosoinsurance-servicebus (Standard tier)
Queues:
document-processing: Max delivery count 5, message TTL 24 hours, dead-letter queue enabledpremium-calculation: Max delivery count 3, message TTL 12 hours, sessions enabled for ordered processingnotification-sender: Max delivery count 10, message TTL 1 hour, partitioning enabled for scale
- Access Control
- Managed identities for all consumers and producers. No shared access keys used. Azure RBAC assignments grant
Azure Service Bus Data SenderandAzure Service Bus Data Receiverroles. - Monitoring
- Azure Monitor tracks queue depth, message count, dead-letter count. Alerts trigger when queue depth exceeds thresholds (500 messages for document-processing, 1000 for notifications).
Identity & Access Management¶
Customer Identity — Azure AD B2C¶
Contoso uses Azure AD B2C for customer authentication with custom branding and policies:
- Sign-Up/Sign-In Flow
- Email verification required, password complexity enforced (12+ characters, mixed case, numbers, symbols). Optional social identity providers (Google, Facebook, Microsoft Account).
- MFA Configuration
- SMS-based MFA recommended but not required for customers. Required for sensitive operations (claim submission above €10,000, bank account changes).
- Token Lifetime
- Access tokens valid for 1 hour, refresh tokens valid for 14 days. Silent token renewal via refresh token rotation.
Employee Identity — Microsoft Entra ID¶
Insurance agents and administrators authenticate via corporate Entra ID:
- Conditional Access Policies
- MFA required for all users, block access from unmanaged devices, require compliant devices for privileged access (administrators).
- Privileged Identity Management (PIM)
- Just-in-time access for administrative roles. Database administrators request time-limited access to Azure SQL Database with approval workflow.
Secrets Management — Azure Key Vault¶
All application secrets, connection strings, and certificates are stored in Azure Key Vault:
Key Vault: contoso-insurance-kv (Premium tier with HSM-backed keys)
Stored Secrets:
- Azure SQL Database connection strings
- Azure Service Bus connection strings (backup for managed identity failures)
- SMTP server credentials for email notifications
- SMS API keys for notification worker
- Encryption keys for sensitive data at rest
- Access Model
- AKS pods use Azure workload identity to access secrets. Secrets loaded at pod startup and cached in memory. No secrets in environment variables or ConfigMaps.
Monitoring & Observability¶
Azure Monitor & Application Insights¶
Comprehensive observability powered by Azure's native monitoring stack:
- Application Insights
- Distributed tracing across frontend → API → workers → external services. Custom metrics for business KPIs (claims submitted/hour, average processing time, settlement amounts). Dependency tracking for Azure SQL, Service Bus, and Blob Storage calls.
- Log Analytics Workspace
- Centralized log aggregation from AKS (container logs), Azure SQL (audit logs), Application Insights (application logs). Retention: 90 days for query performance, 1 year in cold storage.
- Alerts & Action Groups
- Alerts configured for critical scenarios — API p95 latency >1s, Service Bus queue depth >1000, Azure SQL DTU >90%, pod crash loops, certificate expiration <30 days. Action groups trigger PagerDuty for on-call engineers.
- Dashboards
- Custom Azure Dashboard showing real-time metrics — claims submitted today, average claim processing time, system health (green/yellow/red), cost trends.
CI/CD Pipeline — GitHub Actions¶
Contoso's deployment pipeline leverages GitHub Actions with GitOps principles:
Pipeline Stages¶
- 1. Build & Test (triggered on pull request)
- Compile .NET projects, run unit tests, run integration tests (Testcontainers for SQL Server), build Docker images, scan images with Trivy for vulnerabilities.
- 2. Publish Images (triggered on merge to main)
- Tag images with git commit SHA and
latest, push to Azure Container Registry, sign images with Notation (CNCF supply chain security). - 3. Deploy to Staging (automatic after publish)
- Deploy to AKS staging namespace using Helm charts, run smoke tests, run end-to-end tests with Playwright.
- 4. Deploy to Production (manual approval)
- Requires approval from two engineers, deploy to AKS production namespace using Helm charts with blue/green deployment strategy, automated rollback on health check failures.
Infrastructure as Code¶
All Azure resources defined in Bicep templates stored in infrastructure/ directory:
infrastructure/
├── main.bicep # Root template orchestrating all resources
├── modules/
│ ├── aks.bicep # AKS cluster configuration
│ ├── sql.bicep # Azure SQL Database
│ ├── storage.bicep # Blob Storage account
│ ├── servicebus.bicep # Service Bus namespace
│ ├── keyvault.bicep # Key Vault
│ └── monitoring.bicep # Azure Monitor, App Insights
└── parameters/
├── dev.bicepparam # Development environment
├── staging.bicepparam # Staging environment
└── production.bicepparam # Production environment
Bicep deployments run via GitHub Actions using Azure CLI (az deployment group create). Deployment state stored in Azure (Resource Manager tracks state).
Cost Profile — Phase 1¶
Monthly Azure costs for the production environment (as of Q1 2023):
| Service | Configuration | Monthly Cost (EUR) |
|---|---|---|
| Azure Kubernetes Service | 4 node pools, avg 20 nodes | €12,000 |
| Azure SQL Database | Business Critical, 8 vCores | €8,500 |
| Azure Blob Storage | 8 TB (hot/cool/archive mix) | €1,200 |
| Azure Service Bus | Standard tier, 3 queues | €800 |
| Azure Front Door | Premium tier, ~50 TB egress | €6,500 |
| Azure Container Registry | Premium tier, geo-replication | €600 |
| Azure Key Vault | Premium tier, HSM operations | €400 |
| Azure Monitor & App Insights | Log Analytics ingestion (~500 GB/mo) | €3,200 |
| Azure AD B2C | 50K MAU (Monthly Active Users) | €200 |
| Azure Firewall | Premium tier for hub VNet | €1,600 |
| Data Egress | Cross-region, internet egress | €8,000 |
| Support Plan | Premier support | €2,000 |
| Total | €45,000/month |
- Cost Drivers
- The largest cost components are compute (AKS nodes), database (Azure SQL Business Critical tier), and data egress (Front Door and cross-region traffic). Data egress alone accounts for 18% of monthly costs.
3-Year TCO Projection
At €45,000/month, the 3-year total cost of ownership for Phase 1 is €1,620,000. This baseline drives the business case for hybrid/on-premises alternatives.
Identifying Continuum Constraints¶
While Phase 1 provides operational simplicity, certain Azure PaaS services create dependencies that must be addressed in later phases:
| Azure Service | Continuum Constraint | Future Replacement |
|---|---|---|
| Azure SQL Database | Managed service, no on-premises equivalent | Arc-enabled SQL MI → SQL Server on VMs |
| Azure Service Bus | Cloud-only messaging | RabbitMQ or NATS on Kubernetes |
| Azure AD B2C / Entra ID | Cloud-dependent identity | AD DS + ADFS for on-premises |
| Azure Key Vault | Cloud-only secrets management | HashiCorp Vault or similar |
| Azure Monitor / App Insights | Cloud-based observability | Prometheus + Grafana + Loki |
| Azure Front Door | Cloud CDN and WAF | NGINX or Traefik ingress + local CDN |
| Azure Container Registry | Cloud registry | Harbor on-premises |
Design for Portability
To enable future hybrid scenarios, Contoso made key architectural decisions in Phase 1:
- Containerize everything — All workloads run in containers, not on VMs or PaaS compute (App Service, Functions)
- Use Kubernetes — AKS provides a portable orchestration layer vs. Azure-specific services like Container Apps
- Abstract cloud services — Database access via Entity Framework Core, queue access via abstraction interfaces, storage access via Azure SDK with interface wrappers
- Avoid Azure-specific code — No Azure Functions bindings, no Azure-specific queue message formats, no Azure Cosmos DB Change Feed
These decisions slightly increase complexity in Phase 1 but dramatically simplify migrations in Phase 2 and Phase 3.
Operational Model — Phase 1¶
Day-to-day operations in Phase 1 leverage Azure's managed services:
- Routine Tasks Automated by Azure
- OS patching on AKS nodes, Kubernetes version upgrades, Azure SQL Database backups, certificate rotation for Azure services, security updates for Azure PaaS components.
- Contoso Operations Team Responsibilities
- Application deployments via CI/CD, monitoring alert response, incident management, cost optimization, capacity planning, disaster recovery testing (quarterly).
- Team Structure (Phase 1)
- 2 DevOps engineers, 1 Database administrator, 1 Security engineer, 1 SRE (Site Reliability Engineer). Team works standard business hours with on-call rotation for P1 incidents.
- Skills Required
- Azure platform knowledge, Kubernetes administration, .NET development (for troubleshooting), SQL Server performance tuning, GitHub Actions, Bicep/Terraform.
Phase 1 Summary — Baseline Established¶
Phase 1 successfully establishes Contoso Insurance Platform on Azure public cloud with:
✅ High availability — 99.95% uptime over 12 months, exceeding 99.9% SLA
✅ Scalability — Handles 10,000 concurrent users with autoscaling, headroom to 20,000
✅ Security — Zero security incidents, passed Q4 penetration test with no critical findings
✅ Operational simplicity — Small team manages enterprise platform with Azure's managed services
❌ Cost concerns — €45,000/month exceeds budget, driving cost optimization initiatives
❌ Regulatory risk — Azure's multi-region architecture creates data residency concerns
❌ Cloud dependency — Complete reliance on Azure SLAs and service availability
These challenges set the stage for Phase 2 — migration to Azure Local with hybrid connectivity.
References¶
- AKS Best Practices
- Azure SQL Database
- Azure Service Bus
- Azure Front Door
- Azure Blob Storage
- Azure AD B2C
- Azure Monitor Container Insights
- GitHub Actions for Azure
🔗 Working Example: Contoso Insurance Sample Application
See the complete Phase 1 cloud-native implementation at ContosoInsurances-NativeToLocal (main branch) — including infra/ Bicep templates for AKS, ACR, SQL, Key Vault, App Gateway, and networking modules, plus k8s/ Kubernetes manifests for deployments, services, and network policies.