Connected → Disconnected Azure Local¶
Introduction¶
Transitioning from connected Azure Local to a fully disconnected, air-gapped deployment represents the final and most complex stage of cloud exit. Every dependency on Azure cloud services—identity, management, monitoring, container registries, secrets management, and DNS—must be severed and replaced with self-hosted alternatives. This chapter guides you through the systematic process of achieving true independence from cloud infrastructure.
The disconnection journey requires meticulous planning because there's no safety net once connectivity is removed. Unlike the connected-to-Azure Local migration where Azure management remains accessible, disconnected mode demands complete self-sufficiency. Organizations typically pursue this path for stringent regulatory requirements, data sovereignty mandates, defense and intelligence use cases, or operation in environments lacking reliable internet connectivity.
Point of No Return
Once disconnected, re-establishing Azure connectivity requires significant reconfiguration. Ensure all local infrastructure is stable, tested, and operational before severing cloud connections. Plan for a minimum 4-6 week validation period in isolated mode before true disconnection.
The Disconnection Checklist¶
Before disconnecting, every Azure cloud dependency must be identified and replaced. This checklist provides a comprehensive inventory of services to address:
Identity and Access Management¶
| Azure Service | Disconnected Replacement | Migration Complexity | Notes |
|---|---|---|---|
| Azure AD (Entra ID) | Active Directory Domain Services (AD DS) | High | User accounts, groups, authentication infrastructure |
| Azure AD B2C | Custom identity solution (Keycloak, IdentityServer) | Very High | Requires application code changes |
| Managed Identities | Service accounts with certificate-based auth | Medium | Reconfigure all services using managed identities |
| Azure RBAC | Local RBAC (Kubernetes RBAC, AD groups) | Medium | Role assignments must be recreated locally |
Management and Orchestration¶
| Azure Service | Disconnected Replacement | Migration Complexity | Notes |
|---|---|---|---|
| Azure Portal | Local management tools (kubectl, PowerShell, Windows Admin Center) | Medium | Train teams on new tooling |
| Azure Arc | Native Kubernetes management | Low | Disconnect Arc agents |
| Azure Resource Manager | Infrastructure-as-Code (Terraform, Ansible) | Medium | Declarative infrastructure management |
| Azure Policy | Open Policy Agent (OPA) / Gatekeeper | Medium | Policy definitions must be converted |
Monitoring and Observability¶
| Azure Service | Disconnected Replacement | Migration Complexity | Notes |
|---|---|---|---|
| Azure Monitor | Prometheus + Grafana + Loki | Medium | Metrics, dashboards, alerting |
| Application Insights | Jaeger + Zipkin (distributed tracing) | Medium | Application instrumentation changes |
| Log Analytics | Elasticsearch / Loki + Grafana | Medium | Log aggregation and search |
| Alerts | Prometheus Alertmanager | Low | Alert rules must be recreated |
Security and Compliance¶
| Azure Service | Disconnected Replacement | Migration Complexity | Notes |
|---|---|---|---|
| Azure Key Vault | HashiCorp Vault | Medium | Secrets, certificates, keys migration |
| Defender for Cloud | Falco, Trivy, local vulnerability scanning | High | Security posture management |
| Azure Firewall | pfSense, FortiGate, or native OS firewalls | Low | Network security policies |
| DDoS Protection | On-premises DDoS appliances | Low | Typically not needed in disconnected environments |
Container and Application Services¶
| Azure Service | Disconnected Replacement | Migration Complexity | Notes |
|---|---|---|---|
| Azure Container Registry | Harbor | Low | Mirror all images before disconnection |
| Azure Service Bus | RabbitMQ / NATS | Medium | Message queue migration |
| Azure Event Hubs | Apache Kafka | Medium | Event streaming platform |
| Azure Functions | KEDA on Kubernetes / Knative | High | Function runtime migration |
Networking and DNS¶
| Azure Service | Disconnected Replacement | Migration Complexity | Notes |
|---|---|---|---|
| Azure DNS | BIND / Windows DNS Server | Low | Zone transfers and record migration |
| Azure Load Balancer | HAProxy / NGINX / MetalLB (Kubernetes) | Low | Already handled in connected mode |
| Azure VPN Gateway | pfSense, FortiGate, or Linux-based VPN | Low | Site-to-site VPN for branch connectivity only |
Data Services¶
| Azure Service | Disconnected Replacement | Migration Complexity | Notes |
|---|---|---|---|
| Azure SQL Database | SQL Server on VMs/containers | Low | Already migrated in connected phase |
| Cosmos DB | MongoDB / PostgreSQL cluster | Medium | Already handled if migrated previously |
| Azure Storage | MinIO / Ceph | Low | Object storage migration |
Infrastructure Preparation for Disconnected Mode¶
Local Identity Infrastructure Deployment¶
Active Directory Domain Services Setup:
# Deploy AD DS on Windows Server
Install-WindowsFeature -Name AD-Domain-Services -IncludeManagementTools
# Promote server to domain controller
Install-ADDSForest `
-DomainName "corp.azurelocal.local" `
-DomainNetbiosName "CORP" `
-ForestMode "WinThreshold" `
-DomainMode "WinThreshold" `
-InstallDns:$true `
-SafeModeAdministratorPassword (ConvertTo-SecureString "ComplexP@ssw0rd!" -AsPlainText -Force) `
-Force:$true
Synchronize Users from Azure AD Before Disconnection:
# Install Azure AD Connect for final sync
# Configure writeback if needed
# Perform final synchronization run
Start-ADSyncSyncCycle -PolicyType Delta
# Export user list for validation
Get-ADUser -Filter * | Export-Csv -Path "C:\Migration\AD-Users.csv"
DNS Configuration:
# Create forward lookup zone
Add-DnsServerPrimaryZone -Name "azurelocal.local" -ZoneFile "azurelocal.local.dns"
# Add A records for key services
Add-DnsServerResourceRecordA -Name "harbor" -ZoneName "azurelocal.local" -IPv4Address "192.168.10.50"
Add-DnsServerResourceRecordA -Name "vault" -ZoneName "azurelocal.local" -IPv4Address "192.168.10.51"
Add-DnsServerResourceRecordA -Name "prometheus" -ZoneName "azurelocal.local" -IPv4Address "192.168.10.52"
Add-DnsServerResourceRecordA -Name "grafana" -ZoneName "azurelocal.local" -IPv4Address "192.168.10.53"
# Configure forwarders for external resolution (if any external connectivity remains)
Add-DnsServerForwarder -IPAddress "8.8.8.8"
Local Monitoring Stack Deployment¶
Deploy Prometheus for Metrics:
# prometheus-values.yaml for Helm deployment
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
---
# Deploy using Helm
# helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
# helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring -f prometheus-values.yaml
alertmanager:
enabled: true
config:
route:
receiver: 'team-emails'
group_by: ['alertname', 'cluster']
group_wait: 30s
group_interval: 5m
repeat_interval: 12h
receivers:
- name: 'team-emails'
email_configs:
- to: 'ops-team@company.local'
from: 'alertmanager@azurelocal.local'
smarthost: 'smtp.company.local:25'
prometheus:
prometheusSpec:
retention: 30d
storageSpec:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
grafana:
adminPassword: "admin-changeme"
ingress:
enabled: true
hosts:
- grafana.azurelocal.local
persistence:
enabled: true
size: 10Gi
Deploy Loki for Log Aggregation:
helm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki-stack -n monitoring \
--set loki.persistence.enabled=true \
--set loki.persistence.size=50Gi \
--set promtail.enabled=true
Configure Applications to Send Logs and Metrics:
# Update application deployment to include Prometheus annotations
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend-api
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
template:
spec:
containers:
- name: api
image: harbor.azurelocal.local/app/backend:v2.1
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://jaeger-collector.monitoring.svc.cluster.local:4317"
Local Container Registry Setup¶
Deploy Harbor Registry:
# Add Harbor Helm repository
helm repo add harbor https://helm.goharbor.io
# Deploy Harbor with persistent storage
helm install harbor harbor/harbor -n harbor --create-namespace \
--set expose.type=loadBalancer \
--set expose.tls.enabled=true \
--set externalURL=https://harbor.azurelocal.local \
--set persistence.enabled=true \
--set persistence.persistentVolumeClaim.registry.size=500Gi \
--set persistence.persistentVolumeClaim.database.size=20Gi \
--set harborAdminPassword="HarborAdmin123!"
Mirror All Container Images from Azure Container Registry:
# Authenticate to both registries
az acr login --name myazureacr
docker login harbor.azurelocal.local -u admin -p HarborAdmin123!
# Create image mirror script
#!/bin/bash
SOURCE_REGISTRY="myazureacr.azurecr.io"
TARGET_REGISTRY="harbor.azurelocal.local"
IMAGES=$(az acr repository list --name myazureacr --output tsv)
for IMAGE in $IMAGES; do
TAGS=$(az acr repository show-tags --name myazureacr --repository $IMAGE --output tsv)
for TAG in $TAGS; do
echo "Mirroring $IMAGE:$TAG"
docker pull $SOURCE_REGISTRY/$IMAGE:$TAG
docker tag $SOURCE_REGISTRY/$IMAGE:$TAG $TARGET_REGISTRY/$IMAGE:$TAG
docker push $TARGET_REGISTRY/$IMAGE:$TAG
done
done
Configure Kubernetes to Use Local Registry:
# Create image pull secret
kubectl create secret docker-registry harbor-registry \
--docker-server=harbor.azurelocal.local \
--docker-username=admin \
--docker-password=HarborAdmin123! \
--docker-email=admin@azurelocal.local \
-n default
# Update deployments to reference local registry
spec:
imagePullSecrets:
- name: harbor-registry
containers:
- name: backend
image: harbor.azurelocal.local/app/backend:v2.1 # Changed from ACR
Local PKI and Certificate Infrastructure¶
Deploy Private Certificate Authority:
# Install AD Certificate Services
Install-WindowsFeature -Name ADCS-Cert-Authority -IncludeManagementTools
# Configure CA
Install-AdcsCertificationAuthority -CAType EnterpriseRootCa `
-CryptoProviderName "RSA#Microsoft Software Key Storage Provider" `
-KeyLength 4096 `
-HashAlgorithmName SHA256 `
-ValidityPeriod Years `
-ValidityPeriodUnits 10
Issue Certificates for Internal Services:
# Install cert-manager in Kubernetes for automated certificate management
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager -n cert-manager --create-namespace \
--set installCRDs=true
# Configure ClusterIssuer for internal CA
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: internal-ca-issuer
spec:
ca:
secretName: internal-ca-key-pair
Generate certificates for services:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: harbor-tls
namespace: harbor
spec:
secretName: harbor-tls-secret
issuerRef:
name: internal-ca-issuer
kind: ClusterIssuer
dnsNames:
- harbor.azurelocal.local
- "*.harbor.azurelocal.local"
Local Secrets Management¶
Deploy HashiCorp Vault:
helm repo add hashicorp https://helm.releases.hashicorp.com
helm install vault hashicorp/vault -n vault --create-namespace \
--set server.ha.enabled=true \
--set server.ha.replicas=3 \
--set ui.enabled=true \
--set server.dataStorage.size=10Gi
# Initialize Vault
kubectl exec -it vault-0 -n vault -- vault operator init
# Unseal Vault on all replicas
kubectl exec -it vault-0 -n vault -- vault operator unseal <key-1>
kubectl exec -it vault-0 -n vault -- vault operator unseal <key-2>
kubectl exec -it vault-0 -n vault -- vault operator unseal <key-3>
Migrate Secrets from Azure Key Vault:
# Export secrets from Azure Key Vault before disconnection
az keyvault secret list --vault-name my-keyvault --query "[].name" -o tsv > secret-names.txt
# Download each secret
while read SECRET_NAME; do
az keyvault secret show --vault-name my-keyvault --name $SECRET_NAME \
--query "value" -o tsv > "secrets/${SECRET_NAME}.txt"
done < secret-names.txt
# Import into Vault
export VAULT_ADDR='http://vault.azurelocal.local:8200'
export VAULT_TOKEN='<root-token>'
while read SECRET_NAME; do
vault kv put secret/$SECRET_NAME value=@secrets/${SECRET_NAME}.txt
done < secret-names.txt
Configure Applications to Use Vault:
# Vault Agent Injector for Kubernetes
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend-api
spec:
template:
metadata:
annotations:
vault.hashicorp.com/agent-inject: "true"
vault.hashicorp.com/role: "backend-role"
vault.hashicorp.com/agent-inject-secret-db-password: "secret/data/database/password"
spec:
serviceAccountName: backend
containers:
- name: api
image: harbor.azurelocal.local/app/backend:v2.1
Local Backup and Disaster Recovery¶
Deploy Backup Solution (Velero for Kubernetes):
# Install Velero CLI
# Configure backup storage location (NFS, MinIO, or SAN)
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.7.0 \
--bucket velero-backups \
--secret-file ./credentials-velero \
--use-volume-snapshots=true \
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio.azurelocal.local:9000
# Create backup schedule
velero schedule create daily-backup --schedule="0 2 * * *"
Database Backup Strategy:
-- SQL Server automated backups to local storage
BACKUP DATABASE ProductionDB
TO DISK = '\\nas.azurelocal.local\sqlbackups\ProductionDB_Full.bak'
WITH COMPRESSION, CHECKSUM, STATS = 10;
-- Configure automated backup jobs via SQL Agent
Application Changes Required¶
Remove Azure SDK Dependencies¶
Before (Azure-specific):
using Azure.Identity;
using Azure.Security.KeyVault.Secrets;
using Azure.Storage.Blobs;
// Using Managed Identity
var credential = new DefaultAzureCredential();
var secretClient = new SecretClient(new Uri("https://my-keyvault.vault.azure.net/"), credential);
var secret = await secretClient.GetSecretAsync("DatabasePassword");
After (Disconnected-compatible):
using VaultSharp;
using VaultSharp.V1.AuthMethods.Token;
using VaultSharp.V1.AuthMethods;
// Using Vault with token authentication
var vaultUri = "http://vault.azurelocal.local:8200";
var authMethod = new TokenAuthMethodInfo(Environment.GetEnvironmentVariable("VAULT_TOKEN"));
var vaultClientSettings = new VaultClientSettings(vaultUri, authMethod);
var vaultClient = new VaultClient(vaultClientSettings);
var secret = await vaultClient.V1.Secrets.KeyValue.V2.ReadSecretAsync("database/password");
var password = secret.Data.Data["value"].ToString();
Replace Azure PaaS Client Libraries¶
Service Bus → RabbitMQ:
// Before: Azure Service Bus
using Azure.Messaging.ServiceBus;
ServiceBusClient client = new ServiceBusClient(connectionString);
ServiceBusSender sender = client.CreateSender(queueName);
await sender.SendMessageAsync(new ServiceBusMessage("Hello"));
// After: RabbitMQ with MassTransit
using MassTransit;
var busControl = Bus.Factory.CreateUsingRabbitMq(cfg =>
{
cfg.Host("rabbitmq.azurelocal.local", "/", h =>
{
h.Username("admin");
h.Password(vaultPassword);
});
});
await busControl.Publish(new MessageContract { Text = "Hello" });
Application Insights → OpenTelemetry:
// Before: Application Insights
using Microsoft.ApplicationInsights;
TelemetryClient telemetryClient = new TelemetryClient();
telemetryClient.TrackEvent("OrderPlaced");
// After: OpenTelemetry to Jaeger/Prometheus
using OpenTelemetry;
using OpenTelemetry.Trace;
var tracerProvider = Sdk.CreateTracerProviderBuilder()
.AddSource("OrderService")
.AddJaegerExporter(options =>
{
options.AgentHost = "jaeger.azurelocal.local";
options.AgentPort = 6831;
})
.Build();
using var activity = ActivitySource.StartActivity("OrderPlaced");
Update Configuration for Local Endpoints¶
Configuration File Updates:
// appsettings.json - Before (Azure)
{
"ConnectionStrings": {
"Database": "Server=myserver.database.windows.net;Database=ProductionDB;Authentication=Active Directory Default;"
},
"KeyVault": {
"Url": "https://my-keyvault.vault.azure.net/"
},
"ServiceBus": {
"ConnectionString": "Endpoint=sb://my-servicebus.servicebus.windows.net/..."
},
"Storage": {
"AccountName": "mystorageaccount",
"Endpoint": "https://mystorageaccount.blob.core.windows.net/"
}
}
// appsettings.json - After (Disconnected)
{
"ConnectionStrings": {
"Database": "Server=sql-mi.azurelocal.local,1433;Database=ProductionDB;User Id=sa;Password=FromVault;"
},
"Vault": {
"Url": "http://vault.azurelocal.local:8200",
"Token": "FromEnvironment"
},
"MessageQueue": {
"Host": "rabbitmq.azurelocal.local",
"Username": "admin",
"Password": "FromVault"
},
"Storage": {
"Endpoint": "http://minio.azurelocal.local:9000",
"AccessKey": "FromVault",
"SecretKey": "FromVault"
}
}
Implement Offline-Capable Health Checks¶
Remove Cloud Dependency Checks:
// Before: Health check includes Azure dependencies
public class HealthCheck : IHealthCheck
{
public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context)
{
// This will fail in disconnected mode
var azureHealth = await CheckAzureKeyVaultConnection();
return azureHealth ? HealthCheckResult.Healthy() : HealthCheckResult.Unhealthy();
}
}
// After: Health check only validates local dependencies
public class HealthCheck : IHealthCheck
{
public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context)
{
var dbHealth = await CheckDatabaseConnection();
var vaultHealth = await CheckVaultConnection();
var queueHealth = await CheckRabbitMQConnection();
return (dbHealth && vaultHealth && queueHealth)
? HealthCheckResult.Healthy()
: HealthCheckResult.Unhealthy();
}
}
Testing in Disconnected Mode¶
Network Isolation Testing¶
Create Isolated Test Environment:
# Simulate network disconnection by blocking Azure endpoints
# Add firewall rules to block *.azure.com, *.microsoft.com
# Linux/pfSense firewall rules
iptables -A OUTPUT -d management.azure.com -j DROP
iptables -A OUTPUT -d login.microsoftonline.com -j DROP
iptables -A OUTPUT -d *.azurecr.io -j DROP
# Validate applications continue to function
kubectl get pods -A # Should show all pods running
curl -k https://app.azurelocal.local/health # Should return 200 OK
Test Scenarios:
- Application startup without Azure connectivity: All services start successfully
- User authentication: Login works using local AD
- Database operations: CRUD operations function normally
- Message queue operations: Messages publish and consume successfully
- Monitoring and alerting: Metrics collected, dashboards display data, alerts fire correctly
- Certificate validation: TLS connections establish with internal CA certificates
- Backup operations: Automated backups execute successfully
Failover and Recovery Testing¶
Simulate Node Failure:
# Stop a Kubernetes node
kubectl drain node-02 --ignore-daemonsets --delete-emptydir-data
# Verify pods reschedule to other nodes
kubectl get pods -o wide
# Bring node back online
kubectl uncordon node-02
Simulate Database Failure:
-- Stop SQL instance
SHUTDOWN;
-- Verify application connection failover to replica
-- Restart SQL instance
-- Verify replication catches up
Operational Procedure Testing¶
Validate Day 2 Operations:
- [ ] Deploy application update using Harbor registry
- [ ] Rotate TLS certificate using internal CA
- [ ] Update secrets in Vault
- [ ] Scale application deployment up/down
- [ ] Restore from backup
- [ ] Review monitoring dashboards and validate metrics
- [ ] Trigger alert and validate notification delivery
- [ ] Add new user to AD and validate application access
- [ ] Execute database maintenance tasks
The Disconnection Event: Execution Plan¶
Pre-Disconnection Preparation (Days 1-14)¶
Week 1: Final Infrastructure Validation - [ ] Validate all local services operational (AD, DNS, PKI, Vault, Harbor, Prometheus, Grafana) - [ ] Perform final image mirror from Azure Container Registry - [ ] Export final secrets from Azure Key Vault - [ ] Backup all Azure resources for rollback capability - [ ] Document network topology and configurations - [ ] Train operations team on local management tools
Week 2: Application Validation - [ ] Deploy applications in isolated test environment - [ ] Run full integration test suite - [ ] Perform load testing - [ ] Validate all health checks pass without Azure connectivity - [ ] Review and update runbooks for disconnected operations - [ ] Brief stakeholders on disconnection timeline
Infrastructure Hardening (Days 15-21)¶
- [ ] Enable high availability for all critical services (AD, DNS, Vault, Harbor)
- [ ] Configure automated backups with verified restore procedures
- [ ] Implement monitoring alerting thresholds
- [ ] Deploy redundant network connections
- [ ] Establish break-glass access procedures
- [ ] Create offline installation media for critical software
Service Migration (Days 22-28)¶
- [ ] Migrate monitoring from Azure Monitor to Prometheus/Grafana
- [ ] Migrate authentication from Azure AD to local AD (phased user migration)
- [ ] Switch secrets management from Key Vault to Vault
- [ ] Update DNS to resolve to local services
- [ ] Configure applications to use local endpoints
- [ ] Test application functionality after each migration step
Validation Testing (Days 29-35)¶
- [ ] Execute full regression test suite
- [ ] Perform user acceptance testing
- [ ] Load test with production-like traffic
- [ ] Chaos engineering tests (kill nodes, services, network)
- [ ] Disaster recovery drill
- [ ] Documentation review and updates
- [ ] Final go/no-go decision
Disconnection Event (Day 36)¶
Execution Timeline:
Hour 0 (00:00): Maintenance window begins
- Notify users of planned maintenance
- Enable read-only mode on Azure resources
Hour 1 (01:00): Final synchronization
- Sync AD users/groups (if incremental sync still active)
- Verify no data replication lag
- Export final monitoring dashboards
Hour 2 (02:00): Disconnect Azure Arc agents
- Remove Arc agents from Kubernetes cluster
- Disable Azure Policy integrations
- Deactivate Azure Monitor extensions
Hour 3 (03:00): Sever network connectivity
- Disconnect ExpressRoute or VPN connection
- Validate no Azure endpoints accessible
- Confirm all applications continue running
Hour 4 (04:00): Validation phase
- Run smoke tests on all applications
- Verify authentication via local AD
- Confirm monitoring dashboards show metrics
- Test alert delivery
Hour 5 (05:00): User validation
- Enable small user group for testing
- Collect feedback and resolve issues
Hour 6 (06:00): Full cutover
- Enable all users
- Monitor for issues
- Stand by for rapid response
Hour 12 (12:00): Maintenance window ends
- Declare disconnection complete
- Enter hypercare period (24/7 monitoring for 72 hours)
Post-Disconnect Validation (Days 37-42)¶
Daily Validation Activities:
- Monitor application performance and error rates
- Review backup execution logs
- Validate certificate expiration tracking
- Check storage capacity and growth trends
- Review security logs for anomalies
- User feedback collection and issue resolution
Go-Live Checklist:
- [ ] All applications accessible and functional
- [ ] User authentication working via local AD
- [ ] Monitoring dashboards displaying accurate data
- [ ] Alerts configured and delivering notifications
- [ ] Backups running successfully
- [ ] No critical errors in logs
- [ ] Operations team comfortable with new tools
- [ ] Stakeholders satisfied with performance
Post-Disconnection Operations and Maintenance¶
Update Management Without Cloud Connectivity¶
Operating System Updates:
- Establish local WSUS (Windows Server Update Services) for Windows updates
- Configure local repository mirrors for Linux (YUM/APT mirrors)
- Schedule maintenance windows for update application
- Test updates in non-production before deploying to production
Application Updates:
# Build container images locally or on connected build server
docker build -t harbor.azurelocal.local/app/backend:v2.2 .
docker push harbor.azurelocal.local/app/backend:v2.2
# Deploy via GitOps (Flux/ArgoCD) pulling from local Harbor
kubectl set image deployment/backend backend=harbor.azurelocal.local/app/backend:v2.2
Kubernetes Cluster Updates:
- Maintain local repository of Kubernetes binaries
- Test cluster upgrades in dev environment first
- Perform rolling upgrades during maintenance windows
- Document upgrade procedures and rollback steps
Monitoring and Incident Response¶
24/7 Operations:
- On-call rotation using local paging system (PagerDuty self-hosted or similar)
- Runbooks for common issues readily accessible
- War room procedures for critical incidents
- Escalation paths clearly documented
Capacity Planning:
- Weekly capacity reviews: CPU, memory, storage, network
- Growth trend analysis and forecasting
- Procurement lead times accounted for (hardware orders can take weeks/months)
- Capacity alerts configured at 70% and 85% thresholds
Break-Glass Procedures¶
Emergency Access:
- Root/Administrator passwords stored in physical safe
- Break-glass accounts not connected to AD for emergency access
- Physical console access available for infrastructure
- Out-of-band management network independent of production
Long-Term Operational Considerations¶
Skill Development:
- Team training on Kubernetes, Prometheus, Grafana, Vault, Harbor
- Cross-training to reduce single points of failure
- Documentation of tribal knowledge
- Regular tabletop exercises and DR drills
Vendor Support:
- Establish support contracts with hardware vendors
- Maintain relationships with software vendors (Red Hat, SUSE, etc.)
- Consider Microsoft support for Windows/SQL Server if applicable
- Plan for vendor site visits for major issues
Hardware Lifecycle:
- Track warranty expiration dates
- Maintain spare parts inventory (drives, RAM, NICs, power supplies)
- Plan hardware refresh cycles (typically 5 years for servers)
- Budget for periodic infrastructure upgrades
graph TB
subgraph Before["🔗 Connected State - Azure Local with Cloud Dependencies"]
direction TB
AKS_C[AKS on Azure Local]
SQL_C[Arc-enabled SQL MI]
Storage_C[Local Storage]
Queue_C[RabbitMQ]
subgraph CloudDeps["☁️ Cloud Dependencies"]
AAD_C[Azure AD<br/>🔴 CLOUD DEP]
KV_C[Azure Key Vault<br/>🔴 CLOUD DEP]
Monitor_C[Azure Monitor<br/>🔴 CLOUD DEP]
ACR_C[Azure Container Registry<br/>🔴 CLOUD DEP]
end
AKS_C -.->|ExpressRoute| AAD_C
AKS_C -.->|ExpressRoute| KV_C
AKS_C -.->|Arc Agent| Monitor_C
AKS_C -.->|Image Pulls| ACR_C
AKS_C --> SQL_C
AKS_C --> Storage_C
AKS_C --> Queue_C
end
Before ==>|TRANSITION| After
subgraph After["🔒 Disconnected State - Fully Self-Contained"]
direction TB
K3s_D[K3s/RKE2 Cluster]
SQL_D[SQL Server on VMs]
Storage_D[MinIO Object Storage]
Queue_D[RabbitMQ]
subgraph LocalServices["🏠 Local Services"]
ADDS_D[AD DS + ADFS<br/>✅ LOCAL]
Vault_D[HashiCorp Vault<br/>✅ LOCAL]
Prom_D[Prometheus + Grafana<br/>✅ LOCAL]
Harbor_D[Harbor Registry<br/>✅ LOCAL]
end
K3s_D --> ADDS_D
K3s_D --> Vault_D
K3s_D --> Prom_D
K3s_D --> Harbor_D
K3s_D --> SQL_D
K3s_D --> Storage_D
K3s_D --> Queue_D
end
AAD_C -.->|Replace with| ADDS_D
KV_C -.->|Replace with| Vault_D
Monitor_C -.->|Replace with| Prom_D
ACR_C -.->|Replace with| Harbor_D
style Before fill:#50e6ff,stroke:#0078d4,stroke-width:3px
style After fill:#107c10,stroke:#004b1c,stroke-width:3px,color:#fff
style CloudDeps fill:#dc3545,stroke:#a71d2a,stroke-width:2px,color:#fff
style LocalServices fill:#107c10,stroke:#004b1c,stroke-width:2px,color:#fff gantt
title Disconnection Execution Timeline (42-Day Plan)
dateFormat YYYY-MM-DD
axisFormat %d
section Pre-Staging
Deploy Harbor registry :p1, 2024-01-01, 7d
Deploy HashiCorp Vault :p2, 2024-01-03, 5d
Deploy AD DS & ADFS :p3, 2024-01-05, 7d
Deploy Prometheus/Grafana :p4, 2024-01-07, 5d
Sync container images :p5, 2024-01-08, 6d
section Infrastructure Hardening
Configure certificate authority :h1, 2024-01-15, 4d
Implement local DNS :h2, 2024-01-16, 3d
Deploy WSUS server :h3, 2024-01-18, 3d
Test backup/restore procedures :h4, 2024-01-19, 2d
section Service Migration
Migrate identity (AAD→ADDS) :crit, m1, 2024-01-22, 4d
Migrate secrets (KV→Vault) :crit, m2, 2024-01-24, 3d
Migrate monitoring :m3, 2024-01-26, 2d
Update app configurations :m4, 2024-01-27, 1d
section Validation Testing
Functional testing :v1, 2024-01-29, 3d
Performance testing :v2, 2024-01-31, 2d
DR failover test :v3, 2024-02-01, 2d
User acceptance testing :v4, 2024-02-03, 2d
section Disconnection Event
Final data sync :crit, d1, 2024-02-05, 4h
Disable ExpressRoute :crit, milestone, d2, 2024-02-05, 0d
Switch DNS to local :crit, d3, 2024-02-05, 2h
section Post-Disconnect
Monitor for issues :pd1, 2024-02-06, 5d
Validate all services :pd2, 2024-02-06, 3d
Performance validation :pd3, 2024-02-08, 2d
Document lessons learned :pd4, 2024-02-10, 2d Conclusion¶
Transitioning to fully disconnected Azure Local is a significant undertaking requiring meticulous planning, comprehensive testing, and skilled operations teams. The reward is complete independence from cloud infrastructure, meeting the most stringent regulatory, security, and operational requirements.
Organizations that successfully complete this journey achieve true sovereignty over their technology stack, but must also accept the responsibility of self-sufficiency—there's no cloud provider to call for support, no automatic updates, and no elastic scalability. Success depends on disciplined operational practices, proactive monitoring, and continuous investment in infrastructure and team skills.
References¶
- Azure Stack Hub Disconnected Deployment
- Active Directory Domain Services
- HashiCorp Vault
- Harbor Registry