Cross-Cluster Object Replication (HA) Architecture Guide
Written: 2026-03-24 | Updated: 2026-03-24 | Reading time: ~12 min
Reference Environment: EKS 1.32+, ArgoCD 2.13+, Flux v2.4+, Velero 1.15+
1. Overview
Relying on a single EKS cluster in production means a cluster failure brings down the entire service. Cross-Cluster Object Replication is a strategy that ensures high availability by consistently replicating Kubernetes objects (ConfigMaps, Secrets, RBAC, CRDs, NetworkPolicies, etc.) across multiple clusters.
Current State
EKS does not provide managed Cross-Cluster Object Replication. Therefore, you must implement it yourself by combining open-source tools and architecture patterns. This guide compares the pros and cons of each pattern and provides selection criteria based on workload types.
Scope of This Guide
| Included | Not Included |
|---|---|
| K8s object replication (ConfigMap, Secret, CRD, RBAC, etc.) | Application data replication (DB replicas) |
| GitOps-based declarative synchronization | Service mesh-based traffic routing |
| Stateful object backup/restore (Velero) | Storage layer replication (EBS, EFS) |
| DNS failover strategies | Application-level HA patterns |
2. Multi-Cluster Architecture Pattern Comparison
There are three core patterns for implementing Cross-Cluster Object Replication.
Pattern 1: API Proxy (Push Model)
A central routing layer directly proxies CRUD requests to each cluster's API Server.
- How it works: Direct API calls from a central point to each cluster
- Pros: Lightweight and intuitive
- Cons: Credential security vulnerabilities, no multi-cluster Watch support, increasing connection complexity
Pattern 2: Multi-cluster Controller (Kubefed-style)
A central controller monitors each cluster's state via Informer-based List-Watch and synchronizes through CRDs.
- How it works: Central controller monitors and synchronizes each cluster's state
- Pros: Dynamic cluster discovery, federation policies
- Cons: Watch event overflow at ~10+ clusters, Informer cache size limits, plaintext credential storage risk
Kubefed (v2) is effectively in maintenance mode by the Kubernetes SIG. It is not recommended for new projects.
Pattern 3: Agent-based Pull Model (Recommended)
Agents in each cluster pull the desired state from a central source (Git or hub cluster) and reconcile locally. This follows the same principle as kubelet receiving Pod specs and running them locally.
- How it works: Each cluster agent independently pulls the desired state and reconciles locally
- Pros: High scalability, eventual consistency, local operation continues even during central failures
- Cons: Requires agent deployment on all clusters