Skip to main content

VPC CNI vs Cilium CNI Performance Comparison Benchmark

📅 Created: 2026-02-09 | Updated: 2026-02-14 | ⏱️ Reading time: ~16 min

Overview

This is a benchmark report that quantitatively compares the performance of VPC CNI and Cilium CNI across 5 scenarios in an Amazon EKS 1.31 environment.

Benchmark Conclusion

=
≈ Identical
TCP Throughput
NIC bandwidth saturated (12.5 Gbps), CNI-independent
⚙️
Feature Gap
UDP Packet Loss
Bandwidth Manager availability (iperf3 extreme conditions, unlikely in normal environments)
🔗
-36% RTT
Multi-hop Inference Pipeline
RTT savings compound across Gateway→Router→vLLM→RAG→VectorDB hops, meaningful for real-time HTTP/gRPC inference serving
🔍
Features > Perf
Key Selection Criteria
L7 policies, FQDN filtering, Hubble observability, kube-proxy-less
The performance gap between the two CNIs is negligible in practice — the real differentiator is features like eBPF observability, policies, and architecture. However, cumulative RTT reduction across multi-hop inference pipelines can be meaningful.

5 Scenarios:

  • A VPC CNI Default (Baseline)
  • B Cilium + kube-proxy (Measuring transition impact)
  • C Cilium kube-proxy-less (Effect of removing kube-proxy)
  • D Cilium ENI Mode (Overlay vs Native Routing)
  • E Cilium ENI + Full Tuning (Cumulative optimization effect)

Key Takeaways:

MetricVPC CNI (A)Cilium ENI+Tuning (E)Improvement
TCP Throughput12.41 Gbps12.40 GbpsIdentical (NIC-saturated)
UDP Packet Loss20.39%0.03%Bandwidth Manager
Pod-to-Pod RTT4,894 µs3,135 µs36% lower
HTTP p99 @QPS=100010.92 ms8.75 ms*20% lower
Service Scaling (1000 svc)101× iptables growth, +16%/connO(1) constant performanceO(1) vs O(n)

* HTTP p99 improvements reflect optimized network path and reduced latency

🤖AI/ML Workload Relevance Guide
How this CNI benchmark applies to training and inference on EKS
-20%
HTTP p99 Latency
Inference serving
-36%
Pod-to-Pod RTT
Pipeline hops
O(1)
Service Scaling
Model endpoints
N/A
EFA Training
Bypasses CNI
Relevance by Workload Type
Real-time Inference Serving
High
HTTP/gRPC based · Service scaling directly applies
🔗Inference Pipeline (multi-hop)
High
RTT improvement compounds per hop
📦Batch Inference
Medium
Throughput-oriented · CNI gap is small
🧠Distributed Training (no EFA)
Medium
NCCL TCP/UDP partially affected
🚀Distributed Training (EFA)
Low
Kernel network stack bypassed entirely
💻Single-node Training
None
No network dependency
Note: This benchmark used m6i.xlarge (12.5 Gbps, no GPU). GPU instances (g5, p4d, p5, inf2) have 25–400 Gbps NICs and optional EFA. A dedicated AI/ML benchmark on GPU instances is recommended for production sizing.

Test Environment

🖥️ Test Environment
EKS 1.31 · Single AZ · Median of 3+ runs
EKS Version
1.31 (Platform: eks.51)
Cilium Version
1.16.5 (Helm Chart: cilium-1.16.5)
Node Type
m6i.xlarge (4 vCPU, 16 GB RAM, ENA NIC)
Node Count
3 (single AZ: ap-northeast-2a)
OS / Kernel
Amazon Linux 2023 · 6.1.159-182.297.amzn2023
Container Runtime
containerd 2.1.5
Service CIDR
10.100.0.0/16
Tools
kubectl 1.31+, Cilium CLI 0.16+, Helm 3.16+, Fortio 1.65+, iperf3 3.17+
Measurement
Median of 3+ repeated runs

Cluster configuration: See scripts/benchmarks/cni-benchmark/cluster.yaml Workload deployment: See scripts/benchmarks/cni-benchmark/workloads.yaml


Test Scenarios

The 5 scenarios are designed to measure the independent impact of each variable by combining CNI type, kube-proxy mode, IP allocation method, and tuning options.

🧪 Test Scenarios
5 configurations isolating CNI, kube-proxy, IP allocation, and tuning variables
AVPC CNI BaselineBaseline
CNI: VPC CNIkube-proxy: iptablesIP Allocation: ENI Secondary IPTuning: Default
BCilium + kube-proxyMigration impact
CNI: Ciliumkube-proxy: iptablesIP Allocation: Overlay (VXLAN)Tuning: Default
CCilium kube-proxy-lesskube-proxy removal
CNI: Ciliumkube-proxy: eBPFIP Allocation: Overlay (VXLAN)Tuning: Default
DCilium ENI ModeOverlay vs Native
CNI: Ciliumkube-proxy: eBPFIP Allocation: AWS ENI (native)Tuning: Default
ECilium ENI + Full TuningCumulative tuning
CNI: Ciliumkube-proxy: eBPFIP Allocation: AWS ENI (native)Tuning: All applied

Scenario E Tuning Points

Tuning ItemHelm ValueEffectApplied
BPF Host Routingbpf.hostLegacyRouting=falseHost NS iptables bypassYes
DSRloadBalancer.mode=dsrNodePort/LB direct returnNo
(ENA compat)
Bandwidth ManagerbandwidthManager.enabled=trueEDT rate limitingYes
BPF Masqueradebpf.masquerade=trueiptables MASQUERADE → eBPFYes
Socket-level LBsocketLB.enabled=trueLB at connect()Yes
XDP AccelerationloadBalancer.acceleration=nativeNIC driver processingNo
(ENA bpf_link)
BBRbandwidthManager.bbr=trueGoogle BBR congestionYes
Native RoutingroutingMode=nativeRemove VXLANYes
CT Table Expansionbpf.ctGlobalAnyMax/TcpMaxExpand Connection TrackingYes
Hubble Disabledhubble.enabled=falseRemove observability overheadYes
XDP and DSR Compatibility Constraints

The ENA driver on m6i.xlarge instances does not support XDP bpf_link functionality, making XDP acceleration (native/best-effort) unusable. DSR mode also caused Pod crashes, requiring a fallback to the default SNAT mode. Scenario E applies the remaining 8 tuning options.


Architecture

Packet Path Comparison: VPC CNI vs Cilium

Compares the packet path differences for Pod-to-Service traffic between VPC CNI (kube-proxy) and Cilium (eBPF).

Cilium Architecture Overview

The Cilium Daemon manages BPF programs in the kernel, injecting eBPF programs into each container and network interface (eth0).

Cilium Architecture Source: Cilium Component Overview

Cilium eBPF Packet Path

In Pod-to-Pod communication, eBPF programs are attached to veth pairs (lxc), completely bypassing iptables. The diagram below shows the direct communication path between Endpoints.

Cilium eBPF Endpoint-to-Endpoint Source: Cilium - Life of a Packet

Cilium Native Routing (ENI Mode)

In Native Routing mode, Pod traffic is forwarded directly through the host's routing table without VXLAN encapsulation. In ENI mode, Pod IPs are allocated directly from the VPC CIDR.

Cilium Native Routing Source: Cilium Routing

Cilium ENI IPAM Architecture

The Cilium Operator allocates IPs from ENIs via the EC2 API and provides an IP Pool to each node's Agent through CiliumNode CRDs.

Cilium ENI Architecture Source: Cilium ENI Mode

Data Plane Stack Across 5 Scenarios

Compares the Service LB, CNI Agent, network layer configuration, and key performance metrics for each scenario.

Data Plane Stack Comparison


Test Methodology

Test Workloads

  • httpbin: HTTP echo server (2 replicas)
  • Fortio: HTTP load generator
  • iperf3: Network throughput measurement server/client

Measured Metrics

  1. Network Performance: TCP/UDP Throughput, Pod-to-Pod Latency (p50/p99), Connection Setup Rate
  2. HTTP Performance: Throughput and latency per QPS (Fortio → httpbin)
  3. DNS Performance: Resolution latency (p50/p99), QPS Capacity
  4. Resource Usage: CNI CPU/memory overhead
  5. Tuning Effect: Performance contribution of individual tuning points

Running Benchmarks

Run all scenarios at once:

./scripts/benchmarks/cni-benchmark/run-all-scenarios.sh

Run individual scenario:

./scripts/benchmarks/cni-benchmark/run-benchmark.sh <scenario-name>

See in-script comments for detailed test procedures.


Benchmark Results

Data Collection Complete

The results below were measured on 2026-02-09 in an EKS 1.31 environment (m6i.xlarge, Amazon Linux 2023, single AZ). Each metric is the median of at least 3 repeated measurements.

Network Performance

TCP/UDP Throughput

TCP Throughput (Gbps)

NIC limit: 12.5 Gbps
A: VPC CNI
12.41 Gbps
B: Cilium+kp
12.34 Gbps
C: kp-less
12.34 Gbps
D: ENI
12.41 Gbps
E: ENI+Tuned
12.40 Gbps
All scenarios saturated at NIC bandwidth (~12.4 Gbps). TCP throughput is not a differentiator across CNI configurations.

UDP Throughput (Gbps)

Higher ≠ better · check loss rate
A: VPC CNI
10.00 Gbps
⚠ 20% loss
B: Cilium+kp
7.92 Gbps
C: kp-less
7.92 Gbps
D: ENI
10.00 Gbps
⚠ 20% loss
E: ENI+Tuned
7.96 Gbps
Red bars indicate high throughput with 20%+ packet loss (no Bandwidth Manager). Effective data transfer is lower despite higher raw throughput.

iperf3 · 10s duration · m6i.xlarge (12.5 Gbps baseline) · Median of 3+ runs

UDP Packet Loss Difference Is a Feature Difference, Not a Performance Difference

TCP is saturated at NIC bandwidth (12.5 Gbps) across all scenarios with no differences, which represents actual CNI performance. The UDP packet loss rate differences should be understood in the following context:

  • iperf3 test specificity: iperf3 sends UDP packets at the maximum possible rate, intentionally saturating the network. This is an extreme condition rarely occurring in production workloads.
  • Buffer overflow is the cause: In Scenarios A (VPC CNI) and D (Cilium ENI default), 20% packet loss occurred because the kernel UDP buffer overflowed under high-speed transmission.
  • Bandwidth Manager is a feature: In Scenario E, the loss rate dropped to 0.03% because the Bandwidth Manager (EDT-based rate limiting) throttled the send rate to match the receiver's processing capacity. This is an additional feature of Cilium, not an inherent CNI performance advantage.

Conclusion: In typical production workloads, UDP packet loss differences are unlikely to be noticeable. The Bandwidth Manager feature of Cilium is only meaningful for extreme UDP workloads (e.g., high-volume media streaming).

Pod-to-Pod Latency

Pod-to-Pod RTT (µs)

Lower is better
A: VPC CNI
4,894 µs
B: Cilium+kp
4,955 µs
C: kp-less
5,092 µs
D: ENI
4,453 µs
E: ENI+Tuned
3,135 µs
✓ Lowest
-36% vs Baseline (A→E)
-9% ENI native routing (A→D)
-30% Tuning effect (D→E)

Median of 3+ measurements · Single AZ (ap-northeast-2a) · m6i.xlarge nodes

UDP Packet Loss Rate

UDP Packet Loss (%)

Lower is better
A: VPC CNI
20.39%
B: Cilium+kp
0.94%
C: kp-less
0.69%
D: ENI
20.42%
⚠ High
E: ENI+Tuned
0.03%
✓ Lowest
Bandwidth Manager + BBR enabled (E)
20%+ loss without Bandwidth Manager (A, D)

UDP Throughput (Gbps)

Higher is better · but check loss rate
A: VPC CNI
10.00 Gbps
B: Cilium+kp
7.92 Gbps
C: kp-less
7.92 Gbps
D: ENI
10.00 Gbps
E: ENI+Tuned
7.96 Gbps
⚠ High throughput with high loss (A, D) means lower effective data transfer. Bandwidth Manager + BBR (E) optimizes for reliable delivery.

iperf3 UDP test · 10s duration · Median of 3+ measurements

Detailed Data Table
MetricA: VPC CNIB: Cilium+kpC: kp-lessD: ENIE: ENI+Tuning
TCP Throughput (Gbps)12.4112.3412.3412.4112.40
UDP Throughput (Gbps)10.007.927.9210.007.96
UDP Loss (%)20.390.940.6920.420.03
Pod-to-Pod RTT p50 (us)48944955509244533135
Pod-to-Pod RTT p99 (us)48944955509244533135
TCP Throughput Saturation

The baseline network bandwidth for m6i.xlarge is 12.5 Gbps. TCP throughput reached this limit across all scenarios, showing no difference between CNIs.

HTTP Application Performance

HTTP p99 Latency @QPS=1000 (ms)

Lower is better
A: VPC CNI
10.92
B: Cilium+kp
9.87
C: kp-less
8.91
D: ENI
8.75
Lowest
E: ENI+Tuned
9.89

Maximum Achieved QPS

Higher is better
A: VPC CNI
4,104
B: Cilium+kp
4,045
C: kp-less
4,019
D: ENI
4,026
E: ENI+Tuned
4,182
Highest

Measurements at QPS=1000 · Optimal configurations vary by workload

Detailed Data Table
Target QPSMetricA: VPC CNIB: Cilium+kpC: kp-lessD: ENIE: ENI+Tuning
1,000Actual QPS999.6999.6999.7999.7999.7
1,000p50 (ms)4.394.364.454.294.21
1,000p99 (ms)10.929.878.918.759.89
5,000Actual QPS4071.14012.03986.53992.64053.2
5,000p99 (ms)440.4521.60358.3823.0124.44
MaxActual QPS4103.94044.74019.34026.44181.9
Maxp99 (ms)28.0725.2528.5026.6728.45
Variability at QPS=5000+ Load

Scenarios A and C showed abnormally high p99 values (440ms, 358ms) during QPS=5000 tests. This is suspected to be temporary network congestion, as Max QPS tests (~4000 QPS actual) reverted to normal levels (25-28ms). We recommend using QPS=1000 results as the primary metric for reproducible comparison.

Impact of Service Count Scaling on Performance (Scenario E)

To validate Cilium eBPF's O(1) service lookup performance, we compared performance with 4 vs 104 services in the same Scenario E environment.

4 Services vs 104 Services — eBPF O(1) hash map lookup
4 Services
104 Services
HTTP p99 @QPS=1000
4 Svc
3.94ms
104 Svc
3.64ms
-8% (within noise)
Max Achieved QPS
4 Svc
4,405
104 Svc
4,221
-4.2%
TCP Throughput (Gbps)
4 Svc
12.3
104 Svc
12.4
~identical
DNS Resolution p50
4 Svc
2ms
104 Svc
2ms
identical
Key Insight
eBPF maintains O(1) performance regardless of service count. With iptables (kube-proxy), service lookup degrades O(n) as rules increase — significant at 500+ services.
Detailed Data Table
Metric4 Services104 ServicesDifference
HTTP p99 @QPS=1000 (ms)3.943.64-8% (within measurement error)
Max Achieved QPS4,4054,221-4.2%
TCP Throughput (Gbps)12.312.4~Same
DNS Resolution p50 (ms)22Same
eBPF O(1) Service Lookup Confirmed

In a Cilium eBPF environment, even after increasing services from 4 to 104 (26x increase), all metrics remained the same within measurement error (within 5%). This confirms that eBPF's hash map-based O(1) lookup maintains consistent performance regardless of service count. However, as shown in the kube-proxy scaling test below, iptables overhead is also practically negligible at the 1,000-service level, so this difference is unlikely to affect real-world performance unless service counts scale to several thousand or more.

kube-proxy (iptables) Service Scaling: 4 → 104 → 1,000

To cross-validate the O(1) advantage of eBPF, we measured performance changes while scaling services from 4 → 104 → 1,000 under VPC CNI + kube-proxy (Scenario A).

kube-proxy (iptables) Service Scaling
4 → 104 → 1,000 Services — iptables rule growth vs eBPF O(1)
4 Svc
104 Svc
1,000 Svc
Lower is better (except Max QPS)

iptables Rule Growth

NAT Rules+101×
4 Svc
99
104 Svc
699
1,000 Svc
10,059
kube-proxy Sync+31%
4 Svc
~130ms
104 Svc
~160ms
1,000 Svc
~170ms

keepalive HTTP Performance

HTTP p99 @QPS=1000~noise
4 Svc
5.86ms
104 Svc
5.99ms
1,000 Svc
2.96ms
Max QPS~same
4 Svc
4,197
104 Svc
4,231
1,000 Svc
4,178

Connection:close Overhead

Conn Setup Avg+16% (+26µs)
4 Svc
164 µs
1,000 Svc
190 µs
HTTP p99+5%
4 Svc
8.11ms
1,000 Svc
8.53ms
HTTP Total Avg+5%
4 Svc
4.399ms
1,000 Svc
4.621ms

Cilium eBPF (comparison)

HTTP p99 @QPS=1000-8% (noise)
4 Svc
3.94ms
104 Svc
3.64ms
eBPF Map Lookupconstant
4 Svc
O(1)
104 Svc
O(1)
1,000 Svc
O(1)
Key Insight
At 1,000 services, iptables NAT rules grow 101× (99→10,059) while per-connection setup adds +26µs (+16%). keepalive connections are unaffected due to conntrack caching. Cilium eBPF maintains O(1) hash map lookups regardless of service count.
⚠️ Scaling Threshold
At 5,000+ services, kube-proxy sync exceeds 500ms and chain traversal adds hundreds of µs per connection.

keepalive vs Connection:close Analysis

keepalive mode (reusing existing connections): Even with a 101x increase in iptables rules, there is no impact on HTTP performance. This is because conntrack caches packets from established connections, bypassing the iptables chain traversal.

Connection:close mode (new TCP connection per request): Every SYN packet traverses the KUBE-SERVICES iptables chain to evaluate DNAT rules. At 1,000 services, an overhead of +26us (+16%) per connection was measured.

Why the Connection:close Test Matters

In production environments, workloads that don't use keepalive (legacy services without gRPC, one-shot HTTP requests, TCP-based microservices, etc.) pay the iptables chain traversal cost for every request. The KUBE-SERVICES chain uses probability-based matching (-m statistic --mode random), so the average traversal length is O(n/2), increasing proportionally with service count.

iptables Scaling Characteristics

At the 1,000-service scale, the per-connection overhead is measurable at +26us (+16%), but in absolute terms this is very minimal. This difference is hard to perceive in most production environments. While it theoretically has O(n) characteristics that increase linearly with service count and could have cumulative impact at thousands of services, for typical EKS clusters (hundreds of services), it is difficult to experience a practical performance difference between iptables and eBPF. Cilium eBPF's O(1) lookup is meaningful as future-proofing for large-scale service environments.

Full kube-proxy vs Cilium Comparison Data
Metrickube-proxy 4kube-proxy 104kube-proxy 1000Change (4→1000)Cilium 4Cilium 104Change
HTTP p99 @QPS=10005.86ms5.99ms2.96ms-49%3.94ms3.64ms-8%
HTTP avg @QPS=10002.508ms2.675ms1.374ms-45%---
Max QPS (keepalive)4,1974,2314,178~0%4,4054,221-4.2%
TCP Throughput12.4 Gbps12.4 Gbps--12.3 Gbps12.4 Gbps~0%
iptables NAT Rules9969910,059+101xN/A (eBPF)N/A (eBPF)-
Sync Cycle Time~130ms~160ms~170ms+31%N/AN/A-
Connection Setup Time (Connection:close)164us-190us+16%N/AN/A-
HTTP avg (Connection:close)4.399ms-4.621ms+5%N/AN/A-
HTTP p99 (Connection:close)8.11ms-8.53ms+5%N/AN/A-

DNS Resolution Performance and Resource Usage

🌐 DNS Resolution Performancedig · 100 queries · median
A: VPC CNI
p50:2 ms
p99:4 ms
B: Cilium+kp
p50:2 ms
p99:4 ms
C: kp-less
p50:2 ms
p99:2 ms
D: ENI
p50:2 ms
p99:4 ms
E: ENI+Tuned
p50:2 ms
p99:3 ms
DNS resolution latency is 2–4 ms across all scenarios — CNI choice has negligible impact.
📊 CNI Resource Usage (per node, under load)Cilium agent · during iperf3/Fortio benchmark
A: VPC CNI
CPU: N/M
Memory: N/M
B: Cilium+kp
CPU: 4-6m
Memory: 83Mi
C: kp-less
CPU: 4-6m
Memory: 129Mi
D: ENI
CPU: 5-6m
Memory: 81Mi
E: ENI+Tuned
CPU: 4-5m
Memory: 82Mi
Scenario C (kp-less) uses more memory because it combines VXLAN overlay eBPF maps (tunnel endpoints, encapsulation state) with kube-proxy replacement eBPF maps (Service endpoints). Scenarios D/E also replace kube-proxy, but ENI native routing eliminates overlay maps, resulting in lower memory.

Impact by Tuning Point

Individual Tuning Effects Not Measured

This benchmark measured the cumulative effect of applying all tuning simultaneously in Scenario E. The task of applying each tuning option individually to measure its standalone contribution was not performed. The overall performance improvement in Scenario E (RTT 36%, p99 20%) is the combined result of 8 tuning options.

Tuning applied in Scenario E:

  • Socket-level LB, BPF Host Routing, BPF Masquerade, Bandwidth Manager, BBR, Native Routing, CT Table Expansion, Hubble Disabled
  • XDP Acceleration, DSR (Not applied due to ENA driver compatibility constraints)

ENA Driver XDP Constraints: The ENA driver on m6i.xlarge does not support bpf_link functionality, causing both XDP native and best-effort modes to fail. DSR mode also caused Pod crashes, requiring a fallback to SNAT mode. Retesting is needed when NIC driver updates become available.


Key Conclusion: Performance Difference vs Feature Difference

The most important conclusion from this benchmark is that there is virtually no practical performance difference between VPC CNI and Cilium CNI.

ItemResultInterpretation
TCP ThroughputSame across all scenarios (12.4 Gbps)Saturated at NIC bandwidth, CNI-independent
HTTP p99 @QPS=10008.75~10.92ms (varies by scenario)Within measurement error
UDP Packet LossVPC CNI 20% vs Cilium Tuned 0.03%Bandwidth Manager feature difference (extreme iperf3 conditions)
Service Scalingiptables +26us/connection @1,000Measurable but negligible in practice
Implications for AI/ML Real-Time Inference Workloads

However, in HTTP/gRPC-based real-time inference serving environments, the RTT improvement (4,894→3,135us, ~36%) and HTTP p99 latency reduction (10.92→8.75ms, ~20%) can accumulate to become meaningful. In agentic AI workloads with multi-hop communication patterns where a single request traverses multiple microservices (e.g., Gateway → Router → vLLM → RAG → Vector DB), the latency saved at each hop accumulates, potentially creating a perceptible difference in overall end-to-end response time. This should be considered for real-time inference serving requiring ultra-low latency.

The real differentiators when choosing between the two CNIs are features:

  • L7 Network Policies (HTTP path/method-based filtering)
  • FQDN-based Egress Policies (domain name-based external access control)
  • eBPF-based Observability (real-time network flow visibility through Hubble)
  • Hubble Network Map — Since it collects packet metadata at the kernel level using eBPF, it has extremely low overhead compared to sidecar proxy approaches while providing real-time visualization of inter-service communication flows, dependencies, and policy verdicts (ALLOWED/DENIED). The ability to obtain a network topology map without a separate service mesh is a significant advantage for operational visibility.
  • kube-proxy-less Architecture (reduced operational complexity, future-proofing for large-scale environments)
  • Bandwidth Manager (QoS control for extreme UDP workloads)

If performance optimization is the goal, application tuning, instance type selection, and network topology optimization have a much greater impact than CNI selection. However, in environments where multi-hop inference pipelines or network visibility is important, Cilium's functional advantages can translate into performance improvements.


Analysis and Recommendations

Benchmark Key Results Summary

EKS 1.31 · m6i.xlarge × 3 Nodes · Real-world measurements across 5 scenarios

-36%
RTT Latency Improvement
Scenario E vs A
BW Mgr
UDP Loss Mitigation
Bandwidth Manager + BBR
101×
iptables Rule Growth
99 → 10,059 (1000 svc)
O(1)
eBPF Service Lookup
vs iptables O(n)

Key Findings

1

TCP Throughput Saturated by NIC Bandwidth

All scenarios achieved 12.34-12.41 Gbps, limited by m6i.xlarge's 12.5 Gbps baseline bandwidth. TCP throughput is effectively identical across all configurations.

2

UDP Loss Rate: Key Differentiator

Key Differentiator
ScenarioUDP LossReason
A (VPC CNI)20.39%Native ENI, no eBPF rate limiting
B (Cilium+kp)0.94%eBPF Bandwidth Manager
C (kp-less)0.69%eBPF Bandwidth Manager
D (ENI)20.42%No tuning
E (ENI+Tuning)0.03%Bandwidth Manager + BBR
Insight: eBPF Bandwidth Manager enforces pod-level rate limits without kernel qdisc overhead, preventing UDP packet drops at high throughput.
3

RTT Improvement Through Tuning

36% Improvement
A: VPC CNI
4,894µs
D: ENI
4,453µs
-9%
E: ENI+Tuning
3,135µs
-36%
Key contributors: BPF Host Routing (bypasses iptables), Socket LB (direct connection), BBR congestion control
4

kube-proxy Removal Effect

B vs C
TCP/UDP Throughput
No difference
RTT
+3% worse
4955→5092µs
HTTP p99@1000
-10% better
9.87→8.91ms
DNS p99
-50% better
4ms→2ms
Insight: At small scale (~10 services), kube-proxy removal shows minimal benefit. At 1,000 services, iptables rules grew 101× (99→10,059) with +16% per-connection overhead, while Cilium eBPF maintained O(1) lookup performance regardless of service count.
5

ENI Mode vs Overlay Mode

C vs D
MetricC (VXLAN)D (ENI)Change
TCP Throughput12.34 Gbps12.41 Gbps+0.6%
RTT5,092 µs4,453 µs-12.5%
HTTP p99@10008.91 ms8.75 ms-1.8%
UDP Loss0.69%20.42%Needs tuning
Insight: VXLAN overlay adds ~640µs RTT overhead due to encapsulation. ENI mode provides lower latency but requires UDP tuning.
6

Tuning Cumulative Effect

D → E
MetricD (ENI)E (ENI+Tuning)Change
RTT4,453 µs3,135 µs-30%
UDP Loss20.42%0.03%-99.9%
HTTP QPS@max4,0264,182+3.9%
HTTP p99@10008.75 ms9.89 ms+13%
Most Impactful Tunings:
  1. Bandwidth Manager + BBR — UDP loss 20% → 0.03%
  2. Socket LB — Direct connection
  3. BPF Host Routing — Bypasses iptables
XDP acceleration and DSR unavailable in ENI mode
7

1,000-Service Scaling: iptables O(n) vs eBPF O(1)

Scaling
Metric4 Services1,000 ServicesChange
iptables NAT Rules9910,059+101×
Sync Cycle Time~130ms~170ms+31%
Per-conn Setup (close)164µs190µs+16%
Max QPS (keepalive)4,1974,178~0%
Insight: With keepalive connections, conntrack caching bypasses iptables chain traversal, hiding the O(n) cost. Without keepalive, every SYN packet traverses the full KUBE-SERVICES chain. At 5,000+ services, this overhead becomes critical, adding hundreds of µs per connection.
Workload CharacteristicsRecommendedRationale
Small, Simple (<100 Services)A: VPC CNIMinimal complexity
UDP-heavy (streaming, gaming)E: ENI+Tuning0.03% UDP loss
Network Policies RequiredC or DL3/L4/L7 policies
Large Scale (500+ Services)D: Cilium ENIeBPF O(1) vs iptables +16%/conn @1000svc
Latency Sensitive (Finance, Real-time)E: ENI+Tuning36% RTT improvement
IP ConstraintsC: kp-lessVXLAN Overlay
Multi-tenant, ObservabilityD + HubbleENI + visibility
A: VPC CNI
Dev/Staging
Complexity: Low
Performance: Baseline
D: Cilium ENI
General Production
Complexity: Medium
Performance: High
E: ENI+Tuning
High-Perf/Latency-Sensitive
Complexity: High
Performance: Maximum
C: kp-less
Network Policies/IP Constraints
Complexity: Medium
Performance: Moderate-High
XDP Support Verification

To leverage XDP Acceleration and DSR, verify that the NIC driver of the instance type supports bpf_link functionality. The ENA driver on m6i.xlarge does not currently support it. Re-verification is needed when considering future driver updates or other instance types (C6i, C7i, etc.).


Configuration Notes

Issues and solutions discovered during benchmark environment setup are documented here. Refer to these when introducing Cilium to EKS or reproducing the benchmark.

eksctl Cluster Creation

  • Minimum 2 AZs required: eksctl requires at least 2 AZs in availabilityZones. Even if you want a single-AZ node group, you must specify 2 or more AZs at the cluster level.

    # Cluster level: 2 AZs required
    availabilityZones:
    - ap-northeast-2a
    - ap-northeast-2c
    # Node group level: Single AZ possible
    managedNodeGroups:
    - availabilityZones: [ap-northeast-2a]

Cilium Helm Chart Compatibility

  • tunnel option removed (Cilium 1.15+): --set tunnel=vxlan or --set tunnel=disabled are no longer valid. Use routingMode and tunnelProtocol instead.

    # Previous (Cilium 1.14 and below)
    --set tunnel=vxlan

    # Current (Cilium 1.15+)
    --set routingMode=tunnel --set tunnelProtocol=vxlan

    # Native Routing (ENI mode)
    --set routingMode=native

XDP Acceleration Requirements

XDP (eXpress Data Path) processes packets at the NIC driver level, bypassing the kernel network stack. To use XDP with Cilium, all of the following conditions must be met.

RequirementConditionNotes
Linux Kernel>= 5.10bpf_link support >= 5.7
NIC DriverXDP Native-capableSee compatibility below
Cilium ConfigkubeProxyReplacement=truekube-proxy replacement required
InterfacePhysical NICNo bond/VLAN
DriverXDP NativeMin KernelEnvironmentNotes
mlx5 (Mellanox ConnectX-4/5/6)Full>= 4.9Bare MetalBest, recommended
i40e (Intel XL710)Full>= 4.12Bare MetalStable
ixgbe (Intel 82599)Full>= 4.12Bare Metal10GbE
bnxt_en (Broadcom)Supported>= 4.11Bare Metal-
ena (AWS ENA)⚠️ Limited>= 5.6AWS EC2See AWS below
virtio-net⚠️ Generic only>= 4.10KVM/QEMUNo native
Instance TypeXDP NativeReason
Bare Metal (c5.metal, m6i.metal...)SupportedDirect hardware access
Virtualized (m6i.xlarge, c6i...)❌ UnsupportedENA lacks bpf_link
ENA Express (c6in, m6in...)❌ UnsupportedSRD protocol, unrelated to XDP
Graviton (m7g, c7g...)❌ UnsupportedSame ENA constraint
Tuning ItemEffect
Socket-level LBDirect connection at connect(), no per-packet NAT
BPF Host RoutingComplete host iptables bypass
BPF Masqueradeiptables MASQUERADE → eBPF
Bandwidth Manager + BBREDT rate limiting + BBR
Native Routing (ENI)VXLAN encap removed
36% RTT improvement achieved without XDP or DSR

DSR (Direct Server Return) Compatibility

  • Setting loadBalancer.mode=dsr may cause Cilium Agent Pod crashes
  • mode=snat (default) is recommended in AWS ENA environments
  • DSR is only stable in environments where XDP works correctly (Bare Metal + mlx5/i40e, etc.)
Checking XDP Support
# Check Cilium XDP activation status
kubectl -n kube-system exec ds/cilium -- cilium-dbg status | grep XDP
# "Disabled" indicates XDP is not supported on the instance

# Check NIC driver
ethtool -i eth0 | grep driver

Workload Deployment

  • Fortio container image constraints: The fortio/fortio image does not include sleep, sh, or nslookup binaries. Use Fortio's own server mode for idle waiting instead of sleep infinity.

    command: ["fortio", "server", "-http-port", "8080"]
  • Pod selection for DNS testing: Use images that include sh (e.g., iperf3) with getent hosts for DNS resolution tests. nslookup requires separate installation.

CNI Transition Pod Restarts

  • CPU shortage during Rolling Update: When restarting workloads after VPC CNI → Cilium transition, Rolling Update strategy temporarily doubles the Pod count. CPU shortage may occur on small nodes.

    # Safe restart method: Delete existing Pods and let them recreate
    kubectl delete pods -n bench --all
    kubectl rollout status -n bench deployment --timeout=120s
  • Cilium DaemonSet restart: If the DaemonSet doesn't automatically restart after changing Cilium Helm values, trigger it manually.

    kubectl -n kube-system rollout restart daemonset/cilium
    kubectl -n kube-system rollout status daemonset/cilium --timeout=300s

AWS Authentication

  • SSO token expiration: AWS SSO tokens may expire during long-running benchmark executions. Check the token validity period before execution or refresh with aws sso login.

Reference: VPC CNI vs Cilium Network Policy Comparison

Both VPC CNI and Cilium support network policies on EKS, but there are significant differences in supported scope and capabilities.

FeatureVPC CNI (EKS Network Policy)Cilium
Kubernetes NetworkPolicy APISupportedSupported
L3/L4 FilteringSupportedSupported
L7 Filtering (HTTP/gRPC/Kafka)Not supportedCiliumNetworkPolicy CRD
FQDN-based PoliciesNot supportedtoFQDNs rules
Identity-based MatchingIP-basedCilium Identity (eBPF, O(1))
Cluster-wide PoliciesNamespace-scoped onlyCiliumClusterwideNetworkPolicy
Host-level PoliciesPod traffic onlyHost traffic control
Policy Enforcement VisibilityCloudWatch Logs (limited)Hubble (real-time)
Policy Editor/UINot supportedCilium Network Policy Editor
ImplementationeBPF (AWS agent)eBPF (Cilium agent)
Performance ImpactLowLow

Highlighted rows indicate features where Cilium provides capabilities not available in VPC CNI.

Key Differences

L7 Policies (Cilium only): Filtering is possible at the HTTP request path, method, and header level. For example, you can create a policy that allows GET /api/public but blocks DELETE /api/admin.

# Cilium L7 Policy Example
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: allow-get-only
spec:
endpointSelector:
matchLabels:
app: api-server
ingress:
- fromEndpoints:
- matchLabels:
role: frontend
toPorts:
- ports:
- port: "80"
protocol: TCP
rules:
http:
- method: GET
path: "/api/public.*"

FQDN-based Policies (Cilium only): Access to external domains can be controlled by DNS name. Policies automatically update even when IPs change.

# Allow only specific AWS services
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: allow-aws-only
spec:
endpointSelector:
matchLabels:
app: backend
egress:
- toFQDNs:
- matchPattern: "*.amazonaws.com"
- matchPattern: "*.eks.amazonaws.com"

Policy Enforcement Visibility: Cilium's Hubble displays policy verdicts (ALLOWED/DENIED) in real-time for all network flows. VPC CNI provides only limited logging via CloudWatch Logs.

Selection Guide
  • Only basic L3/L4 policies needed: VPC CNI's EKS Network Policy is sufficient.
  • L7 filtering, FQDN policies, real-time visibility needed: Cilium is the only option.
  • Multi-tenant environments: Cilium's CiliumClusterwideNetworkPolicy and Host-level policies are powerful.

References