Scaling & Performance (Kubernetes)π
This page explains how to scale DSX-Connect infrastructure in Kubernetes.
Use Operations β Performance Tuning to measure throughput. Use this page to apply those tuning decisions safely in Kubernetes.
Architecture Scaling Modelπ
DSX-Connect scales horizontally. Throughput is controlled by:
dsx-connect-scan-request-workerconcurrencydsx-connect-scan-request-workerreplica count- Connector sharding via
DSXCONNECTOR_ASSET - DSXA capacity
- Infrastructure limits (CPU, memory, network)
flowchart LR
subgraph Connector Shards
C1[Connector Shard A]
C2[Connector Shard B]
C3[Connector Shard C]
end
C1 --> Queue
C2 --> Queue
C3 --> Queue
Queue[(Scan Request Queue)]
Queue --> Worker[Scan Request Workers]
Worker --> DSXA
DSXA --> Results
Results --> Redis
Primary scaling hierarchy:
- scan_request concurrency
- scan_request replicas
- connector sharding (assets)
- Redis capacity
- DSXA capacity
- network bandwidth
Scaling should follow this order and be validated using Job Comparisons.
Scaling scan_request Workersπ
dsx-connect-scan-request-worker controls how many files are processed concurrently.
Tuning must follow the same order described in Performance Tuning.
Step 1: Increase Concurrencyπ
Increase concurrency first when:
- Worker CPU and memory have headroom.
- DSXA is not saturated.
- Network bandwidth is available.
Example:
dsx-connect-scan-request-worker:
replicaCount: 1
celery:
concurrency: 4
Increase gradually (for example 2 β 4 β 6) and measure after each change.
Watch for:
- CPU saturation
- Memory pressure
- Growing latency
- Queue backlog growth
Stop increasing concurrency when throughput gains flatten or instability appears.
Step 2: Increase Replicasπ
Increase replicas when:
- Concurrency gains plateau.
- Per-pod memory pressure increases.
- You need better distribution across nodes.
- You want to enable autoscaling.
Example:
dsx-connect-scan-request-worker:
replicaCount: 4
celery:
concurrency: 4
Replicas add horizontal capacity without increasing per-pod memory pressure.
Do not increase concurrency and replicas simultaneously during tuning. Adjust one variable at a time and validate using Job Comparisons.
Scaling Connectorsπ
Connectors are typically I/O-bound and scale differently than core workers.
Increasing replicaCount alone does not always improve performance and may cause:
- Duplicate enumeration
- Excessive list operations
- API rate-limit amplification
- Network contention
Connector throughput depends on:
- Number of concurrent scan_request workers
- Connector enumeration speed
- Source system throughput
- Network latency and bandwidth
- Sharding strategy using
DSXCONNECTOR_ASSET
Preferred Strategy: Shard with DSXCONNECTOR_ASSETπ
DSXCONNECTOR_ASSET defines the logical root of a connectorβs scan scope. It enables partitioning a large repository into smaller shards so multiple connector instances can operate in parallel.
For very large repositories (millions or billions of items), sharding is the primary scaling mechanism.
Examples:
- S3:
bucket/A/*,bucket/B/*, or time partitions likebucket/2025-01/* - Filesystem:
/data/shard1,/data/shard2 - SharePoint: split by library or folder scope
Deploy one connector instance per shard.
In Kubernetes, this typically means:
- Multiple Helm releases of the connector, each with a distinct
DSXCONNECTOR_ASSET - Or multiple deployments configured with unique asset values
Simply increasing replica count while all replicas point at the same asset may not improve throughput and can increase API pressure.
Assets vs Filtersπ
- Use Assets for coarse partitioning (sharding).
- Use Filters for fine-grained inclusion/exclusion inside a shard.
- Combine both for large-scale environments.
Filters alone are not sufficient for billion-item repositories because enumeration still occurs inside the asset boundary.
Redis Scalingπ
Redis affects:
- Queue depth
- Backlog handling
- Retry behavior
- Burst tolerance
Increase Redis memory if:
- You observe evictions
- Backlog persists
- Workers are idle due to queue issues
Example:
redis:
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: "1"
memory: 4Gi
Redis should be sized to tolerate bursts without eviction.
DSXA Capacity Planningπ
DSXA often becomes the true bottleneck.
If increasing concurrency and replicas does not increase throughput:
- Check DSXA CPU usage
- Review DSXA scan bytes/sec
- Verify concurrent scan capacity
If DSXA is saturated, scale DSXA before increasing workers further.
Adding workers beyond DSXA capacity increases queue depth without improving scan rate.
Horizontal Pod Autoscaler (HPA)π
CPU-based autoscaling works well for dsx-connect-scan-request-worker.
Example:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: dsx-scan-request
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: dsx-connect-scan-request-worker
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Guidance:
- Start conservative (min 2, max 5β10).
- Ensure cluster capacity supports maxReplicas.
- Monitor DSXA before raising limits.
- Avoid scaling API unless it is CPU-bound.
Node Capacity Planningπ
Plan cluster capacity before enabling autoscaling.
If each scan_request worker requests:
1 vCPU
1 Gi memory
And maxReplicas is 10, your cluster must support:
10 vCPU
10 Gi memory
+ connectors
+ Redis
+ API
+ system overhead
Include ingress, logging, monitoring, and system pods in capacity planning.
For production environments, dedicate node pools for scan workloads when possible.
Recommended Scaling Workflowπ
Follow the same order used in Performance Tuning.
- Establish a baseline using Job Comparisons.
- Increase
dsx-connect-scan-request-worker.celery.concurrency. - When gains flatten, increase
replicaCount. - Shard large connectors using
DSXCONNECTOR_ASSET. - Increase Redis memory if backlog appears.
- Scale DSXA if scan time dominates.
- Validate network bandwidth and latency.
After each change:
- Re-run the same workload.
- Compare results.
- Modify one variable at a time.
The goal is sustained, predictable throughput without saturating your infrastructure.