Test Parallelization GuideΒΆ
OverviewΒΆ
The test suite can run tests in parallel using test sharding - splitting tests across multiple isolated Docker stacks. This significantly speeds up test execution while maintaining isolation.
Quick StartΒΆ
# Run with 5 shards (default)
yarn test:sharded
# Run with 3 shards (faster startup, less parallelization)
yarn test:sharded:3
# Run with 10 shards (more parallelization, more resource usage)
yarn test:sharded:10
# Custom shard count
./scripts/test-e2e-sharded.sh 8
How It WorksΒΆ
ArchitectureΒΆ
- Multiple Docker Stacks: Each shard runs in its own isolated Docker Compose stack
- Different port:
3069,3070,3071, etc. - Different database:
matchzy_tournament_shard1,matchzy_tournament_shard2, etc. -
Different Docker project:
matchzy-test-shard-1,matchzy-test-shard-2, etc. -
Playwright Sharding: Uses Playwright's built-in
--shard=X/Yfeature - Splits all tests into Y shards
- Shard X runs its subset of tests
-
Tests are distributed deterministically
-
Parallel Execution: All shards run simultaneously
- Each shard uses 1 worker (no internal parallelism)
- But N shards run in parallel = Nx speedup
Example: 5 ShardsΒΆ
Shard 1: Tests 1-50 β Docker stack on port 3069 β Database: matchzy_tournament_shard1
Shard 2: Tests 51-100 β Docker stack on port 3070 β Database: matchzy_tournament_shard2
Shard 3: Tests 101-150 β Docker stack on port 3071 β Database: matchzy_tournament_shard3
Shard 4: Tests 151-200 β Docker stack on port 3072 β Database: matchzy_tournament_shard4
Shard 5: Tests 201-249 β Docker stack on port 3073 β Database: matchzy_tournament_shard5
All 5 shards run simultaneously in parallel processes.
PerformanceΒΆ
Before (Single Worker)ΒΆ
- 249 tests Γ 3 browsers = 747 test runs
- Sequential execution: ~18-20 minutes
- Resource usage: Low (1 Docker stack)
After (5 Shards)ΒΆ
- 249 tests split into 5 shards Γ 3 browsers
- Parallel execution: ~4-5 minutes (4x speedup)
- Resource usage: Medium (5 Docker stacks)
After (10 Shards)ΒΆ
- 249 tests split into 10 shards Γ 3 browsers
- Parallel execution: ~2-3 minutes (8-10x speedup)
- Resource usage: High (10 Docker stacks)
Resource RequirementsΒΆ
Each shard requires: - ~500MB RAM (PostgreSQL + Application) - 1 CPU core - Port number (3069, 3070, etc.)
Recommended shard counts: - 3 shards: Good balance, ~6-8 minutes - 5 shards: Default, ~4-5 minutes - 10 shards: Fast but high resource usage, ~2-3 minutes
System requirements: - 8GB+ RAM recommended for 5 shards - 16GB+ RAM recommended for 10 shards - Ensure Docker has enough resources allocated
UsageΒΆ
Basic UsageΒΆ
# Default (5 shards)
yarn test:sharded
# With filters (filters apply to all shards)
yarn test:sharded --grep "@api"
yarn test:sharded --project chromium
Advanced UsageΒΆ
# Custom shard count with arguments
./scripts/test-e2e-sharded.sh 8 --grep "@ui" --project chromium
# Pass through any Playwright arguments
./scripts/test-e2e-sharded.sh 5 --reporter=dot --timeout=60000
OutputΒΆ
Individual Shard LogsΒΆ
Each shard outputs to: test-output-shard-{N}.log
Individual ReportsΒΆ
Each shard generates: playwright-report-shard-{N}/
Merged Report (if available)ΒΆ
If @playwright/merge-reports is installed:
- Merged report: playwright-report/index.html
- Combines all shard results
To install merge-reports:
TroubleshootingΒΆ
Port ConflictsΒΆ
If you see port conflicts:
# Check what's using the ports
lsof -i :3069-3080
# Kill existing test containers
docker ps | grep matchzy-test-shard | awk '{print $1}' | xargs docker kill
Out of MemoryΒΆ
If Docker runs out of memory:
- Reduce shard count: yarn test:sharded:3
- Increase Docker Desktop memory limit
- Close other applications
Shard Startup FailuresΒΆ
If a shard fails to start:
- Check logs: docker compose -f docker/docker-compose.local.yml -p matchzy-test-shard-N logs
- Verify port is available
- Check Docker resources
Comparison: Sharding vs WorkersΒΆ
Why Not Multiple Workers?ΒΆ
The system uses a single shared database and can't handle concurrent test operations. Multiple workers would cause: - Database conflicts (duplicate keys, race conditions) - Test interference (tests modifying shared state) - Flaky tests
Why Sharding WorksΒΆ
Each shard has: - β Isolated database - β Isolated application instance - β No shared state - β Deterministic test distribution
CI/CD IntegrationΒΆ
For CI systems (GitHub Actions, etc.), you can use native sharding:
# GitHub Actions example
strategy:
matrix:
shard: [1, 2, 3, 4, 5]
steps:
- run: yarn test:sharded:${{ matrix.shard }}/5
Or use CI's parallel job feature with the sharded script.
Future ImprovementsΒΆ
Potential optimizations: - [ ] Cache Docker images between shards - [ ] Reuse build artifacts - [ ] Smart test distribution (group slow tests separately) - [ ] Automatic shard count based on available resources