Setting up matrix testing for cross-browser frontend builds
Configure a GitHub Actions matrix that runs Playwright tests across Chromium, Firefox, and WebKit in parallel — without cache collisions, port conflicts, or runaway artifact storage.
When to use this pattern
- Your frontend makes heavy use of CSS features or Web APIs that behave differently across browser engines (Grid subgrid,
has(), WebGL, clipboard API). - You have discovered regressions that only surface in WebKit on macOS, and need a reproducible CI gate before those bugs reach production.
- Your team has adopted environment matrices in GitHub Actions for environment parity and now needs to extend that strategy to browser coverage.
Prerequisites
Complete working example
The workflow below is fully self-contained. Drop it into .github/workflows/cross-browser.yml and it runs immediately.
# .github/workflows/cross-browser.yml
name: Cross-browser tests
on:
push:
branches: [main]
pull_request:
schedule:
# Full matrix runs nightly at 02:00 UTC
- cron: '0 2 * * *'
jobs:
test:
name: "${{ matrix.browser }} / ${{ matrix.os }}"
runs-on: ${{ matrix.os }}
strategy:
# Do not abort passing browsers when one fails
fail-fast: false
# Cap parallel jobs to avoid runner queue starvation
max-parallel: 10
matrix:
browser: [chromium, firefox, webkit]
os: [ubuntu-latest, macos-latest]
exclude:
# WebKit on Linux lacks a stable GPU stack — run it only on macOS
- browser: webkit
os: ubuntu-latest
# One concurrency group per branch+browser so re-triggered runs cancel the
# previous in-flight job for that combination, not the whole workflow
concurrency:
group: "xbrowser-${{ github.ref }}-${{ matrix.browser }}-${{ matrix.os }}"
cancel-in-progress: true
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
# npm ci reads package-lock.json; pnpm users swap this for pnpm/action-setup
cache: 'npm'
# Isolate the Playwright binary cache per browser+OS to prevent state leakage
- name: Cache Playwright browsers
uses: actions/cache@v4
with:
path: ~/.cache/ms-playwright
# Hash on browser+OS+Playwright version so cache busts on upgrades
key: "${{ runner.os }}-pw-${{ matrix.browser }}-${{ hashFiles('package-lock.json') }}"
restore-keys: |
${{ runner.os }}-pw-${{ matrix.browser }}-
- name: Install dependencies
run: npm ci
# Install only the single browser binary this job needs, not all three
- name: Install Playwright browser
run: npx playwright install --with-deps ${{ matrix.browser }}
# Allocate a free port so parallel jobs sharing a host never collide
- name: Pick free port
id: port
run: |
PORT=$(node -e "
const net = require('net');
const s = net.createServer();
s.listen(0, () => { console.log(s.address().port); s.close(); });
")
echo "port=$PORT" >> "$GITHUB_OUTPUT"
- name: Run Playwright tests
env:
# Playwright config reads this to build the dev server URL
BASE_URL: "http://localhost:${{ steps.port.outputs.port }}"
# Scope the project filter to the current browser only
PW_BROWSER: ${{ matrix.browser }}
run: |
npx playwright test \
--project="${{ matrix.browser }}" \
--reporter=github,html
# Upload traces, screenshots, and videos only on failure to control storage
- name: Upload test artifacts
if: failure()
uses: actions/upload-artifact@v4
with:
# Unique name per permutation prevents cross-job clobbering
name: "test-results-${{ matrix.browser }}-${{ matrix.os }}"
path: test-results/
retention-days: 7Step-by-step walkthrough
Matrix definition and exclusions
The matrix block declares the combinatorial space: three browsers × two operating systems = six permutations, reduced to five by the exclude rule that drops WebKit on Ubuntu. WebKit on Linux relies on X11 and lacks Apple’s GPU driver stack; the test surface it exercises differs enough from Safari that the signal is misleading rather than useful.
fail-fast: false is mandatory. Without it, a single flaky WebKit timeout aborts all in-flight Chromium and Firefox jobs, wasting runner minutes and hiding their results.
Cache isolation per browser
The cache key includes matrix.browser so Chromium, Firefox, and WebKit each maintain a separate binary store under ~/.cache/ms-playwright. If you use a shared key, Playwright’s binary detection hash can resolve incorrectly and silently execute the wrong engine — a failure mode that produces green results for the wrong browser.
The npx playwright install --with-deps ${{ matrix.browser }} step installs only one binary. On a cold cache this saves roughly 200 MB of download per job compared to npx playwright install --with-deps (which installs all three).
Dynamic port allocation
Each matrix job may land on the same physical runner host, especially on GitHub’s shared Ubuntu pool. If every job spawns a dev server on port 3000, the second job through the ninth will fail with ECONNREFUSED or EADDRINUSE. The Node.js snippet in the Pick free port step asks the OS to bind to port 0, which the kernel fills with an available ephemeral port, then reads it back before releasing the socket. The port is written to $GITHUB_OUTPUT and consumed downstream via steps.port.outputs.port.
Concurrency groups
The concurrency key is scoped to github.ref + matrix.browser + matrix.os. A new push to the same branch cancels the in-flight job for that exact combination rather than cancelling the entire workflow. Without per-combination scoping, a follow-up push would kill WebKit’s still-running job on a previous commit even though Chromium had already passed.
Conditional artifact upload
if: failure() ensures that passing jobs never write to artifact storage. A full six-permutation run with traces and screenshots enabled can accumulate 1–3 GB per workflow run; gating on failure reduces typical storage consumption by 80–90%. The name field includes both matrix.browser and matrix.os so permutations never overwrite each other’s artifacts in the same workflow run.
Verification
After pushing the workflow, confirm it is wiring correctly before trusting the results:
# List all jobs from the most recent workflow run
gh run list --workflow=cross-browser.yml --limit=1
# Watch all matrix jobs in real time
gh run watch
# Download artifacts from a failing run to inspect locally
gh run download <run-id> --name test-results-webkit-macos-latest
npx playwright show-report test-results-webkit-macos-latest/Expected output from gh run watch when all permutations pass:
✓ chromium / ubuntu-latest 2m 14s
✓ firefox / ubuntu-latest 2m 38s
✓ chromium / macos-latest 3m 01s
✓ firefox / macos-latest 3m 19s
✓ webkit / macos-latest 4m 07s
If a job appears stuck in the “queued” state for more than two minutes, you have likely exceeded the max-parallel limit or hit the per-repository runner concurrency quota. Lower max-parallel or move WebKit to a separate workflow with its own schedule.
SVG: matrix execution flow
Common pitfalls
Shared browser binary cache across jobs
If the cache key omits matrix.browser, all five jobs restore and overwrite the same cache entry. Whichever job restores last wins, and earlier jobs may have already detected the wrong binary. Always include ${{ matrix.browser }} — and ${{ matrix.os }} if you run both Ubuntu and macOS — in the key and in the path segment where it matters.
# Wrong — shared key poisons all jobs
key: "${{ runner.os }}-pw-${{ hashFiles('package-lock.json') }}"
# Correct — isolated per browser
key: "${{ runner.os }}-pw-${{ matrix.browser }}-${{ hashFiles('package-lock.json') }}"Port collisions on shared runner hosts
GitHub-hosted runners are virtual machines, not containers, so multiple workflow jobs landing on the same host will share the network namespace. If your Playwright config hard-codes webServer.port: 3000, the second job to start will fail immediately with EADDRINUSE. Use the dynamic port snippet from the complete example — or set webServer.reuseExistingServer: false and let Playwright pick a port by passing port: 0 in playwright.config.ts.
// playwright.config.ts
const port = parseInt(process.env.PORT ?? '0', 10);
export default defineConfig({
webServer: {
command: `npm run dev -- --port ${port}`,
port,
reuseExistingServer: false,
},
});Artifact storage exhaustion from full matrix runs
Playwright generates per-test traces, screenshots, and optional video for every run. Across five browser/OS permutations with 200+ tests, uncompressed artifacts can hit 4–8 GB per workflow run. Two guards prevent this: if: failure() on the upload step (keeps passing permutation artifacts off storage entirely) and retention-days: 7 to auto-purge. For WebKit video recordings specifically, add video: 'retain-on-failure' to playwright.config.ts rather than video: 'on'.
Frequently Asked Questions
How do I prevent cache collisions when running multiple browser contexts in parallel?
Scope the cache key to include matrix.browser and matrix.os. Never share a single global cache for Playwright binaries across jobs — binary state leaks between Chromium and WebKit cause silent mis-executions where the test runner reports the wrong engine.
What is the recommended max-parallel for cross-browser matrices on GitHub-hosted runners?
Start at 10–15. Monitor queue wait times in the Actions usage report and tune down if jobs sit pending longer than 90 seconds. If you manage pipeline concurrency and queue limits centrally, group concurrency by branch rather than workflow to avoid starving unrelated PRs.
Should I run the full browser matrix on every pull request?
No. Run a minimal subset — Chromium on Ubuntu only — on PRs to keep developer feedback fast. Reserve the full matrix for merges to main and nightly cron schedules. This pattern reduces per-PR runner minutes by ~80% while retaining full browser coverage before release.
Related
- Managing Environment Matrices in GitHub Actions — runner allocation, dynamic matrix generation, and concurrency patterns that underpin this cross-browser setup.
- CI/CD Pipeline Architecture & Fundamentals — foundational pipeline design principles for frontend and full-stack applications.
- Optimizing Pipeline Concurrency and Queue Limits — tune
max-paralleland concurrency groups to prevent queue starvation across matrix jobs. - Artifact Management Strategies for Frontend Builds — storage policies and retention rules for build outputs and test reports.