Setting up matrix testing for cross-browser frontend builds

Q: How do I prevent cache collisions when running multiple browser contexts in parallel?

Scope the cache key to include matrix.browser and matrix.os. Never share a single global cache for Playwright binaries across jobs — binary state leaks between Chromium and WebKit cause silent mis-executions.

Q: Should I run the full browser matrix on every pull request?

No. Run a minimal subset (Chromium on Ubuntu) on PRs. Reserve the full matrix for merges to main and nightly schedules to control compute spend and queue throughput.

Configure a GitHub Actions matrix that runs Playwright tests across Chromium, Firefox, and WebKit in parallel — without cache collisions, port conflicts, or runaway artifact storage.

When to use this pattern

Your frontend makes heavy use of CSS features or Web APIs that behave differently across browser engines (Grid subgrid, has(), WebGL, clipboard API).
You have discovered regressions that only surface in WebKit on macOS, and need a reproducible CI gate before those bugs reach production.
Your team has adopted environment matrices in GitHub Actions for environment parity and now needs to extend that strategy to browser coverage.

Prerequisites

Node.js 20+ and npm or pnpm available in your runner image
@playwright/test ≥ 1.42.0 installed as a dev dependency
A playwright.config.ts at the repository root that reads process.env.BASE_URL for the dev server origin
Repository write permissions to upload workflow artifacts
Familiarity with the CI/CD pipeline architecture fundamentals — specifically job isolation and dependency graphs

Complete working example

The workflow below is fully self-contained. Drop it into .github/workflows/cross-browser.yml and it runs immediately.

# .github/workflows/cross-browser.yml
name: Cross-browser tests

on:
  push:
    branches: [main]
  pull_request:
  schedule:
    # Full matrix runs nightly at 02:00 UTC
    - cron: '0 2 * * *'

jobs:
  test:
    name: "${{ matrix.browser }} / ${{ matrix.os }}"
    runs-on: ${{ matrix.os }}

    strategy:
      # Do not abort passing browsers when one fails
      fail-fast: false
      # Cap parallel jobs to avoid runner queue starvation
      max-parallel: 10
      matrix:
        browser: [chromium, firefox, webkit]
        os: [ubuntu-latest, macos-latest]
        exclude:
          # WebKit on Linux lacks a stable GPU stack — run it only on macOS
          - browser: webkit
            os: ubuntu-latest

    # One concurrency group per branch+browser so re-triggered runs cancel the
    # previous in-flight job for that combination, not the whole workflow
    concurrency:
      group: "xbrowser-${{ github.ref }}-${{ matrix.browser }}-${{ matrix.os }}"
      cancel-in-progress: true

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          # npm ci reads package-lock.json; pnpm users swap this for pnpm/action-setup
          cache: 'npm'

      # Isolate the Playwright binary cache per browser+OS to prevent state leakage
      - name: Cache Playwright browsers
        uses: actions/cache@v4
        with:
          path: ~/.cache/ms-playwright
          # Hash on browser+OS+Playwright version so cache busts on upgrades
          key: "${{ runner.os }}-pw-${{ matrix.browser }}-${{ hashFiles('package-lock.json') }}"
          restore-keys: |
            ${{ runner.os }}-pw-${{ matrix.browser }}-

      - name: Install dependencies
        run: npm ci

      # Install only the single browser binary this job needs, not all three
      - name: Install Playwright browser
        run: npx playwright install --with-deps ${{ matrix.browser }}

      # Allocate a free port so parallel jobs sharing a host never collide
      - name: Pick free port
        id: port
        run: |
          PORT=$(node -e "
            const net = require('net');
            const s = net.createServer();
            s.listen(0, () => { console.log(s.address().port); s.close(); });
          ")
          echo "port=$PORT" >> "$GITHUB_OUTPUT"

      - name: Run Playwright tests
        env:
          # Playwright config reads this to build the dev server URL
          BASE_URL: "http://localhost:${{ steps.port.outputs.port }}"
          # Scope the project filter to the current browser only
          PW_BROWSER: ${{ matrix.browser }}
        run: |
          npx playwright test \
            --project="${{ matrix.browser }}" \
            --reporter=github,html

      # Upload traces, screenshots, and videos only on failure to control storage
      - name: Upload test artifacts
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          # Unique name per permutation prevents cross-job clobbering
          name: "test-results-${{ matrix.browser }}-${{ matrix.os }}"
          path: test-results/
          retention-days: 7

Step-by-step walkthrough

Matrix definition and exclusions

The matrix block declares the combinatorial space: three browsers × two operating systems = six permutations, reduced to five by the exclude rule that drops WebKit on Ubuntu. WebKit on Linux relies on X11 and lacks Apple’s GPU driver stack; the test surface it exercises differs enough from Safari that the signal is misleading rather than useful.

fail-fast: false is mandatory. Without it, a single flaky WebKit timeout aborts all in-flight Chromium and Firefox jobs, wasting runner minutes and hiding their results.

Cache isolation per browser

The cache key includes matrix.browser so Chromium, Firefox, and WebKit each maintain a separate binary store under ~/.cache/ms-playwright. If you use a shared key, Playwright’s binary detection hash can resolve incorrectly and silently execute the wrong engine — a failure mode that produces green results for the wrong browser.

The npx playwright install --with-deps ${{ matrix.browser }} step installs only one binary. On a cold cache this saves roughly 200 MB of download per job compared to npx playwright install --with-deps (which installs all three).

Dynamic port allocation

Each matrix job may land on the same physical runner host, especially on GitHub’s shared Ubuntu pool. If every job spawns a dev server on port 3000, the second job through the ninth will fail with ECONNREFUSED or EADDRINUSE. The Node.js snippet in the Pick free port step asks the OS to bind to port 0, which the kernel fills with an available ephemeral port, then reads it back before releasing the socket. The port is written to $GITHUB_OUTPUT and consumed downstream via steps.port.outputs.port.

Concurrency groups

The concurrency key is scoped to github.ref + matrix.browser + matrix.os. A new push to the same branch cancels the in-flight job for that exact combination rather than cancelling the entire workflow. Without per-combination scoping, a follow-up push would kill WebKit’s still-running job on a previous commit even though Chromium had already passed.

Conditional artifact upload

if: failure() ensures that passing jobs never write to artifact storage. A full six-permutation run with traces and screenshots enabled can accumulate 1–3 GB per workflow run; gating on failure reduces typical storage consumption by 80–90%. The name field includes both matrix.browser and matrix.os so permutations never overwrite each other’s artifacts in the same workflow run.

Verification

After pushing the workflow, confirm it is wiring correctly before trusting the results:

# List all jobs from the most recent workflow run
gh run list --workflow=cross-browser.yml --limit=1

# Watch all matrix jobs in real time
gh run watch

# Download artifacts from a failing run to inspect locally
gh run download <run-id> --name test-results-webkit-macos-latest
npx playwright show-report test-results-webkit-macos-latest/

Expected output from gh run watch when all permutations pass:

✓  chromium / ubuntu-latest   2m 14s
✓  firefox / ubuntu-latest    2m 38s
✓  chromium / macos-latest    3m 01s
✓  firefox / macos-latest     3m 19s
✓  webkit / macos-latest      4m 07s

If a job appears stuck in the “queued” state for more than two minutes, you have likely exceeded the max-parallel limit or hit the per-repository runner concurrency quota. Lower max-parallel or move WebKit to a separate workflow with its own schedule.

SVG: matrix execution flow

Common pitfalls

Shared browser binary cache across jobs

If the cache key omits matrix.browser, all five jobs restore and overwrite the same cache entry. Whichever job restores last wins, and earlier jobs may have already detected the wrong binary. Always include ${{ matrix.browser }} — and ${{ matrix.os }} if you run both Ubuntu and macOS — in the key and in the path segment where it matters.

# Wrong — shared key poisons all jobs
key: "${{ runner.os }}-pw-${{ hashFiles('package-lock.json') }}"

# Correct — isolated per browser
key: "${{ runner.os }}-pw-${{ matrix.browser }}-${{ hashFiles('package-lock.json') }}"

Port collisions on shared runner hosts

GitHub-hosted runners are virtual machines, not containers, so multiple workflow jobs landing on the same host will share the network namespace. If your Playwright config hard-codes webServer.port: 3000, the second job to start will fail immediately with EADDRINUSE. Use the dynamic port snippet from the complete example — or set webServer.reuseExistingServer: false and let Playwright pick a port by passing port: 0 in playwright.config.ts.

// playwright.config.ts
const port = parseInt(process.env.PORT ?? '0', 10);

export default defineConfig({
  webServer: {
    command: `npm run dev -- --port ${port}`,
    port,
    reuseExistingServer: false,
  },
});

Artifact storage exhaustion from full matrix runs

Playwright generates per-test traces, screenshots, and optional video for every run. Across five browser/OS permutations with 200+ tests, uncompressed artifacts can hit 4–8 GB per workflow run. Two guards prevent this: if: failure() on the upload step (keeps passing permutation artifacts off storage entirely) and retention-days: 7 to auto-purge. For WebKit video recordings specifically, add video: 'retain-on-failure' to playwright.config.ts rather than video: 'on'.

Frequently Asked Questions

How do I prevent cache collisions when running multiple browser contexts in parallel?

Scope the cache key to include matrix.browser and matrix.os. Never share a single global cache for Playwright binaries across jobs — binary state leaks between Chromium and WebKit cause silent mis-executions where the test runner reports the wrong engine.

What is the recommended max-parallel for cross-browser matrices on GitHub-hosted runners?

Start at 10–15. Monitor queue wait times in the Actions usage report and tune down if jobs sit pending longer than 90 seconds. If you manage pipeline concurrency and queue limits centrally, group concurrency by branch rather than workflow to avoid starving unrelated PRs.

Should I run the full browser matrix on every pull request?

No. Run a minimal subset — Chromium on Ubuntu only — on PRs to keep developer feedback fast. Reserve the full matrix for merges to main and nightly cron schedules. This pattern reduces per-PR runner minutes by ~80% while retaining full browser coverage before release.

← Back to Managing Environment Matrices in GitHub Actions

Managing Environment Matrices in GitHub Actions — runner allocation, dynamic matrix generation, and concurrency patterns that underpin this cross-browser setup.
CI/CD Pipeline Architecture & Fundamentals — foundational pipeline design principles for frontend and full-stack applications.
Optimizing Pipeline Concurrency and Queue Limits — tune max-parallel and concurrency groups to prevent queue starvation across matrix jobs.
Artifact Management Strategies for Frontend Builds — storage policies and retention rules for build outputs and test reports.