Incremental Builds and Affected Detection in Monorepos
PR validation pipelines in a growing monorepo routinely rebuild every workspace regardless of what changed — burning runner minutes, delaying developer feedback, and masking legitimate failures in noise. Affected detection solves this by computing a directed acyclic graph (DAG) of workspace dependencies, diffing the change set against it, and dispatching only the tasks whose transitive inputs actually changed. This guide covers the full production path: graph construction, CI integration, remote cache synchronisation, and the failure modes that corrupt results at scale.
Prerequisites
Before wiring affected detection into CI, confirm the following are in place:
How the Dependency Graph and Affected Scope Work
Both Nx and Turborepo parse workspace manifests and build a DAG where each node is a project and each directed edge is a dependencies / devDependencies declaration pointing to another workspace package. When a change set arrives (a list of file paths from git diff base..HEAD), the tool walks the graph upward — any project that owns a changed file, plus any project that depends on it transitively, is marked affected.
The critical constraint: the graph reflects declared dependencies only. A dynamic import() evaluated at runtime, or a path alias that bypasses the manifest, is invisible to the static analyser. This is why teams maintain strict eslint-plugin-import rules and explicit outputs declarations in turbo.json — undeclared edges are the root cause of most false-negative affected misses.
The base commit for the diff matters enormously. In a PR pipeline the correct base is git merge-base HEAD origin/main, not origin/main directly — the latter may exclude commits from the feature branch that landed after the branch diverged, producing a wider diff and triggering unnecessary rebuilds.
Step-by-Step Implementation
1. Initialize workspace dependency declarations
Verify that every internal dependency is declared explicitly in each workspace’s package.json:
{
"name": "@acme/web",
"dependencies": {
"@acme/ui": "workspace:*",
"@acme/api-client": "workspace:*"
}
}Run the graph visualiser to confirm edges are resolved:
# Nx
npx nx graph --file=graph.json && cat graph.json | jq '.graph.nodes | keys'
# Turborepo
npx turbo run build --dry=json | jq '.tasks[].package'If a workspace you expected to appear is missing, the dependency declaration is absent from its manifest.
2. Configure base-commit tracking
Export the merge-base SHA as an environment variable that all subsequent steps share. This is the single most important configuration decision — an incorrect base produces unreliable results:
# Compute the true divergence point from the target branch
BASE=$(git merge-base HEAD origin/main)
echo "BASE_SHA=$BASE" >> "$GITHUB_ENV" # GitHub Actions
# or
echo "BASE_SHA=$BASE" >> build.env # GitLab artifact env3. Wire GitHub Actions
name: Monorepo — Affected CI
on:
pull_request:
branches: [main]
jobs:
affected:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # full history required for merge-base calculation
- uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- run: corepack enable && pnpm install --frozen-lockfile
# Restore Nx / Turborepo local cache between runs
- uses: actions/cache@v4
with:
path: |
.nx/cache
.turbo
key: ${{ runner.os }}-nx-${{ hashFiles('**/pnpm-lock.yaml') }}-${{ github.sha }}
restore-keys: |
${{ runner.os }}-nx-${{ hashFiles('**/pnpm-lock.yaml') }}-
${{ runner.os }}-nx-
# Nx: run lint + test + build only on affected projects
- name: Nx affected
run: |
BASE=$(git merge-base HEAD origin/${{ github.base_ref }})
npx nx affected \
--target=lint,test,build \
--base=$BASE \
--head=${{ github.sha }} \
--parallel=4
env:
NX_CLOUD_ACCESS_TOKEN: ${{ secrets.NX_CLOUD_ACCESS_TOKEN }}Line-by-line explanation:
fetch-depth: 0— without full historygit merge-basereturns nothing.hashFiles('**/pnpm-lock.yaml')— cache key changes only when the lockfile changes, protecting against stale node_modules hits.--target=lint,test,build— a single affected invocation fans out across all three targets; Nx respectsdependsOnordering automatically.--parallel=4— caps concurrent task processes; tune to available runner vCPUs.NX_CLOUD_ACCESS_TOKEN— enables distributed task execution and remote cache on Nx Cloud.
Verify the step produces the expected project list before running tasks:
npx nx show projects --affected --base=$BASE --head=${{ github.sha }}4. Wire GitLab CI
GitLab does not expose the target branch tip SHA as a ready-made variable. Use git merge-base against the fetched target branch:
stages: [affected, build]
.affected_base: &affected_base
image: node:20-alpine
before_script:
- git fetch --unshallow origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
- export BASE=$(git merge-base HEAD origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME)
cache:
key:
files: [pnpm-lock.yaml]
paths: [.nx/cache, node_modules/.cache]
lint-and-test:
<<: *affected_base
stage: affected
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
script:
- corepack enable && pnpm install --frozen-lockfile
- npx nx affected --target=lint,test --base=$BASE --head=$CI_COMMIT_SHA --parallel=4
build:
<<: *affected_base
stage: build
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
script:
- corepack enable && pnpm install --frozen-lockfile
- npx nx affected --target=build --base=$BASE --head=$CI_COMMIT_SHA --parallel=4
artifacts:
paths: [dist/]
expire_in: 1 dayLine-by-line explanation:
--unshallow— GitLab shallow clones omit ancestry required formerge-base; this restores it.cache.key.files— generates a unique cache key per lockfile state, preventing cross-branch pollution.CI_PIPELINE_SOURCE == "merge_request_event"— gates execution to MR pipelines only; branch pipelines onmainshould run a full build via a separate job.--parallel=4— GitLab shared runners provide 2 vCPUs; 4 keeps the queue saturated without OOM.
Verify affected output before committing to the full pipeline:
npx nx show projects --affected --base=$BASE --head=$CI_COMMIT_SHA5. Integrate remote caching
For Turborepo remote caching, inject the three required environment variables and enable remote-only mode on PR builds:
env:
TURBO_TOKEN: ${{ secrets.TURBO_TOKEN }}
TURBO_TEAM: ${{ vars.TURBO_TEAM }}
TURBO_API: ${{ vars.TURBO_API }}
- name: Build affected (Turborepo)
run: |
BASE=$(git merge-base HEAD origin/${{ github.base_ref }})
npx turbo run build \
--filter="...[origin/main]" \
--remote-only \
--cache-dir=.turboThe --filter="...[origin/main]" syntax is Turborepo’s native affected scope: it expands to all workspaces with changes since origin/main.
Configuration Reference
| Option | Type | Default | Effect |
|---|---|---|---|
--base (Nx) |
SHA string | last successful CI run | Oldest commit included in the diff; set to merge-base in PRs |
--head (Nx) |
SHA string | HEAD |
Newest commit included in the diff |
--parallel (Nx) |
integer | 3 |
Max concurrent task processes per runner |
--target (Nx) |
string / list | — | Task(s) to run across affected projects |
--filter (Turborepo) |
glob / range | — | Workspace scope; ...[origin/main] means affected since main |
--remote-only (Turborepo) |
flag | off | Skip local cache; enforce remote artifact reads |
remoteCache.enabled (turbo.json) |
boolean | false | Activates remote cache feature |
NX_CLOUD_ACCESS_TOKEN |
env var | — | Auth token for Nx Cloud distributed cache |
Integration with Sibling Topics
Incremental builds do not operate in isolation — they are most effective when combined with two adjacent capabilities from the Build Optimization & Caching Strategies pillar:
Remote Build Caching: Affected detection narrows which tasks run; remote caching eliminates re-running those tasks when artifacts already exist from a previous identical input hash. Without remote cache, every new runner restarts the affected tasks from scratch even if the source code is unchanged from a sibling branch.
Docker Layer Caching for Full-Stack Applications: When containerised runners produce Docker images as build artifacts, align Dockerfile COPY directives with workspace boundaries so that only the layers corresponding to affected workspaces are invalidated. Mount the .nx/cache or .turbo directory into the build container to avoid redundant dependency resolution steps.
Nx affected PR gating: The sibling deep-dive covers Nx-specific configuration in detail — branch protection rules, required status check wiring, and handling the edge case where no projects are affected (the empty-affected guard).
For tracking the financial impact of incremental builds, feed affected-count metrics into CI/CD compute cost tracking to demonstrate ROI to stakeholders.
Performance Benchmarks
Field data from platform teams running Nx or Turborepo affected pipelines on mid-sized monorepos (20–60 workspaces, active team of 10–30 engineers):
| Metric | Full-rebuild baseline | Affected pipeline | Change |
|---|---|---|---|
| Median PR build time | 14 min | 3.5 min | −75 % |
| p95 PR build time | 28 min | 9 min | −68 % |
| Runner minutes / day | ~840 | ~210 | −75 % |
| Remote cache hit rate (steady state) | n/a | 72–85 % | — |
| Graph init overhead | — | 8–15 s | added fixed cost |
The graph initialisation overhead (8–15 s) is a fixed cost that dominates on very small PRs. For single-file hotfixes touching a standalone utility package, the overhead can exceed the build time — acceptable given the savings across the broader PR queue.
Remote cache hit rates plateau at 72–85 % after ~2 weeks of steady-state traffic, once the most common input hashes have been populated. Hit rates below 60 % typically indicate environment drift (mismatched Node.js versions or OS between runners).
Troubleshooting
Cannot find module '@acme/ui' after workspace install
Root cause: pnpm install --frozen-lockfile ran before internal packages were built; a consumer workspace imports the compiled output that does not yet exist.
Fix: Add an explicit dependsOn in nx.json or turbo.json so the dependency’s build task runs before the consumer’s:
{
"tasks": {
"build": {
"dependsOn": ["^build"]
}
}
}The ^ prefix means “all upstream workspaces’ build tasks”.
merge-base returns empty string in GitLab
Error text: fatal: Not a valid object name ''
Root cause: The pipeline checked out a shallow clone. git merge-base requires ancestry data that shallow clones omit.
Fix:
git fetch --unshallow origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAMEAdd this to before_script before computing BASE.
Affected set includes every workspace on every PR
Root cause: A root-level file (package.json, nx.json, turbo.json, .eslintrc.js) changed, causing the entire graph to be marked affected. Or --base is set to the initial commit (HEAD~1 on a squash merge).
Fix: Exclude root config changes from the affected expansion using project-level tags and targetDefaults to scope impact. For release-config files, accept the full rebuild — correctness trumps speed here.
Cache hit rate drops below 50 % after adding a new runner pool
Root cause: The new runners run a different OS or Node.js patch version; the cache key includes the runner OS (runner.os in GitHub Actions), which differs.
Fix: Normalise all runners to the same base image. Include OS and Node version in the cache key explicitly:
key: ${{ runner.os }}-node20-nx-${{ hashFiles('**/pnpm-lock.yaml') }}Frequently Asked Questions
How does affected detection handle dynamic imports or runtime-only dependencies?
Static analysis tools — both Nx and Turborepo — miss dynamic import() calls and path aliases that bypass workspace manifests. The practical mitigation is to enforce explicit eslint-plugin-import no-cycle and no-unresolved rules that catch undeclared cross-workspace imports at lint time. For high-risk modules (shared config, design tokens), configure a fallback full-build by tagging them as implicitDependencies in nx.json so any change always marks all consumers affected.
What is the recommended cache TTL for monorepo build artifacts?
7–14 days with LRU eviction works well for active monorepos. Shorter TTLs (3 days) suit repos with high branch churn where stale artifacts from dead branches consume disproportionate storage. Longer TTLs (30 days) make sense for release branches where the same artifact hashes recur frequently. Pair TTL with a maximum total storage budget and an automated pruning script rather than relying on TTL alone.
When should I disable incremental builds in CI?
Disable for merge commits to main or release/* — run a full build to guarantee artifact consistency and exercise all integration tests. Also disable when the dependency graph integrity is in question: after a major dependency upgrade, after a significant workspace restructure, or when you see divergence between local and CI build outputs. The weekly main full-rebuild also serves as a canary for infrastructure drift.
How do I ensure environment parity between local and CI runners?
Pin Node.js via .nvmrc or .tool-versions and commit the file. Use a containerised runner image that hard-codes the same Node version. Include process.env.NODE_VERSION and process.env.npm_config_cache in the cache key fingerprint so a mismatch produces a cache miss rather than a poisoned hit. Validate parity regularly by comparing node --version output across local, PR, and release runner logs.
Related
- Implementing Remote Build Caching with Turborepo — eliminate redundant task execution by sharing artifact hashes across all runners in your organisation.
- How to Configure Nx Affected Commands for Faster PR Checks — Nx-specific branch protection rules, required status check wiring, and the empty-affected guard pattern.
- Docker Layer Caching for Full-Stack Applications — align container build layers with workspace boundaries to skip redundant image rebuilds.
- Tracking CI/CD Compute Costs for Platform Teams — quantify the runner-minute savings from incremental builds and report ROI against infrastructure spend.
← Back to Build Optimization & Caching Strategies