Automated Preview Deployments on Pull Requests
Without per-PR isolation, developers share a single staging slot β meaning a broken feature branch blocks everyone elseβs review, and reviewers cannot test UI or API changes without checking out code locally. The fix is a deterministic CI/CD workflow that provisions a fully isolated environment for every pull request, posts a live URL to the PR thread, and tears the environment down automatically when the branch closes. This page details production-grade automation for that lifecycle, from webhook trigger to namespace garbage collection, targeting platform teams running GitHub Actions or GitLab CI against Kubernetes or container-based infrastructure.
Prerequisites
How Per-PR Deployment Automation Works
Every PR preview pipeline follows a four-phase lifecycle: trigger β provision β deploy β teardown. Understanding each phase is essential for debugging the failure modes covered later.
Trigger. The CI system subscribes to pull request webhook events: opened, synchronize (new commit pushed), and closed. The synchronize event must cancel any in-flight build for the same PR before starting a new one β without this guard, rapid commits leave stale deployments pointing at old commits.
Provision. The pipeline creates an isolated namespace or container group keyed to the PR number. Parameterised Helm charts or Terraform modules render environment-specific manifests (namespace, service accounts, ingress rules, resource quotas) from a single template β no hand-edited per-branch YAML. The ingress rule maps a dynamic subdomain (pr-<N>.preview.example.com) to the service that will be deployed into that namespace.
Deploy. Build artifacts (container images, static assets) produced by parallel frontend and backend jobs are pushed to the registry, then the Helm release or equivalent deployment object is applied to the provisioned namespace. A health-check probe against the appβs readiness endpoint gates the URL-posting step so developers never receive a link to a crashed pod.
Teardown. A separate CI job, triggered only on the closed event, deletes the Helm release and namespace, releasing CPU, memory, and persistent volume claims. A TTL-based cron job runs daily to catch orphaned namespaces left by force-pushed branches or deleted webhooks.
The diagram below shows the full lifecycle across CI, Kubernetes, and the PR thread:
Step-by-Step Implementation
Step 1 β Configure the GitHub Actions workflow
The core workflow listens to three PR event types and uses a concurrency group to serialise runs per PR number. Place this file at .github/workflows/preview.yml.
name: PR Preview Deployment
on:
pull_request:
types: [opened, synchronize, closed]
# One run per PR; cancel the in-flight job when a new commit arrives
concurrency:
group: preview-pr-${{ github.event.pull_request.number }}
cancel-in-progress: true
env:
PR_NUMBER: ${{ github.event.pull_request.number }}
PREVIEW_HOST: pr-${{ github.event.pull_request.number }}.preview.example.com
jobs:
deploy:
runs-on: ubuntu-latest
if: github.event.action != 'closed'
permissions:
id-token: write # OIDC token for cluster auth
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v4
- name: Authenticate to cluster (OIDC)
uses: azure/login@v2 # replace with your cloud provider action
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Build and push container images
run: |
docker build -t registry.example.com/app:pr-${PR_NUMBER} .
docker push registry.example.com/app:pr-${PR_NUMBER}
- name: Provision namespace and deploy via Helm
run: |
helm upgrade --install preview-${PR_NUMBER} ./charts/preview \
--namespace preview-${PR_NUMBER} \
--create-namespace \
--set image.tag=pr-${PR_NUMBER} \
--set ingress.host=${PREVIEW_HOST} \
--wait --timeout 5m
- name: Wait for readiness
run: |
kubectl rollout status deployment/app \
-n preview-${PR_NUMBER} --timeout=3m
- name: Post preview URL to PR
uses: actions/github-script@v7
with:
script: |
await github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `**Preview deployed:** https://${process.env.PREVIEW_HOST}\n\nThis environment will be torn down when the PR is closed.`
})
env:
PREVIEW_HOST: ${{ env.PREVIEW_HOST }}
teardown:
runs-on: ubuntu-latest
if: github.event.action == 'closed'
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Authenticate to cluster (OIDC)
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Delete Helm release and namespace
run: |
helm uninstall preview-${PR_NUMBER} \
--namespace preview-${PR_NUMBER} --ignore-not-found
kubectl delete namespace preview-${PR_NUMBER} --ignore-not-foundVerify: After a PR is opened, run kubectl get namespace preview-<N> β the namespace should exist and show Active. After the PR is merged, the namespace should be absent.
Step 2 β Configure GitLab CI dynamic environments
GitLabβs environment block with auto_stop_in handles TTL enforcement natively, removing the need for a separate teardown cron job.
stages:
- preview
- cleanup
preview_deploy:
image: alpine/helm:3.14.0
stage: preview
rules:
# Only trigger on merge request pipelines, never on branch pushes
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
when: always
variables:
ENV_NAME: preview-$CI_COMMIT_REF_SLUG
PREVIEW_URL: https://$CI_COMMIT_REF_SLUG.preview.example.com
environment:
name: $ENV_NAME
url: $PREVIEW_URL
# Hard lifecycle cap; GitLab stops the environment after 24 hours
auto_stop_in: 24h
on_stop: preview_teardown
script:
- helm upgrade --install $ENV_NAME ./charts/preview
--namespace $ENV_NAME
--create-namespace
--set image.tag=$CI_COMMIT_SHA
--set ingress.host=$CI_COMMIT_REF_SLUG.preview.example.com
--wait
preview_teardown:
image: alpine/helm:3.14.0
stage: cleanup
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
when: manual
environment:
name: preview-$CI_COMMIT_REF_SLUG
action: stop
script:
- helm uninstall $ENV_NAME --namespace $ENV_NAME --ignore-not-found
- kubectl delete namespace $ENV_NAME --ignore-not-foundVerify: Open GitLab β Deployments β Environments. The preview environment should appear with the correct URL and an auto_stop_in countdown badge.
Step 3 β Author the Helm chart for per-PR namespaces
A single parameterised chart generates all namespace-scoped resources. This is the critical file to get right: misconfigured RBAC or resource quotas here cause the majority of PR preview failures.
# charts/preview/templates/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: preview-{{ .Values.pr_id }}
labels:
app.kubernetes.io/managed-by: helm
preview-env: "true" # label used by the TTL cron job for orphan detection
---
# charts/preview/templates/resourcequota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: preview-quota
namespace: preview-{{ .Values.pr_id }}
spec:
hard:
requests.cpu: "500m"
requests.memory: "512Mi"
limits.cpu: "1000m"
limits.memory: "1Gi"
pods: "10"
---
# charts/preview/templates/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: preview-ingress
namespace: preview-{{ .Values.pr_id }}
annotations:
# cert-manager issues a TLS cert per subdomain automatically
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/proxy-read-timeout: "60"
spec:
ingressClassName: nginx
tls:
- hosts:
- pr-{{ .Values.pr_id }}.preview.example.com
secretName: preview-tls-{{ .Values.pr_id }}
rules:
- host: pr-{{ .Values.pr_id }}.preview.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app-service
port:
number: 80Verify: helm template preview-42 ./charts/preview --set pr_id=42 | grep "name: preview-42" should print the namespace and ingress names.
Step 4 β Orphan cleanup cron job
Force-pushes and deleted branches can leave namespaces behind if webhooks silently fail. A daily cron job reconciles reality against open PRs.
# .github/workflows/preview-cleanup.yml
name: Cleanup Orphaned Preview Namespaces
on:
schedule:
# Runs at 03:00 UTC daily
- cron: "0 3 * * *"
workflow_dispatch: # allow manual trigger for incident response
jobs:
cleanup:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Authenticate to cluster
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Delete namespaces older than 48h with no open PR
run: |
# List preview namespaces and their creation timestamps
kubectl get ns -l preview-env=true \
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.creationTimestamp}{"\n"}{end}' \
| while IFS=$'\t' read -r ns created; do
pr_num="${ns#preview-}"
age_hours=$(( ( $(date +%s) - $(date -d "$created" +%s) ) / 3600 ))
# Only delete if namespace is older than 48h (safety margin)
if [ "$age_hours" -gt 48 ]; then
echo "Deleting orphaned namespace $ns (${age_hours}h old)"
helm uninstall "$ns" --namespace "$ns" --ignore-not-found || true
kubectl delete namespace "$ns" --ignore-not-found
fi
doneVerify: kubectl get ns -l preview-env=true before and after a manual workflow_dispatch run β namespaces older than 48 hours should disappear.
Configuration Reference
| Option | Type | Default | Effect |
|---|---|---|---|
concurrency.cancel-in-progress |
bool | false |
Cancels the previous run for the same PR when a new commit arrives; set to true to avoid stale deploys |
helm upgrade --wait |
flag | off | Blocks the pipeline until all pods are ready; prevents posting a broken URL |
helm upgrade --timeout |
duration | 5m0s |
Maximum wait for readiness; increase for slow cold-start images |
ResourceQuota.limits.memory |
quantity | unbounded | Cap per-namespace memory; prevents a single PR from starving shared compute |
auto_stop_in (GitLab) |
duration | none | Hard TTL on the environment; GitLab calls on_stop job automatically |
cert-manager.io/cluster-issuer |
annotation | none | Issues a TLS cert per preview subdomain; replace with letsencrypt-staging for non-production clusters |
preview-env: "true" label |
label | none | Marks namespaces for orphan detection by the cleanup cron job |
Integration with Sibling Topics
Automated deployments are the entry point, but they depend on and feed into the rest of the Preview Environments & Environment Parity workflow.
Configuration injection. Each Helm release needs environment-specific runtime values. Synchronizing Environment Variables Across Stages covers how to propagate scoped secrets and feature-flag overrides into ephemeral namespaces without leaking production credentials.
Data layer. Isolated namespaces alone are not enough when the app talks to a database. Database Mocking and Seeding for Ephemeral Environments covers schema-per-tenant Postgres provisioning, lightweight SQLite mocks, and anonymised snapshot restoration for high-fidelity previews.
Build artifact reuse. Preview pipelines benefit from implementing remote build caching with Turborepo or Docker layer caching so that each PR only rebuilds changed packages rather than the full monorepo. Typical monorepos see a 60β80% reduction in build minutes after enabling remote cache hits.
Pipeline structure. The concurrency and job-dependency patterns here are a specialised application of designing multi-stage CI/CD pipelines for React apps β the same fan-out/fan-in topology, applied to ephemeral infrastructure rather than a fixed staging slot.
Performance and Cost Impact
Running 20 concurrent feature branches on a mid-sized team has measurable infrastructure costs. These numbers are representative for a Node.js + PostgreSQL stack on a managed Kubernetes service (EKS, GKE, or AKS).
| Metric | Without per-PR isolation | With automated per-PR previews |
|---|---|---|
| Merge conflicts caused by shared staging | ~3β5 per sprint | 0 (each PR has its own slot) |
| Average review cycle time | 2β3 days (queue for staging) | 4β8 hours (URL in PR within 8 min) |
| Build time per PR (cold, no cache) | N/A | ~12 min (Node.js + Docker build) |
| Build time per PR (warm Turborepo remote cache) | N/A | ~3 min (75% cache hit rate) |
| Cost per preview environment per day | N/A | ~$0.80β$2.50 (0.5 vCPU / 512 MB) |
| Orphaned environment cleanup with cron | Manual (hoursβdays of waste) | Automatic (48h max overage) |
The dominant cost lever is cache hit rate β getting from 0% to 75% remote cache hits cuts build minutes by ~3Γ and is the single highest-ROI optimisation for high-frequency PR workflows. Resource quotas capping each namespace at 1 vCPU / 1 Gi keep per-PR costs predictable regardless of team size.
Troubleshooting
ImagePullBackOff in preview namespace
Exact error: Failed to pull image "registry.example.com/app:pr-42": unauthorized
Root cause: The imagePullSecret is missing from the preview namespace. Helm namespaces created with --create-namespace do not inherit secrets from the default namespace.
Fix: Add an imagePullSecret to the Helm chart templates:
# charts/preview/templates/registry-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: registry-credentials
namespace: preview-{{ .Values.pr_id }}
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: {{ .Values.registryCredentials | b64enc }}Then pass --set registryCredentials=$(cat ~/.docker/config.json | base64 -w0) in the Helm upgrade step, or better, sync the secret via your secrets manager.
Ingress returns 503 before pods are ready
Exact error: 503 Service Temporarily Unavailable immediately after helm upgrade completes.
Root cause: The --wait flag waits for Deployment readiness but the ingress controller may route traffic before the first request is served. The helm upgrade --wait only checks pod readiness, not ingress backend registration.
Fix: Add an explicit readiness probe gate after the Helm step:
kubectl wait --for=condition=ready pod \
-l app.kubernetes.io/instance=preview-${PR_NUMBER} \
-n preview-${PR_NUMBER} \
--timeout=120sThen optionally verify the ingress backend:
curl -sf --retry 5 --retry-delay 3 \
https://pr-${PR_NUMBER}.preview.example.com/healthDNS propagation delay β URL unreachable for minutes after deploy
Exact error: curl: (6) Could not resolve host: pr-42.preview.example.com
Root cause: The wildcard DNS record has a TTL of 300+ seconds, and the initial DNS lookup for a new subdomain misses cache on the clientβs resolver.
Fix: Lower the wildcard record TTL to 60 seconds (most managed DNS providers allow this without extra cost). Alternatively, switch to path-based routing (preview.example.com/pr/42) β a single root record with a low TTL eliminates per-subdomain propagation delays entirely. Update the ingress host and PREVIEW_HOST env var pattern accordingly.
exceeded quota β namespace creation fails after a team grows
Exact error: Error from server (Forbidden): namespaces "preview-57" is forbidden: exceeded quota: namespace-count, requested: count/namespaces=1, used: count/namespaces=50, limited: count/namespaces=50
Root cause: Kubernetes or the namespace tier has a hard namespace count limit. Stale preview namespaces from merged PRs have accumulated.
Fix: Run the orphan cleanup workflow manually (workflow_dispatch), then reduce the cron interval from daily to every 6 hours. Set a Kubernetes LimitRange at the control-plane level restricting new preview namespaces and alerting ops when the count exceeds 40.
Frequently Asked Questions
How do I handle database migrations for preview environments?
Automate schema provisioning using your migration tool (Flyway, Liquibase, rails db:migrate, Prisma Migrate) as an init container or a post-install Helm hook. Use isolated ephemeral databases per namespace β either a lightweight Postgres container in the same namespace, or a schema-per-tenant pattern on a shared instance β to prevent cross-PR data leakage. Seed test data as part of the migration hook, not as a separate manual step.
What is the recommended TTL for PR preview URLs?
Set a default TTL of 24 hours for active PRs, extended automatically when a new commit triggers synchronize. Force teardown within 2 hours of PR merge or closure via the closed event webhook. For teams with overnight review cycles, 48 hours is a reasonable upper bound. Never rely solely on the closed event β the cron-based orphan cleaner is a mandatory safety net.
Can I reuse production secrets safely in preview deployments?
No. Inject scoped preview credentials instead: cloud provider OIDC federation lets the CI runner request short-lived, least-privilege tokens for the preview namespace without any long-lived secret stored in the repo. Mock or stub third-party APIs (payment processors, identity providers) at the application layer using environment-variable feature flags so previews never touch production external services.
How do I route multiple PRs without hitting DNS record limits?
Use a single wildcard subdomain (*.preview.example.com) with the ingress controller handling subdomain routing internally β this creates one DNS record regardless of how many PRs are open. If your DNS provider does not support wildcards or you have TLS cert-issuance latency, switch to path-based routing (preview.example.com/pr/<N>) which requires no DNS changes and can share a single TLS certificate.
Related
- Synchronizing Environment Variables Across Stages β propagate scoped secrets and feature-flag overrides into ephemeral namespaces without leaking production credentials.
- Database Mocking and Seeding for Ephemeral Environments β provision deterministic data layers for short-lived preview namespaces.
- Designing Multi-Stage CI/CD Pipelines for React Apps β the fan-out/fan-in pipeline topology that underpins per-PR deployment jobs.
- Implementing Remote Build Caching with Turborepo β cut preview build time by 60β80% by sharing a remote cache across PRs in a monorepo.
β Back to Preview Environments & Environment Parity