Implement Advanced Caching Strategies - Reduce dependency installation by 60%
Phase 4: Advanced Caching Strategies
🎯 Objective
Reduce npm dependency installation time from 90+ seconds to ~30 seconds through intelligent caching strategies.
📊 Current Caching Analysis
Current State
- npm ci execution: 90-120 seconds per job
- Cache hit rate: ~50% (key changes frequently)
- Cache size: Growing unbounded
- No layer caching for Docker operations
Problems Identified
- Cache key based only on
package-lock.json
- misses when lock file changes - No fallback cache strategies
- Downloading full npm registry metadata repeatedly
- No caching of built artifacts between stages
🚀 Implementation Strategy
1. Multi-Level Cache Strategy
# .gitlab-ci.yml
variables:
npm_config_cache: "$CI_PROJECT_DIR/.npm"
YARN_CACHE_FOLDER: "$CI_PROJECT_DIR/.yarn"
.npm_cache_enhanced: &npm_cache_enhanced
cache:
- key:
files:
- package-lock.json
paths:
- .npm/
- node_modules/
policy: pull-push
# Fallback cache when lock file changes
- key: "$CI_COMMIT_REF_SLUG"
paths:
- .npm/
policy: pull
# Global fallback cache
- key: "npm-cache-global"
paths:
- .npm/
policy: pull
2. Artifact Caching Between Stages
# .gitlab-ci.yml
build:
stage: build
script:
- npm ci --cache .npm --prefer-offline
- npm run build
artifacts:
paths:
- dist/
- node_modules/
- .npm/
expire_in: 2 hours
# Use artifacts for downstream jobs
when: on_success
test:coverage:
stage: test
dependencies: ["build"] # Reuse build artifacts
script:
# Skip npm ci if node_modules exists from artifacts
- |
if [ ! -d "node_modules" ]; then
npm ci --cache .npm --prefer-offline
else
echo "Using node_modules from build artifacts"
fi
- npm run test:coverage:ci
3. Docker Layer Caching
# .gitlab-ci.yml
variables:
DOCKER_BUILDKIT: 1
BUILDKIT_INLINE_CACHE: 1
before_script:
# Enable Docker layer caching
- docker buildx create --use --driver docker-container
- docker buildx inspect --bootstrap
build:docker:
script:
- |
docker buildx build \
--cache-from type=registry,ref=$CI_REGISTRY_IMAGE:cache \
--cache-to type=registry,ref=$CI_REGISTRY_IMAGE:cache,mode=max \
--tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA \
--push .
4. NPM Registry Caching
# .gitlab-ci.yml
before_script:
# Configure npm to use local cache aggressively
- npm config set prefer-offline true
- npm config set cache-min 999999999
- npm config set fetch-retries 3
- npm config set fetch-retry-mintimeout 15000
- npm config set fetch-retry-maxtimeout 90000
5. Selective Dependency Installation
// package.json - Add CI-specific install script
{
"scripts": {
"ci:install": "npm ci --omit=optional --ignore-scripts --cache .npm --prefer-offline",
"ci:install:production": "npm ci --production --cache .npm --prefer-offline"
}
}
6. Cache Warming Strategy
# .gitlab-ci.yml - Scheduled pipeline to warm caches
cache:warm:
only:
- schedules
script:
- npm ci --cache .npm
- npm run build
- echo "Cache warmed at $(date)"
cache:
key: "npm-cache-global"
paths:
- .npm/
- node_modules/
policy: push
7. Intelligent Cache Invalidation
# .gitlab-ci.yml
variables:
# Version cache keys to allow controlled invalidation
CACHE_VERSION: "v2"
.npm_cache: &npm_cache
cache:
key: "$CACHE_VERSION-$CI_COMMIT_REF_SLUG-$CI_PIPELINE_ID"
paths:
- .npm/
- node_modules/
# Automatic cleanup of old caches
untracked: false
8. Distributed Cache with GitLab
# .gitlab-ci.yml
default:
cache:
# Use distributed cache for better performance
key:
files:
- package-lock.json
- .gitlab-ci.yml # Invalidate on CI changes
paths:
- .npm/
- node_modules/
# Cache compression for faster transfer
untracked: false
when: on_success
📈 Expected Improvements
Before
- npm ci: 90-120 seconds
- Cache hit rate: ~50%
- Total dependency time: ~100 seconds average
After
- npm ci (cache hit): 20-30 seconds
- npm ci (cache miss): 60-70 seconds
- Cache hit rate: ~85%
- Total dependency time: ~35 seconds average
Net Savings
- ~65 seconds per job average
- 60% reduction in dependency installation time
⚠️ Risk Assessment
Low Risk
- npm configuration changes
- Artifact reuse between stages
- Cache key strategies
Medium Risk
- Cache size growth (mitigated by expiration)
- Stale cache issues (mitigated by versioning)
- Network issues with distributed cache
Mitigation
- Regular cache pruning via scheduled pipelines
- Cache version bumping when needed
- Fallback strategies for cache misses
✅ Success Criteria
- npm ci < 30 seconds with cache hit
- Cache hit rate > 80%
- Cache size < 500MB per job
- No stale dependency issues
- Total CI time reduced by 2+ minutes
📋 Implementation Steps
- Phase 1: Multi-level cache keys
- Phase 2: Artifact sharing between stages
- Phase 3: Docker layer caching
- Phase 4: Cache warming pipelines
- Phase 5: Monitoring and optimization
🔗 Related Issues
📊 Monitoring Metrics
- Cache hit/miss ratio
- npm ci execution time
- Cache transfer time
- Total cache size
- Job duration percentiles
🏷️ Advanced Optimizations (Future)
npm Optimization
# Use pnpm for faster installs (70% faster)
pnpm install --frozen-lockfile --prefer-offline
# Or Yarn with PnP (Plug'n'Play)
yarn install --immutable --immutable-cache
Pre-built Docker Images
# Dockerfile.ci
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
# Push as base image for CI
Distributed npm Cache
# Using Artifactory or Nexus as npm proxy
npm config set registry https://npm-cache.company.internal/
Priority: Medium
Estimated Effort: 1 day
Labels: performance
, ci-cd
, caching