Skip to content

Cache Architecture

INFO

This page provides a technical overview of the Tuist cache service architecture. It is primarily intended for self-hosting users and contributors who need to understand the internal workings of the service. General users who only want to use the cache do not need to read this.

The Tuist cache service is a standalone service that provides Content Addressable Storage (CAS) for build artifacts and a key-value store for cache metadata.

Overview

The service uses a two-tier storage architecture plus local SQLite metadata:

  • Local disk: Primary storage for low-latency cache hits
  • S3: Durable storage that persists artifacts and allows recovery after eviction
  • SQLite: Local metadata for artifact access tracking, orphan cleanup, background jobs, and key-value cache data

Components

Nginx

Nginx serves as the entry point and handles efficient file delivery using X-Accel-Redirect:

  • Downloads: The cache service validates authentication, then returns an X-Accel-Redirect header. Nginx serves the file directly from disk or proxies from S3.
  • Uploads: Nginx proxies requests to the cache service, which streams data to disk.

Content Addressable Storage

Artifacts are stored on local disk in a sharded directory structure:

  • Path: {account}/{project}/cas/{shard1}/{shard2}/{artifact_id}
  • Sharding: First four characters of the artifact ID create a two-level shard (e.g., ABCD1234AB/CD/ABCD1234)

SQLite Metadata

The cache service uses two SQLite databases:

  • Primary metadata DB: Stores cache_artifacts, orphan scan cursors, Oban jobs, and other service metadata.
  • Key-value DB: Stores key_value_entries and key_value_entry_hashes in a dedicated SQLite file.

The key-value store is split into its own database so it can use SQLite incremental auto-vacuum without affecting artifact metadata and orphan cleanup state.

S3 Integration

S3 provides durable storage:

  • Background uploads: After writing to disk, artifacts are queued for upload to S3 via a background worker that runs every minute
  • On-demand hydration: When a local artifact is missing, the request is served immediately via a presigned S3 URL while the artifact is queued for background download to local disk

Disk Eviction

The service manages disk space using multiple background processes:

  • CAS disk eviction uses LRU semantics backed by cache_artifacts
  • When disk usage exceeds 85%, the oldest artifacts are deleted until usage drops to 70%
  • Artifacts remain in S3 after local eviction
  • KV eviction removes old key-value entries by retention and can also shrink the dedicated KV database when it grows past its configured size budget

Orphan Cleanup

The service also runs an orphan cleanup worker for disk artifacts:

  • It scans the storage tree for files that exist on disk but have no corresponding cache_artifacts row.
  • This can happen if a file is written to disk but the metadata write is lost before the SQLite buffer flush completes.
  • Files newer than a safety window are ignored to avoid racing with in-flight uploads.
  • If an orphan is deleted and later requested again, the next cache miss causes it to be uploaded again, so the system self-heals.

Authentication

The cache delegates authentication to the Tuist server by calling the /api/projects endpoint and caching results (10 minutes for success, 3 seconds for failure).

Request Flows

Download

Upload

API Endpoints

EndpointMethodDescription
/upGETHealth check
/metricsGETPrometheus metrics
/api/cache/cas/:idGETDownload CAS artifact
/api/cache/cas/:idPOSTUpload CAS artifact
/api/cache/keyvalue/:cas_idGETGet key-value entry
/api/cache/keyvaluePUTStore key-value entry
/api/cache/module/:idHEADCheck if module artifact exists
/api/cache/module/:idGETDownload module artifact
/api/cache/module/startPOSTStart multipart upload
/api/cache/module/partPOSTUpload part
/api/cache/module/completePOSTComplete multipart upload

Released under the MIT License.