Expand description
Change detection for incremental deployment.
This module implements a Dirty Propagation Algorithm to determine which database objects, schemas, and clusters need redeployment after changes.
§Algorithm Overview
The algorithm computes three result sets via fixed-point iteration:
DirtyStmt(object)- All objects that must be reprocessedDirtyCluster(cluster)- All clusters that must be refreshedDirtySchema(database, schema)- All schemas containing dirty objects
§Seeds
The fixed-point starts from two caller-supplied inputs:
ChangedStmt(O)— objects whose hashes differ between the old and new snapshots.ForcedSchema(Db, Sch)— schemas the caller marks dirty unconditionally (stage --redeploy-schema), redeployed even when nothing in them changed.
§Propagation Rules
§Rule Category 1 — Statement Dirtiness
DirtyStmt(O) :- ChangedStmt(O) # Changed objects are dirty
DirtyStmt(O) :- StmtUsesCluster(O, C), DirtyCluster(C) # Objects on dirty statement clusters are dirty
DirtyStmt(O) :- DependsOn(O, P), DirtyStmt(P), NOT IsReplacement(P) # Downstream dependents are dirty, except through replacement MVs
DirtyStmt(O) :- DirtySchema(Db, Sch), ObjectInSchema(O, Db, Sch) # Every object in a dirty schema is dirtyReplacement MVs: A replacement MV (in a stable-API schema, redeployed in place) has exactly one special property — its dirtiness does not propagate downstream to dependents in other schemas. Otherwise it behaves like any other compute object: a dirty replacement MV dirties its schema, a dirty stable schema redeploys all of its MVs atomically, and a dirty cluster propagates normally.
Key Insight: Index clusters do NOT cause objects to be marked dirty. Indexes are physical optimizations that can be managed independently without redeploying the object’s statement. If object A’s index uses a dirty cluster, object A is NOT marked for redeployment.
§Rule Category 2 — Cluster Dirtiness
DirtyCluster(C) :- ChangedStmt(O), StmtUsesCluster(O, C), NOT IsSink(O), ClusterBoundary(C) # Clusters of changed statements are dirty within the boundary
DirtyCluster(C) :- ChangedStmt(O), IndexUsesCluster(O, _, C), NOT IsSink(O), ClusterBoundary(C) # Clusters of changed indexes are dirty within the boundaryNote: Clusters are only marked dirty when the STATEMENT itself changes,
not when the object is dirty for other reasons (dependencies, schema propagation, etc.).
Sinks are excluded because they write to external systems and are created after the swap.
ClusterBoundary is the set of clusters referenced by statements or
indexes in the project. A cluster can become dirty only if it is both used
by a changed object and present in that boundary.
§Rule Category 3 — Schema Dirtiness
DirtySchema(Db, Sch) :- ForcedSchema(Db, Sch) # Forced schemas are dirty up front (seed)
DirtySchema(Db, Sch) :- DirtyStmt(O), ObjectInSchema(O, Db, Sch), NOT IsSink(O) # Dirty objects make their schemas dirty (excluding sinks)Key Property: All dirty objects (except sinks) contribute to schema dirtiness, which triggers schema-level atomic redeployment. Sinks are excluded because they are created after the swap during apply and shouldn’t cause other objects to be redeployed.
Modules§
- base_
facts 🔒 - Base fact extraction from a planned project.
- datalog 🔒
- Datalog-style fixed-point computation of dirty objects, clusters, and schemas.
- diff 🔒
- Snapshot diff — finds objects whose hashes changed between two deployments.
- logging 🔒
- Verbose logging helpers for the Datalog fixed-point computation.
- types 🔒
- Core
ChangeSettype and its display formatting.