Crate mz_transform

source ·
Expand description

Transformations for relation expressions.

This crate contains traits, types, and methods suitable for transforming MirRelationExpr types in ways that preserve semantics and improve performance. The core trait is Transform, and many implementors of this trait can be boxed and iterated over. Some common transformation patterns are wrapped as Transform implementors themselves.

The crate also contains the beginnings of whole-dataflow optimization, which uses the same analyses but spanning multiple dataflow elements.



  • Traits and types for reusable expression analysis
  • Transformations that bring relation expressions to their canonical form.
  • Canonicalizes MFPs, e.g., performs CSE on the scalar expressions, eliminates identity MFPs.
  • Transformations based on pulling information about individual columns from sources.
  • Transformations that don’t fit into one of the canonicalization, fusion, movement, or ordering buckets.
  • Common subexpression elimination.
  • Whole-dataflow optimization
  • Transformation based on pushing demand information about columns toward sources.
  • Propagates expression equivalence from leaves to root, and back down again.
  • Replace operators on constants collections with constant collections.
  • Transformations that fuse together others of their kind.
  • Determines the join implementation for join operators.
  • See if there are predicates of the form <expr> = literal that can be sped up using an index. More specifically, look for an MFP on top of a Get, where the MFP has an appropriate filter, and the Get has a matching index. Convert these to IndexedFilter joins, which is a semi-join with a constant collection.
  • Hoist literal values from maps wherever possible.
  • Analysis to identify monotonic collections, especially TopK inputs.
  • Transformations that move relation expressions up (lifting) and down (pushdown) the tree.
  • Push non-null requirements toward sources.
  • Harvests information about non-nullability of columns from sources.
  • Normalize the structure of Let and LetRec operators in expressions.
  • Normalize the structure of various operators.
  • Notices that the optimizer wants to show to users.
  • Transformations that impose a canonical order on the inputs of multi-input relation expressions.
  • Pushes predicates down through other operators.
  • Removes Reduce when the input has as unique keys the keys of the reduce.
  • Tries to convert a reduce around a join to a join of reduces. Also absorbs Map operators into Reduce operators.
  • Remove redundant collections of distinct elements from joins.
  • Remove semijoins that are applied multiple times to no further effect.
  • Definition and trait instances for working with symbolic algebraic expressions, SymbolicExpression
  • Remove Threshold operators when we are certain no records have negative multiplicity.
  • Check that the visible type of each query has not been changed
  • Detects an input being unioned with its negation and cancels them out


  • Compute the conjunction of a variadic number of expressions.
  • Compute the disjunction of a variadic number of expressions.




  • A trait for a type that can answer questions about what indexes exist.
  • A trait for a type that can estimate statistics about a given GlobalId
  • Types capable of transforming relation expressions.