Struct mz_expr::JoinInputMapper

source ·
pub struct JoinInputMapper {
    arities: Vec<usize>,
    input_relation: Vec<usize>,
    prior_arities: Vec<usize>,
}
Expand description

Any column in a join expression exists in two contexts:

  1. It has a position relative to the result of the join (global)
  2. It has a position relative to the specific input it came from (local) This utility focuses on taking expressions that are in terms of the local input and re-expressing them in global terms and vice versa.

Methods in this class that take an argument equivalences are only guaranteed to return a correct answer if equivalence classes are in canonical form. (See crate::relation::canonicalize::canonicalize_equivalences.)

Fields§

§arities: Vec<usize>

The number of columns per input. All other fields in this struct are derived using the information in this field.

§input_relation: Vec<usize>

Looks up which input each column belongs to. Derived from arities. Stored as a field to avoid recomputation.

§prior_arities: Vec<usize>

The sum of the arities of the previous inputs in the join. Derived from arities. Stored as a field to avoid recomputation.

Implementations§

source§

impl JoinInputMapper

source

pub fn new(inputs: &[MirRelationExpr]) -> Self

Creates a new JoinInputMapper and calculates the mapping of global context columns to local context columns.

source

pub fn new_from_input_types(types: &[RelationType]) -> Self

Creates a new JoinInputMapper and calculates the mapping of global context columns to local context columns. Using this method saves is more efficient if input types have been pre-calculated

source

pub fn new_from_input_arities<I>(arities: I) -> Self
where I: IntoIterator<Item = usize>,

Creates a new JoinInputMapper and calculates the mapping of global context columns to local context columns. Using this method saves is more efficient if input arities have been pre-calculated

source

pub fn total_columns(&self) -> usize

reports sum of the number of columns of each input

source

pub fn total_inputs(&self) -> usize

reports total numbers of inputs in the join

source

pub fn global_keys<'a, I>( &self, local_keys: I, equivalences: &[Vec<MirScalarExpr>], ) -> Vec<Vec<usize>>
where I: Iterator<Item = &'a Vec<Vec<usize>>>,

Using the keys that came from each local input, figures out which keys remain unique in the larger join Currently, we only figure out a small subset of the keys that can remain unique.

source

pub fn input_arity(&self, index: usize) -> usize

returns the arity for a particular input

source

pub fn local_columns(&self, index: usize) -> Range<usize>

All column numbers in order for a particular input in the local context

source

pub fn global_columns(&self, index: usize) -> Range<usize>

All column numbers in order for a particular input in the global context

source

pub fn map_expr_to_local(&self, expr: MirScalarExpr) -> MirScalarExpr

Takes an expression from the global context and creates a new version where column references have been remapped to the local context. Assumes that all columns in expr are from the same input.

source

pub fn map_expr_to_global( &self, expr: MirScalarExpr, index: usize, ) -> MirScalarExpr

Takes an expression from the local context of the indexth input and creates a new version where column references have been remapped to the global context.

source

pub fn map_column_to_local(&self, column: usize) -> (usize, usize)

Remap column numbers from the global to the local context. Returns a 2-tuple (<new column number>, <index of input>)

source

pub fn map_column_to_global(&self, column: usize, index: usize) -> usize

Remap a column number from a local context to the global context.

source

pub fn split_column_set_by_input<'a, I>( &self, columns: I, ) -> Vec<BTreeSet<usize>>
where I: Iterator<Item = &'a usize>,

Takes a sequence of columns in the global context and splits it into a Vec containing self.total_inputs() BTreeSets, each containing the localized columns that belong to the particular input.

source

pub fn lookup_inputs(&self, expr: &MirScalarExpr) -> impl Iterator<Item = usize>

Find the sorted, dedupped set of inputs an expression references

source

pub fn single_input(&self, expr: &MirScalarExpr) -> Option<usize>

Returns the index of the only input referenced in the given expression.

source

pub fn is_localized(&self, expr: &MirScalarExpr, index: usize) -> bool

Returns whether the given expr refers to columns of only the indexth input.

source

pub fn find_bound_expr( &self, expr: &MirScalarExpr, bound_inputs: &[usize], equivalences: &[Vec<MirScalarExpr>], ) -> Option<MirScalarExpr>

Takes an expression in the global context and looks in equivalences for an equivalent expression (also expressed in the global context) that belongs to one or more of the inputs in bound_inputs

§Examples
use mz_repr::{Datum, ColumnType, RelationType, ScalarType};
use mz_expr::{JoinInputMapper, MirRelationExpr, MirScalarExpr};

// A two-column schema common to each of the three inputs
let schema = RelationType::new(vec![
  ScalarType::Int32.nullable(false),
  ScalarType::Int32.nullable(false),
]);

// the specific data are not important here.
let data = vec![Datum::Int32(0), Datum::Int32(1)];
let input0 = MirRelationExpr::constant(vec![data.clone()], schema.clone());
let input1 = MirRelationExpr::constant(vec![data.clone()], schema.clone());
let input2 = MirRelationExpr::constant(vec![data.clone()], schema.clone());

// [input0(#0) = input2(#1)], [input0(#1) = input1(#0) = input2(#0)]
let equivalences = vec![
  vec![MirScalarExpr::Column(0), MirScalarExpr::Column(5)],
  vec![MirScalarExpr::Column(1), MirScalarExpr::Column(2), MirScalarExpr::Column(4)],
];

let input_mapper = JoinInputMapper::new(&[input0, input1, input2]);
assert_eq!(
  Some(MirScalarExpr::Column(4)),
  input_mapper.find_bound_expr(&MirScalarExpr::Column(2), &[2], &equivalences)
);
assert_eq!(
  None,
  input_mapper.find_bound_expr(&MirScalarExpr::Column(0), &[1], &equivalences)
);
source

pub fn try_localize_to_input_with_bound_expr( &self, expr: &mut MirScalarExpr, index: usize, equivalences: &[Vec<MirScalarExpr>], ) -> bool

Try to rewrite expr from the global context so that all the columns point to the indexth input by replacing subexpressions with their bound equivalents in the indexth input if necessary. Returns whether the rewriting was successful. If it returns true, then expr is in the context of the indexth input. If it returns false, then still some subexpressions might have been rewritten. However, expr is still in the global context.

source

pub fn consequence_for_input( &self, expr: &MirScalarExpr, index: usize, ) -> Option<MirScalarExpr>

Try to find a consequence c of the given expression e for the given input.

If we return Some(c), that means

  1. c uses only columns from the given input;
  2. if c doesn’t hold on a row of the input, then e also wouldn’t hold;
  3. if c holds on a row of the input, then e might or might not hold.
  4. and 2. means that if we have a join with predicate e then we can use c for pre-filtering a join input before the join. However, 3. means that e shouldn’t be deleted from the join predicates, i.e., we can’t do a “traditional” predicate pushdown.

Note that “c is a consequence of e” is the same thing as 2., see https://en.wikipedia.org/wiki/Contraposition

Example: For (t1.f2 = 3 AND t2.f2 = 4) OR (t1.f2 = 5 AND t2.f2 = 6) we find t1.f2 = 3 OR t1.f2 = 5 for t1, and t2.f2 = 4 OR t2.f2 = 6 for t2.

Further examples are in TPC-H Q07, Q19, and chbench Q07, Q19.

Parameters:

  • expr: The expression e from above. try_localize_to_input_with_bound_expr should be called on expr before us!
  • index: The index of the join input whose columns we will use.
  • equivalences: Join equivalences that we can use for try_map_to_input_with_bound_expr. If successful, the returned expression is in the local context of the specified input.

Trait Implementations§

source§

impl Debug for JoinInputMapper

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T, U> CastInto<U> for T
where U: CastFrom<T>,

source§

fn cast_into(self) -> U

Performs the cast.
source§

impl<T> CopyAs<T> for T

source§

fn copy_as(self) -> T

source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T> FutureExt for T

source§

fn with_context(self, otel_cx: Context) -> WithContext<Self>

Attaches the provided Context to this type, returning a WithContext wrapper. Read more
source§

fn with_current_context(self) -> WithContext<Self>

Attaches the current Context to this type, returning a WithContext wrapper. Read more
source§

impl<T> Instrument for T

source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> IntoRequest<T> for T

source§

fn into_request(self) -> Request<T>

Wrap the input message T in a tonic::Request
source§

impl<T, U> OverrideFrom<Option<&T>> for U
where U: OverrideFrom<T>,

source§

fn override_from(self, layer: &Option<&T>) -> U

Override the configuration represented by Self with values from the given layer.
source§

impl<T> Pointable for T

source§

const ALIGN: usize = _

The alignment of pointer.
§

type Init = T

The type for initializers.
source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
source§

impl<P, R> ProtoType<R> for P
where R: RustType<P>,

source§

impl<T> Same for T

§

type Output = T

Should always be Self
source§

impl<'a, S, T> Semigroup<&'a S> for T
where T: Semigroup<S>,

source§

fn plus_equals(&mut self, rhs: &&'a S)

The method of std::ops::AddAssign, for types that do not implement AddAssign.
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

source§

fn vzip(self) -> V

source§

impl<T> WithSubscriber for T

source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,