Struct mz_expr::JoinInputMapper
source · pub struct JoinInputMapper {
arities: Vec<usize>,
input_relation: Vec<usize>,
prior_arities: Vec<usize>,
}
Expand description
Any column in a join expression exists in two contexts:
- It has a position relative to the result of the join (global)
- It has a position relative to the specific input it came from (local) This utility focuses on taking expressions that are in terms of the local input and re-expressing them in global terms and vice versa.
Methods in this class that take an argument equivalences
are only
guaranteed to return a correct answer if equivalence classes are in
canonical form.
(See crate::relation::canonicalize::canonicalize_equivalences
.)
Fields§
§arities: Vec<usize>
The number of columns per input. All other fields in this struct are derived using the information in this field.
input_relation: Vec<usize>
Looks up which input each column belongs to. Derived from arities
.
Stored as a field to avoid recomputation.
prior_arities: Vec<usize>
The sum of the arities of the previous inputs in the join. Derived from
arities
. Stored as a field to avoid recomputation.
Implementations§
source§impl JoinInputMapper
impl JoinInputMapper
sourcepub fn new(inputs: &[MirRelationExpr]) -> Self
pub fn new(inputs: &[MirRelationExpr]) -> Self
Creates a new JoinInputMapper
and calculates the mapping of global context
columns to local context columns.
sourcepub fn new_from_input_types(types: &[RelationType]) -> Self
pub fn new_from_input_types(types: &[RelationType]) -> Self
Creates a new JoinInputMapper
and calculates the mapping of global context
columns to local context columns. Using this method saves is more
efficient if input types have been pre-calculated
sourcepub fn new_from_input_arities<I>(arities: I) -> Selfwhere
I: IntoIterator<Item = usize>,
pub fn new_from_input_arities<I>(arities: I) -> Selfwhere
I: IntoIterator<Item = usize>,
Creates a new JoinInputMapper
and calculates the mapping of global context
columns to local context columns. Using this method saves is more
efficient if input arities have been pre-calculated
sourcepub fn total_columns(&self) -> usize
pub fn total_columns(&self) -> usize
reports sum of the number of columns of each input
sourcepub fn total_inputs(&self) -> usize
pub fn total_inputs(&self) -> usize
reports total numbers of inputs in the join
sourcepub fn global_keys<'a, I>(
&self,
local_keys: I,
equivalences: &[Vec<MirScalarExpr>],
) -> Vec<Vec<usize>>
pub fn global_keys<'a, I>( &self, local_keys: I, equivalences: &[Vec<MirScalarExpr>], ) -> Vec<Vec<usize>>
Using the keys that came from each local input, figures out which keys remain unique in the larger join Currently, we only figure out a small subset of the keys that can remain unique.
sourcepub fn input_arity(&self, index: usize) -> usize
pub fn input_arity(&self, index: usize) -> usize
returns the arity for a particular input
sourcepub fn local_columns(&self, index: usize) -> Range<usize>
pub fn local_columns(&self, index: usize) -> Range<usize>
All column numbers in order for a particular input in the local context
sourcepub fn global_columns(&self, index: usize) -> Range<usize>
pub fn global_columns(&self, index: usize) -> Range<usize>
All column numbers in order for a particular input in the global context
sourcepub fn map_expr_to_local(&self, expr: MirScalarExpr) -> MirScalarExpr
pub fn map_expr_to_local(&self, expr: MirScalarExpr) -> MirScalarExpr
Takes an expression from the global context and creates a new version
where column references have been remapped to the local context.
Assumes that all columns in expr
are from the same input.
sourcepub fn map_expr_to_global(
&self,
expr: MirScalarExpr,
index: usize,
) -> MirScalarExpr
pub fn map_expr_to_global( &self, expr: MirScalarExpr, index: usize, ) -> MirScalarExpr
Takes an expression from the local context of the index
th input and
creates a new version where column references have been remapped to the
global context.
sourcepub fn map_column_to_local(&self, column: usize) -> (usize, usize)
pub fn map_column_to_local(&self, column: usize) -> (usize, usize)
Remap column numbers from the global to the local context.
Returns a 2-tuple (<new column number>, <index of input>)
sourcepub fn map_column_to_global(&self, column: usize, index: usize) -> usize
pub fn map_column_to_global(&self, column: usize, index: usize) -> usize
Remap a column number from a local context to the global context.
sourcepub fn split_column_set_by_input<'a, I>(
&self,
columns: I,
) -> Vec<BTreeSet<usize>>
pub fn split_column_set_by_input<'a, I>( &self, columns: I, ) -> Vec<BTreeSet<usize>>
Takes a sequence of columns in the global context and splits it into
a Vec
containing self.total_inputs()
BTreeSet
s, each containing
the localized columns that belong to the particular input.
sourcepub fn lookup_inputs(&self, expr: &MirScalarExpr) -> impl Iterator<Item = usize>
pub fn lookup_inputs(&self, expr: &MirScalarExpr) -> impl Iterator<Item = usize>
Find the sorted, dedupped set of inputs an expression references
sourcepub fn single_input(&self, expr: &MirScalarExpr) -> Option<usize>
pub fn single_input(&self, expr: &MirScalarExpr) -> Option<usize>
Returns the index of the only input referenced in the given expression.
sourcepub fn is_localized(&self, expr: &MirScalarExpr, index: usize) -> bool
pub fn is_localized(&self, expr: &MirScalarExpr, index: usize) -> bool
Returns whether the given expr refers to columns of only the index
th input.
sourcepub fn find_bound_expr(
&self,
expr: &MirScalarExpr,
bound_inputs: &[usize],
equivalences: &[Vec<MirScalarExpr>],
) -> Option<MirScalarExpr>
pub fn find_bound_expr( &self, expr: &MirScalarExpr, bound_inputs: &[usize], equivalences: &[Vec<MirScalarExpr>], ) -> Option<MirScalarExpr>
Takes an expression in the global context and looks in equivalences
for an equivalent expression (also expressed in the global context) that
belongs to one or more of the inputs in bound_inputs
§Examples
use mz_repr::{Datum, ColumnType, RelationType, ScalarType};
use mz_expr::{JoinInputMapper, MirRelationExpr, MirScalarExpr};
// A two-column schema common to each of the three inputs
let schema = RelationType::new(vec![
ScalarType::Int32.nullable(false),
ScalarType::Int32.nullable(false),
]);
// the specific data are not important here.
let data = vec![Datum::Int32(0), Datum::Int32(1)];
let input0 = MirRelationExpr::constant(vec![data.clone()], schema.clone());
let input1 = MirRelationExpr::constant(vec![data.clone()], schema.clone());
let input2 = MirRelationExpr::constant(vec![data.clone()], schema.clone());
// [input0(#0) = input2(#1)], [input0(#1) = input1(#0) = input2(#0)]
let equivalences = vec![
vec![MirScalarExpr::Column(0), MirScalarExpr::Column(5)],
vec![MirScalarExpr::Column(1), MirScalarExpr::Column(2), MirScalarExpr::Column(4)],
];
let input_mapper = JoinInputMapper::new(&[input0, input1, input2]);
assert_eq!(
Some(MirScalarExpr::Column(4)),
input_mapper.find_bound_expr(&MirScalarExpr::Column(2), &[2], &equivalences)
);
assert_eq!(
None,
input_mapper.find_bound_expr(&MirScalarExpr::Column(0), &[1], &equivalences)
);
sourcepub fn try_localize_to_input_with_bound_expr(
&self,
expr: &mut MirScalarExpr,
index: usize,
equivalences: &[Vec<MirScalarExpr>],
) -> bool
pub fn try_localize_to_input_with_bound_expr( &self, expr: &mut MirScalarExpr, index: usize, equivalences: &[Vec<MirScalarExpr>], ) -> bool
Try to rewrite expr
from the global context so that all the
columns point to the index
th input by replacing subexpressions with their
bound equivalents in the index
th input if necessary.
Returns whether the rewriting was successful.
If it returns true, then expr
is in the context of the index
th input.
If it returns false, then still some subexpressions might have been rewritten. However,
expr
is still in the global context.
sourcepub fn consequence_for_input(
&self,
expr: &MirScalarExpr,
index: usize,
) -> Option<MirScalarExpr>
pub fn consequence_for_input( &self, expr: &MirScalarExpr, index: usize, ) -> Option<MirScalarExpr>
Try to find a consequence c
of the given expression e
for the given input.
If we return Some(c)
, that means
c
uses only columns from the given input;- if
c
doesn’t hold on a row of the input, thene
also wouldn’t hold; - if
c
holds on a row of the input, thene
might or might not hold. - and 2. means that if we have a join with predicate
e
then we can usec
for pre-filtering a join input before the join. However, 3. means thate
shouldn’t be deleted from the join predicates, i.e., we can’t do a “traditional” predicate pushdown.
Note that “c
is a consequence of e
” is the same thing as 2., see
https://en.wikipedia.org/wiki/Contraposition
Example: For
(t1.f2 = 3 AND t2.f2 = 4) OR (t1.f2 = 5 AND t2.f2 = 6)
we find
t1.f2 = 3 OR t1.f2 = 5
for t1, and
t2.f2 = 4 OR t2.f2 = 6
for t2.
Further examples are in TPC-H Q07, Q19, and chbench Q07, Q19.
Parameters:
expr
: The expressione
from above.try_localize_to_input_with_bound_expr
should be called onexpr
before us!index
: The index of the join input whose columns we will use.equivalences
: Join equivalences that we can use fortry_map_to_input_with_bound_expr
. If successful, the returned expression is in the local context of the specified input.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for JoinInputMapper
impl RefUnwindSafe for JoinInputMapper
impl Send for JoinInputMapper
impl Sync for JoinInputMapper
impl Unpin for JoinInputMapper
impl UnwindSafe for JoinInputMapper
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> FutureExt for T
impl<T> FutureExt for T
source§fn with_context(self, otel_cx: Context) -> WithContext<Self>
fn with_context(self, otel_cx: Context) -> WithContext<Self>
source§fn with_current_context(self) -> WithContext<Self>
fn with_current_context(self) -> WithContext<Self>
source§impl<T> Instrument for T
impl<T> Instrument for T
source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request
source§impl<T, U> OverrideFrom<Option<&T>> for Uwhere
U: OverrideFrom<T>,
impl<T, U> OverrideFrom<Option<&T>> for Uwhere
U: OverrideFrom<T>,
source§impl<T> Pointable for T
impl<T> Pointable for T
source§impl<P, R> ProtoType<R> for Pwhere
R: RustType<P>,
impl<P, R> ProtoType<R> for Pwhere
R: RustType<P>,
source§fn into_rust(self) -> Result<R, TryFromProtoError>
fn into_rust(self) -> Result<R, TryFromProtoError>
RustType::from_proto
.source§fn from_rust(rust: &R) -> P
fn from_rust(rust: &R) -> P
RustType::into_proto
.source§impl<'a, S, T> Semigroup<&'a S> for Twhere
T: Semigroup<S>,
impl<'a, S, T> Semigroup<&'a S> for Twhere
T: Semigroup<S>,
source§fn plus_equals(&mut self, rhs: &&'a S)
fn plus_equals(&mut self, rhs: &&'a S)
std::ops::AddAssign
, for types that do not implement AddAssign
.