pub(crate) fn rejected_nulls(
    expr: &BoxScalarExpr,
    set: &mut HashSet<ColumnReference>
)
Expand description

Returns all columns that must be non-NULL for the boolean expr to be NULL or FALSE.

An expression expr rejects nulls in a set of column references C if it evaluates to either FALSE or NULL whenever some c in C is null.

An expression expr propagates nulls in a set of column references C if it evaluates to NULL whenever some c in C is null.

Consequently, results returned by propagated_nulls must be included in rejected_nulls.

Unfortunately, boolean functions such as “and” and “or” are not propagating nulls in their inputs, but we still need to handle them here, as they are used quite frequently in predicates. The procedure for doing this is derived below.

Observe the truth values for the following terms:

For AND(A, B):

FNT
FFFF
NFNN
TFNT

For OR(A, B):

FNT
FFNT
NNNT
TTTT

For NOT(AND(A, B)):

FNT
FTTT
NTNN
TTNF

For NOT(OR(A, B)):

FNT
FTNF
NNNF
TFFF

Based on the above truth tables, we can establish the following statements are always true:

  1. If either A or B rejects nulls in C, then AND(A, B) rejects nulls in C.
  2. If both A and B reject nulls in C, then OR(A, B) rejects nulls in C.
  3. If both A and B propagate nulls in C, then NOT(AND(A, B)) rejects nulls in C.
  4. If either A or B propagates nulls in C, then NOT(OR(A, B)) rejects nulls in C.

Based on the above statements, the algorithm implemented by this function can be described by the following pseudo-code:

def rejected_nulls(expr: Expr, sign: bool = true) -> Set[Expr]:
    match expr:
        case NOT(ISNULL(c)):
            { c }
        case NOT(expr):
            rejected_nulls(expr, !sign)
        case AND(lhs, rhs):
            if sign > 0:
                rejected_nulls(lhs, sign) ∪ rejected_nulls(rhs, sign)
            else:
                propagated_nulls(lhs) ∩ propagated_nulls(rhs)
        case OR(lhs, rhs):
            if sign > 0:
                rejected_nulls(lhs, sign) ∩ rejected_nulls(rhs, sign)
            else:
                propagated_nulls(lhs) ∪ propagated_nulls(rhs)
        case expr:
            propagated_nulls(expr)