arrow_array::array

Struct GenericByteViewArray

Source
pub struct GenericByteViewArray<T: ByteViewType + ?Sized> { /* private fields */ }
Expand description

Variable-size Binary View Layout: An array of variable length bytes view arrays.

Different than crate::GenericByteArray as it stores both an offset and length meaning that take / filter operations can be implemented without copying the underlying data.

See StringViewArray for storing utf8 encoded string data and BinaryViewArray for storing bytes.

A GenericByteViewArray stores variable length byte strings. An array of N elements is stored as N fixed length “views” and a variable number of variable length “buffers”.

Each view is a u128 value layout is different depending on the length of the string stored at that location:

                        ┌──────┬────────────────────────┐
                        │length│      string value      │
   Strings (len <= 12)  │      │    (padded with 0)     │
                        └──────┴────────────────────────┘
                         0    31                      127

                        ┌───────┬───────┬───────┬───────┐
                        │length │prefix │  buf  │offset │
   Strings (len > 12)   │       │       │ index │       │
                        └───────┴───────┴───────┴───────┘
                         0    31       63      95    127
  • Strings with length <= 12 are stored directly in the view.

  • Strings with length > 12: The first four bytes are stored inline in the view and the entire string is stored in one of the buffers.

Unlike GenericByteArray, there are no constraints on the offsets other than they must point into a valid buffer. However, they can be out of order, non continuous and overlapping.

For example, in the following diagram, the strings “FishWasInTownToday” and “CrumpleFacedFish” are both longer than 12 bytes and thus are stored in a separate buffer while the string “LavaMonster” is stored inlined in the view. In this case, the same bytes for “Fish” are used to store both strings.

                                                                           ┌───┐
                        ┌──────┬──────┬──────┬──────┐               offset │...│
"FishWasInTownTodayYay" │  21  │ Fish │  0   │ 115  │─ ─              103  │Mr.│
                        └──────┴──────┴──────┴──────┘   │      ┌ ─ ─ ─ ─ ▶ │Cru│
                        ┌──────┬──────┬──────┬──────┐                      │mpl│
"CrumpleFacedFish"      │  16  │ Crum │  0   │ 103  │─ ─│─ ─ ─ ┘           │eFa│
                        └──────┴──────┴──────┴──────┘                      │ced│
                        ┌──────┬────────────────────┐   └ ─ ─ ─ ─ ─ ─ ─ ─ ▶│Fis│
"LavaMonster"           │  11  │   LavaMonster\0    │                      │hWa│
                        └──────┴────────────────────┘               offset │sIn│
                                                                      115  │Tow│
                                                                           │nTo│
                                                                           │day│
                                 u128 "views"                              │Yay│
                                                                  buffer 0 │...│
                                                                           └───┘

Implementations§

Source§

impl<T: ByteViewType + ?Sized> GenericByteViewArray<T>

Source

pub fn new( views: ScalarBuffer<u128>, buffers: Vec<Buffer>, nulls: Option<NullBuffer>, ) -> Self

Create a new GenericByteViewArray from the provided parts, panicking on failure

§Panics

Panics if GenericByteViewArray::try_new returns an error

Source

pub fn try_new( views: ScalarBuffer<u128>, buffers: Vec<Buffer>, nulls: Option<NullBuffer>, ) -> Result<Self, ArrowError>

Create a new GenericByteViewArray from the provided parts, returning an error on failure

§Errors
Source

pub unsafe fn new_unchecked( views: ScalarBuffer<u128>, buffers: Vec<Buffer>, nulls: Option<NullBuffer>, ) -> Self

Create a new GenericByteViewArray from the provided parts, without validation

§Safety

Safe if Self::try_new would not error

Source

pub fn new_null(len: usize) -> Self

Create a new GenericByteViewArray of length len where all values are null

Source

pub fn from_iter_values<Ptr, I>(iter: I) -> Self
where Ptr: AsRef<T::Native>, I: IntoIterator<Item = Ptr>,

Creates a GenericByteViewArray based on an iterator of values without nulls

Source

pub fn into_parts(self) -> (ScalarBuffer<u128>, Vec<Buffer>, Option<NullBuffer>)

Deconstruct this array into its constituent parts

Source

pub fn views(&self) -> &ScalarBuffer<u128>

Returns the views buffer

Source

pub fn data_buffers(&self) -> &[Buffer]

Returns the buffers storing string data

Source

pub fn value(&self, i: usize) -> &T::Native

Returns the element at index i

§Panics

Panics if index i is out of bounds.

Source

pub unsafe fn value_unchecked(&self, idx: usize) -> &T::Native

Returns the element at index i

§Safety

Caller is responsible for ensuring that the index is within the bounds of the array

Source

pub fn iter(&self) -> ArrayIter<&Self>

constructs a new iterator

Source

pub fn slice(&self, offset: usize, length: usize) -> Self

Returns a zero-copy slice of this array with the indicated offset and length.

Trait Implementations§

Source§

impl<T: ByteViewType + ?Sized> Array for GenericByteViewArray<T>

Source§

fn as_any(&self) -> &dyn Any

Returns the array as Any so that it can be downcasted to a specific implementation. Read more
Source§

fn to_data(&self) -> ArrayData

Returns the underlying data of this array
Source§

fn into_data(self) -> ArrayData

Returns the underlying data of this array Read more
Source§

fn data_type(&self) -> &DataType

Returns a reference to the DataType of this array. Read more
Source§

fn slice(&self, offset: usize, length: usize) -> ArrayRef

Returns a zero-copy slice of this array with the indicated offset and length. Read more
Source§

fn len(&self) -> usize

Returns the length (i.e., number of elements) of this array. Read more
Source§

fn is_empty(&self) -> bool

Returns whether this array is empty. Read more
Source§

fn offset(&self) -> usize

Returns the offset into the underlying data used by this array(-slice). Note that the underlying data can be shared by many arrays. This defaults to 0. Read more
Source§

fn nulls(&self) -> Option<&NullBuffer>

Returns the null buffer of this array if any. Read more
Source§

fn get_buffer_memory_size(&self) -> usize

Returns the total number of bytes of memory pointed to by this array. The buffers store bytes in the Arrow memory format, and include the data as well as the validity map. Note that this does not always correspond to the exact memory usage of an array, since multiple arrays can share the same buffers or slices thereof.
Source§

fn get_array_memory_size(&self) -> usize

Returns the total number of bytes of memory occupied physically by this array. This value will always be greater than returned by get_buffer_memory_size() and includes the overhead of the data structures that contain the pointers to the various buffers.
Source§

fn logical_nulls(&self) -> Option<NullBuffer>

Returns a potentially computed NullBuffer that represents the logical null values of this array, if any. Read more
Source§

fn is_null(&self, index: usize) -> bool

Returns whether the element at index is null according to Array::nulls Read more
Source§

fn is_valid(&self, index: usize) -> bool

Returns whether the element at index is not null, the opposite of Self::is_null. Read more
Source§

fn null_count(&self) -> usize

Returns the total number of physical null values in this array. Read more
Source§

fn is_nullable(&self) -> bool

Returns false if the array is guaranteed to not contain any logical nulls Read more
Source§

impl<'a, T: ByteViewType + ?Sized> ArrayAccessor for &'a GenericByteViewArray<T>

Source§

type Item = &'a <T as ByteViewType>::Native

The Arrow type of the element being accessed.
Source§

fn value(&self, index: usize) -> Self::Item

Returns the element at index i Read more
Source§

unsafe fn value_unchecked(&self, index: usize) -> Self::Item

Returns the element at index i Read more
Source§

impl<T: ByteViewType + ?Sized> Clone for GenericByteViewArray<T>

Source§

fn clone(&self) -> Self

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<T: ByteViewType + ?Sized> Debug for GenericByteViewArray<T>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<T: ByteViewType + ?Sized> From<ArrayData> for GenericByteViewArray<T>

Source§

fn from(value: ArrayData) -> Self

Converts to this type from the input type.
Source§

impl<T: ByteViewType + ?Sized> From<GenericByteViewArray<T>> for ArrayData

Source§

fn from(array: GenericByteViewArray<T>) -> Self

Converts to this type from the input type.
Source§

impl<Ptr, T: ByteViewType + ?Sized> FromIterator<Option<Ptr>> for GenericByteViewArray<T>
where Ptr: AsRef<T::Native>,

Source§

fn from_iter<I: IntoIterator<Item = Option<Ptr>>>(iter: I) -> Self

Creates a value from an iterator. Read more
Source§

impl<'a, T: ByteViewType + ?Sized> IntoIterator for &'a GenericByteViewArray<T>

Source§

type Item = Option<&'a <T as ByteViewType>::Native>

The type of the elements being iterated over.
Source§

type IntoIter = ArrayIter<&'a GenericByteViewArray<T>>

Which kind of iterator are we turning this into?
Source§

fn into_iter(self) -> Self::IntoIter

Creates an iterator from a value. Read more

Auto Trait Implementations§

§

impl<T> Freeze for GenericByteViewArray<T>
where T: ?Sized,

§

impl<T> RefUnwindSafe for GenericByteViewArray<T>
where T: RefUnwindSafe + ?Sized,

§

impl<T> Send for GenericByteViewArray<T>
where T: ?Sized,

§

impl<T> Sync for GenericByteViewArray<T>
where T: ?Sized,

§

impl<T> Unpin for GenericByteViewArray<T>
where T: Unpin + ?Sized,

§

impl<T> UnwindSafe for GenericByteViewArray<T>
where T: UnwindSafe + ?Sized,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> Datum for T
where T: Array,

Source§

fn get(&self) -> (&dyn Array, bool)

Returns the value for this Datum and a boolean indicating if the value is scalar
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,