Attribute Macro multiversion::multiversion
source · #[multiversion]Expand description
Provides function multiversioning.
Functions are selected in order, calling the first matching target. The function tagged by the attribute is the generic implementation that does not require any specific architecture or features.
§Helper attributes
#[clone]- Clones the function for the specified target.
- Arguments:
target: the target specification of the clone
#[specialize]- Specializes the function for the specified target with another function.
- Arguments:
target: the target specification of the specializationfn: path to the function specializing the tagged functionunsafe(optional): indicates whether the specialization function isunsafe, but safe to call for this target. Functions tagged with thetargetattribute must beunsafe, so markingunsafe = trueindicates that the safety contract is fulfilled andfunctionis safe to call on the specified target. Iffunctionis unsafe for any other reason, remember to mark the tagged functionunsafeas well.
#[crate_path]- Specifies the location of the multiversion crate (useful for re-exporting).
- Arguments:
path: the path to the multiversion crate
§Examples
§Cloning
The following compiles square three times, once for each target and once for the generic
target. Calling square selects the appropriate version at runtime.
use multiversion::multiversion;
#[multiversion]
#[clone(target = "[x86|x86_64]+avx")]
#[clone(target = "x86+sse")]
fn square(x: &mut [f32]) {
for v in x {
*v *= *v
}
}§Specialization
This example creates a function where_am_i that prints the detected CPU feature.
use multiversion::multiversion;
fn where_am_i_avx() {
println!("avx");
}
fn where_am_i_sse() {
println!("sse");
}
fn where_am_i_neon() {
println!("neon");
}
#[multiversion]
#[specialize(target = "[x86|x86_64]+avx", fn = "where_am_i_avx")]
#[specialize(target = "x86+sse", fn = "where_am_i_sse")]
#[specialize(target = "[arm|aarch64]+neon", fn = "where_am_i_neon")]
fn where_am_i() {
println!("generic");
}
§Making target_feature functions safe
This example is the same as the above example, but calls unsafe specialized functions. Note
that the where_am_i function is still safe, since we know we are only calling specialized
functions on supported CPUs.
use multiversion::{multiversion, target};
#[target("[x86|x86_64]+avx")]
unsafe fn where_am_i_avx() {
println!("avx");
}
#[target("x86+sse")]
unsafe fn where_am_i_sse() {
println!("sse");
}
#[target("[arm|aarch64]+neon")]
unsafe fn where_am_i_neon() {
println!("neon");
}
#[multiversion]
#[specialize(target = "[x86|x86_64]+avx", fn = "where_am_i_avx", unsafe = true)]
#[specialize(target = "x86+sse", fn = "where_am_i_sse", unsafe = true)]
#[specialize(target = "[arm|aarch64]+neon", fn = "where_am_i_neon")]
fn where_am_i() {
println!("generic");
}
§Static dispatching
The multiversion attribute allows functions called inside the function to be statically dispatched.
Additionally, functions created with this attribute can themselves be statically dispatched.
See static dispatching for more information.
§Conditional compilation
The multiversion attribute supports conditional compilation with the #[target_cfg] helper
attribute. See conditional compilation for more information.
§Function name mangling
The functions created by this macro are mangled as {ident}_{features}_version, where ident is
the name of the multiversioned function, and features is either default (for the default
version with no features enabled) or the list of features, sorted alphabetically. Dots (.)
in the feature names are removed.
The following creates two functions, foo_avx_sse41_version and foo_default_version.
#[multiversion::multiversion]
#[clone(target = "[x86|x86_64]+sse4.1+avx")]
fn foo() {}
#[multiversion::target("[x86|x86_64]+sse4.1+avx")]
unsafe fn call_foo_avx() {
foo_avx_sse41_version();
}
fn call_foo_default() {
foo_default_version();
}§Implementation details
The function version dispatcher consists of a function selector and an atomic function pointer. Initially the function pointer will point to the function selector. On invocation, this selector will then choose an implementation, store a pointer to it in the atomic function pointer for later use and then pass on control to the chosen function. On subsequent calls, the chosen function will be called without invoking the function selector.
Some comments on the benefits of this implementation:
- The function selector is only invoked once. Subsequent calls are reduced to an atomic load
and indirect function call (for non-generic, non-
asyncfunctions). Generic andasyncfunctions cannot be stored in the atomic function pointer, which may result in additional branches. - If called in multiple threads, there is no contention. It is possible for two threads to hit the same function before function selection has completed, which results in each thread invoking the function selector, but the atomic ensures that these are synchronized correctly.