Attribute Macro multiversion::multiversion
source · #[multiversion]
Expand description
Provides function multiversioning.
Functions are selected in order, calling the first matching target. The function tagged by the attribute is the generic implementation that does not require any specific architecture or features.
§Helper attributes
#[clone]
- Clones the function for the specified target.
- Arguments:
target
: the target specification of the clone
#[specialize]
- Specializes the function for the specified target with another function.
- Arguments:
target
: the target specification of the specializationfn
: path to the function specializing the tagged functionunsafe
(optional): indicates whether the specialization function isunsafe
, but safe to call for this target. Functions tagged with thetarget
attribute must beunsafe
, so markingunsafe = true
indicates that the safety contract is fulfilled andfunction
is safe to call on the specified target. Iffunction
is unsafe for any other reason, remember to mark the tagged functionunsafe
as well.
#[crate_path]
- Specifies the location of the multiversion crate (useful for re-exporting).
- Arguments:
path
: the path to the multiversion crate
§Examples
§Cloning
The following compiles square
three times, once for each target and once for the generic
target. Calling square
selects the appropriate version at runtime.
use multiversion::multiversion;
#[multiversion]
#[clone(target = "[x86|x86_64]+avx")]
#[clone(target = "x86+sse")]
fn square(x: &mut [f32]) {
for v in x {
*v *= *v
}
}
§Specialization
This example creates a function where_am_i
that prints the detected CPU feature.
use multiversion::multiversion;
fn where_am_i_avx() {
println!("avx");
}
fn where_am_i_sse() {
println!("sse");
}
fn where_am_i_neon() {
println!("neon");
}
#[multiversion]
#[specialize(target = "[x86|x86_64]+avx", fn = "where_am_i_avx")]
#[specialize(target = "x86+sse", fn = "where_am_i_sse")]
#[specialize(target = "[arm|aarch64]+neon", fn = "where_am_i_neon")]
fn where_am_i() {
println!("generic");
}
§Making target_feature
functions safe
This example is the same as the above example, but calls unsafe
specialized functions. Note
that the where_am_i
function is still safe, since we know we are only calling specialized
functions on supported CPUs.
use multiversion::{multiversion, target};
#[target("[x86|x86_64]+avx")]
unsafe fn where_am_i_avx() {
println!("avx");
}
#[target("x86+sse")]
unsafe fn where_am_i_sse() {
println!("sse");
}
#[target("[arm|aarch64]+neon")]
unsafe fn where_am_i_neon() {
println!("neon");
}
#[multiversion]
#[specialize(target = "[x86|x86_64]+avx", fn = "where_am_i_avx", unsafe = true)]
#[specialize(target = "x86+sse", fn = "where_am_i_sse", unsafe = true)]
#[specialize(target = "[arm|aarch64]+neon", fn = "where_am_i_neon")]
fn where_am_i() {
println!("generic");
}
§Static dispatching
The multiversion
attribute allows functions called inside the function to be statically dispatched.
Additionally, functions created with this attribute can themselves be statically dispatched.
See static dispatching for more information.
§Conditional compilation
The multiversion
attribute supports conditional compilation with the #[target_cfg]
helper
attribute. See conditional compilation for more information.
§Function name mangling
The functions created by this macro are mangled as {ident}_{features}_version
, where ident
is
the name of the multiversioned function, and features
is either default
(for the default
version with no features enabled) or the list of features, sorted alphabetically. Dots (.
)
in the feature names are removed.
The following creates two functions, foo_avx_sse41_version
and foo_default_version
.
#[multiversion::multiversion]
#[clone(target = "[x86|x86_64]+sse4.1+avx")]
fn foo() {}
#[multiversion::target("[x86|x86_64]+sse4.1+avx")]
unsafe fn call_foo_avx() {
foo_avx_sse41_version();
}
fn call_foo_default() {
foo_default_version();
}
§Implementation details
The function version dispatcher consists of a function selector and an atomic function pointer. Initially the function pointer will point to the function selector. On invocation, this selector will then choose an implementation, store a pointer to it in the atomic function pointer for later use and then pass on control to the chosen function. On subsequent calls, the chosen function will be called without invoking the function selector.
Some comments on the benefits of this implementation:
- The function selector is only invoked once. Subsequent calls are reduced to an atomic load
and indirect function call (for non-generic, non-
async
functions). Generic andasync
functions cannot be stored in the atomic function pointer, which may result in additional branches. - If called in multiple threads, there is no contention. It is possible for two threads to hit the same function before function selection has completed, which results in each thread invoking the function selector, but the atomic ensures that these are synchronized correctly.