This documentation is still new and evolving. If you spot any mistakes, unclear explanations, or missing details, please open an issue.
Your feedback helps us improve!
SIMD helpersโ
This page lists all operations on slices, available in the exp/simd sub-package. These helpers use AVX (128-bit), AVX2 (256-bit) or AVX512 (512-bit) SIMD when built with Go 1.26+, the GOEXPERIMENT=simd flag, and on amd64.
SIMD helpers are experimental. The API may break in the future.
Performanceโ
Benchmarks show that running SIMD operators on small datasets is slower:
BenchmarkSumInt8/small/Fallback-lo-4 203616572 5.875 ns/op
BenchmarkSumInt8/small/AVX-x16-4 100000000 12.04 ns/op
BenchmarkSumInt8/small/AVX2-x32-4 64041816 17.93 ns/op
BenchmarkSumInt8/small/AVX512-x64-4 26947528 44.75 ns/op
But much much faster on big datasets:
BenchmarkSumInt8/xlarge/Fallback-lo-4 247677 4860 ns/op
BenchmarkSumInt8/xlarge/AVX-x16-4 3851040 311.4 ns/op
BenchmarkSumInt8/xlarge/AVX2-x32-4 7100002 169.2 ns/op
BenchmarkSumInt8/xlarge/AVX512-x64-4 10107534 118.1 ns/op
Containsโ
Checks if a target value is present in a collection using SIMD instructions. The suffix (x4, x8, x16, x32, x64) indicates the number of lanes processed simultaneously.
Note: Choose the variant matching your CPU's capabilities. Higher lane counts provide better performance but require newer CPU support.
// Using AVX2 variant (32 lanes at once) - Intel Haswell+ / AMD Excavator+found := simd.ContainsInt8x32([]int8{1, 2, 3, 4, 5}, 3)// true// Using AVX variant (16 lanes at once) - works on all amd64found := simd.ContainsInt64x2([]int64{1000000, 2000000, 3000000}, 2000000)// true// Using AVX-512 variant (64 lanes at once) - Intel Skylake-X+found := simd.ContainsUint8x64([]uint8{10, 20, 30, 40, 50}, 30)// true// Float32 with AVX2 (8 lanes at once)found := simd.ContainsFloat32x8([]float32{1.1, 2.2, 3.3, 4.4}, 3.3)// true// Empty collection returns falsefound := simd.ContainsInt16x16([]int16{}, 5)// falsePrototypes:func ContainsInt8x16[T ~int8](collection []T, target T) boolfunc ContainsInt8x32[T ~int8](collection []T, target T) boolfunc ContainsInt8x64[T ~int8](collection []T, target T) boolfunc ContainsInt16x8[T ~int16](collection []T, target T) boolfunc ContainsInt16x16[T ~int16](collection []T, target T) boolfunc ContainsInt16x32[T ~int16](collection []T, target T) boolfunc ContainsInt32x4[T ~int32](collection []T, target T) boolfunc ContainsInt32x8[T ~int32](collection []T, target T) boolfunc ContainsInt32x16[T ~int32](collection []T, target T) boolfunc ContainsInt64x2[T ~int64](collection []T, target T) boolfunc ContainsInt64x4[T ~int64](collection []T, target T) boolfunc ContainsInt64x8[T ~int64](collection []T, target T) boolfunc ContainsUint8x16[T ~uint8](collection []T, target T) boolfunc ContainsUint8x32[T ~uint8](collection []T, target T) boolfunc ContainsUint8x64[T ~uint8](collection []T, target T) boolfunc ContainsUint16x8[T ~uint16](collection []T, target T) boolfunc ContainsUint16x16[T ~uint16](collection []T, target T) boolfunc ContainsUint16x32[T ~uint16](collection []T, target T) boolfunc ContainsUint32x4[T ~uint32](collection []T, target T) boolfunc ContainsUint32x8[T ~uint32](collection []T, target T) boolfunc ContainsUint32x16[T ~uint32](collection []T, target T) boolfunc ContainsUint64x2[T ~uint64](collection []T, target T) boolfunc ContainsUint64x4[T ~uint64](collection []T, target T) boolfunc ContainsUint64x8[T ~uint64](collection []T, target T) boolfunc ContainsFloat32x4[T ~float32](collection []T, target T) boolfunc ContainsFloat32x8[T ~float32](collection []T, target T) boolfunc ContainsFloat32x16[T ~float32](collection []T, target T) boolfunc ContainsFloat64x2[T ~float64](collection []T, target T) boolfunc ContainsFloat64x4[T ~float64](collection []T, target T) boolfunc ContainsFloat64x8[T ~float64](collection []T, target T) boolSumโ
Sums the values in a collection using SIMD instructions. The suffix (x2, x4, x8, x16, x32, x64) indicates the number of lanes processed simultaneously.
Note: Choose the variant matching your CPU's capabilities. Higher lane counts provide better performance but require newer CPU support.
// Using AVX2 variant (32 lanes at once) - Intel Haswell+ / AMD Excavator+sum := simd.SumInt8x32([]int8{1, 2, 3, 4, 5})// 15// Using AVX-512 variant (16 lanes at once) - Intel Skylake-X+sum := simd.SumFloat32x16([]float32{1.1, 2.2, 3.3, 4.4})// 11// Using AVX variant (4 lanes at once) - works on all amd64sum := simd.SumInt32x4([]int32{1000000, 2000000, 3000000})// 6000000// Empty collection returns 0sum := simd.SumUint16x16([]uint16{})// 0Similar:Prototypes:func SumInt8x16[T ~int8](collection []T) Tfunc SumInt8x32[T ~int8](collection []T) Tfunc SumInt8x64[T ~int8](collection []T) Tfunc SumInt16x8[T ~int16](collection []T) Tfunc SumInt16x16[T ~int16](collection []T) Tfunc SumInt16x32[T ~int16](collection []T) Tfunc SumInt32x4[T ~int32](collection []T) Tfunc SumInt32x8[T ~int32](collection []T) Tfunc SumInt32x16[T ~int32](collection []T) Tfunc SumInt64x2[T ~int64](collection []T) Tfunc SumInt64x4[T ~int64](collection []T) Tfunc SumInt64x8[T ~int64](collection []T) Tfunc SumUint8x16[T ~uint8](collection []T) Tfunc SumUint8x32[T ~uint8](collection []T) Tfunc SumUint8x64[T ~uint8](collection []T) Tfunc SumUint16x8[T ~uint16](collection []T) Tfunc SumUint16x16[T ~uint16](collection []T) Tfunc SumUint16x32[T ~uint16](collection []T) Tfunc SumUint32x4[T ~uint32](collection []T) Tfunc SumUint32x8[T ~uint32](collection []T) Tfunc SumUint32x16[T ~uint32](collection []T) Tfunc SumUint64x2[T ~uint64](collection []T) Tfunc SumUint64x4[T ~uint64](collection []T) Tfunc SumUint64x8[T ~uint64](collection []T) Tfunc SumFloat32x4[T ~float32](collection []T) Tfunc SumFloat32x8[T ~float32](collection []T) Tfunc SumFloat32x16[T ~float32](collection []T) Tfunc SumFloat64x2[T ~float64](collection []T) Tfunc SumFloat64x4[T ~float64](collection []T) Tfunc SumFloat64x8[T ~float64](collection []T) TMeanโ
Calculates the arithmetic mean of a collection using SIMD instructions. The suffix (x2, x4, x8, x16, x32, x64) indicates the number of lanes processed simultaneously.
Note: Choose the variant matching your CPU's capabilities. Higher lane counts provide better performance but require newer CPU support.
// Using AVX2 variant (32 lanes at once) - Intel Haswell+ / AMD Excavator+mean := simd.MeanInt8x32([]int8{1, 2, 3, 4, 5})// 3// Using AVX-512 variant (16 lanes at once) - Intel Skylake-X+mean := simd.MeanFloat32x16([]float32{1.0, 2.0, 3.0, 4.0})// 2.5// Using AVX variant (8 lanes at once) - works on all amd64mean := simd.MeanInt16x8([]int16{10, 20, 30, 40})// 25// Empty collection returns 0mean := simd.MeanUint32x4([]uint32{})// 0Similar:Prototypes:func MeanInt8x16[T ~int8](collection []T) Tfunc MeanInt8x32[T ~int8](collection []T) Tfunc MeanInt8x64[T ~int8](collection []T) Tfunc MeanInt16x8[T ~int16](collection []T) Tfunc MeanInt16x16[T ~int16](collection []T) Tfunc MeanInt16x32[T ~int16](collection []T) Tfunc MeanInt32x4[T ~int32](collection []T) Tfunc MeanInt32x8[T ~int32](collection []T) Tfunc MeanInt32x16[T ~int32](collection []T) Tfunc MeanInt64x2[T ~int64](collection []T) Tfunc MeanInt64x4[T ~int64](collection []T) Tfunc MeanInt64x8[T ~int64](collection []T) Tfunc MeanUint8x16[T ~uint8](collection []T) Tfunc MeanUint8x32[T ~uint8](collection []T) Tfunc MeanUint8x64[T ~uint8](collection []T) Tfunc MeanUint16x8[T ~uint16](collection []T) Tfunc MeanUint16x16[T ~uint16](collection []T) Tfunc MeanUint16x32[T ~uint16](collection []T) Tfunc MeanUint32x4[T ~uint32](collection []T) Tfunc MeanUint32x8[T ~uint32](collection []T) Tfunc MeanUint32x16[T ~uint32](collection []T) Tfunc MeanUint64x2[T ~uint64](collection []T) Tfunc MeanUint64x4[T ~uint64](collection []T) Tfunc MeanUint64x8[T ~uint64](collection []T) Tfunc MeanFloat32x4[T ~float32](collection []T) Tfunc MeanFloat32x8[T ~float32](collection []T) Tfunc MeanFloat32x16[T ~float32](collection []T) Tfunc MeanFloat64x2[T ~float64](collection []T) Tfunc MeanFloat64x4[T ~float64](collection []T) Tfunc MeanFloat64x8[T ~float64](collection []T) TMinโ
Finds the minimum value in a collection using SIMD instructions. The suffix (x2, x4, x8, x16, x32, x64) indicates the number of lanes processed simultaneously.
Note: Choose the variant matching your CPU's capabilities. Higher lane counts provide better performance but require newer CPU support.
// Using AVX2 variant (32 lanes at once) - Intel Haswell+ / AMD Excavator+min := simd.MinInt8x32([]int8{5, 2, 8, 1, 9})// 1// Using AVX-512 variant (16 lanes at once) - Intel Skylake-X+min := simd.MinFloat32x16([]float32{3.5, 1.2, 4.8, 2.1})// 1.2// Using AVX variant (4 lanes at once) - works on all amd64min := simd.MinInt32x4([]int32{100, 50, 200, 75})// 50// Empty collection returns 0min := simd.MinUint16x8([]uint16{})// 0Prototypes:func MinInt8x16[T ~int8](collection []T) Tfunc MinInt8x32[T ~int8](collection []T) Tfunc MinInt8x64[T ~int8](collection []T) Tfunc MinInt16x8[T ~int16](collection []T) Tfunc MinInt16x16[T ~int16](collection []T) Tfunc MinInt16x32[T ~int16](collection []T) Tfunc MinInt32x4[T ~int32](collection []T) Tfunc MinInt32x8[T ~int32](collection []T) Tfunc MinInt32x16[T ~int32](collection []T) Tfunc MinInt64x2[T ~int64](collection []T) Tfunc MinInt64x4[T ~int64](collection []T) Tfunc MinInt64x8[T ~int64](collection []T) Tfunc MinUint8x16[T ~uint8](collection []T) Tfunc MinUint8x32[T ~uint8](collection []T) Tfunc MinUint8x64[T ~uint8](collection []T) Tfunc MinUint16x8[T ~uint16](collection []T) Tfunc MinUint16x16[T ~uint16](collection []T) Tfunc MinUint16x32[T ~uint16](collection []T) Tfunc MinUint32x4[T ~uint32](collection []T) Tfunc MinUint32x8[T ~uint32](collection []T) Tfunc MinUint32x16[T ~uint32](collection []T) Tfunc MinUint64x2[T ~uint64](collection []T) Tfunc MinUint64x4[T ~uint64](collection []T) Tfunc MinUint64x8[T ~uint64](collection []T) Tfunc MinFloat32x4[T ~float32](collection []T) Tfunc MinFloat32x8[T ~float32](collection []T) Tfunc MinFloat32x16[T ~float32](collection []T) Tfunc MinFloat64x2[T ~float64](collection []T) Tfunc MinFloat64x4[T ~float64](collection []T) Tfunc MinFloat64x8[T ~float64](collection []T) TSumByโ
SumBy transforms a collection using an iteratee function and sums the result using SIMD instructions. The automatic dispatch functions (e.g.,
SumByInt8) will select the best SIMD variant based on CPU capabilities. The specific variants (e.g.,SumByInt8x32) use a fixed SIMD instruction set regardless of CPU capabilities.Note: The automatic dispatch functions (e.g.,
SumByInt8) will use the best available SIMD variant for the current CPU. Use specific variants (e.g.,SumByInt8x32) only if you know your target CPU supports that instruction set.type Person struct {Name stringAge int8}people := []Person{{Name: "Alice", Age: 25},{Name: "Bob", Age: 30},{Name: "Charlie", Age: 35},}// Automatic dispatch - uses best available SIMDsum := simd.SumByInt8(people, func(p Person) int8 {return p.Age})// 90type Product struct {Name stringPrice float32Stock int32}products := []Product{{Name: "Widget", Price: 10.50, Stock: 5},{Name: "Gadget", Price: 20.00, Stock: 3},{Name: "Tool", Price: 15.75, Stock: 2},}// Sum stock value using specific AVX2 variantsum := simd.SumByFloat32x8(products, func(p Product) float32 {return p.Price * float32(p.Stock)})// 152.5type Metric struct {Value uint16}metrics := []Metric{{Value: 100},{Value: 200},{Value: 300},{Value: 400},}// Using AVX variant - works on all amd64sum := simd.SumByUint16x8(metrics, func(m Metric) uint16 {return m.Value})// 1000// Empty collection returns 0type Item struct {Count int64}sum := simd.SumByInt64([]Item{}, func(i Item) int64 {return i.Count})// 0Prototypes:func SumByInt8[T any, R ~int8](collection []T, iteratee func(item T) R) Rfunc SumByInt16[T any, R ~int16](collection []T, iteratee func(item T) R) Rfunc SumByInt32[T any, R ~int32](collection []T, iteratee func(item T) R) Rfunc SumByInt64[T any, R ~int64](collection []T, iteratee func(item T) R) Rfunc SumByUint8[T any, R ~uint8](collection []T, iteratee func(item T) R) Rfunc SumByUint16[T any, R ~uint16](collection []T, iteratee func(item T) R) Rfunc SumByUint32[T any, R ~uint32](collection []T, iteratee func(item T) R) Rfunc SumByUint64[T any, R ~uint64](collection []T, iteratee func(item T) R) Rfunc SumByFloat32[T any, R ~float32](collection []T, iteratee func(item T) R) Rfunc SumByFloat64[T any, R ~float64](collection []T, iteratee func(item T) R) Rfunc SumByInt8x16[T any, R ~int8](collection []T, iteratee func(item T) R) Rfunc SumByInt8x32[T any, R ~int8](collection []T, iteratee func(item T) R) Rfunc SumByInt8x64[T any, R ~int8](collection []T, iteratee func(item T) R) Rfunc SumByInt16x8[T any, R ~int16](collection []T, iteratee func(item T) R) Rfunc SumByInt16x16[T any, R ~int16](collection []T, iteratee func(item T) R) Rfunc SumByInt16x32[T any, R ~int16](collection []T, iteratee func(item T) R) Rfunc SumByInt32x4[T any, R ~int32](collection []T, iteratee func(item T) R) Rfunc SumByInt32x8[T any, R ~int32](collection []T, iteratee func(item T) R) Rfunc SumByInt32x16[T any, R ~int32](collection []T, iteratee func(item T) R) Rfunc SumByInt64x2[T any, R ~int64](collection []T, iteratee func(item T) R) Rfunc SumByInt64x4[T any, R ~int64](collection []T, iteratee func(item T) R) Rfunc SumByInt64x8[T any, R ~int64](collection []T, iteratee func(item T) R) Rfunc SumByUint8x16[T any, R ~uint8](collection []T, iteratee func(item T) R) Rfunc SumByUint8x32[T any, R ~uint8](collection []T, iteratee func(item T) R) Rfunc SumByUint8x64[T any, R ~uint8](collection []T, iteratee func(item T) R) Rfunc SumByUint16x8[T any, R ~uint16](collection []T, iteratee func(item T) R) Rfunc SumByUint16x16[T any, R ~uint16](collection []T, iteratee func(item T) R) Rfunc SumByUint16x32[T any, R ~uint16](collection []T, iteratee func(item T) R) Rfunc SumByUint32x4[T any, R ~uint32](collection []T, iteratee func(item T) R) Rfunc SumByUint32x8[T any, R ~uint32](collection []T, iteratee func(item T) R) Rfunc SumByUint32x16[T any, R ~uint32](collection []T, iteratee func(item T) R) Rfunc SumByUint64x2[T any, R ~uint64](collection []T, iteratee func(item T) R) Rfunc SumByUint64x4[T any, R ~uint64](collection []T, iteratee func(item T) R) Rfunc SumByUint64x8[T any, R ~uint64](collection []T, iteratee func(item T) R) Rfunc SumByFloat32x4[T any, R ~float32](collection []T, iteratee func(item T) R) Rfunc SumByFloat32x8[T any, R ~float32](collection []T, iteratee func(item T) R) Rfunc SumByFloat32x16[T any, R ~float32](collection []T, iteratee func(item T) R) Rfunc SumByFloat64x2[T any, R ~float64](collection []T, iteratee func(item T) R) Rfunc SumByFloat64x4[T any, R ~float64](collection []T, iteratee func(item T) R) Rfunc SumByFloat64x8[T any, R ~float64](collection []T, iteratee func(item T) R) RMaxโ
Finds the maximum value in a collection using SIMD instructions. The suffix (x2, x4, x8, x16, x32, x64) indicates the number of lanes processed simultaneously.
Note: Choose the variant matching your CPU's capabilities. Higher lane counts provide better performance but require newer CPU support.
// Using AVX2 variant (32 lanes at once) - Intel Haswell+ / AMD Excavator+max := simd.MaxInt8x32([]int8{5, 2, 8, 1, 9})// 9// Using AVX-512 variant (16 lanes at once) - Intel Skylake-X+max := simd.MaxFloat32x16([]float32{3.5, 1.2, 4.8, 2.1})// 4.8// Using AVX variant (4 lanes at once) - works on all amd64max := simd.MaxInt32x4([]int32{100, 50, 200, 75})// 200// Empty collection returns 0max := simd.MaxUint16x8([]uint16{})// 0Prototypes:func MaxInt8x16[T ~int8](collection []T) Tfunc MaxInt8x32[T ~int8](collection []T) Tfunc MaxInt8x64[T ~int8](collection []T) Tfunc MaxInt16x8[T ~int16](collection []T) Tfunc MaxInt16x16[T ~int16](collection []T) Tfunc MaxInt16x32[T ~int16](collection []T) Tfunc MaxInt32x4[T ~int32](collection []T) Tfunc MaxInt32x8[T ~int32](collection []T) Tfunc MaxInt32x16[T ~int32](collection []T) Tfunc MaxInt64x2[T ~int64](collection []T) Tfunc MaxInt64x4[T ~int64](collection []T) Tfunc MaxInt64x8[T ~int64](collection []T) Tfunc MaxUint8x16[T ~uint8](collection []T) Tfunc MaxUint8x32[T ~uint8](collection []T) Tfunc MaxUint8x64[T ~uint8](collection []T) Tfunc MaxUint16x8[T ~uint16](collection []T) Tfunc MaxUint16x16[T ~uint16](collection []T) Tfunc MaxUint16x32[T ~uint16](collection []T) Tfunc MaxUint32x4[T ~uint32](collection []T) Tfunc MaxUint32x8[T ~uint32](collection []T) Tfunc MaxUint32x16[T ~uint32](collection []T) Tfunc MaxUint64x2[T ~uint64](collection []T) Tfunc MaxUint64x4[T ~uint64](collection []T) Tfunc MaxUint64x8[T ~uint64](collection []T) Tfunc MaxFloat32x4[T ~float32](collection []T) Tfunc MaxFloat32x8[T ~float32](collection []T) Tfunc MaxFloat32x16[T ~float32](collection []T) Tfunc MaxFloat64x2[T ~float64](collection []T) Tfunc MaxFloat64x4[T ~float64](collection []T) Tfunc MaxFloat64x8[T ~float64](collection []T) TMeanByโ
MeanBy transforms a collection using an iteratee function and calculates the arithmetic mean of the result using SIMD instructions. The automatic dispatch functions (e.g.,
MeanByInt8) will select the best SIMD variant based on CPU capabilities. The specific variants (e.g.,MeanByInt8x32) use a fixed SIMD instruction set regardless of CPU capabilities.Note: The automatic dispatch functions (e.g.,
MeanByInt8) will use the best available SIMD variant for the current CPU. Use specific variants (e.g.,MeanByInt8x32) only if you know your target CPU supports that instruction set.type Person struct {Name stringAge int8}people := []Person{{Name: "Alice", Age: 20},{Name: "Bob", Age: 30},{Name: "Charlie", Age: 40},}// Automatic dispatch - uses best available SIMDmean := simd.MeanByInt8(people, func(p Person) int8 {return p.Age})// 30type Product struct {Name stringPrice float32}products := []Product{{Name: "Widget", Price: 10.50},{Name: "Gadget", Price: 20.00},{Name: "Tool", Price: 15.75},}// Mean price using specific AVX2 variantmean := simd.MeanByFloat32x8(products, func(p Product) float32 {return p.Price})// 15.4167type Metric struct {Value uint16}metrics := []Metric{{Value: 100},{Value: 200},{Value: 300},{Value: 400},}// Using AVX variant - works on all amd64mean := simd.MeanByUint16x8(metrics, func(m Metric) uint16 {return m.Value})// 250// Empty collection returns 0type Item struct {Count int64}mean := simd.MeanByInt64([]Item{}, func(i Item) int64 {return i.Count})// 0Prototypes:func MeanByInt8[T any, R ~int8](collection []T, iteratee func(item T) R) Rfunc MeanByInt16[T any, R ~int16](collection []T, iteratee func(item T) R) Rfunc MeanByInt32[T any, R ~int32](collection []T, iteratee func(item T) R) Rfunc MeanByInt64[T any, R ~int64](collection []T, iteratee func(item T) R) Rfunc MeanByUint8[T any, R ~uint8](collection []T, iteratee func(item T) R) Rfunc MeanByUint16[T any, R ~uint16](collection []T, iteratee func(item T) R) Rfunc MeanByUint32[T any, R ~uint32](collection []T, iteratee func(item T) R) Rfunc MeanByUint64[T any, R ~uint64](collection []T, iteratee func(item T) R) Rfunc MeanByFloat32[T any, R ~float32](collection []T, iteratee func(item T) R) Rfunc MeanByFloat64[T any, R ~float64](collection []T, iteratee func(item T) R) Rfunc MeanByInt8x16[T any, R ~int8](collection []T, iteratee func(item T) R) Rfunc MeanByInt8x32[T any, R ~int8](collection []T, iteratee func(item T) R) Rfunc MeanByInt8x64[T any, R ~int8](collection []T, iteratee func(item T) R) Rfunc MeanByInt16x8[T any, R ~int16](collection []T, iteratee func(item T) R) Rfunc MeanByInt16x16[T any, R ~int16](collection []T, iteratee func(item T) R) Rfunc MeanByInt16x32[T any, R ~int16](collection []T, iteratee func(item T) R) Rfunc MeanByInt32x4[T any, R ~int32](collection []T, iteratee func(item T) R) Rfunc MeanByInt32x8[T any, R ~int32](collection []T, iteratee func(item T) R) Rfunc MeanByInt32x16[T any, R ~int32](collection []T, iteratee func(item T) R) Rfunc MeanByInt64x2[T any, R ~int64](collection []T, iteratee func(item T) R) Rfunc MeanByInt64x4[T any, R ~int64](collection []T, iteratee func(item T) R) Rfunc MeanByInt64x8[T any, R ~int64](collection []T, iteratee func(item T) R) Rfunc MeanByUint8x16[T any, R ~uint8](collection []T, iteratee func(item T) R) Rfunc MeanByUint8x32[T any, R ~uint8](collection []T, iteratee func(item T) R) Rfunc MeanByUint8x64[T any, R ~uint8](collection []T, iteratee func(item T) R) Rfunc MeanByUint16x8[T any, R ~uint16](collection []T, iteratee func(item T) R) Rfunc MeanByUint16x16[T any, R ~uint16](collection []T, iteratee func(item T) R) Rfunc MeanByUint16x32[T any, R ~uint16](collection []T, iteratee func(item T) R) Rfunc MeanByUint32x4[T any, R ~uint32](collection []T, iteratee func(item T) R) Rfunc MeanByUint32x8[T any, R ~uint32](collection []T, iteratee func(item T) R) Rfunc MeanByUint32x16[T any, R ~uint32](collection []T, iteratee func(item T) R) Rfunc MeanByUint64x2[T any, R ~uint64](collection []T, iteratee func(item T) R) Rfunc MeanByUint64x4[T any, R ~uint64](collection []T, iteratee func(item T) R) Rfunc MeanByUint64x8[T any, R ~uint64](collection []T, iteratee func(item T) R) Rfunc MeanByFloat32x4[T any, R ~float32](collection []T, iteratee func(item T) R) Rfunc MeanByFloat32x8[T any, R ~float32](collection []T, iteratee func(item T) R) Rfunc MeanByFloat32x16[T any, R ~float32](collection []T, iteratee func(item T) R) Rfunc MeanByFloat64x2[T any, R ~float64](collection []T, iteratee func(item T) R) Rfunc MeanByFloat64x4[T any, R ~float64](collection []T, iteratee func(item T) R) Rfunc MeanByFloat64x8[T any, R ~float64](collection []T, iteratee func(item T) R) RClampโ
Clamps each element in a collection between min and max values using SIMD instructions. The suffix (x2, x4, x8, x16, x32, x64) indicates the number of lanes processed simultaneously.
Note: Choose the variant matching your CPU's capabilities. Higher lane counts provide better performance but require newer CPU support.
// Using AVX2 variant (32 lanes at once) - Intel Haswell+ / AMD Excavator+result := simd.ClampInt8x32([]int8{1, 5, 10, 15, 20}, 5, 15)// []int8{5, 5, 10, 15, 15}// Using AVX-512 variant (16 lanes at once) - Intel Skylake-X+result := simd.ClampFloat32x16([]float32{0.5, 1.5, 2.5, 3.5}, 1.0, 3.0)// []float32{1.0, 1.5, 2.5, 3.0}// Using AVX variant (8 lanes at once) - works on all amd64result := simd.ClampInt16x8([]int16{100, 150, 200, 250}, 120, 220)// []int16{120, 150, 200, 220}// Empty collection returns empty collectionresult := simd.ClampUint32x4([]uint32{}, 10, 100)// []uint32{}Prototypes:func ClampInt8x16[T ~int8, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt8x32[T ~int8, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt8x64[T ~int8, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt16x8[T ~int16, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt16x16[T ~int16, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt16x32[T ~int16, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt32x4[T ~int32, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt32x8[T ~int32, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt32x16[T ~int32, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt64x2[T ~int64, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt64x4[T ~int64, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampInt64x8[T ~int64, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint8x16[T ~uint8, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint8x32[T ~uint8, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint8x64[T ~uint8, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint16x8[T ~uint16, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint16x16[T ~uint16, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint16x32[T ~uint16, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint32x4[T ~uint32, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint32x8[T ~uint32, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint32x16[T ~uint32, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint64x2[T ~uint64, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint64x4[T ~uint64, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampUint64x8[T ~uint64, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampFloat32x4[T ~float32, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampFloat32x8[T ~float32, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampFloat32x16[T ~float32, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampFloat64x2[T ~float64, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampFloat64x4[T ~float64, Slice ~[]T](collection Slice, min, max T) Slicefunc ClampFloat64x8[T ~float64, Slice ~[]T](collection Slice, min, max T) Slice