High-performance tensor operations with SIMD acceleration for .NET. Provides Vector, Matrix, and Tensor types with hardware-accelerated operations (Sin, Cos, Exp, Log, etc.) that work across .NET Framework 4.7.1 and .NET 10.0. Supports any numeric type via generic INumericOperations interface. Optional GPU acceleration available via separate packages.
$ dotnet add package AiDotNet.TensorsHigh-performance tensor operations with SIMD and GPU acceleration for .NET.
ArrayPool<T> and Span<T> for hot path performanceINumericOperations<T> interface# Core package (CPU SIMD acceleration)
dotnet add package AiDotNet.Tensors
# Optional: OpenBLAS for optimized CPU BLAS operations
dotnet add package AiDotNet.Native.OpenBLAS
# Optional: CLBlast for OpenCL GPU acceleration (AMD/Intel/NVIDIA)
dotnet add package AiDotNet.Native.CLBlast
# Optional: CUDA for NVIDIA GPU acceleration (requires NVIDIA GPU)
dotnet add package AiDotNet.Native.CUDA
using AiDotNet.Tensors.LinearAlgebra;
// Create vectors
var v1 = new Vector<double>(new[] { 1.0, 2.0, 3.0, 4.0 });
var v2 = new Vector<double>(new[] { 5.0, 6.0, 7.0, 8.0 });
// SIMD-accelerated operations
var sum = v1 + v2;
var dot = v1.Dot(v2);
// Create matrices
var m1 = new Matrix<double>(3, 3);
var m2 = Matrix<double>.Identity(3);
// Matrix operations
var product = m1 * m2;
var transpose = m1.Transpose();
The library automatically detects and uses the best available SIMD instructions:
| Instruction Set | Vector Width | Supported |
|---|---|---|
| AVX-512 | 512-bit (16 floats) | .NET 8+ |
| AVX2 | 256-bit (8 floats) | .NET 6+ |
| AVX | 256-bit (8 floats) | .NET 6+ |
| SSE4.2 | 128-bit (4 floats) | .NET 6+ |
| ARM NEON | 128-bit (4 floats) | .NET 6+ |
using AiDotNet.Tensors.Engines;
var caps = PlatformDetector.Capabilities;
// SIMD capabilities
Console.WriteLine($"AVX2: {caps.HasAVX2}");
Console.WriteLine($"AVX-512: {caps.HasAVX512F}");
// GPU support
Console.WriteLine($"CUDA: {caps.HasCudaSupport}");
Console.WriteLine($"OpenCL: {caps.HasOpenCLSupport}");
// Native library availability
Console.WriteLine($"OpenBLAS: {caps.HasOpenBlas}");
Console.WriteLine($"CLBlast: {caps.HasClBlast}");
// Or get a full status summary
Console.WriteLine(NativeLibraryDetector.GetStatusSummary());Provides optimized CPU BLAS operations using OpenBLAS:
dotnet add package AiDotNet.Native.OpenBLASPerformance: 2x faster matrix operations compared to managed code.
Provides GPU acceleration via OpenCL (works on AMD, Intel, and NVIDIA GPUs):
dotnet add package AiDotNet.Native.CLBlastPerformance: 10x+ faster for large matrix operations on GPU.
Provides GPU acceleration via NVIDIA CUDA (NVIDIA GPUs only):
dotnet add package AiDotNet.Native.CUDAPerformance: 30,000+ GFLOPS for matrix operations on modern NVIDIA GPUs.
Requirements:
Usage with helpful error messages:
using AiDotNet.Tensors.Engines.DirectGpu.CUDA;
// Recommended: throws beginner-friendly exception if CUDA unavailable
using var cuda = CudaBackend.CreateOrThrow();
// Or check availability first
if (CudaBackend.IsCudaAvailable)
{
using var backend = new CudaBackend();
// Use CUDA acceleration
}If CUDA is not available, you'll get detailed troubleshooting steps explaining exactly what's missing and how to fix it.
Apache 2.0 - See LICENSE for details.
Contributions are welcome! Please feel free to submit a Pull Request.