DotCompute.Backends.CPU
Production-ready CPU compute backend with SIMD vectorization for .NET 9+.
Status: ✅ Production Ready
The CPU backend provides high-performance compute acceleration through:
- SIMD Vectorization: AVX512/AVX2/NEON instruction sets
- Multi-threading: Work-stealing thread pool
- Memory Optimization: NUMA-aware allocation
- Native AOT: Full compatibility with Native AOT compilation
Key Features
SIMD Acceleration
- AVX512: Best performance on Intel Ice Lake+ and AMD Zen 4+
- AVX2: Wide compatibility on modern Intel/AMD processors
- NEON: ARM64 support for Apple Silicon and ARM servers
- Automatic Detection: Runtime detection of optimal instruction set
Performance
- 8-23x Speedup: Achieved on vectorizable operations
- Memory Bandwidth: 95%+ of theoretical peak utilization
- Thread Scaling: Near-linear scaling to CPU core count
- Low Overhead: Sub-microsecond kernel launch latency
Installation
dotnet add package DotCompute.Backends.CPU --version 0.5.3
Usage
Basic Setup
using DotCompute.Backends.CPU;
using Microsoft.Extensions.Logging;
var logger = LoggerFactory.Create(builder => builder.AddConsole())
.CreateLogger<CpuAccelerator>();
var accelerator = new CpuAccelerator(logger);
await accelerator.InitializeAsync();
Service Registration
services.AddSingleton<IAccelerator, CpuAccelerator>();
// OR
services.AddCpuBackend();
Kernel Execution
var kernelDef = new KernelDefinition
{
Name = "VectorAdd",
Source = "/* OpenCL C kernel source */",
EntryPoint = "vector_add"
};
var compiledKernel = await accelerator.CompileKernelAsync(kernelDef);
await compiledKernel.ExecuteAsync(parameters);
Architecture
SIMD Dispatcher
Automatically selects the best available SIMD instruction set:
- Detection: Runtime CPU capability detection
- Dispatch: Function pointer selection to optimized kernels
- Fallback: Scalar implementation for unsupported hardware
Thread Pool
- Work-Stealing: Efficient load balancing across cores
- Thread-Local Storage: Minimizes synchronization overhead
- Adaptive Sizing: Scales with workload and system load
Memory Management
- NUMA Awareness: Memory allocation respects CPU topology
- Cache Optimization: Data layout for optimal cache usage
- Memory Pooling: Reuse allocations to reduce overhead
Performance Benchmarks
Tested on Intel Core Ultra 7 165H with 16 threads:
| Operation | Elements | CPU Time | SIMD Time | Speedup |
|---|
| Vector Add | 1M floats | 4.33ms | 187μs | 23x |
| Matrix Mult | 512x512 | 2,340ms | 89ms | 26x |
| Dot Product | 1M floats | 2.1ms | 156μs | 13.4x |
System Requirements
Minimum
- .NET 9.0 or later
- x64 or ARM64 processor
- 2GB RAM
Recommended
- Modern CPU with AVX2+ (Intel Haswell+ / AMD Excavator+)
- 8+ CPU cores for optimal threading performance
- 16GB+ RAM for large datasets
Supported Platforms
- Windows: x64, ARM64
- Linux: x64, ARM64
- macOS: x64 (Intel), ARM64 (Apple Silicon)
Build Configuration
The CPU backend automatically configures itself based on the target platform:
<PropertyGroup Condition="'$(TargetArchitecture)' == 'x64'">
<DefineConstants>$(DefineConstants);ENABLE_AVX2;ENABLE_AVX512</DefineConstants>
</PropertyGroup>
<PropertyGroup Condition="'$(TargetArchitecture)' == 'arm64'">
<DefineConstants>$(DefineConstants);ENABLE_NEON</DefineConstants>
</PropertyGroup>
Troubleshooting
Performance Issues
- Check SIMD Support: Verify CPU supports AVX2/AVX512
- Memory Alignment: Ensure data is properly aligned for SIMD
- Thread Count: Match thread count to physical cores
- Memory Bandwidth: Monitor memory utilization during execution
Compatibility Issues
- Native AOT: Ensure all types are AOT-compatible
- Platform Support: Verify target platform support
- Dependencies: Check for missing runtime dependencies
Documentation & Resources
Comprehensive documentation is available for DotCompute:
Architecture Documentation
Developer Guides
Examples
API Documentation
Support
Contributing
The CPU backend welcomes contributions in:
- New SIMD instruction set support (e.g., AVX-512 variants)
- Platform-specific optimizations
- Kernel compilation improvements
- Performance benchmarks and analysis
See CONTRIBUTING.md for guidelines.