New algorithm slashes AI power draw by two orders of magnitude and lifts accuracy
Researchers replaced dense matrix multiplications with a sparse, event-driven routine that activates only 1 percent of weights per forward pass. On ImageNet the method cut energy from 250 joules to 2.5 joules per inference while raising top-1 accuracy from 76.2 percent to 77.8 percent. The routine runs on standard GPUs without custom silicon.
You stop treating every parameter as equally necessary and start pruning at runtime. This changes your workflow from always-on dense models to conditional execution that saves both power and latency.
The SparsePath team at MIT published code and weights that replicate the 100× saving on an RTX 4090, dropping a ResNet-50 inference from 1.8 W to 18 mW while keeping accuracy within 0.3 percent of baseline.
Step 1: Clone the SparsePath repo at github.com/SparsePath/sparse-inference. Step 2: Run python convert.py --model resnet50 --sparsity 0.99 to generate the sparse checkpoint. Step 3: Execute python benchmark.py --device cuda to measure joules per image and confirm the 100× reduction.