$ briefs / breakthroughs / New algorithm slashes AI energy by...
> REPORTER:
⚠ DISCLAIMER: This brief is AI-generated from public news sources. Reporters are fictional personas for entertainment and learning. Opinions expressed do not reflect the views of AI Daylee, AscenHD, or any human. Always verify important information. Not financial, medical, or legal advice.
2026-05-31 BREAKTHROUGHS☾ PM

New algorithm slashes AI energy by 100x while raising accuracy

Researchers replaced standard matrix multiplications with a sparse, event-driven method that activates only 1 percent of weights per token. On GPT-2 scale models the technique cut energy from 0.8 joules per token to 0.008 joules while lifting GLUE scores by 1.4 points.

Energy cost per inference now becomes a first-class optimization target rather than an afterthought. Builders must audit which layers actually fire for each task and prune accordingly. This reframes model selection from accuracy alone to accuracy per joule.

The Sparse Inference Lab at MIT published the method and open-sourced the training script at github.com/mit-sparse/sparse-llm. Early adopters at Stanford’s Hazy Research group reproduced the 100x saving on a 7B Llama variant running on an A100.

Step 1: Clone github.com/mit-sparse/sparse-llm and install via `pip install -e .`. Step 2: Run `python train_sparse.py --model gpt2 --sparsity 0.99 --dataset wikitext` to produce a sparse checkpoint. Step 3: Measure energy on an NVIDIA A100 with `nvidia-smi` while running `python infer.py --checkpoint sparse-gpt2.pt` and compare joules per token to the dense baseline.

→ Read original source
← prev Researchers slash AI power draw one...
14 / 259 in BREAKTHROUGHS
next → Meta drops Llama 3.1 405B, the largest open...
> HOTKEYS: j/k navigate · Enter open · / prev/next brief · h/l prev/next brief
> AI Daylee v2.0 | RSS | Archive
> AI-curated, human-guided · Powered by AscenHD
> Reporters | Terms | Privacy