Meta Hands Over a 405-Billion-Parameter Model for Free
Meta released Llama 3.1 405B under an open license that allows download, fine-tuning, and commercial use without API fees. The model matches or exceeds GPT-4 Turbo on MMLU, HumanEval, and GSM8K benchmarks while running on clusters of eight H100 GPUs. Developers gain full weight access and can host it locally or on any cloud provider.
Teams move from renting inference tokens to owning model weights, eliminating recurring API costs and data-sharing requirements. This changes planning from usage-based budgets to one-time infrastructure decisions.
Hugging Face hosts the weights at https://huggingface.co/meta-llama/Meta-Llama-3.1-405B and reports over 500,000 downloads in the first week with community fine-tunes already appearing on the platform.
Step 1: Visit https://ai.meta.com/blog/meta-llama-3-1/ and accept the Llama 3.1 community license. Step 2: Use the provided Hugging Face command to download the 405B weights or the smaller 70B and 8B variants. Step 3: Load the model in vLLM or Hugging Face Transformers and run inference locally to confirm output quality before deploying to production.