Increasing model download speeds from Hugging Face

Alpaca · September 3, 2025, 8:05am

Hi, What optimizations would you recommend for increasing model download speeds from Hugging Face?

TeamVerda · September 3, 2025, 10:16am

Hi,

There are multiple optimizations depending on the model, some of the quick and easy changes:

set HF_HOME to your SFS (Shared Filesystem) or NVMe drive (read about Cache management: https://huggingface.co/docs/datasets/en/cache)
use HF_HUB_ENABLE_HF_TRANSFER=1

Use hf_transfer, more on their docs site: https://huggingface.co/docs/hub/en/models-downloading

HF transfer helps usually if you have 100s of 4-9 gb files. Sometimes in HF there are these huge 45 gb files (example Qwen Image, Wan 2.2) They make the downloads quite slow.

Most good llm models have shards of size 9gb so can make excellent use of HF_TRANSFER

For reference

Topic		Replies	Views
Finetuning LLM model using serverless container General Discussion	1	81	October 21, 2025
Limit in containers API Product Q&A	1	66	August 26, 2025
Run Ollama on the cloud server Product Q&A	1	119	September 2, 2025
Post Event Q&A - PyTorch Afters: Efficient World Model and LLM Training on Blackwell Infrastructure Event	0	32	October 21, 2025
PyTorch Afters: Efficient World Model and LLM Training on Blackwell Infrastructure Event	1	75	October 20, 2025

Increasing model download speeds from Hugging Face

Related topics