Original Research

PyTorch Model Zoo Stats

Name: PyTorch Model Zoo Stats — HuggingFace Downloads
Creator: Michael Lip
Published: 2026-04-11
License: https://creativecommons.org/licenses/by/4.0/

Architecture Popularity and Download Trends

By Michael Lip · Published April 11, 2026 · Data source: HuggingFace API · Last updated: April 11, 2026

Models Ranked

194M

Top Downloads

5,942

Most Liked

Task Categories

Which PyTorch models are developers actually using? We queried the HuggingFace API for the top 50 most-downloaded PyTorch models and found that sentence embedding models dominate, with all-MiniLM-L6-v2 leading at 194M+ downloads. Text generation models from Qwen and Meta occupy 12 of the top 50 spots, reflecting the ongoing LLM boom.

This data helps developers choose battle-tested models and understand which architectures the community trusts for production workloads.

Top 50 PyTorch Models by Downloads

#	Model ID	Downloads	Likes	Task	License
1	sentence-transformers/all-MiniLM-L6-v2	194,504,137	4,666	sentence-similarity	Apache-2.0
2	google-bert/bert-base-uncased	66,872,132	2,617	fill-mask	Apache-2.0
3	google/electra-base-discriminator	48,518,518	90	discriminative	Apache-2.0
4	Falconsai/nsfw_image_detection	37,968,104	1,041	image-classification	Apache-2.0
5	sentence-transformers/all-mpnet-base-v2	29,616,352	1,275	sentence-similarity	Apache-2.0
6	openai/clip-vit-large-patch14	29,402,065	1,989	zero-shot-image	Custom
7	sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2	28,989,731	1,193	sentence-similarity	Apache-2.0
8	openai/clip-vit-base-patch32	20,738,321	904	zero-shot-image	Custom
9	FacebookAI/roberta-large	20,428,795	273	fill-mask	MIT
10	FacebookAI/xlm-roberta-base	19,185,606	812	fill-mask	MIT
11	laion/clap-htsat-fused	19,125,301	75	audio-classification	Apache-2.0
12	cross-encoder/ms-marco-MiniLM-L6-v2	17,230,680	213	text-ranking	Apache-2.0
13	BAAI/bge-small-en-v1.5	16,375,859	434	feature-extraction	MIT
14	openai/clip-vit-large-patch14-336	15,708,322	297	zero-shot-image	Custom
15	Bingsu/adetailer	15,447,493	682	detection	Apache-2.0
16	FacebookAI/roberta-base	15,384,150	583	fill-mask	MIT
17	colbert-ir/colbertv2.0	15,383,631	320	retrieval	MIT
18	Qwen/Qwen3-0.6B	15,101,523	1,183	text-generation	Apache-2.0
19	BAAI/bge-m3	15,082,193	2,896	sentence-similarity	MIT
20	timm/mobilenetv3_small_100.lamb_in1k	14,678,825	60	image-classification	Apache-2.0
21	amazon/chronos-2	13,963,080	229	time-series	Apache-2.0
22	openai-community/gpt2	13,521,727	3,188	text-generation	MIT
23	Qwen/Qwen2.5-7B-Instruct	12,111,952	1,198	text-generation	Apache-2.0
24	pyannote/wespeaker-voxceleb-resnet34-LM	11,945,522	116	speaker-embed	CC-BY-4.0
25	pyannote/segmentation-3.0	11,077,399	894	voice-activity	MIT
26	pyannote/speaker-diarization-3.1	10,906,588	1,732	speech-recognition	MIT
27	nomic-ai/nomic-embed-text-v1.5	10,616,366	794	sentence-similarity	Apache-2.0
28	omni-research/Tarsier2-Recap-7b	10,424,091	33	multimodal	Apache-2.0
29	autogluon/chronos-bolt-small	10,236,709	30	time-series	Apache-2.0
30	Qwen/Qwen2.5-1.5B-Instruct	9,922,750	661	text-generation	Apache-2.0
31	hexgrad/Kokoro-82M	9,728,735	5,942	text-to-speech	Apache-2.0
32	meta-llama/Llama-3.1-8B-Instruct	9,196,892	5,677	text-generation	Llama 3.1
33	autogluon/chronos-2	9,162,047	12	time-series	Apache-2.0
34	Qwen/Qwen2.5-3B-Instruct	8,679,038	437	text-generation	Other
35	Qwen/Qwen3-8B	8,366,071	1,036	text-generation	Apache-2.0
36	Qwen/Qwen3-4B	8,255,392	593	text-generation	Apache-2.0
37	Qwen/Qwen3-1.7B	8,194,168	445	text-generation	Apache-2.0
38	Marqo/nsfw-image-detection-384	7,885,546	45	image-classification	Apache-2.0
39	distilbert/distilbert-base-uncased	7,745,707	860	fill-mask	Apache-2.0
40	BAAI/bge-large-en-v1.5	7,570,795	646	feature-extraction	MIT
41	Kijai/WanVideo_comfy	7,415,388	2,246	video-generation	Custom
42	Qwen/Qwen3-4B-Instruct-2507	7,359,404	802	text-generation	Apache-2.0
43	facebook/contriever	7,296,114	77	retrieval	Custom
44	answerdotai/ModernBERT-base	7,201,565	1,022	fill-mask	Apache-2.0
45	Comfy-Org/Wan_2.2_ComfyUI_Repackaged	6,919,345	671	image-generation	Custom
46	lpiccinelli/unidepth-v2-vitl14	6,886,202	12	depth-estimation	Custom
47	dima806/fairface_age_image_detection	6,858,091	69	image-classification	Apache-2.0
48	amazon/chronos-bolt-base	6,774,805	85	time-series	Apache-2.0
49	FacebookAI/xlm-roberta-large	6,667,591	503	fill-mask	MIT
50	facebook/opt-125m	6,544,170	240	text-generation	Other

Methodology

Model data in this index is sourced directly from the HuggingFace API:

API endpoint — huggingface.co/api/models?sort=downloads&direction=-1&limit=50&library=pytorch queried on April 11, 2026
Downloads — All-time download count as reported by HuggingFace's CDN tracking
Likes — Community likes/stars, equivalent to GitHub stars
Task categories — Derived from the model's pipeline_tag metadata field
Licenses — Extracted from model tags; some models use custom or restrictive licenses

Download counts include programmatic API access and may not reflect unique users. Models used as dependencies by other packages will have inflated download numbers.

Frequently Asked Questions

What is the most downloaded PyTorch model on HuggingFace?

As of April 2026, the most downloaded PyTorch model is sentence-transformers/all-MiniLM-L6-v2 with over 194 million downloads. This sentence embedding model is popular for semantic search, clustering, and retrieval-augmented generation (RAG) pipelines. Google's BERT-base-uncased ranks second with 66M+ downloads.

Which PyTorch model architecture is most popular in 2026?

Sentence embedding models dominate the top downloads, with sentence-transformers appearing 4 times in the top 20. Text generation models (Qwen3, Llama 3.1, GPT-2) are the second most popular category. CLIP models for zero-shot image classification hold strong in the top 10. The trend shows a shift toward specialized embedding and multimodal models.

How do I choose between PyTorch models for my project?

Consider: 1) Task match — use the pipeline_tag to find models for your task, 2) Model size vs. hardware — MiniLM (22M params) runs on CPU while Llama-3.1-8B needs a GPU, 3) License — Apache 2.0 and MIT are commercially permissive, 4) Community validation — higher likes and downloads indicate battle-tested models, 5) Recency — newer architectures like ModernBERT often outperform older ones.

What license do most popular PyTorch models use?

Apache 2.0 is the most common license among the top 50, used by over 60% including Qwen, sentence-transformers, and BERT. MIT is second, used by RoBERTa, BGE, and pyannote models. Some models like CLIP and Llama use custom licenses with varying commercial restrictions.

Why do some models have high downloads but low likes?

High downloads with low likes typically indicates the model is used programmatically as a dependency rather than being directly chosen. For example, google/electra-base-discriminator has 48M downloads but only 90 likes because it is pulled automatically by downstream packages. Models with high likes relative to downloads (like Kokoro-82M with 5,942 likes) tend to have strong community engagement.

Related Tools

Parameter Counter Memory Calculator ML Framework Comparison Layer Performance

Free to use under CC BY 4.0 license. Cite this page when sharing.