PyTorch Model Zoo Stats
Architecture Popularity and Download Trends
Which PyTorch models are developers actually using? We queried the HuggingFace API for the top 50 most-downloaded PyTorch models and found that sentence embedding models dominate, with all-MiniLM-L6-v2 leading at 194M+ downloads. Text generation models from Qwen and Meta occupy 12 of the top 50 spots, reflecting the ongoing LLM boom.
This data helps developers choose battle-tested models and understand which architectures the community trusts for production workloads.
Top 50 PyTorch Models by Downloads
| # | Model ID | Downloads | Likes | Task | License |
|---|---|---|---|---|---|
| 1 | sentence-transformers/all-MiniLM-L6-v2 | 194,504,137 | 4,666 | sentence-similarity | Apache-2.0 |
| 2 | google-bert/bert-base-uncased | 66,872,132 | 2,617 | fill-mask | Apache-2.0 |
| 3 | google/electra-base-discriminator | 48,518,518 | 90 | discriminative | Apache-2.0 |
| 4 | Falconsai/nsfw_image_detection | 37,968,104 | 1,041 | image-classification | Apache-2.0 |
| 5 | sentence-transformers/all-mpnet-base-v2 | 29,616,352 | 1,275 | sentence-similarity | Apache-2.0 |
| 6 | openai/clip-vit-large-patch14 | 29,402,065 | 1,989 | zero-shot-image | Custom |
| 7 | sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | 28,989,731 | 1,193 | sentence-similarity | Apache-2.0 |
| 8 | openai/clip-vit-base-patch32 | 20,738,321 | 904 | zero-shot-image | Custom |
| 9 | FacebookAI/roberta-large | 20,428,795 | 273 | fill-mask | MIT |
| 10 | FacebookAI/xlm-roberta-base | 19,185,606 | 812 | fill-mask | MIT |
| 11 | laion/clap-htsat-fused | 19,125,301 | 75 | audio-classification | Apache-2.0 |
| 12 | cross-encoder/ms-marco-MiniLM-L6-v2 | 17,230,680 | 213 | text-ranking | Apache-2.0 |
| 13 | BAAI/bge-small-en-v1.5 | 16,375,859 | 434 | feature-extraction | MIT |
| 14 | openai/clip-vit-large-patch14-336 | 15,708,322 | 297 | zero-shot-image | Custom |
| 15 | Bingsu/adetailer | 15,447,493 | 682 | detection | Apache-2.0 |
| 16 | FacebookAI/roberta-base | 15,384,150 | 583 | fill-mask | MIT |
| 17 | colbert-ir/colbertv2.0 | 15,383,631 | 320 | retrieval | MIT |
| 18 | Qwen/Qwen3-0.6B | 15,101,523 | 1,183 | text-generation | Apache-2.0 |
| 19 | BAAI/bge-m3 | 15,082,193 | 2,896 | sentence-similarity | MIT |
| 20 | timm/mobilenetv3_small_100.lamb_in1k | 14,678,825 | 60 | image-classification | Apache-2.0 |
| 21 | amazon/chronos-2 | 13,963,080 | 229 | time-series | Apache-2.0 |
| 22 | openai-community/gpt2 | 13,521,727 | 3,188 | text-generation | MIT |
| 23 | Qwen/Qwen2.5-7B-Instruct | 12,111,952 | 1,198 | text-generation | Apache-2.0 |
| 24 | pyannote/wespeaker-voxceleb-resnet34-LM | 11,945,522 | 116 | speaker-embed | CC-BY-4.0 |
| 25 | pyannote/segmentation-3.0 | 11,077,399 | 894 | voice-activity | MIT |
| 26 | pyannote/speaker-diarization-3.1 | 10,906,588 | 1,732 | speech-recognition | MIT |
| 27 | nomic-ai/nomic-embed-text-v1.5 | 10,616,366 | 794 | sentence-similarity | Apache-2.0 |
| 28 | omni-research/Tarsier2-Recap-7b | 10,424,091 | 33 | multimodal | Apache-2.0 |
| 29 | autogluon/chronos-bolt-small | 10,236,709 | 30 | time-series | Apache-2.0 |
| 30 | Qwen/Qwen2.5-1.5B-Instruct | 9,922,750 | 661 | text-generation | Apache-2.0 |
| 31 | hexgrad/Kokoro-82M | 9,728,735 | 5,942 | text-to-speech | Apache-2.0 |
| 32 | meta-llama/Llama-3.1-8B-Instruct | 9,196,892 | 5,677 | text-generation | Llama 3.1 |
| 33 | autogluon/chronos-2 | 9,162,047 | 12 | time-series | Apache-2.0 |
| 34 | Qwen/Qwen2.5-3B-Instruct | 8,679,038 | 437 | text-generation | Other |
| 35 | Qwen/Qwen3-8B | 8,366,071 | 1,036 | text-generation | Apache-2.0 |
| 36 | Qwen/Qwen3-4B | 8,255,392 | 593 | text-generation | Apache-2.0 |
| 37 | Qwen/Qwen3-1.7B | 8,194,168 | 445 | text-generation | Apache-2.0 |
| 38 | Marqo/nsfw-image-detection-384 | 7,885,546 | 45 | image-classification | Apache-2.0 |
| 39 | distilbert/distilbert-base-uncased | 7,745,707 | 860 | fill-mask | Apache-2.0 |
| 40 | BAAI/bge-large-en-v1.5 | 7,570,795 | 646 | feature-extraction | MIT |
| 41 | Kijai/WanVideo_comfy | 7,415,388 | 2,246 | video-generation | Custom |
| 42 | Qwen/Qwen3-4B-Instruct-2507 | 7,359,404 | 802 | text-generation | Apache-2.0 |
| 43 | facebook/contriever | 7,296,114 | 77 | retrieval | Custom |
| 44 | answerdotai/ModernBERT-base | 7,201,565 | 1,022 | fill-mask | Apache-2.0 |
| 45 | Comfy-Org/Wan_2.2_ComfyUI_Repackaged | 6,919,345 | 671 | image-generation | Custom |
| 46 | lpiccinelli/unidepth-v2-vitl14 | 6,886,202 | 12 | depth-estimation | Custom |
| 47 | dima806/fairface_age_image_detection | 6,858,091 | 69 | image-classification | Apache-2.0 |
| 48 | amazon/chronos-bolt-base | 6,774,805 | 85 | time-series | Apache-2.0 |
| 49 | FacebookAI/xlm-roberta-large | 6,667,591 | 503 | fill-mask | MIT |
| 50 | facebook/opt-125m | 6,544,170 | 240 | text-generation | Other |
Methodology
Model data in this index is sourced directly from the HuggingFace API:
- API endpoint —
huggingface.co/api/models?sort=downloads&direction=-1&limit=50&library=pytorchqueried on April 11, 2026 - Downloads — All-time download count as reported by HuggingFace's CDN tracking
- Likes — Community likes/stars, equivalent to GitHub stars
- Task categories — Derived from the model's
pipeline_tagmetadata field - Licenses — Extracted from model tags; some models use custom or restrictive licenses
Download counts include programmatic API access and may not reflect unique users. Models used as dependencies by other packages will have inflated download numbers.
Frequently Asked Questions
What is the most downloaded PyTorch model on HuggingFace?
As of April 2026, the most downloaded PyTorch model is sentence-transformers/all-MiniLM-L6-v2 with over 194 million downloads. This sentence embedding model is popular for semantic search, clustering, and retrieval-augmented generation (RAG) pipelines. Google's BERT-base-uncased ranks second with 66M+ downloads.
Which PyTorch model architecture is most popular in 2026?
Sentence embedding models dominate the top downloads, with sentence-transformers appearing 4 times in the top 20. Text generation models (Qwen3, Llama 3.1, GPT-2) are the second most popular category. CLIP models for zero-shot image classification hold strong in the top 10. The trend shows a shift toward specialized embedding and multimodal models.
How do I choose between PyTorch models for my project?
Consider: 1) Task match — use the pipeline_tag to find models for your task, 2) Model size vs. hardware — MiniLM (22M params) runs on CPU while Llama-3.1-8B needs a GPU, 3) License — Apache 2.0 and MIT are commercially permissive, 4) Community validation — higher likes and downloads indicate battle-tested models, 5) Recency — newer architectures like ModernBERT often outperform older ones.
What license do most popular PyTorch models use?
Apache 2.0 is the most common license among the top 50, used by over 60% including Qwen, sentence-transformers, and BERT. MIT is second, used by RoBERTa, BGE, and pyannote models. Some models like CLIP and Llama use custom licenses with varying commercial restrictions.
Why do some models have high downloads but low likes?
High downloads with low likes typically indicates the model is used programmatically as a dependency rather than being directly chosen. For example, google/electra-base-discriminator has 48M downloads but only 90 likes because it is pulled automatically by downstream packages. Models with high likes relative to downloads (like Kokoro-82M with 5,942 likes) tend to have strong community engagement.
Related Tools
Free to use under CC BY 4.0 license. Cite this page when sharing.