Loss Functions Guide
Compare PyTorch loss functions: CrossEntropyLoss, MSELoss, BCELoss, and more. Formulas, when to use, code examples, and common pitfalls for each loss function.
Built by Michael Lip
Frequently Asked Questions
Which loss function should I use?
Multi-class classification: CrossEntropyLoss (no Softmax needed — it's built in). Binary classification: BCEWithLogitsLoss. Regression: MSELoss (L2) or L1Loss (more robust to outliers). Regression with outliers: HuberLoss.
Why does CrossEntropyLoss include Softmax?
PyTorch's CrossEntropyLoss combines LogSoftmax + NLLLoss for numerical stability. Do NOT apply Softmax before CrossEntropyLoss — you'll get wrong gradients. Your model's final layer should output raw logits.
What is the difference between BCE and CrossEntropy?
BCELoss is for binary classification (one output per sample, 0 or 1). CrossEntropyLoss is for multi-class classification (one-of-N classes). For multi-label classification (multiple labels per sample), use BCEWithLogitsLoss.
About This Tool
This tool is part of HeyTensor, a free suite of PyTorch and deep learning utilities. All calculations run entirely in your browser — no data is sent to any server. The source code is open on GitHub.
Contact
HeyTensor is built and maintained by Michael Lip. For questions or feedback, email [email protected].