How to Fix "Gradient Is None" in PyTorch
Ensure requires_grad=True on your tensor, and that you called .backward() before accessing .grad. Detached tensors and operations inside torch.no_grad() lose gradients.
Cause 1: requires_grad Not Set
# BUG: tensors don't track gradients by default
x = torch.tensor([1.0, 2.0, 3.0])
y = x * 2
y.sum().backward()
print(x.grad) # None!
# FIX: set requires_grad=True
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
y = x * 2
y.sum().backward()
print(x.grad) # tensor([2., 2., 2.]) ✓
Cause 2: Forgot to Call .backward()
x = torch.randn(3, requires_grad=True)
y = x.sum()
# print(x.grad) # None — backward() not called yet!
y.backward()
print(x.grad) # tensor([1., 1., 1.]) ✓
Cause 3: Tensor Was Detached
x = torch.randn(3, requires_grad=True)
y = x.detach() * 2 # detach() breaks the computation graph
y.sum().backward() # ERROR or x.grad is None
# FIX: don't detach if you need gradients
y = x * 2
y.sum().backward()
print(x.grad) # Works ✓
Cause 4: Inside torch.no_grad()
x = torch.randn(3, requires_grad=True)
with torch.no_grad():
y = x * 2 # no computation graph built
# y.sum().backward() # ERROR: grad can't be created
# FIX: remove torch.no_grad() during training
y = x * 2
y.sum().backward() # Works ✓
Cause 5: Non-Leaf Tensor
x = torch.randn(3, requires_grad=True)
y = x * 2 # y is non-leaf (result of computation)
y.sum().backward()
print(y.grad) # None — only leaf tensors get .grad
# FIX: use retain_grad() for non-leaf tensors
x = torch.randn(3, requires_grad=True)
y = x * 2
y.retain_grad() # tell PyTorch to keep this gradient
y.sum().backward()
print(y.grad) # tensor([1., 1., 1.]) ✓