How to Fix "one of the variables needed for gradient computation has been modified" in PyTorch

The error one of the variables needed for gradient computation has been modified means pyTorch autograd records operations to build a computation graph for backpropagation. When you modify a tensor in-place (e.g., x += 1, x.add_(1), x[:] = value), the saved reference becomes invalid because the underlying data changed. This causes incorrect gradients or this error.

What Causes This Error

PyTorch autograd records operations to build a computation graph for backpropagation. When you modify a tensor in-place (e.g., x += 1, x.add_(1), x[:] = value), the saved reference becomes invalid because the underlying data changed. This causes incorrect gradients or this error.

Scenario 1: In-place Addition in Forward Pass

Using += instead of + modifies tensors in place.

The Error

class Model(nn.Module):
    def forward(self, x):
        out = self.layer1(x)
        out += self.residual(x)  # In-place! Modifies out
        out = self.layer2(out)
        return out

# RuntimeError: one of the variables needed for gradient computation
# has been modified by an inplace operation

The Fix

class Model(nn.Module):
    def forward(self, x):
        out = self.layer1(x)
        out = out + self.residual(x)  # Out-of-place: creates new tensor
        out = self.layer2(out)
        return out

# Or use torch.add explicitly:
# out = torch.add(out, self.residual(x))

Replace += with = ... + to create a new tensor instead of modifying in place. This preserves the original values needed for gradient computation.

Scenario 2: In-place Activation Functions

Using inplace=True on activations that feed into operations needing gradients.

The Error

model = nn.Sequential(
    nn.Linear(256, 128),
    nn.ReLU(inplace=True),  # In-place modification
    nn.Linear(128, 10)
)
# May cause "modified by an inplace operation" in some graph configurations

The Fix

model = nn.Sequential(
    nn.Linear(256, 128),
    nn.ReLU(inplace=False),  # Safe: creates new tensor
    nn.Linear(128, 10)
)

# inplace=True saves memory but risks gradient errors.
# Only use inplace=True when you're certain the tensor isn't
# needed by other branches of the computation graph.

While inplace=True saves memory, it can break gradient computation in networks with skip connections or shared parameters. Default to inplace=False unless profiling shows a clear memory benefit.

Scenario 3: Modifying Weight Tensors During Forward

Directly modifying parameters or buffers during forward pass.

The Error

class Model(nn.Module):
    def forward(self, x):
        self.weight.data.zero_()  # In-place modification of parameter!
        self.weight.data.add_(compute_weight(x))
        return F.linear(x, self.weight)
# RuntimeError: one of the variables needed for gradient computation
# has been modified by an inplace operation

The Fix

class Model(nn.Module):
    def forward(self, x):
        # Create a new weight tensor instead of modifying in-place
        w = compute_weight(x)  # Compute fresh weights
        return F.linear(x, w)

# If you need conditional weights, clone first:
# w = self.weight.clone()
# w = w + delta  # out-of-place modification on clone

Never modify .data of parameters during forward pass. Use functional operations that create new tensors, or .clone() first to avoid corrupting the computation graph.

Quick Debugging Checklist

# Enable anomaly detection to find the exact line
torch.autograd.set_detect_anomaly(True)

# Check tensor properties
print(f"dtype: {tensor.dtype}, device: {tensor.device}, shape: {tensor.shape}")
print(f"requires_grad: {tensor.requires_grad}")

Related Questions

Debug Shape Errors