How to Fix "Sizes of tensors must match except in dimension" in PyTorch

Q: How to Fix "Sizes of tensors must match except in dimension" in PyTorch

This error occurs when using torch.cat() or torch.stack() with tensors that have mismatched dimensions. All tensors must have the same shape except in the concatenation dimension. Fix by padding shorter tensors, slicing longer ones, or reshaping to match.

In PyTorch, torch.cat and torch.stack require every non-concatenation dimension to match exactly; fix mismatches with F.interpolate, adaptive pooling, or pad_sequence.

The error Sizes of tensors must match except in dimension means when concatenating tensors with torch.cat() or torch.stack(), all non-concatenation dimensions must match exactly. For example, torch.cat([tensor_a, tensor_b], dim=0) requires both tensors to have the same shape in dimensions 1, 2, etc.

What Causes This Error

When concatenating tensors with torch.cat() or torch.stack(), all non-concatenation dimensions must match exactly. For example, torch.cat([tensor_a, tensor_b], dim=0) requires both tensors to have the same shape in dimensions 1, 2, etc.

Scenario 1: Concatenating Feature Maps of Different Sizes

Skip connections or multi-scale features may produce tensors of different spatial sizes.

The Error

features_high = torch.randn(1, 64, 32, 32)  # 32x32
features_low = torch.randn(1, 64, 16, 16)   # 16x16
combined = torch.cat([features_high, features_low], dim=1)
# RuntimeError: Sizes of tensors must match except in dimension 1.
# Expected size 32 but got size 16 for tensor number 1 in the list

The Fix

import torch.nn.functional as F

features_high = torch.randn(1, 64, 32, 32)
features_low = torch.randn(1, 64, 16, 16)

# Option 1: Upsample the smaller tensor
features_low_up = F.interpolate(features_low, size=(32, 32), mode='bilinear', align_corners=False)
combined = torch.cat([features_high, features_low_up], dim=1)  # Works: [1, 128, 32, 32]

# Option 2: Downsample the larger tensor
features_high_down = F.adaptive_avg_pool2d(features_high, (16, 16))
combined = torch.cat([features_high_down, features_low], dim=1)  # Works: [1, 128, 16, 16]

In U-Net and FPN architectures, feature maps at different scales must be resized before concatenation. Use F.interpolate for upsampling or adaptive pooling for downsampling.

Scenario 2: Batching Sequences of Different Lengths

NLP tasks often have variable-length sequences that cannot be directly concatenated.

The Error

seq1 = torch.randn(5, 768)   # 5 tokens
seq2 = torch.randn(8, 768)   # 8 tokens
batch = torch.stack([seq1, seq2])
# RuntimeError: Sizes of tensors must match except in dimension 0

The Fix

# Option 1: Pad to maximum length
from torch.nn.utils.rnn import pad_sequence

seq1 = torch.randn(5, 768)
seq2 = torch.randn(8, 768)
batch = pad_sequence([seq1, seq2], batch_first=True)  # [2, 8, 768], padded with zeros

# Option 2: Truncate to minimum length
min_len = min(seq1.size(0), seq2.size(0))
batch = torch.stack([seq1[:min_len], seq2[:min_len]])  # [2, 5, 768]

pad_sequence pads shorter tensors with zeros to match the longest. Use attention masks to ignore padded positions during training.

Scenario 3: Residual Connection Shape Mismatch

Skip/residual connections require the input and output to have identical shapes.

The Error

class Block(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(64, 128, 3, padding=1)  # Changes channels!

    def forward(self, x):  # x: [B, 64, H, W]
        return x + self.conv(x)  # Error! [B, 64, H, W] + [B, 128, H, W]
# RuntimeError: Sizes of tensors must match except in dimension

The Fix

class Block(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(64, 128, 3, padding=1)
        self.shortcut = nn.Conv2d(64, 128, 1)  # 1x1 conv to match channels

    def forward(self, x):
        return self.shortcut(x) + self.conv(x)  # Both [B, 128, H, W]. Works!

ResNet uses 1x1 convolutions (projection shortcuts) to match channel dimensions when the residual path changes the number of channels.

Quick Debugging Checklist

Print tensor .dtype and .device before operations
Check for in-place operations: +=, *=, .add_(), .mul_()
Verify shapes with print(tensor.shape) at each step
Use torch.autograd.set_detect_anomaly(True) to pinpoint the exact operation

# Enable anomaly detection to find the exact line
torch.autograd.set_detect_anomaly(True)

# Check tensor properties
print(f"dtype: {tensor.dtype}, device: {tensor.device}, shape: {tensor.shape}")
print(f"requires_grad: {tensor.requires_grad}")

Try the Shape Mismatch Solver

How to Fix "Sizes of tensors must match except in dimension" in PyTorch

What Causes This Error

Scenario 1: Concatenating Feature Maps of Different Sizes

The Error

The Fix

Scenario 2: Batching Sequences of Different Lengths

The Error

The Fix

Scenario 3: Residual Connection Shape Mismatch

The Error

The Fix

Quick Debugging Checklist

Related Questions