What Does Conv2d Output with 32×32 Input, Kernel 3, Padding 0?

Conv2d with 32×32 input, kernel_size=3, stride=1, padding=0 outputs 30×30. The formula gives: floor((32 + 2×0 - 3) / 1) + 1 = 30.

Formula Breakdown

The Conv2d output size formula is:

output_size = floor((input_size - kernel_size + 2 * padding) / stride) + 1

Plugging in the values for 32×32 input:

output = floor((32 - 3 + 2*0) / 1) + 1
output = floor((32 - 3 + 0) / 1) + 1
output = floor(29 / 1) + 1
output = floor(29) + 1
output = 30

So the spatial dimensions go from 32×32 to 30×30.

PyTorch Code Example

import torch
import torch.nn as nn

# Define the Conv2d layer
conv = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=0)

# Create input tensor: (batch, channels, height, width)
x = torch.randn(1, 64, 32, 32)
output = conv(x)
print(output.shape)  # torch.Size([1, 128, 30, 30])

# Verify with formula
expected = (32 + 2 * 0 - 3) // 1 + 1
print(f"Expected output size: {expected}x{expected}")  # 30x30

Architecture Context

A 3×3 convolution without padding reduces spatial size by 2 in each dimension. This “valid” convolution is common in early architectures like LeNet-5.

Parameter Count

A Conv2d(64, 128, 3) layer has:

parameters = in_channels * out_channels * kernel_size^2 + out_channels (bias)
parameters = 64 * 128 * 3 * 3 + 128
parameters = 73,856

This layer has 73,856 trainable parameters (73728 weights + 128 bias terms).

Practical Tips

Related Questions

Try the Conv2d Calculator