What does Conv2d output with 224x224 input and kernel size 3?

Conv2d with 224x224 input, kernel_size=3, stride=1, padding=1 outputs 224x224. The formula is: output = (input - kernel + 2*padding) / stride + 1 = (224 - 3 + 2*1) / 1 + 1 = 224. This is the standard 'same' convolution that preserves spatial dimensions, commonly used in VGG and ResNet architectures.

What Does Conv2d Output with 224×224 Input and Kernel 3?

Conv2d with 224×224 input, kernel_size=3, stride=1, padding=1 outputs 224×224. The formula is: output = (224 - 3 + 2×1) / 1 + 1 = 224. This preserves spatial dimensions.

Formula Breakdown

The Conv2d output size formula is:

output_size = (input_size - kernel_size + 2 * padding) / stride + 1

Plugging in the values:

output = (224 - 3 + 2*1) / 1 + 1
output = (224 - 3 + 2) / 1 + 1
output = 223 / 1 + 1
output = 224

PyTorch Code

import torch
import torch.nn as nn

conv = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
x = torch.randn(1, 3, 224, 224)
output = conv(x)
print(output.shape)  # torch.Size([1, 64, 224, 224])

Why This Matters

A 3×3 kernel with padding=1 is the standard "same" convolution that preserves spatial dimensions. This is the most common convolution configuration in modern architectures like VGG, ResNet, and DenseNet. It allows you to stack many convolutional layers without shrinking the feature maps, and only reduce spatial size through explicit pooling or strided convolutions.

Try the Conv2d Calculator

What Does Conv2d Output with 224×224 Input and Kernel 3?

Formula Breakdown

PyTorch Code

Why This Matters

Related Questions