What Is Stride in Conv2d?

Q: What is stride in Conv2d?

Stride controls how far the convolution kernel moves between positions. With stride=1 (default), the kernel moves one pixel at a time, keeping the output size the same (with proper padding). With stride=2, the kernel skips every other position, halving the output dimensions. Stride is used as an alternative to pooling for downsampling — ResNet uses stride=2 convolutions instead of MaxPool for spatial reduction in residual blocks.

Stride controls how far the kernel moves between positions. stride=2 halves the output size. stride=1 (default) keeps the size with proper padding.

How Stride Affects Output Size

output_size = floor((input - kernel + 2*padding) / stride) + 1

# 224x224 input, kernel=3, padding=1
stride=1:  floor((224 - 3 + 2) / 1) + 1 = 224  (same size)
stride=2:  floor((224 - 3 + 2) / 2) + 1 = 112  (half size)
stride=4:  floor((224 - 3 + 2) / 4) + 1 = 56   (quarter size)

PyTorch Examples

import torch
import torch.nn as nn

x = torch.randn(1, 3, 224, 224)

# stride=1: preserves spatial size
conv_s1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1)
print(conv_s1(x).shape)  # [1, 64, 224, 224]

# stride=2: halves spatial size
conv_s2 = nn.Conv2d(3, 64, kernel_size=3, stride=2, padding=1)
print(conv_s2(x).shape)  # [1, 64, 112, 112]

# stride=2 with kernel=7: ResNet conv1
conv_resnet = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3)
print(conv_resnet(x).shape)  # [1, 64, 112, 112]

Stride vs Pooling for Downsampling

MaxPool2d(2, 2) — takes the max value in each 2×2 window, no learnable parameters
Conv2d(stride=2) — learnable downsampling, the network learns how to downsample

Modern architectures (ResNet, EfficientNet) prefer strided convolutions over pooling because the network can learn what information to preserve during downsampling.

Asymmetric Strides

# Different stride for height and width
conv = nn.Conv2d(3, 64, kernel_size=3, stride=(2, 1), padding=1)
x = torch.randn(1, 3, 224, 224)
print(conv(x).shape)  # [1, 64, 112, 224] — height halved, width same

Try the Conv2d Calculator

What Is Stride in Conv2d?

How Stride Affects Output Size

PyTorch Examples

Stride vs Pooling for Downsampling

Asymmetric Strides

Related Questions