Receptive Field Calculator for Stacked CNN Layers

Add your Conv2d and MaxPool2d layers below to compute the effective receptive field, feature jump, and start offset at every stage of a PyTorch convolutional network. Output updates live as you edit.

Layer type to add

#	Type	Kernel	Stride	Padding	Dilation	RF	Jump	Start

—

Receptive field (px)

—

Jump / stride product

—

Start offset (px)

—

Layers

How the receptive field is computed

The receptive field (RF) of a unit in a convolutional feature map is the size of the region in the original input image that influences that unit's value. Stacking convolutions and pooling layers compounds the RF non-linearly, which is why a 50-layer ResNet can "see" hundreds of pixels even though each kernel is only 3×3. This tool walks the network layer by layer and tracks three quantities introduced in Dang-Ha's RF analysis: the receptive field size r, the jump j (the pixel distance between two adjacent features in the current map, equal to the cumulative product of strides), and the start s (the center coordinate of the first feature relative to the input).

For each layer with kernel k, stride S, padding P and dilation d, the effective kernel becomes k_eff = d·(k−1)+1, and the recurrence applied to the previous layer's values is:

j_out = j_in × S r_out = r_in + (k_eff − 1) × j_in s_out = s_in + ((k_eff − 1)/2 − P) × j_in

Initialization is j = 1, r = 1, s = 0.5 at the input. The key insight that competitor "output-size" calculators miss is that the RF growth is driven by the incoming jump j_in, not the layer's own stride. A 3×3 conv placed after two stride-2 pools adds (3−1)×4 = 8 pixels of RF per layer, while the same conv at the input adds only 2. The start offset s tells you whether your feature map is spatially aligned with the input or shifted — a negative or fractional s often signals an off-by-one padding choice that will quietly misregister your detections.

Dilation widens the effective kernel without adding parameters, so a dilated stack grows the RF aggressively while keeping the jump fixed — the standard trick behind dense-prediction backbones. Use the per-layer columns to find exactly which layer first covers your object scale, then prune or dilate accordingly.

Receptive Field Calculator for Stacked CNN Layers

How the receptive field is computed

Related Tools