Calculate Tensor Strides and Storage Offsets
Enter a tensor shape and an index to see the contiguous strides PyTorch assigns and the exact flat position the element occupies in linear memory.
How PyTorch strides work
A PyTorch tensor stores its values in one flat, one-dimensional block of memory. The stride tells the engine how many elements to jump in that flat block to advance one step along each dimension. For a default contiguous, row-major (C-order) tensor, the stride of a dimension equals the product of the sizes of all dimensions to its right.
Formally, for a shape (d₀, d₁, …, dₙ₋₁), this calculator computes each stride as:
stride[k] = d[k+1] × d[k+2] × … × d[n-1], with the last dimension always having stride = 1.
It builds these by scanning the shape right-to-left, keeping a running product that starts at 1: assign the running product as the current stride, then multiply it by the current dimension size before moving left. That single pass is exactly what tensor.contiguous().stride() returns.
The flat storage offset of a multi-dimensional index (i₀, i₁, …) is the dot product of the index with the stride vector:
offset = Σ i[k] × stride[k]
This is why a view() or reshape() on a contiguous tensor is free: the data never moves, only the shape and stride metadata change. By contrast, a transpose() or permute() swaps strides without touching memory, producing a non-contiguous tensor where strides are no longer a clean descending product, which is what later forces a copy. The total element count is ∏ d[k], and a contiguous tensor's storage spans exactly that many elements with no gaps, so the largest valid offset is total - 1. Understanding this map between indices and offsets is essential when debugging custom CUDA kernels, autograd memory-layout warnings, or unexpected copies during training.