Question 1

Does LayerNorm change the output shape?

Accepted Answer

No. LayerNorm always outputs the same shape as the input. It normalizes across the dimensions specified by normalized_shape but does not change any dimensions. If input is [batch, seq, features], output is [batch, seq, features].

Question 2

What is the difference between LayerNorm and BatchNorm?

Accepted Answer

BatchNorm normalizes across the batch dimension (each feature independently across samples). LayerNorm normalizes across feature dimensions (each sample independently). LayerNorm works with any batch size and is the standard in Transformers. BatchNorm is standard in CNNs.

Question 3

What should normalized_shape be?

Accepted Answer

normalized_shape should match the last N dimensions of your input tensor. For a Transformer with input [batch, seq_len, d_model], set normalized_shape=[d_model] or normalized_shape=d_model. For normalizing over both seq and features, use [seq_len, d_model].

Question 4

How many parameters does LayerNorm have?

Accepted Answer

With elementwise_affine=True (default), LayerNorm has 2 × product(normalized_shape) trainable parameters: one gamma (weight) and one beta (bias) per element. For normalized_shape=[512], that is 1,024 parameters. With elementwise_affine=False, it has zero.

Question 5

Is this tool free?

Accepted Answer

Yes. All HeyTensor tools are free, run in your browser, and require no signup.

LayerNorm Shape Calculator

Frequently Asked Questions

About This Tool

Contact

LayerNorm Shape Calculator

Frequently Asked Questions

Related Tools

BatchNorm Calculator

MultiheadAttention Calculator

Parameter Counter

About This Tool

Contact