Question 1

What is the output shape of a PyTorch GRU?

Accepted Answer

A PyTorch GRU returns two tensors: output of shape (seq_len, batch, num_directions*hidden_size) and h_n of shape (num_layers*num_directions, batch, hidden_size). With batch_first=True, output becomes (batch, seq_len, num_directions*hidden_size).

Question 2

What is the difference between GRU and LSTM?

Accepted Answer

GRU has 2 gates (reset and update) while LSTM has 3 gates (input, forget, output) plus a cell state. GRU merges the cell state and hidden state into one, making it simpler with fewer parameters. GRU trains faster but LSTM can capture longer dependencies. GRU output shape is simpler: no separate cell state tensor.

Question 3

How does bidirectional affect GRU output shape?

Accepted Answer

With bidirectional=True, num_directions=2. The output last dimension doubles to 2*hidden_size (forward and backward concatenated). The h_n first dimension doubles to num_layers*2. For example, a 2-layer bidirectional GRU with hidden_size=256 produces output [..., 512] and h_n of shape [4, batch, 256].

Question 4

How do I initialize GRU hidden state?

Accepted Answer

The initial hidden state h_0 should have shape (num_layers*num_directions, batch_size, hidden_size). If not provided, PyTorch initializes it to zeros. For bidirectional GRUs, the first dimension is num_layers*2. Always match the batch size of your input data.

Question 5

Is this tool free?

Accepted Answer

Yes. All HeyTensor tools are free, run in your browser, and require no signup.

GRU Output Shape Calculator

Frequently Asked Questions

About This Tool

Contact

GRU Output Shape Calculator

Frequently Asked Questions

Related Tools

LSTM Shape Calculator

Linear Layer Calculator

Embedding Calculator

About This Tool

Contact