Shortcuts

NRTREncoder

class mmocr.models.textrecog.NRTREncoder(n_layers=6, n_head=8, d_k=64, d_v=64, d_model=512, d_inner=256, dropout=0.1, init_cfg=None)[source]

Transformer Encoder block with self attention mechanism.

Parameters
  • n_layers (int) – The number of sub-encoder-layers in the encoder. Defaults to 6.

  • n_head (int) – The number of heads in the multiheadattention models Defaults to 8.

  • d_k (int) – Total number of features in key. Defaults to 64.

  • d_v (int) – Total number of features in value. Defaults to 64.

  • d_model (int) – The number of expected features in the decoder inputs. Defaults to 512.

  • d_inner (int) – The dimension of the feedforward network model. Defaults to 256.

  • dropout (float) – Dropout rate for MHSA and FFN. Defaults to 0.1.

  • init_cfg (dict or list[dict], optional) – Initialization configs.

Return type

None

forward(feat, data_samples=None)[source]
Parameters
  • feat (Tensor) – Backbone output of shape \((N, C, H, W)\).

  • data_samples (list[TextRecogDataSample]) – Batch of TextRecogDataSample, containing valid_ratio information. Defaults to None.

Returns

The encoder output tensor. Shape \((N, T, C)\).

Return type

Tensor

Read the Docs v: stable
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.