Shortcuts

ABILanguageDecoder

class mmocr.models.textrecog.ABILanguageDecoder(dictionary, d_model=512, n_head=8, d_inner=2048, n_layers=4, dropout=0.1, detach_tokens=True, use_self_attn=False, max_seq_len=40, module_loss=None, postprocessor=None, init_cfg=None, **kwargs)[源代码]

Transformer-based language model responsible for spell correction. Implementation of language model of

参数
  • dictionary (dict or Dictionary) – The config for Dictionary or the instance of Dictionary. The dictionary must have an end token.

  • d_model (int) – Hidden size \(E\) of model. Defaults to 512.

  • n_head (int) – Number of multi-attention heads.

  • d_inner (int) – Hidden size of feedforward network model.

  • n_layers (int) – The number of similar decoding layers.

  • dropout (float) – Dropout rate.

  • detach_tokens (bool) – Whether to block the gradient flow at input tokens.

  • use_self_attn (bool) – If True, use self attention in decoder layers, otherwise cross attention will be used.

  • max_seq_len (int) – Maximum sequence length \(T\). The sequence is usually generated from decoder. Defaults to 40.

  • module_loss (dict, optional) – Config to build loss. Defaults to None.

  • postprocessor (dict, optional) – Config to build postprocessor. Defaults to None.

  • init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.

返回类型

None

forward_test(feat=None, logits=None, data_samples=None)[源代码]
参数
  • feat (torch.Tensor, optional) – Not required. Feature map placeholder. Defaults to None.

  • logits (Tensor) – Raw language logitis. Shape \((N, T, C)\). Defaults to None.

  • data_samples (list[TextRecogDataSample], optional) – Not required. DataSample placeholder. Defaults to None.

返回

A dict with keys feature and logits.

  • feature (Tensor): Shape \((N, T, E)\). Raw textual features for vision language aligner.

  • logits (Tensor): Shape \((N, T, C)\). The raw logits for characters after spell correction.

返回类型

Dict

forward_train(feat=None, out_enc=None, data_samples=None)[源代码]
参数
  • feat (torch.Tensor, optional) – Not required. Feature map placeholder. Defaults to None.

  • out_enc (torch.Tensor) – Logits with shape \((N, T, C)\). Defaults to None.

  • data_samples (list[TextRecogDataSample], optional) – Not required. DataSample placeholder. Defaults to None.

返回

A dict with keys feature and logits.

  • feature (Tensor): Shape \((N, T, E)\). Raw textual features for vision language aligner.

  • logits (Tensor): Shape \((N, T, C)\). The raw logits for characters after spell correction.

返回类型

Dict

Read the Docs v: stable
Versions
latest
stable
0.x
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.