Shortcuts

ParallelSARDecoder

class mmocr.models.textrecog.ParallelSARDecoder(dictionary, module_loss=None, postprocessor=None, enc_bi_rnn=False, dec_bi_rnn=False, dec_rnn_dropout=0.0, dec_gru=False, d_model=512, d_enc=512, d_k=64, pred_dropout=0.0, max_seq_len=30, mask=True, pred_concat=False, init_cfg=None, **kwargs)[源代码]

Implementation Parallel Decoder module in `SAR.

<https://arxiv.org/abs/1811.00751>`_.

参数
  • dictionary (dict or Dictionary) – The config for Dictionary or the instance of Dictionary.

  • module_loss (dict, optional) – Config to build module_loss. Defaults to None.

  • postprocessor (dict, optional) – Config to build postprocessor. Defaults to None.

  • enc_bi_rnn (bool) – If True, use bidirectional RNN in encoder. Defaults to False.

  • dec_bi_rnn (bool) – If True, use bidirectional RNN in decoder. Defaults to False.

  • dec_rnn_dropout (float) – Dropout of RNN layer in decoder. Defaults to 0.0.

  • dec_gru (bool) – If True, use GRU, else LSTM in decoder. Defaults to False.

  • d_model (int) – Dim of channels from backbone \(D_i\). Defaults to 512.

  • d_enc (int) – Dim of encoder RNN layer \(D_m\). Defaults to 512.

  • d_k (int) – Dim of channels of attention module. Defaults to 64.

  • pred_dropout (float) – Dropout probability of prediction layer. Defaults to 0.0.

  • max_seq_len (int) – Maximum sequence length for decoding. Defaults to 30.

  • mask (bool) – If True, mask padding in feature map. Defaults to True.

  • pred_concat (bool) – If True, concat glimpse feature from attention with holistic feature and hidden state. Defaults to False.

  • init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.

返回类型

None

forward_test(feat, out_enc, data_samples=None)[源代码]
参数
  • feat (Tensor) – Tensor of shape \((N, D_i, H, W)\).

  • out_enc (Tensor) – Encoder output of shape \((N, D_m, H, W)\).

  • data_samples (list[TextRecogDataSample], optional) – Batch of TextRecogDataSample, containing valid_ratio information. Defaults to None.

返回

Character probabilities. of shape \((N, self.max_seq_len, C)\) where \(C\) is num_classes.

返回类型

Tensor

forward_train(feat, out_enc, data_samples)[源代码]
参数
  • feat (Tensor) – Tensor of shape \((N, D_i, H, W)\).

  • out_enc (Tensor) – Encoder output of shape \((N, D_m, H, W)\).

  • data_samples (list[TextRecogDataSample]) – Batch of TextRecogDataSample, containing gt_text and valid_ratio information.

返回

A raw logit tensor of shape \((N, T, C)\).

返回类型

Tensor

Read the Docs v: latest
Versions
latest
stable
0.x
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.