ASTERDecoder¶

class mmocr.models.textrecog.ASTERDecoder(in_channels, emb_dims=512, attn_dims=512, hidden_size=512, dictionary=None, max_seq_len=25, module_loss=None, postprocessor=None, init_cfg={'layer': 'Conv2d', 'type': 'Xavier'})[source]¶

Implement attention decoder.

Parameters

in_channels (int) – Number of input channels.
emb_dims (int) – Dims of char embedding. Defaults to 512.
attn_dims (int) – Dims of attention. Both hidden states and features will be projected to this dims. Defaults to 512.
hidden_size (int) – Dims of hidden state for GRU. Defaults to 512.
dictionary (dict or Dictionary) – The config for Dictionary or the instance of Dictionary. Defaults to None.
max_seq_len (int) – Maximum output sequence length \(T\). Defaults to 25.
module_loss (dict, optional) – Config to build loss. Defaults to None.
postprocessor (dict, optional) – Config to build postprocessor. Defaults to None.
init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.

forward_test(feat=None, out_enc=None, data_samples=None)[source]¶

Parameters

feat (Tensor) – Feature from backbone. Unused in this decoder.
out_enc (torch.Tensor, optional) – Encoder output. Defaults to None.
data_samples (list[TextRecogDataSample], optional) – Batch of TextRecogDataSample, containing gt_text information. Defaults to None. Unused in this decoder.

Returns

The raw logit tensor. Shape \((N, T, C)\) where \(C\) is num_classes.

Return type

Tensor

forward_train(feat=None, out_enc=None, data_samples=None)[source]¶

Parameters

feat (Tensor) – Feature from backbone. Unused in this decoder.
out_enc (torch.Tensor, optional) – Encoder output. Defaults to None.
data_samples (list[TextRecogDataSample], optional) – Batch of TextRecogDataSample, containing gt_text information. Defaults to None.

Returns

The raw logit tensor. Shape \((N, T, C)\) where \(C\) is num_classes.

Return type

Tensor