BaseDecoder¶

class mmocr.models.textrecog.BaseDecoder(dictionary, module_loss=None, postprocessor=None, max_seq_len=40, init_cfg=None)[source]¶

Base decoder for text recognition, build the loss and postprocessor.

Parameters

dictionary (dict or Dictionary) – The config for Dictionary or the instance of Dictionary.
loss (dict, optional) – Config to build loss. Defaults to None.
postprocessor (dict, optional) – Config to build postprocessor. Defaults to None.
max_seq_len (int) – Maximum sequence length. The sequence is usually generated from decoder. Defaults to 40.
init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.
module_loss (Optional[Dict]) –

Return type

None

forward(feat=None, out_enc=None, data_samples=None)[source]¶

Decoder forward.

Args:

feat (Tensor, optional): Features from the backbone. Defaults
to None.

out_enc (Tensor, optional): Features from the encoder.
Defaults to None.

data_samples (list[TextRecogDataSample]): A list of N datasamples,
containing meta information and gold annotations for each of the images. Defaults to None.

Returns

Features from decoder forward.

Return type

Tensor

Parameters

feat (Optional[torch.Tensor]) –
out_enc (Optional[torch.Tensor]) –
data_samples (Optional[Sequence[mmocr.structures.textrecog_data_sample.TextRecogDataSample]]) –

forward_test(feat=None, out_enc=None, data_samples=None)[source]¶

Forward for testing.

Parameters

feat (torch.Tensor, optional) – The feature map from backbone of shape \((N, E, H, W)\). Defaults to None.
out_enc (torch.Tensor, optional) – Encoder output. Defaults to None.
data_samples (Sequence[TextRecogDataSample]) – Batch of TextRecogDataSample, containing gt_text information. Defaults to None.

Return type

torch.Tensor

forward_train(feat=None, out_enc=None, data_samples=None)[source]¶

Forward for training.

Parameters

feat (torch.Tensor, optional) – The feature map from backbone of shape \((N, E, H, W)\). Defaults to None.
out_enc (torch.Tensor, optional) – Encoder output. Defaults to None.
data_samples (Sequence[TextRecogDataSample]) – Batch of TextRecogDataSample, containing gt_text information. Defaults to None.

Return type

torch.Tensor

loss(feat=None, out_enc=None, data_samples=None)[source]¶

Calculate losses from a batch of inputs and data samples.

Parameters

feat (Tensor, optional) – Features from the backbone. Defaults to None.
out_enc (Tensor, optional) – Features from the encoder. Defaults to None.
data_samples (list[TextRecogDataSample], optional) – A list of N datasamples, containing meta information and gold annotations for each of the images. Defaults to None.

Returns

A dictionary of loss components.

Return type

dict[str, tensor]

predict(feat=None, out_enc=None, data_samples=None)[source]¶

Perform forward propagation of the decoder and postprocessor.

Parameters

feat (Tensor, optional) – Features from the backbone. Defaults to None.
out_enc (Tensor, optional) – Features from the encoder. Defaults to None.
data_samples (list[TextRecogDataSample]) – A list of N datasamples, containing meta information and gold annotations for each of the images. Defaults to None.

Returns

A list of N datasamples of prediction results. Results are stored in pred_text.

Return type

list[TextRecogDataSample]