BaseDecoder¶
- class mmocr.models.textrecog.BaseDecoder(dictionary, module_loss=None, postprocessor=None, max_seq_len=40, init_cfg=None)[source]¶
Base decoder for text recognition, build the loss and postprocessor.
- Parameters
dictionary (dict or
Dictionary
) – The config for Dictionary or the instance of Dictionary.loss (dict, optional) – Config to build loss. Defaults to None.
postprocessor (dict, optional) – Config to build postprocessor. Defaults to None.
max_seq_len (int) – Maximum sequence length. The sequence is usually generated from decoder. Defaults to 40.
init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.
module_loss (Optional[Dict]) –
- Return type
- forward(feat=None, out_enc=None, data_samples=None)[source]¶
Decoder forward.
- Args:
- feat (Tensor, optional): Features from the backbone. Defaults
to None.
- out_enc (Tensor, optional): Features from the encoder.
Defaults to None.
- data_samples (list[TextRecogDataSample]): A list of N datasamples,
containing meta information and gold annotations for each of the images. Defaults to None.
- Returns
Features from
decoder
forward.- Return type
Tensor
- Parameters
feat (Optional[torch.Tensor]) –
out_enc (Optional[torch.Tensor]) –
data_samples (Optional[Sequence[mmocr.structures.textrecog_data_sample.TextRecogDataSample]]) –
- forward_test(feat=None, out_enc=None, data_samples=None)[source]¶
Forward for testing.
- Parameters
feat (torch.Tensor, optional) – The feature map from backbone of shape \((N, E, H, W)\). Defaults to None.
out_enc (torch.Tensor, optional) – Encoder output. Defaults to None.
data_samples (Sequence[TextRecogDataSample]) – Batch of TextRecogDataSample, containing gt_text information. Defaults to None.
- Return type
- forward_train(feat=None, out_enc=None, data_samples=None)[source]¶
Forward for training.
- Parameters
feat (torch.Tensor, optional) – The feature map from backbone of shape \((N, E, H, W)\). Defaults to None.
out_enc (torch.Tensor, optional) – Encoder output. Defaults to None.
data_samples (Sequence[TextRecogDataSample]) – Batch of TextRecogDataSample, containing gt_text information. Defaults to None.
- Return type
- loss(feat=None, out_enc=None, data_samples=None)[source]¶
Calculate losses from a batch of inputs and data samples.
- Parameters
feat (Tensor, optional) – Features from the backbone. Defaults to None.
out_enc (Tensor, optional) – Features from the encoder. Defaults to None.
data_samples (list[TextRecogDataSample], optional) – A list of N datasamples, containing meta information and gold annotations for each of the images. Defaults to None.
- Returns
A dictionary of loss components.
- Return type
- predict(feat=None, out_enc=None, data_samples=None)[source]¶
Perform forward propagation of the decoder and postprocessor.
- Parameters
feat (Tensor, optional) – Features from the backbone. Defaults to None.
out_enc (Tensor, optional) – Features from the encoder. Defaults to None.
data_samples (list[TextRecogDataSample]) – A list of N datasamples, containing meta information and gold annotations for each of the images. Defaults to None.
- Returns
A list of N datasamples of prediction results. Results are stored in
pred_text
.- Return type