Shortcuts

EncoderDecoderRecognizer

class mmocr.models.textrecog.EncoderDecoderRecognizer(preprocessor=None, backbone=None, encoder=None, decoder=None, data_preprocessor=None, init_cfg=None)[source]

Base class for encode-decode recognizer.

Parameters
  • preprocessor (dict, optional) – Config dict for preprocessor. Defaults to None.

  • backbone (dict, optional) – Backbone config. Defaults to None.

  • encoder (dict, optional) – Encoder config. If None, the output from backbone will be directly fed into decoder. Defaults to None.

  • decoder (dict, optional) – Decoder config. Defaults to None.

  • data_preprocessor (dict, optional) – Model preprocessing config for processing the input image data. Keys allowed are ``to_rgb``(bool), ``pad_size_divisor``(int), ``pad_value``(int or float), ``mean``(int or float) and ``std``(int or float). Preprcessing order: 1. to rgb; 2. normalization 3. pad. Defaults to None.

  • init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.

Return type

None

extract_feat(inputs)[source]

Directly extract features from the backbone.

Parameters

inputs (torch.Tensor) –

Return type

torch.Tensor

loss(inputs, data_samples, **kwargs)[source]

Calculate losses from a batch of inputs and data samples. :param inputs: Input images of shape (N, C, H, W).

Typically these should be mean centered and std scaled.

Parameters
  • data_samples (list[TextRecogDataSample]) – A list of N datasamples, containing meta information and gold annotations for each of the images.

  • inputs (tensor) –

Returns

A dictionary of loss components.

Return type

dict[str, tensor]

predict(inputs, data_samples, **kwargs)[source]

Predict results from a batch of inputs and data samples with post- processing.

Parameters
  • inputs (torch.Tensor) – Image input tensor.

  • data_samples (list[TextRecogDataSample]) – A list of N datasamples, containing meta information and gold annotations for each of the images.

Returns

A list of N datasamples of prediction results. Results are stored in pred_text.

Return type

list[TextRecogDataSample]

Read the Docs v: stable
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.