BaseRecognizer¶
- class mmocr.models.textrecog.BaseRecognizer(data_preprocessor=None, init_cfg=None)[source]¶
Base class for recognizer.
- Parameters
- abstract extract_feat(inputs)[source]¶
Extract features from images.
- Parameters
inputs (torch.Tensor) –
- Return type
- forward(inputs, data_samples=None, mode='tensor', **kwargs)[source]¶
The unified entry for a forward process in both training and test.
The method should accept three modes: “tensor”, “predict” and “loss”:
“tensor”: Forward the whole network and return tensor or tuple of
tensor without any post-processing, same as a common nn.Module. - “predict”: Forward and return the predictions, which are fully processed to a list of
DetDataSample
. - “loss”: Forward and return a dict of losses according to the given inputs and data samples.Note that this method doesn’t handle neither back propagation nor optimizer updating, which are done in the
train_step()
.- Parameters
inputs (torch.Tensor) – The input tensor with shape (N, C, …) in general.
data_samples (list[
DetDataSample
], optional) – The annotation data of every samples. Defaults to None.mode (str) – Return what kind of value. Defaults to ‘tensor’.
- Returns
The return type depends on
mode
.If
mode="tensor"
, return a tensor or a tuple of tensor.If
mode="predict"
, return a list ofDetDataSample
.If
mode="loss"
, return a dict of tensor.
- Return type
Union[Dict[str, torch.Tensor], List[mmocr.structures.textrecog_data_sample.TextRecogDataSample], Tuple[torch.Tensor], torch.Tensor]
- abstract loss(inputs, data_samples, **kwargs)[source]¶
Calculate losses from a batch of inputs and data samples.
- Parameters
inputs (torch.Tensor) –
data_samples (List[mmocr.structures.textrecog_data_sample.TextRecogDataSample]) –
- Return type
- abstract predict(inputs, data_samples, **kwargs)[source]¶
Predict results from a batch of inputs and data samples with post- processing.
- Parameters
inputs (torch.Tensor) –
data_samples (List[mmocr.structures.textrecog_data_sample.TextRecogDataSample]) –
- Return type
List[mmocr.structures.textrecog_data_sample.TextRecogDataSample]