BaseTextDetHead¶

class mmocr.models.textdet.BaseTextDetHead(module_loss=None, postprocessor=None, init_cfg=None)[source]¶

Base head for text detection, build the loss and postprocessor.

1. The init_weights method is used to initialize head’s model parameters. After detector initialization, init_weights is triggered when detector.init_weights() is called externally.

2. The loss method is used to calculate the loss of head, which includes two steps: (1) the head model performs forward propagation to obtain the feature maps (2) The module_loss method is called based on the feature maps to calculate the loss.

loss(): forward() -> module_loss()

3. The predict method is used to predict detection results, which includes two steps: (1) the head model performs forward propagation to obtain the feature maps (2) The postprocessor method is called based on the feature maps to predict detection results including post-processing.

predict(): forward() -> postprocessor()

4. The loss_and_predict method is used to return loss and detection results at the same time. It will call head’s forward, module_loss and postprocessor methods in order.

loss_and_predict(): forward() -> module_loss() -> postprocessor()

Parameters

loss (dict, optional) – Config to build loss. Defaults to None.
postprocessor (dict, optional) – Config to build postprocessor. Defaults to None.
init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.
module_loss (Optional[Dict]) –

Return type

None

loss(x, data_samples)[source]¶

Perform forward propagation and loss calculation of the detection head on the features of the upstream network.

Parameters

x (tuple[Tensor]) – Features from the upstream network, each is a 4D-tensor.
data_samples (List[DetDataSample]) – The Data Samples. It usually includes information such as gt_instance, gt_panoptic_seg and gt_sem_seg.

Returns

A dictionary of loss components.

Return type

dict

loss_and_predict(x, data_samples)[source]¶

Perform forward propagation of the head, then calculate loss and predictions from the features and data samples.

Parameters

x (tuple[Tensor]) – Features from FPN.
data_samples (list[DetDataSample]) – Each item contains the meta information of each image and corresponding annotations.

Returns

the return value is a tuple contains:

losses: (dict[str, Tensor]): A dictionary of loss components.

predictions (list[InstanceData]): Detection results of each image after the post process.

Return type

tuple

predict(x, data_samples)[source]¶

Perform forward propagation of the detection head and predict detection results on the features of the upstream network.

Parameters

x (tuple[Tensor]) – Multi-level features from the upstream network, each is a 4D-tensor.
data_samples (List[DetDataSample]) – The Data Samples. It usually includes information such as gt_instance, gt_panoptic_seg and gt_sem_seg.

Returns

Detection results of each image after the post process.

Return type

SampleList