CTCModuleLoss¶

class mmocr.models.textrecog.CTCModuleLoss(dictionary, letter_case='unchanged', flatten=True, reduction='mean', zero_infinity=False, **kwargs)[source]¶

Implementation of loss module for CTC-loss based text recognition.

Parameters

dictionary (dict or Dictionary) – The config for Dictionary or the instance of Dictionary.
letter_case (str) – There are three options to alter the letter cases of gt texts: - unchanged: Do not change gt texts. - upper: Convert gt texts into uppercase characters. - lower: Convert gt texts into lowercase characters. Usually, it only works for English characters. Defaults to ‘unchanged’.
flatten (bool) – If True, use flattened targets, else padded targets.
reduction (str) – Specifies the reduction to apply to the output, should be one of the following: (‘none’, ‘mean’, ‘sum’).
zero_infinity (bool) – Whether to zero infinite losses and the associated gradients. Default: False. Infinite losses mainly occur when the inputs are too short to be aligned to the targets.

Return type

None

forward(outputs, data_samples)[source]¶

Parameters

outputs (Tensor) – A raw logit tensor of shape \((N, T, C)\).
data_samples (list[TextRecogDataSample]) – List of TextRecogDataSample which are processed by get_target.

Returns

The loss dict with key loss_ctc.

Return type

dict

get_targets(data_samples)[source]¶

Target generator.

Parameters

data_samples (list[TextRecogDataSample]) – It usually includes gt_text information.

Returns

updated data_samples. It will add two key in data_sample:

indexes (torch.LongTensor): The index corresponding to the item.

Return type

list[TextRecogDataSample]