CTCModuleLoss¶

class mmocr.models.textrecog.CTCModuleLoss(dictionary, letter_case='unchanged', flatten=True, reduction='mean', zero_infinity=False, **kwargs)[源代码]¶

Implementation of loss module for CTC-loss based text recognition.

参数

dictionary (dict or Dictionary) – The config for Dictionary or the instance of Dictionary.
letter_case (str) – There are three options to alter the letter cases of gt texts: - unchanged: Do not change gt texts. - upper: Convert gt texts into uppercase characters. - lower: Convert gt texts into lowercase characters. Usually, it only works for English characters. Defaults to ‘unchanged’.
flatten (bool) – If True, use flattened targets, else padded targets.
reduction (str) – Specifies the reduction to apply to the output, should be one of the following: (‘none’, ‘mean’, ‘sum’).
zero_infinity (bool) – Whether to zero infinite losses and the associated gradients. Default: False. Infinite losses mainly occur when the inputs are too short to be aligned to the targets.

返回类型

None

forward(outputs, data_samples)[源代码]¶

参数

outputs (Tensor) – A raw logit tensor of shape \((N, T, C)\).
data_samples (list[TextRecogDataSample]) – List of TextRecogDataSample which are processed by get_target.

返回

The loss dict with key loss_ctc.

返回类型

dict

get_targets(data_samples)[源代码]¶

Target generator.

参数

data_samples (list[TextRecogDataSample]) – It usually includes gt_text information.

返回

updated data_samples. It will add two key in data_sample:

indexes (torch.LongTensor): The index corresponding to the item.

返回类型

list[TextRecogDataSample]