BaseTextRecogModuleLoss¶

class mmocr.models.textrecog.BaseTextRecogModuleLoss(dictionary, max_seq_len=40, letter_case='unchanged', pad_with='auto', **kwargs)[source]¶

Base recognition loss.

Parameters

dictionary (dict or Dictionary) – The config for Dictionary or the instance of Dictionary.
max_seq_len (int) – Maximum sequence length. The sequence is usually generated from decoder. Defaults to 40.
letter_case (str) – There are three options to alter the letter cases of gt texts: - unchanged: Do not change gt texts. - upper: Convert gt texts into uppercase characters. - lower: Convert gt texts into lowercase characters. Usually, it only works for English characters. Defaults to ‘unchanged’.
pad_with (str) –
The padding strategy for gt_text.padded_indexes. Defaults to ‘auto’. Options are: - ‘auto’: Use dictionary.padding_idx to pad gt texts, or

dictionary.end_idx if dictionary.padding_idx is None.
- ’padding’: Always use dictionary.padding_idx to pad gt texts.
- ’end’: Always use dictionary.end_idx to pad gt texts.
- ’none’: Do not pad gt texts.

Return type

None

get_targets(data_samples)[source]¶

Target generator.

Parameters

data_samples (list[TextRecogDataSample]) – It usually includes gt_text information.

Returns

Updated data_samples. Two keys will be added to data_sample:

indexes (torch.LongTensor): Character indexes representing gt texts. All special tokens are excluded, except for UKN.
padded_indexes (torch.LongTensor): Character indexes representing gt texts with BOS and EOS if applicable, following several padding indexes until the length reaches max_seq_len. In particular, if pad_with='none', no padding will be applied.

Return type

list[TextRecogDataSample]