BaseTextRecogModuleLoss¶
- class mmocr.models.textrecog.BaseTextRecogModuleLoss(dictionary, max_seq_len=40, letter_case='unchanged', pad_with='auto', **kwargs)[source]¶
Base recognition loss.
- Parameters
dictionary (dict or
Dictionary
) – The config for Dictionary or the instance of Dictionary.max_seq_len (int) – Maximum sequence length. The sequence is usually generated from decoder. Defaults to 40.
letter_case (str) – There are three options to alter the letter cases of gt texts: - unchanged: Do not change gt texts. - upper: Convert gt texts into uppercase characters. - lower: Convert gt texts into lowercase characters. Usually, it only works for English characters. Defaults to ‘unchanged’.
pad_with (str) –
The padding strategy for
gt_text.padded_indexes
. Defaults to ‘auto’. Options are: - ‘auto’: Use dictionary.padding_idx to pad gt texts, ordictionary.end_idx if dictionary.padding_idx is None.
’padding’: Always use dictionary.padding_idx to pad gt texts.
’end’: Always use dictionary.end_idx to pad gt texts.
’none’: Do not pad gt texts.
- Return type
- get_targets(data_samples)[source]¶
Target generator.
- Parameters
data_samples (list[TextRecogDataSample]) – It usually includes
gt_text
information.- Returns
Updated data_samples. Two keys will be added to data_sample:
indexes (torch.LongTensor): Character indexes representing gt texts. All special tokens are excluded, except for UKN.
padded_indexes (torch.LongTensor): Character indexes representing gt texts with BOS and EOS if applicable, following several padding indexes until the length reaches
max_seq_len
. In particular, ifpad_with='none'
, no padding will be applied.
- Return type