mmocr.models¶

models.common¶

BackBones¶

UNet

UNet backbone.

Dictionary¶

Dictionary

The class generates a dictionary for recognition.

Losses¶

`MaskedBalancedBCEWithLogitsLoss`	This loss combines a Sigmoid layers and a masked balanced BCE loss in one single class.
`MaskedDiceLoss`	Masked dice loss.
`MaskedSmoothL1Loss`	Masked Smooth L1 loss.
`MaskedSquareDiceLoss`	Masked square dice loss.
`MaskedBCEWithLogitsLoss`	This loss combines a Sigmoid layers and a masked BCE loss in one single class.
`SmoothL1Loss`	Smooth L1 loss.
`CrossEntropyLoss`	Cross entropy loss.
`MaskedBalancedBCELoss`	Masked Balanced BCE loss.
`MaskedBCELoss`	Masked BCE loss.

Layers¶

`TFEncoderLayer`	Transformer Encoder Layer.
`TFDecoderLayer`	Transformer Decoder Layer.

Modules¶

`ScaledDotProductAttention`	Scaled Dot-Product Attention Module.
`MultiHeadAttention`	Multi-Head Attention module.
`PositionwiseFeedForward`	Two-layer feed-forward module.
`PositionalEncoding`	Fixed positional encoding with sine and cosine functions.

models.textdet¶

Detectors¶

`SingleStageTextDetector`	The class for implementing single stage text detector.
`DBNet`	The class for implementing DBNet text detector: Real-time Scene Text Detection with Differentiable Binarization.
`PANet`	The class for implementing PANet text detector:
`PSENet`	The class for implementing PSENet text detector: Shape Robust Text Detection with Progressive Scale Expansion Network.
`TextSnake`	The class for implementing TextSnake text detector: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes.
`FCENet`	The class for implementing FCENet text detector FCENet(CVPR2021): Fourier Contour Embedding for Arbitrary-shaped Text Detection
`DRRG`	The class for implementing DRRG text detector.
`MMDetWrapper`	A wrapper of MMDet’s model.

Data Preprocessors¶

TextDetDataPreprocessor

Image pre-processor for detection tasks.

Necks¶

`FPEM_FFM`	This code is from https://github.com/WenmuZhou/PAN.pytorch.
`FPNF`	FPN-like fusion module in Shape Robust Text Detection with Progressive Scale Expansion Network.
`FPNC`	FPN-like fusion module in Real-time Scene Text Detection with Differentiable Binarization.
`FPN_UNet`	The class for implementing DRRG and TextSnake U-Net-like FPN.

Heads¶

`BaseTextDetHead`	Base head for text detection, build the loss and postprocessor.
`PSEHead`	The class for PSENet head.
`PANHead`	The class for PANet head.
`DBHead`	The class for DBNet head.
`FCEHead`	The class for implementing FCENet head.
`TextSnakeHead`	The class for TextSnake head: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes.
`DRRGHead`	The class for DRRG head: Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection.

Module Losses¶

`SegBasedModuleLoss`	Base class for the module loss of segmentation-based text detection algorithms with some handy utilities.
`PANModuleLoss`	The class for implementing PANet loss.
`PSEModuleLoss`	The class for implementing PSENet loss.
`DBModuleLoss`	The class for implementing DBNet loss.
`TextSnakeModuleLoss`	The class for implementing TextSnake loss.
`FCEModuleLoss`	The class for implementing FCENet loss.
`DRRGModuleLoss`	The class for implementing DRRG loss.

Postprocessors¶

`BaseTextDetPostProcessor`	Base postprocessor for text detection models.
`PSEPostprocessor`	Decoding predictions of PSENet to instances.
`PANPostprocessor`	Convert scores to quadrangles via post processing in PANet.
`DBPostprocessor`	Decoding predictions of DbNet to instances.
`DRRGPostprocessor`	Merge text components and construct boundaries of text instances.
`FCEPostprocessor`	Decoding predictions of FCENet to instances.
`TextSnakePostprocessor`	Decoding predictions of TextSnake to instances.

models.textrecog¶

Recognizers¶

`BaseRecognizer`	Base class for recognizer.
`EncoderDecoderRecognizer`	Base class for encode-decode recognizer.
`CRNN`	CTC-loss based recognizer.
`SARNet`	Implementation of SAR
`NRTR`	Implementation of NRTR
`RobustScanner`	Implementation of `RobustScanner.
`SATRN`	Implementation of SATRN
`ABINet`	Implementation of `Read Like Humans: Autonomous, Bidirectional and Iterative LanguageModeling for Scene Text Recognition.
`MASTER`	Implementation of MASTER
`ASTER`	Implement `ASTER: An Attentional Scene Text Recognizer with Flexible Rectification.

Data Preprocessors¶

TextRecogDataPreprocessor

Image pre-processor for recognition tasks.

Preprocessors¶

STN

Implement STN module in ASTER: An Attentional Scene Text Recognizer with Flexible Rectification (https://ieeexplore.ieee.org/abstract/document/8395027/)

BackBones¶

`ResNet31OCR`	Implement ResNet backbone for text recognition, modified from
`MiniVGG`	A mini VGG backbone for text recognition, modified from `VGG-VeryDeep.
`NRTRModalityTransform`	Modality transform in NRTR.
`ShallowCNN`	Implement Shallow CNN block for SATRN.
`ResNetABI`	Implement ResNet backbone for text recognition, modified from `ResNet.
`ResNet`	param in_channels Number of channels of input image tensor.
`MobileNetV2`	See mmdet.models.backbones.MobileNetV2 for details.

Encoders¶

`SAREncoder`	Implementation of encoder module in `SAR.
`NRTREncoder`	Transformer Encoder block with self attention mechanism.
`BaseEncoder`	Base Encoder class for text recognition.
`ChannelReductionEncoder`	Change the channel number with a one by one convoluational layer.
`SATRNEncoder`	Implement encoder for SATRN, see `SATRN.
`ABIEncoder`	Implement transformer encoder for text recognition, modified from <https://github.com/FangShancheng/ABINet>.
`ASTEREncoder`	Implement BiLSTM encoder module in `ASTER: An Attentional Scene Text Recognizer with Flexible Rectification.

Decoders¶

`BaseDecoder`	Base decoder for text recognition, build the loss and postprocessor.
`ABILanguageDecoder`	Transformer-based language model responsible for spell correction. Implementation of language model of ABINet.
`ABIVisionDecoder`	Converts visual features into text characters.
`ABIFuser`	A special decoder responsible for mixing and aligning visual feature and linguistic feature.
`CRNNDecoder`	Decoder for CRNN.
`ParallelSARDecoder`	Implementation Parallel Decoder module in `SAR.
`SequentialSARDecoder`	Implementation Sequential Decoder module in `SAR.
`ParallelSARDecoderWithBS`	Parallel Decoder module with beam-search in SAR.
`NRTRDecoder`	Transformer Decoder block with self attention mechanism.
`SequenceAttentionDecoder`	Sequence attention decoder for RobustScanner.
`PositionAttentionDecoder`	Position attention decoder for RobustScanner.
`RobustScannerFuser`	Decoder for RobustScanner.
`MasterDecoder`	Decoder module in MASTER.
`ASTERDecoder`	Implement attention decoder.

Module Losses¶

`BaseTextRecogModuleLoss`	Base recognition loss.
`CEModuleLoss`	Implementation of loss module for encoder-decoder based text recognition method with CrossEntropy loss.
`CTCModuleLoss`	Implementation of loss module for CTC-loss based text recognition.
`ABIModuleLoss`	Implementation of ABINet multiloss that allows mixing different types of losses with weights.

Postprocessors¶

`BaseTextRecogPostprocessor`	Base text recognition postprocessor.
`AttentionPostprocessor`	PostProcessor for seq2seq.
`CTCPostProcessor`	PostProcessor for CTC.

Layers¶

`BidirectionalLSTM`
`Adaptive2DPositionalEncoding`	Implement Adaptive 2D positional encoder for SATRN, see `SATRN.
`BasicBlock`
`Bottleneck`
`RobustScannerFusionLayer`
`DotProductAttentionLayer`
`PositionAwareLayer`
`SATRNEncoderLayer`	Implement encoder layer for SATRN, see `SATRN.

models.kie¶

Extractors¶

SDMGR

The implementation of the paper: Spatial Dual-Modality Graph Reasoning for Key Information Extraction.

Heads¶

SDMGRHead

SDMGR Head.

Module Losses¶

SDMGRModuleLoss

The implementation the loss of key information extraction proposed in the paper: Spatial Dual-Modality Graph Reasoning for Key Information Extraction.

Postprocessors¶

SDMGRPostProcessor

Postprocessor for SDMGR.