mmocr.models¶
models.common¶
Dictionary¶
The class generates a dictionary for recognition. |
Losses¶
This loss combines a Sigmoid layers and a masked balanced BCE loss in one single class. |
|
Masked dice loss. |
|
Masked Smooth L1 loss. |
|
Masked square dice loss. |
|
This loss combines a Sigmoid layers and a masked BCE loss in one single class. |
|
Smooth L1 loss. |
|
Cross entropy loss. |
|
Masked Balanced BCE loss. |
|
Masked BCE loss. |
Layers¶
Transformer Encoder Layer. |
|
Transformer Decoder Layer. |
Modules¶
Scaled Dot-Product Attention Module. |
|
Multi-Head Attention module. |
|
Two-layer feed-forward module. |
|
Fixed positional encoding with sine and cosine functions. |
models.textdet¶
Detectors¶
The class for implementing single stage text detector. |
|
The class for implementing DBNet text detector: Real-time Scene Text Detection with Differentiable Binarization. |
|
The class for implementing PANet text detector: |
|
The class for implementing PSENet text detector: Shape Robust Text Detection with Progressive Scale Expansion Network. |
|
The class for implementing TextSnake text detector: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. |
|
The class for implementing FCENet text detector FCENet(CVPR2021): Fourier Contour Embedding for Arbitrary-shaped Text Detection |
|
The class for implementing DRRG text detector. |
|
A wrapper of MMDet’s model. |
Data Preprocessors¶
Image pre-processor for detection tasks. |
Necks¶
This code is from https://github.com/WenmuZhou/PAN.pytorch. |
|
FPN-like fusion module in Shape Robust Text Detection with Progressive Scale Expansion Network. |
|
FPN-like fusion module in Real-time Scene Text Detection with Differentiable Binarization. |
|
The class for implementing DRRG and TextSnake U-Net-like FPN. |
Heads¶
Base head for text detection, build the loss and postprocessor. |
|
The class for PSENet head. |
|
The class for PANet head. |
|
The class for DBNet head. |
|
The class for implementing FCENet head. |
|
The class for TextSnake head: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. |
|
The class for DRRG head: Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection. |
Module Losses¶
Base class for the module loss of segmentation-based text detection algorithms with some handy utilities. |
|
The class for implementing PANet loss. |
|
The class for implementing PSENet loss. |
|
The class for implementing DBNet loss. |
|
The class for implementing TextSnake loss. |
|
The class for implementing FCENet loss. |
|
The class for implementing DRRG loss. |
Postprocessors¶
Base postprocessor for text detection models. |
|
Decoding predictions of PSENet to instances. |
|
Convert scores to quadrangles via post processing in PANet. |
|
Decoding predictions of DbNet to instances. |
|
Merge text components and construct boundaries of text instances. |
|
Decoding predictions of FCENet to instances. |
|
Decoding predictions of TextSnake to instances. |
models.textrecog¶
Recognizers¶
Data Preprocessors¶
Preprocessors¶
BackBones¶
Encoders¶
Decoders¶
Module Losses¶
Postprocessors¶
Layers¶
models.kie¶
Extractors¶
The implementation of the paper: Spatial Dual-Modality Graph Reasoning for Key Information Extraction. |
Module Losses¶
The implementation the loss of key information extraction proposed in the paper: Spatial Dual-Modality Graph Reasoning for Key Information Extraction. |
Postprocessors¶
Postprocessor for SDMGR. |