Note
You are reading the documentation for MMOCR 0.x, which will soon be deprecated by the end of 2022. We recommend you upgrade to MMOCR 1.0 to enjoy fruitful new features and better performance brought by OpenMMLab 2.0. Check out the maintenance plan, changelog, code and documentation of MMOCR 1.0 for more details.
Key Information Extraction¶
Overview¶
The structure of the key information extraction dataset directory is organized as follows.
└── wildreceipt
├── class_list.txt
├── dict.txt
├── image_files
├── openset_train.txt
├── openset_test.txt
├── test.txt
└── train.txt
Preparation Steps¶
WildReceipt¶
Just download and extract wildreceipt.tar.
WildReceiptOpenset¶
Step0: have WildReceipt prepared.
Step1: Convert annotation files to OpenSet format:
# You may find more available arguments by running
# python tools/data/kie/closeset_to_openset.py -h
python tools/data/kie/closeset_to_openset.py data/wildreceipt/train.txt data/wildreceipt/openset_train.txt
python tools/data/kie/closeset_to_openset.py data/wildreceipt/test.txt data/wildreceipt/openset_test.txt
Note
You can learn more about the key differences between CloseSet and OpenSet annotations in our tutorial.