Use SavedModel instead of HDF5 format, fix dewarping#89
Conversation
- move model loading into `setup` in constructor context - allow directories as models (TF SavedModel format), too - use correct pageId - simplify and polish
use custom dataset class for in-memory PIL.Image passing instead of file-based repurposed `AlignedDataset` (since (this is faster, and reliable: OCR-D does not guarantee us a `.filename` for derived images; also, does not create temporary files in the input fileGrp anymore)
after decoding, convert tensor to array with due respect for proper channel and dynamic range coding (instead of ad-hoc conversion); then resize while still in RGB and re-binarize (instead of ad-hoc binarization followed by resizing in binary)
- rebase on pix2pixHD#293 (CPU-only option, Torch>=1.0, less verbose, arg passing) - pass args to pix2pixHD directly (instead of sys.args hijacking) - no unneccesary verbosity (and only through loggers) - move model loading into startup context via `setup` fn - rename params: * `imgresize` → `resize_mode`, * `resizeHeight` → `resize_height` * `resizeWidth` → `resize_width` - add proper documentation - fix region-level results
(just BIN is not enough / not as good / not realistic)
|
Now also depends on NVIDIA/pix2pixHD#293, and contains various other fixes, mostly regarding dewarping. Fixes #34, #35, #40, #60, #61, #72, #73, #77, #87, #88, and probably #42 (see below – with With better upsampling/re-binarization, the quality of the dewarper has also improved a little. It is obviously not a good idea to downsample in the first place (which is the case with the default Here are some examples based on the dewarped with default settings:
dewarped with default settings but on GPU:
dewarped with larger size (less resampling/interpolation):
dewarped with original/full image size:
dewarped on cropped but raw RGB (just to show that the models have not been trained on such data):
|
Like I said, we still need to upload the new models, and update the resource URLs. (This is the reason the CI still fails.) |






On Python 3.8, you get errors trying to load the existing HDF5 models for Tensorflow processors
tisegandlayout-analysis.However, Tensorflow offers a more stable alternative: SavedModel directories. I have converted the existing models an adapted the code to make them runnable again.
Now, how do we redistribute these? I have uploaded them as tarballs here and here. But really they should go to https://ocr-d-repo.scc.kit.edu/models/dfki as well.
As soon as we get OCR-D/core#800 done, we should then be able to update the resource list in ocrd-tool.json, right?
Another dependency is in the processors using
ocrolib.morph, i.e.nlbinandtextline: OCR-D/ocropy#2 – @kba, as soon as you have merged and publishedocrd-fork-ocropy==1.4.0a4, this is ready to go.