PyPI - cnocr - Versions diffs - 2.3.0.3__tar.gz → 2.3.1__tar.gz - Mend

cnocr 2.3.0.3tar.gz → 2.3.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (63) hide show

{cnocr-2.3.0.3 → cnocr-2.3.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: cnocr
-Version: 2.3.0.3
+Version: 2.3.1
 Summary: Python3 package for Chinese/English OCR, with small pretrained models
 Home-page: https://github.com/breezedeus/cnocr
 Author: breezedeus
@@ -69,6 +69,16 @@ License-File: LICENSE
 ---
 </div>
+### [Update 2024.11.28]：发布 V2.3.1
+主要变更：
+* 基于 RapidOCR 集成 PPOCRv4 最新版 OCR 模型，提供更多的模型选择
+  * 新增支持 PP-OCRv4  识别模型，包括标准版和服务器版
+* 修改读文件实现方式，支持 Windows 的中文路径
+* 修复Bug：当使用多个进程时，transform_func 无法序列化
+* 修复Bug：与 albumentations=1.4.* 兼容
 ### [Update 2023.12.24]：发布 V2.3
 主要变更：
@@ -406,13 +416,13 @@ print(ocr_out)
 | ------------------------------------------------------------ | ------------ | --------- | ------------ | ------------ | ------------------------------ | -------------------- |
 | db_shufflenet_v2                                             | √            | X         | cnocr        | 18 M         | 简体中文、繁体中文、英文、数字 | √                    |
 | **db_shufflenet_v2_small**                                   | √            | X         | cnocr        | 12 M         | 简体中文、繁体中文、英文、数字 | √                    |
-| [db_shufflenet_v2_tiny](https://mp.weixin.qq.com/s/fHPNoGyo72EFApVhEgR6Nw) | √            | X         | cnocr        | 7.5 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | db_mobilenet_v3                                              | √            | X         | cnocr        | 16 M         | 简体中文、繁体中文、英文、数字 | √                    |
 | db_mobilenet_v3_small                                        | √            | X         | cnocr        | 7.9 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | db_resnet34                                                  | √            | X         | cnocr        | 86 M         | 简体中文、繁体中文、英文、数字 | √                    |
 | db_resnet18                                                  | √            | X         | cnocr        | 47 M         | 简体中文、繁体中文、英文、数字 | √                    |
+| ch_PP-OCRv4_det                                              | X            | √         | ppocr        | 4.5 M        | 简体中文、繁体中文、英文、数字 | √                    |
+| ch_PP-OCRv4_det_server                                       | X            | √         | ppocr        | 108 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | ch_PP-OCRv3_det                                              | X            | √         | ppocr        | 2.3 M        | 简体中文、繁体中文、英文、数字 | √                    |
-| ch_PP-OCRv2_det                                              | X            | √         | ppocr        | 2.2 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | **en_PP-OCRv3_det**                                          | X            | √         | ppocr        | 2.3 M        | **英文**、数字                 | √                    |
@@ -449,11 +459,18 @@ print(ocr_out)
 | **number-densenet_lite_136-fc** 🆕                            | √            | √         | cnocr        | 2.7 M        | **纯数字**（仅包含 `0~9` 十个数字） | X                    |
 | **number-densenet_lite_136-gru**  🆕 <br /> ([星球会员](https://t.zsxq.com/FEYZRJQ)专享) | √            | √         | cnocr        | 5.5 M        | **纯数字**（仅包含 `0~9` 十个数字） | X                    |
 | **number-densenet_lite_666-gru_large** 🆕 <br />（购买链接：[B站](https://gf.bilibili.com/item/detail/1104055055)、[Lemon Squeezy](https://ocr.lemonsqueezy.com/)） | √            | √         | cnocr        | 55 M         | **纯数字**（仅包含 `0~9` 十个数字） | X                    |
+| ch_PP-OCRv4                                                  | X            | √         | ppocr        | 10 M         | 简体中文、英文、数字                | √                    |
+| ch_PP-OCRv4_server                                           | X            | √         | ppocr        | 86 M         | 简体中文、英文、数字                | √                    |
 | ch_PP-OCRv3                                                  | X            | √         | ppocr        | 10 M         | 简体中文、英文、数字                | √                    |
 | ch_ppocr_mobile_v2.0                                         | X            | √         | ppocr        | 4.2 M        | 简体中文、英文、数字                | √                    |
+| en_PP-OCRv4                                                  | X            | √         | ppocr        | 8.6 M        | **英文**、数字                      | √                    |
 | en_PP-OCRv3                                                  | X            | √         | ppocr        | 8.5 M        | **英文**、数字                      | √                    |
 | en_number_mobile_v2.0                                        | X            | √         | ppocr        | 1.8 M        | **英文**、数字                      | √                    |
 | chinese_cht_PP-OCRv3                                         | X            | √         | ppocr        | 11 M         | **繁体中文**、英文、数字            | X                    |
+| japan_PP-OCRv3                                               | X            | √         | ppocr        | 9.6 M         | **日文**、英文、数字                | √                    |
+| korean_PP-OCRv3                                              | X            | √         | ppocr        | 9.4 M         | **韩文**、英文、数字                | √                    |
+| latin_PP-OCRv3                                               | X            | √         | ppocr        | 8.6 M         | **拉丁文**、英文、数字              | √                    |
+| arabic_PP-OCRv3                                              | X            | √         | ppocr        | 8.6 M         | **阿拉伯文**、英文、数字            | √                    |
@@ -484,4 +501,3 @@ print(ocr_out)
 官方代码库：[https://github.com/breezedeus/cnocr](https://github.com/breezedeus/cnocr)。

{cnocr-2.3.0.3 → cnocr-2.3.1}/README.md RENAMED Viewed

@@ -39,6 +39,16 @@
 ---
 </div>
+### [Update 2024.11.28]：发布 V2.3.1
+主要变更：
+* 基于 RapidOCR 集成 PPOCRv4 最新版 OCR 模型，提供更多的模型选择
+  * 新增支持 PP-OCRv4  识别模型，包括标准版和服务器版
+* 修改读文件实现方式，支持 Windows 的中文路径
+* 修复Bug：当使用多个进程时，transform_func 无法序列化
+* 修复Bug：与 albumentations=1.4.* 兼容
 ### [Update 2023.12.24]：发布 V2.3
 主要变更：
@@ -376,13 +386,13 @@ print(ocr_out)
 | ------------------------------------------------------------ | ------------ | --------- | ------------ | ------------ | ------------------------------ | -------------------- |
 | db_shufflenet_v2                                             | √            | X         | cnocr        | 18 M         | 简体中文、繁体中文、英文、数字 | √                    |
 | **db_shufflenet_v2_small**                                   | √            | X         | cnocr        | 12 M         | 简体中文、繁体中文、英文、数字 | √                    |
-| [db_shufflenet_v2_tiny](https://mp.weixin.qq.com/s/fHPNoGyo72EFApVhEgR6Nw) | √            | X         | cnocr        | 7.5 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | db_mobilenet_v3                                              | √            | X         | cnocr        | 16 M         | 简体中文、繁体中文、英文、数字 | √                    |
 | db_mobilenet_v3_small                                        | √            | X         | cnocr        | 7.9 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | db_resnet34                                                  | √            | X         | cnocr        | 86 M         | 简体中文、繁体中文、英文、数字 | √                    |
 | db_resnet18                                                  | √            | X         | cnocr        | 47 M         | 简体中文、繁体中文、英文、数字 | √                    |
+| ch_PP-OCRv4_det                                              | X            | √         | ppocr        | 4.5 M        | 简体中文、繁体中文、英文、数字 | √                    |
+| ch_PP-OCRv4_det_server                                       | X            | √         | ppocr        | 108 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | ch_PP-OCRv3_det                                              | X            | √         | ppocr        | 2.3 M        | 简体中文、繁体中文、英文、数字 | √                    |
-| ch_PP-OCRv2_det                                              | X            | √         | ppocr        | 2.2 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | **en_PP-OCRv3_det**                                          | X            | √         | ppocr        | 2.3 M        | **英文**、数字                 | √                    |
@@ -419,11 +429,18 @@ print(ocr_out)
 | **number-densenet_lite_136-fc** 🆕                            | √            | √         | cnocr        | 2.7 M        | **纯数字**（仅包含 `0~9` 十个数字） | X                    |
 | **number-densenet_lite_136-gru**  🆕 <br /> ([星球会员](https://t.zsxq.com/FEYZRJQ)专享) | √            | √         | cnocr        | 5.5 M        | **纯数字**（仅包含 `0~9` 十个数字） | X                    |
 | **number-densenet_lite_666-gru_large** 🆕 <br />（购买链接：[B站](https://gf.bilibili.com/item/detail/1104055055)、[Lemon Squeezy](https://ocr.lemonsqueezy.com/)） | √            | √         | cnocr        | 55 M         | **纯数字**（仅包含 `0~9` 十个数字） | X                    |
+| ch_PP-OCRv4                                                  | X            | √         | ppocr        | 10 M         | 简体中文、英文、数字                | √                    |
+| ch_PP-OCRv4_server                                           | X            | √         | ppocr        | 86 M         | 简体中文、英文、数字                | √                    |
 | ch_PP-OCRv3                                                  | X            | √         | ppocr        | 10 M         | 简体中文、英文、数字                | √                    |
 | ch_ppocr_mobile_v2.0                                         | X            | √         | ppocr        | 4.2 M        | 简体中文、英文、数字                | √                    |
+| en_PP-OCRv4                                                  | X            | √         | ppocr        | 8.6 M        | **英文**、数字                      | √                    |
 | en_PP-OCRv3                                                  | X            | √         | ppocr        | 8.5 M        | **英文**、数字                      | √                    |
 | en_number_mobile_v2.0                                        | X            | √         | ppocr        | 1.8 M        | **英文**、数字                      | √                    |
 | chinese_cht_PP-OCRv3                                         | X            | √         | ppocr        | 11 M         | **繁体中文**、英文、数字            | X                    |
+| japan_PP-OCRv3                                               | X            | √         | ppocr        | 9.6 M         | **日文**、英文、数字                | √                    |
+| korean_PP-OCRv3                                              | X            | √         | ppocr        | 9.4 M         | **韩文**、英文、数字                | √                    |
+| latin_PP-OCRv3                                               | X            | √         | ppocr        | 8.6 M         | **拉丁文**、英文、数字              | √                    |
+| arabic_PP-OCRv3                                              | X            | √         | ppocr        | 8.6 M         | **阿拉伯文**、英文、数字            | √                    |
@@ -452,4 +469,3 @@ print(ocr_out)
 ---
 官方代码库：[https://github.com/breezedeus/cnocr](https://github.com/breezedeus/cnocr)。

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/__version__.py RENAMED Viewed

@@ -17,4 +17,4 @@
 # specific language governing permissions and limitations
 # under the License.
-__version__ = '2.3.0.3'
+__version__ = '2.3.1'

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/app.py RENAMED Viewed

@@ -129,7 +129,7 @@ def main():
     det_models.append(('naive_det', 'onnx'))
     det_models.sort()
     det_model_name = st.sidebar.selectbox(
-        '选择检测模型', det_models, index=det_models.index(('ch_PP-OCRv3_det', 'onnx'))
+        '选择检测模型', det_models, index=det_models.index(('ch_PP-OCRv4_det', 'onnx'))
     )
     all_models = list(REC_AVAILABLE_MODELS.all_models())

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/cli.py RENAMED Viewed

@@ -215,8 +215,8 @@ def visualize_example(example, fp_prefix):
     '-d',
     '--det-model-name',
     type=str,
-    default='ch_PP-OCRv3_det',
-    help='检测模型名称。默认值为 ch_PP-OCRv3_det',
+    default='ch_PP-OCRv4_det',
+    help='检测模型名称。默认值为 ch_PP-OCRv4_det',
 )
 @click.option(
     '--det-model-backend',

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/cn_ocr.py RENAMED Viewed

@@ -35,7 +35,7 @@ from .consts import AVAILABLE_MODELS as REC_AVAILABLE_MODELS
 from .utils import data_dir, read_img
 from .line_split import line_split
 from .recognizer import Recognizer
-from .ppocr import PPRecognizer, PP_SPACE
+from .ppocr import PPRecognizer, RapidRecognizer, PP_SPACE
 logger = logging.getLogger(__name__)
@@ -64,7 +64,7 @@ class CnOcr(object):
         self,
         rec_model_name: str = 'densenet_lite_136-gru',
         *,
-        det_model_name: str = 'ch_PP-OCRv3_det',
+        det_model_name: str = 'ch_PP-OCRv4_det',
         cand_alphabet: Optional[Union[Collection, str]] = None,
         context: str = 'cpu',  # ['cpu', 'gpu', 'cuda']
         rec_model_fp: Optional[str] = None,
@@ -83,7 +83,7 @@ class CnOcr(object):
         Args:
             rec_model_name (str): 识别模型名称。默认为 `densenet_lite_136-gru`
-            det_model_name (str): 检测模型名称。默认为 `ch_PP-OCRv3_det`
+            det_model_name (str): 检测模型名称。默认为 `ch_PP-OCRv4_det`
             cand_alphabet (Optional[Union[Collection, str]]): 待识别字符所在的候选集合。默认为 `None`，表示不限定识别字符范围
             context (str): 'cpu', or 'gpu'。表明预测时是使用CPU还是GPU。默认为 `cpu`。
                 此参数仅在 `model_backend=='pytorch'` 时有效。
@@ -143,7 +143,8 @@ class CnOcr(object):
         if self.rec_space == REC_AVAILABLE_MODELS.CNOCR_SPACE:
             rec_cls = Recognizer
         elif self.rec_space == PP_SPACE:
-            rec_cls = PPRecognizer
+            rec_name = REC_AVAILABLE_MODELS.get_value(rec_model_name, rec_model_backend, 'recognizer')
+            rec_cls = RapidRecognizer if rec_name == 'RapidRecognizer' else PPRecognizer
             if rec_vocab_fp is not None:
                 logger.warning('param `vocab_fp` is invalid for %s models' % PP_SPACE)
         else:

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/consts.py RENAMED Viewed

@@ -335,6 +335,18 @@ class AvailableModels(object):
             )
             return CN_VOCAB_FP
+    def get_value(self, model_name, model_backend, key) -> Optional[Any]:
+        if (model_name, model_backend) in self.CNOCR_MODELS:
+            info = self.CNOCR_MODELS[(model_name, model_backend)]
+        elif (model_name, model_backend) in self.OUTER_MODELS:
+            info = self.OUTER_MODELS[(model_name, model_backend)]
+        else:
+            logger.warning(
+                'no url is found for model %s' % ((model_name, model_backend),)
+            )
+            return None
+        return info.get(key)
     def get_epoch(self, model_name, model_backend) -> Optional[int]:
         if (model_name, model_backend) in self.CNOCR_MODELS:
             return self.CNOCR_MODELS[(model_name, model_backend)]['epoch']

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/data_utils/transforms.py RENAMED Viewed

@@ -138,14 +138,13 @@ class Bitmap(ImageOnlyTransform):
         return img
-class RandomStretchAug(alb.Resize):
+class RandomStretchAug(ImageOnlyTransform):
     """保持高度不变的情况下，对图像的宽度进行随机拉伸"""
     def __init__(
-        self, min_ratio=0.9, max_ratio=1.1, min_width=8, always_apply=False, p=1
-    ):
-        super(RandomStretchAug, self).__init__(
-            height=0, width=0, always_apply=always_apply, p=p
-        )
+            self, min_ratio=0.9, max_ratio=1.1, min_width=8, always_apply=False, p=1
+        ):
+        super().__init__(always_apply=always_apply, p=p)
         self.min_width = min_width
         self.min_ratio = min_ratio
         self.max_ratio = max_ratio
@@ -171,7 +170,7 @@ class CustomRandomCrop(ImageOnlyTransform):
             always_apply (bool): Whether to always apply the crop. Defaults to False.
             p (float): The probability of applying the crop. Defaults to 1.0.
         """
-        super(CustomRandomCrop, self).__init__(always_apply, p)
+        super().__init__(always_apply=always_apply, p=p)
         self.crop_size = crop_size
     def cal_params(self, img):
@@ -210,7 +209,7 @@ class TransparentOverlay(ImageOnlyTransform):
     def __init__(
         self, max_height_ratio, max_width_ratio, alpha, always_apply=False, p=1.0
     ):
-        super(TransparentOverlay, self).__init__(always_apply, p)
+        super().__init__(always_apply=always_apply, p=p)
         self.max_height_ratio = max_height_ratio
         self.max_width_ratio = max_width_ratio
         self.alpha = alpha
@@ -316,9 +315,9 @@ class TransformWrapper(object):
 _train_alb_transform = alb.Compose(
     [
-        CustomRandomCrop((8, 10), p=0.8),
+        CustomRandomCrop(crop_size=(8, 10), always_apply=False, p=0.8),
         alb.OneOf([Erosion((2, 3)), Dilation((2, 3))], p=0.1),
-        TransparentOverlay(1.0, 0.1, alpha=0.4, p=0.2),  # 半透明的矩形框覆盖
+        TransparentOverlay(1.0, 0.1, alpha=0.4, always_apply=False, p=0.2),  # 半透明的矩形框覆盖
         alb.Affine(shear={"x": (0, 3), "y": (-3, 0)}, cval=(255, 255, 255), p=0.03),
         alb.ShiftScaleRotate(
             shift_limit_x=(0, 0.04),
@@ -382,9 +381,9 @@ train_transform = TransformWrapper(_train_alb_transform)
 _ft_alb_transform = alb.Compose(
     [
-        CustomRandomCrop((4, 4), p=0.8),
+        CustomRandomCrop(crop_size=(4, 4), always_apply=False, p=0.8),
         alb.OneOf([Erosion((2, 3)), Dilation((2, 3))], p=0.1),
-        TransparentOverlay(1.0, 0.1, alpha=0.4, p=0.2),  # 半透明的矩形框覆盖
+        TransparentOverlay(1.0, 0.1, alpha=0.4, always_apply=False, p=0.2),  # 半透明的矩形框覆盖
         alb.RandomBrightnessContrast(0.1, 0.1, True, p=0.1),
         alb.ImageCompression(95, p=0.3),
         alb.GaussNoise(20, p=0.2),
@@ -413,7 +412,7 @@ ft_transform = TransformWrapper(_ft_alb_transform)
 _test_alb_transform = alb.Compose(
     [
-        CustomRandomCrop((6, 8), p=0.8),
+        CustomRandomCrop(crop_size=(6, 8), p=0.8),
         ToSingleChannelGray(always_apply=True),
         CustomNormalize(always_apply=True),
         # alb.Normalize(0.456045, 0.224567, always_apply=True),

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/dataset_utils.py RENAMED Viewed

@@ -22,6 +22,8 @@
 from datasets import Dataset, Image
 import numpy as np
 import torch
+import os
+from pathlib import Path
 from .consts import IMG_STANDARD_HEIGHT
 from .utils import read_tsv_file, pad_img_seq
@@ -41,6 +43,25 @@ def preprocess(img):
     return img.resize(target_w_h)
+def apply_transforms(img, transforms):
+    """Apply transforms to a single image."""
+    img = np.array(img)
+    if img.ndim == 2:
+        img = np.expand_dims(img, 0)
+    return transforms(torch.from_numpy(img))
+def create_transform_func(transforms):
+    """Create a transform function that can be pickled."""
+    def transform_func(examples):
+        outs = []
+        for img in examples['image']:
+            outs.append(apply_transforms(img, transforms))
+        examples['transformed_image'] = outs
+        return examples
+    return transform_func
 def gen_dataset(
     index_fp, img_folder=None, transforms=None, mode='train', num_workers=None
 ) -> Dataset:
@@ -80,18 +101,7 @@ def gen_dataset(
     dataset = dataset.map(map_func, batched=True, num_proc=num_workers)
     if transforms is not None:
-        def transform_func(examples):
-            outs = []
-            for img in examples['image']:
-                img = np.array(img)
-                if img.ndim == 2:
-                    img = np.expand_dims(img, 0)
-                outs.append(transforms(torch.from_numpy(img)))
-            examples['transformed_image'] = outs
-            return examples
-        dataset.set_transform(transform_func)
+        dataset.set_transform(create_transform_func(transforms))
     return dataset

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/gradio_app.py RENAMED Viewed

@@ -172,7 +172,7 @@ def main():
             with gr.Column(min_width=200, variant='panel', scale=1):
                 gr.Markdown('### 模型设置')
                 det_model_name = gr.Dropdown(
-                    label='选择检测模型', choices=det_models, value='ch_PP-OCRv3_det::onnx',
+                    label='选择检测模型', choices=det_models, value='ch_PP-OCRv4_det::onnx',
                 )
                 rec_model_name = gr.Dropdown(

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/ppocr/__init__.py RENAMED Viewed

@@ -3,5 +3,6 @@
 from ..consts import AVAILABLE_MODELS
 from .consts import MODEL_LABELS_FILE_DICT, PP_SPACE
 from .pp_recognizer import PPRecognizer
+from .rapid_recognizer import RapidRecognizer
 AVAILABLE_MODELS.register_models(MODEL_LABELS_FILE_DICT, space=PP_SPACE)

cnocr-2.3.1/cnocr/ppocr/consts.py ADDED Viewed

@@ -0,0 +1,78 @@
+# coding: utf-8
+# Copyright (C) 2022, [Breezedeus](https://github.com/breezedeus).
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from pathlib import Path
+VOCAB_DIR = Path(__file__).parent / "utils"
+MODEL_LABELS_FILE_DICT = {
+    ("ch_PP-OCRv3", "onnx"): {
+        "vocab_fp": VOCAB_DIR / "ppocr_keys_v1.txt",  # 简体中英文
+        "url": "ch_PP-OCRv3_rec_infer-onnx.zip",
+    },
+    ("ch_ppocr_mobile_v2.0", "onnx"): {
+        "vocab_fp": VOCAB_DIR / "ppocr_keys_v1.txt",
+        "url": "ch_ppocr_mobile_v2.0_rec_infer-onnx.zip",
+    },
+    ("en_number_mobile_v2.0", "onnx"): {
+        "vocab_fp": VOCAB_DIR / "en_dict.txt",
+        "url": "en_number_mobile_v2.0_rec_infer-onnx.zip",
+    },
+    ("chinese_cht_PP-OCRv3", "onnx"): {
+        "vocab_fp": VOCAB_DIR / "chinese_cht_dict.txt",  # 繁体中文
+        "url": "chinese_cht_PP-OCRv3_rec_infer-onnx.zip",
+    },
+    ("japan_PP-OCRv3", "onnx"): {
+        "recognizer": "RapidRecognizer",
+        "repo": "breezedeus/cnocr-ppocr-japan_PP-OCRv3",
+    },
+    ("korean_PP-OCRv3", "onnx"): {
+        "recognizer": "RapidRecognizer",
+        "repo": "breezedeus/cnocr-ppocr-korean_PP-OCRv3",
+    },
+    ("latin_PP-OCRv3", "onnx"): {
+        "recognizer": "RapidRecognizer",
+        "repo": "breezedeus/cnocr-ppocr-latin_PP-OCRv3",
+    },
+    ("arabic_PP-OCRv3", "onnx"): {
+        "recognizer": "RapidRecognizer",
+        "repo": "breezedeus/cnocr-ppocr-arabic_PP-OCRv3",
+    },
+    ("en_PP-OCRv3", "onnx"): {
+        "vocab_fp": VOCAB_DIR / "en_dict.txt",  # 英文
+        "url": "en_PP-OCRv3_rec_infer-onnx.zip",
+        "recognizer": "RapidRecognizer",
+        "repo": "breezedeus/cnocr-ppocr-en_PP-OCRv3",
+    },
+    ("en_PP-OCRv4", "onnx"): {
+        "recognizer": "RapidRecognizer",
+        "repo": "breezedeus/cnocr-ppocr-en_PP-OCRv4",
+    },
+    ("ch_PP-OCRv4", "onnx"): {
+        "recognizer": "RapidRecognizer",
+        "repo": "breezedeus/cnocr-ppocr-ch_PP-OCRv4",
+    },
+    ("ch_PP-OCRv4_server", "onnx"): {
+        "recognizer": "RapidRecognizer",
+        "repo": "breezedeus/cnocr-ppocr-ch_PP-OCRv4_server",
+    },
+}
+PP_SPACE = "ppocr"

cnocr-2.3.1/cnocr/ppocr/rapid_recognizer.py ADDED Viewed

@@ -0,0 +1,135 @@
+# coding: utf-8
+# Copyright (C) 2022-2024, [Breezedeus](https://github.com/breezedeus).
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.
+import os
+import logging
+from typing import Union, Optional, List, Tuple
+from pathlib import Path
+import numpy as np
+from rapidocr_onnxruntime.ch_ppocr_rec.text_recognize import TextRecognizer
+from cnstd.utils import prepare_model_files
+from ..utils import data_dir, read_img
+from ..recognizer import Recognizer
+from .consts import PP_SPACE
+from ..consts import MODEL_VERSION, AVAILABLE_MODELS
+logger = logging.getLogger(__name__)
+class RapidRecognizer(Recognizer):
+    def __init__(
+        self,
+        model_name: str = "ch_PP-OCRv3",
+        *,
+        model_fp: Optional[str] = None,
+        root: Union[str, Path] = data_dir(),
+        context: str = "cpu",  # ['cpu', 'gpu']
+        rec_image_shape: str = "3, 48, 320",
+        **kwargs
+    ):
+        """
+        基于 rapidocr_onnxruntime 的文本识别器。
+        Args:
+            model_name (str): 模型名称。默认为 `ch_PP-OCRv3`
+            model_fp (Optional[str]): 如果不使用系统自带的模型，可以通过此参数直接指定所使用的模型文件（'.onnx' 文件）
+            root (Union[str, Path]): 模型文件所在的根目录
+            context (str): 使用的设备。默认为 `cpu`，可选 `gpu`
+            rec_image_shape (str): 输入图片尺寸，无需更改使用默认值即可。默认值：`"3, 32, 320"`
+            **kwargs: 其他参数
+        """
+        self.rec_image_shape = [int(v) for v in rec_image_shape.split(",")]
+        self._model_name = model_name
+        self._model_backend = "onnx"
+        use_gpu = context.lower() not in ("cpu", "mps")
+        self._assert_and_prepare_model_files(model_fp, root)
+        config = {
+            "use_cuda": use_gpu,
+            "rec_img_shape": self.rec_image_shape,
+            "rec_batch_num": 6,
+            "model_path": self._model_fp,
+        }
+        self.recognizer = TextRecognizer(config)
+    def _assert_and_prepare_model_files(self, model_fp, root):
+        if model_fp is not None and not os.path.isfile(model_fp):
+            raise FileNotFoundError("can not find model file %s" % model_fp)
+        if model_fp is not None:
+            self._model_fp = model_fp
+            return
+        root = os.path.join(root, MODEL_VERSION)
+        self._model_dir = os.path.join(root, PP_SPACE, self._model_name)
+        model_fp = os.path.join(self._model_dir, "%s_rec_infer.onnx" % self._model_name)
+        if not os.path.isfile(model_fp):
+            logger.warning("can not find model file %s" % model_fp)
+            if (self._model_name, self._model_backend) not in AVAILABLE_MODELS:
+                raise NotImplementedError(
+                    "%s is not a downloadable model"
+                    % ((self._model_name, self._model_backend),)
+                )
+            remote_repo = AVAILABLE_MODELS.get_value(
+                self._model_name, self._model_backend, "repo"
+            )
+            model_fp = prepare_model_files(model_fp, remote_repo)
+        self._model_fp = model_fp
+        logger.info("use model: %s" % self._model_fp)
+    def recognize(
+        self, img_list: List[Union[str, Path, np.ndarray]], batch_size: int = 6
+    ) -> List[Tuple[str, float]]:
+        """
+        识别图片中的文字。
+        Args:
+            img_list: 支持以下格式的图片数据：
+                + 图片路径
+                + 已经从图片文件中读入的数据
+            batch_size: 待处理图片数据的批大小。
+        Returns:
+            列表，每个元素是对应图片的识别结果，由 (text, score) 组成，其中：
+                + text: 识别出的文本
+                + score: 识别结果的得分
+        """
+        if not isinstance(img_list, (list, tuple)):
+            img_list = [img_list]
+        self.recognizer.rec_batch_num = batch_size
+        img_data_list = []
+        for img in img_list:
+            if isinstance(img, (str, Path)):
+                img = read_img(img, gray=False)
+            if len(img.shape) == 3 and img.shape[2] == 3:
+                img = img[..., ::-1]  # RGB to BGR
+            img_data_list.append(img)
+        results, _ = self.recognizer(img_data_list)
+        return results
+    def recognize_one_line(
+        self, img: Union[str, Path, np.ndarray]
+    ) -> Tuple[str, float]:
+        """
+        识别图片中的一行文字。
+        Args:
+            img: 支持以下格式的图片数据：
+                + 图片路径
+                + 已经从图片文件中读入的数据
+        Returns:
+            (text, score)：
+                + text: 识别出的文本
+                + score: 识别结果的得分
+        """
+        results = self.recognize([img])
+        return results[0]

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/utils.py RENAMED Viewed

@@ -23,12 +23,11 @@ import os
 from pathlib import Path
 import logging
 import platform
-import zipfile
 import requests
 from typing import Union, Any, Tuple, List, Optional, Dict
 from tqdm import tqdm
-from PIL import Image
+from PIL import Image, ImageOps
 import cv2
 import numpy as np
 import torch
@@ -272,16 +271,18 @@ def read_img(path: Union[str, Path], gray=True) -> np.ndarray:
         * when `gray==True`, return a gray image, with dim [height, width, 1], with values range from 0 to 255
         * when `gray==False`, return a color image, with dim [height, width, 3], with values range from 0 to 255
     """
+    try:
+        img = Image.open(path)
+        img = ImageOps.exif_transpose(img)  # 识别旋转后的图片（pillow不会自动识别）
+    except Exception as e:
+        raise FileNotFoundError(f'Error loading image: {path}')
     if gray:
-        img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
-        if img is None:
-            raise FileNotFoundError(f'Error loading image: {path}')
-        return np.expand_dims(img, -1)
+        img = img.convert('L')
+        return np.expand_dims(np.array(img), -1)
     else:
-        img = cv2.imread(path)
-        if img is None:
-            raise FileNotFoundError(f'Error loading image: {path}')
-        return cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
+        img = img.convert('RGB')
+        return np.array(img)
 def save_img(img: Union[Tensor, np.ndarray], path):

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: cnocr
-Version: 2.3.0.3
+Version: 2.3.1
 Summary: Python3 package for Chinese/English OCR, with small pretrained models
 Home-page: https://github.com/breezedeus/cnocr
 Author: breezedeus
@@ -69,6 +69,16 @@ License-File: LICENSE
 ---
 </div>
+### [Update 2024.11.28]：发布 V2.3.1
+主要变更：
+* 基于 RapidOCR 集成 PPOCRv4 最新版 OCR 模型，提供更多的模型选择
+  * 新增支持 PP-OCRv4  识别模型，包括标准版和服务器版
+* 修改读文件实现方式，支持 Windows 的中文路径
+* 修复Bug：当使用多个进程时，transform_func 无法序列化
+* 修复Bug：与 albumentations=1.4.* 兼容
 ### [Update 2023.12.24]：发布 V2.3
 主要变更：
@@ -406,13 +416,13 @@ print(ocr_out)
 | ------------------------------------------------------------ | ------------ | --------- | ------------ | ------------ | ------------------------------ | -------------------- |
 | db_shufflenet_v2                                             | √            | X         | cnocr        | 18 M         | 简体中文、繁体中文、英文、数字 | √                    |
 | **db_shufflenet_v2_small**                                   | √            | X         | cnocr        | 12 M         | 简体中文、繁体中文、英文、数字 | √                    |
-| [db_shufflenet_v2_tiny](https://mp.weixin.qq.com/s/fHPNoGyo72EFApVhEgR6Nw) | √            | X         | cnocr        | 7.5 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | db_mobilenet_v3                                              | √            | X         | cnocr        | 16 M         | 简体中文、繁体中文、英文、数字 | √                    |
 | db_mobilenet_v3_small                                        | √            | X         | cnocr        | 7.9 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | db_resnet34                                                  | √            | X         | cnocr        | 86 M         | 简体中文、繁体中文、英文、数字 | √                    |
 | db_resnet18                                                  | √            | X         | cnocr        | 47 M         | 简体中文、繁体中文、英文、数字 | √                    |
+| ch_PP-OCRv4_det                                              | X            | √         | ppocr        | 4.5 M        | 简体中文、繁体中文、英文、数字 | √                    |
+| ch_PP-OCRv4_det_server                                       | X            | √         | ppocr        | 108 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | ch_PP-OCRv3_det                                              | X            | √         | ppocr        | 2.3 M        | 简体中文、繁体中文、英文、数字 | √                    |
-| ch_PP-OCRv2_det                                              | X            | √         | ppocr        | 2.2 M        | 简体中文、繁体中文、英文、数字 | √                    |
 | **en_PP-OCRv3_det**                                          | X            | √         | ppocr        | 2.3 M        | **英文**、数字                 | √                    |
@@ -449,11 +459,18 @@ print(ocr_out)
 | **number-densenet_lite_136-fc** 🆕                            | √            | √         | cnocr        | 2.7 M        | **纯数字**（仅包含 `0~9` 十个数字） | X                    |
 | **number-densenet_lite_136-gru**  🆕 <br /> ([星球会员](https://t.zsxq.com/FEYZRJQ)专享) | √            | √         | cnocr        | 5.5 M        | **纯数字**（仅包含 `0~9` 十个数字） | X                    |
 | **number-densenet_lite_666-gru_large** 🆕 <br />（购买链接：[B站](https://gf.bilibili.com/item/detail/1104055055)、[Lemon Squeezy](https://ocr.lemonsqueezy.com/)） | √            | √         | cnocr        | 55 M         | **纯数字**（仅包含 `0~9` 十个数字） | X                    |
+| ch_PP-OCRv4                                                  | X            | √         | ppocr        | 10 M         | 简体中文、英文、数字                | √                    |
+| ch_PP-OCRv4_server                                           | X            | √         | ppocr        | 86 M         | 简体中文、英文、数字                | √                    |
 | ch_PP-OCRv3                                                  | X            | √         | ppocr        | 10 M         | 简体中文、英文、数字                | √                    |
 | ch_ppocr_mobile_v2.0                                         | X            | √         | ppocr        | 4.2 M        | 简体中文、英文、数字                | √                    |
+| en_PP-OCRv4                                                  | X            | √         | ppocr        | 8.6 M        | **英文**、数字                      | √                    |
 | en_PP-OCRv3                                                  | X            | √         | ppocr        | 8.5 M        | **英文**、数字                      | √                    |
 | en_number_mobile_v2.0                                        | X            | √         | ppocr        | 1.8 M        | **英文**、数字                      | √                    |
 | chinese_cht_PP-OCRv3                                         | X            | √         | ppocr        | 11 M         | **繁体中文**、英文、数字            | X                    |
+| japan_PP-OCRv3                                               | X            | √         | ppocr        | 9.6 M         | **日文**、英文、数字                | √                    |
+| korean_PP-OCRv3                                              | X            | √         | ppocr        | 9.4 M         | **韩文**、英文、数字                | √                    |
+| latin_PP-OCRv3                                               | X            | √         | ppocr        | 8.6 M         | **拉丁文**、英文、数字              | √                    |
+| arabic_PP-OCRv3                                              | X            | √         | ppocr        | 8.6 M         | **阿拉伯文**、英文、数字            | √                    |
@@ -484,4 +501,3 @@ print(ocr_out)
 官方代码库：[https://github.com/breezedeus/cnocr](https://github.com/breezedeus/cnocr)。

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr.egg-info/SOURCES.txt RENAMED Viewed

@@ -43,6 +43,7 @@ cnocr/models/ocr_model.py
 cnocr/ppocr/__init__.py
 cnocr/ppocr/consts.py
 cnocr/ppocr/pp_recognizer.py
+cnocr/ppocr/rapid_recognizer.py
 cnocr/ppocr/utility.py
 cnocr/ppocr/postprocess/__init__.py
 cnocr/ppocr/postprocess/rec_postprocess.py

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr.egg-info/requires.txt RENAMED Viewed

@@ -8,7 +8,8 @@ wandb
 torchmetrics
 pillow>=5.3.0
 onnx
-cnstd>=1.2.3.4
+cnstd>=1.2.5.1
+rapidocr_onnxruntime<1.4
 [dev]
 albumentations

{cnocr-2.3.0.3 → cnocr-2.3.1}/setup.py RENAMED Viewed

@@ -47,7 +47,8 @@ required = [
     "torchmetrics",
     "pillow>=5.3.0",
     "onnx",
-    "cnstd>=1.2.3.4",
+    "cnstd>=1.2.5.1",
+    "rapidocr_onnxruntime<1.4",
 ]
 extras_require = {
     "ort-cpu": ["onnxruntime"],

cnocr-2.3.0.3/cnocr/ppocr/consts.py DELETED Viewed

@@ -1,48 +0,0 @@
-# coding: utf-8
-# Copyright (C) 2022, [Breezedeus](https://github.com/breezedeus).
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
-from pathlib import Path
-VOCAB_DIR = Path(__file__).parent / 'utils'
-MODEL_LABELS_FILE_DICT = {
-    ('ch_PP-OCRv3', 'onnx'): {
-        'vocab_fp': VOCAB_DIR / 'ppocr_keys_v1.txt',  # 简体中英文
-        'url': 'ch_PP-OCRv3_rec_infer-onnx.zip',
-    },
-    ('ch_ppocr_mobile_v2.0', 'onnx'): {
-        'vocab_fp': VOCAB_DIR / 'ppocr_keys_v1.txt',
-        'url': 'ch_ppocr_mobile_v2.0_rec_infer-onnx.zip',
-    },
-    ('en_PP-OCRv3', 'onnx'): {
-        'vocab_fp': VOCAB_DIR / 'en_dict.txt',  # 英文
-        'url': 'en_PP-OCRv3_rec_infer-onnx.zip',
-    },
-    ('en_number_mobile_v2.0', 'onnx'): {
-        'vocab_fp': VOCAB_DIR / 'en_dict.txt',
-        'url': 'en_number_mobile_v2.0_rec_infer-onnx.zip',
-    },
-    ('chinese_cht_PP-OCRv3', 'onnx'): {
-        'vocab_fp': VOCAB_DIR / 'chinese_cht_dict.txt',  # 繁体中文
-        'url': 'chinese_cht_PP-OCRv3_rec_infer-onnx.zip',
-    },
-}
-PP_SPACE = 'ppocr'

{cnocr-2.3.0.3 → cnocr-2.3.1}/LICENSE RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/__init__.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/classification/__init__.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/classification/dataset.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/classification/image_classifier.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/clf_cli.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/data_utils/__init__.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/data_utils/aug.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/data_utils/block_shuffle.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/data_utils/utils.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/dataset.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/gradio_app2.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/label_cn.txt RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/label_number.txt RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/line_split.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/lr_scheduler.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/models/__init__.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/models/ctc.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/models/densenet.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/models/mobilenet.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/models/ocr_model.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/ppocr/postprocess/__init__.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/ppocr/postprocess/rec_postprocess.py RENAMED Viewed

File without changes

{cnocr-2.3.0.3 → cnocr-2.3.1}/cnocr/ppocr/pp_recognizer.py RENAMED Viewed

@@ -73,6 +73,7 @@ class PPRecognizer(Recognizer):
         vocab_fp = AVAILABLE_MODELS.get_vocab_fp(self._model_name, self._model_backend)
         self._assert_and_prepare_model_files(model_fp, root)
+        logger.info('use model: %s' % self._model_fp)
         postprocess_params = {
             'name': 'CTCLabelDecode',
             'character_dict_path': vocab_fp,
@@ -114,7 +115,6 @@ class PPRecognizer(Recognizer):
             )  # download the .zip file and unzip
         self._model_fp = model_fp
-        logger.info('use model: %s' % self._model_fp)
     def resize_norm_img(self, img, max_wh_ratio):
         """