pdf2img2pdf 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pdf2img2pdf-1.0.0/LICENSE +0 -0
- pdf2img2pdf-1.0.0/PKG-INFO +119 -0
- pdf2img2pdf-1.0.0/README.md +93 -0
- pdf2img2pdf-1.0.0/pdf2img2pdf/__init__.py +9 -0
- pdf2img2pdf-1.0.0/pdf2img2pdf/pdf2img2pdf.py +107 -0
- pdf2img2pdf-1.0.0/pdf2img2pdf.egg-info/PKG-INFO +119 -0
- pdf2img2pdf-1.0.0/pdf2img2pdf.egg-info/SOURCES.txt +12 -0
- pdf2img2pdf-1.0.0/pdf2img2pdf.egg-info/dependency_links.txt +1 -0
- pdf2img2pdf-1.0.0/pdf2img2pdf.egg-info/entry_points.txt +2 -0
- pdf2img2pdf-1.0.0/pdf2img2pdf.egg-info/requires.txt +2 -0
- pdf2img2pdf-1.0.0/pdf2img2pdf.egg-info/top_level.txt +1 -0
- pdf2img2pdf-1.0.0/setup.cfg +4 -0
- pdf2img2pdf-1.0.0/setup.py +32 -0
- pdf2img2pdf-1.0.0/tests/test_pdf2img2pdf.py +64 -0
|
File without changes
|
|
@@ -0,0 +1,119 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: pdf2img2pdf
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: A tool to convert PDF to images and back to PDF
|
|
5
|
+
Home-page: https://github.com/yourusername/pdf2img2pdf
|
|
6
|
+
Author: siwen.huang
|
|
7
|
+
Author-email: 594047189@qq.com
|
|
8
|
+
Classifier: Programming Language :: Python :: 3
|
|
9
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
10
|
+
Classifier: Operating System :: OS Independent
|
|
11
|
+
Requires-Python: >=3.6
|
|
12
|
+
Description-Content-Type: text/markdown
|
|
13
|
+
License-File: LICENSE
|
|
14
|
+
Requires-Dist: pdf2image
|
|
15
|
+
Requires-Dist: img2pdf
|
|
16
|
+
Dynamic: author
|
|
17
|
+
Dynamic: author-email
|
|
18
|
+
Dynamic: classifier
|
|
19
|
+
Dynamic: description
|
|
20
|
+
Dynamic: description-content-type
|
|
21
|
+
Dynamic: home-page
|
|
22
|
+
Dynamic: license-file
|
|
23
|
+
Dynamic: requires-dist
|
|
24
|
+
Dynamic: requires-python
|
|
25
|
+
Dynamic: summary
|
|
26
|
+
|
|
27
|
+
# PDF to Image to PDF Converter (pdf2img2pdf)
|
|
28
|
+
|
|
29
|
+
这是一个轻量级的 Python 工具,用于将 PDF 文件转换为图像,然后再将这些图像合并回 PDF 文件。适用于需要处理 PDF 内容或修复损坏 PDF 的场景。
|
|
30
|
+
|
|
31
|
+
### 功能特性
|
|
32
|
+
|
|
33
|
+
- 支持将 PDF 文件逐页转换为 PNG 图像。
|
|
34
|
+
- 将多张图像合并为一个新的 PDF 文件。
|
|
35
|
+
- 跨平台支持(Windows / Linux / macOS)。
|
|
36
|
+
- 自动检测依赖项(如 Poppler)并提供清晰的安装指引。
|
|
37
|
+
- 支持 PyInstaller 打包后的可执行文件运行。
|
|
38
|
+
|
|
39
|
+
### 安装依赖
|
|
40
|
+
|
|
41
|
+
1. Python 依赖库
|
|
42
|
+
确保已安装以下 Python 库
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
pip install pdf2image img2pdf
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
2. Poppler(必需)
|
|
49
|
+
pdf2image 依赖于 poppler 工具集中的 pdftoppm 命令
|
|
50
|
+
|
|
51
|
+
##### Windows 用户:
|
|
52
|
+
|
|
53
|
+
- 下载预编译版本:[Poppler for Windows](https://github.com/oschwartz10612/poppler-windows/releases/)
|
|
54
|
+
- 解压后将 poppler-x.x.x\Library\bin 添加到系统 PATH 环境变量中。
|
|
55
|
+
- 或者在代码中显式指定路径(见下方示例)。
|
|
56
|
+
|
|
57
|
+
##### Linux 用户:
|
|
58
|
+
|
|
59
|
+
使用包管理器安装:
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
# Ubuntu/Debian
|
|
63
|
+
sudo apt-get install poppler-utils
|
|
64
|
+
# CentOS/RHEL
|
|
65
|
+
sudo yum install poppler-utils
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
##### macOS 用户:
|
|
69
|
+
|
|
70
|
+
使用 Homebrew 安装:
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
brew install poppler
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
### 使用方法
|
|
77
|
+
|
|
78
|
+
##### 命令行调用
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
python pdf2img2pdf.py <input.pdf> <output.pdf>
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
##### 作为模块调用
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
from pdf2img2pdf import convert
|
|
88
|
+
# 转换 PDF 文件
|
|
89
|
+
convert("example_input.pdf", "example_output.pdf")
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### 注意事项
|
|
93
|
+
|
|
94
|
+
- Poppler 必须安装:
|
|
95
|
+
|
|
96
|
+
如果未安装 poppler,程序会自动检测并提示安装方法。
|
|
97
|
+
确保 pdftoppm 或 pdfinfo 命令可在系统中运行。
|
|
98
|
+
|
|
99
|
+
- 权限问题:
|
|
100
|
+
|
|
101
|
+
确保脚本对输入文件有读取权限,对输出目录有写入权限。
|
|
102
|
+
|
|
103
|
+
- 临时文件:
|
|
104
|
+
|
|
105
|
+
转换过程中会在 tmp/imgs 目录下生成临时图像文件,结束后会自动清理。
|
|
106
|
+
|
|
107
|
+
- PyInstaller 支持:
|
|
108
|
+
|
|
109
|
+
此工具支持打包为独立可执行文件,打包时需确保资源文件路径正确
|
|
110
|
+
|
|
111
|
+
### 目录结构
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
pdf2img2pdf/
|
|
115
|
+
├── pdf2img2pdf.py # 主程序文件
|
|
116
|
+
├── tmp/ # 临时文件目录(运行时自动生成)
|
|
117
|
+
│ └── imgs/ # 存放转换过程中的图像文件
|
|
118
|
+
└── README.md # 本文档
|
|
119
|
+
```
|
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
# PDF to Image to PDF Converter (pdf2img2pdf)
|
|
2
|
+
|
|
3
|
+
这是一个轻量级的 Python 工具,用于将 PDF 文件转换为图像,然后再将这些图像合并回 PDF 文件。适用于需要处理 PDF 内容或修复损坏 PDF 的场景。
|
|
4
|
+
|
|
5
|
+
### 功能特性
|
|
6
|
+
|
|
7
|
+
- 支持将 PDF 文件逐页转换为 PNG 图像。
|
|
8
|
+
- 将多张图像合并为一个新的 PDF 文件。
|
|
9
|
+
- 跨平台支持(Windows / Linux / macOS)。
|
|
10
|
+
- 自动检测依赖项(如 Poppler)并提供清晰的安装指引。
|
|
11
|
+
- 支持 PyInstaller 打包后的可执行文件运行。
|
|
12
|
+
|
|
13
|
+
### 安装依赖
|
|
14
|
+
|
|
15
|
+
1. Python 依赖库
|
|
16
|
+
确保已安装以下 Python 库
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
pip install pdf2image img2pdf
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
2. Poppler(必需)
|
|
23
|
+
pdf2image 依赖于 poppler 工具集中的 pdftoppm 命令
|
|
24
|
+
|
|
25
|
+
##### Windows 用户:
|
|
26
|
+
|
|
27
|
+
- 下载预编译版本:[Poppler for Windows](https://github.com/oschwartz10612/poppler-windows/releases/)
|
|
28
|
+
- 解压后将 poppler-x.x.x\Library\bin 添加到系统 PATH 环境变量中。
|
|
29
|
+
- 或者在代码中显式指定路径(见下方示例)。
|
|
30
|
+
|
|
31
|
+
##### Linux 用户:
|
|
32
|
+
|
|
33
|
+
使用包管理器安装:
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
# Ubuntu/Debian
|
|
37
|
+
sudo apt-get install poppler-utils
|
|
38
|
+
# CentOS/RHEL
|
|
39
|
+
sudo yum install poppler-utils
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
##### macOS 用户:
|
|
43
|
+
|
|
44
|
+
使用 Homebrew 安装:
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
brew install poppler
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
### 使用方法
|
|
51
|
+
|
|
52
|
+
##### 命令行调用
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
python pdf2img2pdf.py <input.pdf> <output.pdf>
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
##### 作为模块调用
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
from pdf2img2pdf import convert
|
|
62
|
+
# 转换 PDF 文件
|
|
63
|
+
convert("example_input.pdf", "example_output.pdf")
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
### 注意事项
|
|
67
|
+
|
|
68
|
+
- Poppler 必须安装:
|
|
69
|
+
|
|
70
|
+
如果未安装 poppler,程序会自动检测并提示安装方法。
|
|
71
|
+
确保 pdftoppm 或 pdfinfo 命令可在系统中运行。
|
|
72
|
+
|
|
73
|
+
- 权限问题:
|
|
74
|
+
|
|
75
|
+
确保脚本对输入文件有读取权限,对输出目录有写入权限。
|
|
76
|
+
|
|
77
|
+
- 临时文件:
|
|
78
|
+
|
|
79
|
+
转换过程中会在 tmp/imgs 目录下生成临时图像文件,结束后会自动清理。
|
|
80
|
+
|
|
81
|
+
- PyInstaller 支持:
|
|
82
|
+
|
|
83
|
+
此工具支持打包为独立可执行文件,打包时需确保资源文件路径正确
|
|
84
|
+
|
|
85
|
+
### 目录结构
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
pdf2img2pdf/
|
|
89
|
+
├── pdf2img2pdf.py # 主程序文件
|
|
90
|
+
├── tmp/ # 临时文件目录(运行时自动生成)
|
|
91
|
+
│ └── imgs/ # 存放转换过程中的图像文件
|
|
92
|
+
└── README.md # 本文档
|
|
93
|
+
```
|
|
@@ -0,0 +1,107 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
|
|
3
|
+
def get_resource_path(filename):
|
|
4
|
+
"""Get absolute path to resource, works for dev and for PyInstaller"""
|
|
5
|
+
import os, sys
|
|
6
|
+
try:
|
|
7
|
+
base_path = sys._MEIPASS
|
|
8
|
+
except AttributeError:
|
|
9
|
+
base_path = os.path.dirname(os.path.abspath(__file__))
|
|
10
|
+
return os.path.join(base_path, filename)
|
|
11
|
+
|
|
12
|
+
def get_base_path():
|
|
13
|
+
"""Get absolute path to executable, works for dev and for PyInstaller"""
|
|
14
|
+
import os, sys
|
|
15
|
+
if hasattr(sys, 'frozen'):
|
|
16
|
+
base_path = os.path.dirname(sys.executable)
|
|
17
|
+
else:
|
|
18
|
+
base_path = os.path.dirname(os.path.abspath('.'))
|
|
19
|
+
return base_path
|
|
20
|
+
|
|
21
|
+
def pdf_to_img(input_pdf_path, dpi=300):
|
|
22
|
+
"""Convert pdf to images"""
|
|
23
|
+
import os
|
|
24
|
+
import pdf2image
|
|
25
|
+
img_paths = []
|
|
26
|
+
try:
|
|
27
|
+
images = pdf2image.convert_from_path(input_pdf_path, dpi=dpi)
|
|
28
|
+
img_dir = os.path.join(get_base_path(), 'tmp', 'imgs')
|
|
29
|
+
os.makedirs(img_dir, exist_ok=True)
|
|
30
|
+
for i, image in enumerate(images):
|
|
31
|
+
img_path = os.path.join(img_dir, 'img_%03d.png' % i)
|
|
32
|
+
image.save(img_path, 'PNG')
|
|
33
|
+
img_paths.append(img_path)
|
|
34
|
+
except Exception as e:
|
|
35
|
+
print(e)
|
|
36
|
+
return img_paths
|
|
37
|
+
|
|
38
|
+
def imgs_to_pdf(img_paths, output_pdf_path,):
|
|
39
|
+
"""Convert images to pdf"""
|
|
40
|
+
import os
|
|
41
|
+
import img2pdf
|
|
42
|
+
output_path = os.path.dirname(output_pdf_path)
|
|
43
|
+
os.makedirs(output_path, exist_ok=True)
|
|
44
|
+
|
|
45
|
+
# A4尺寸 - 21cm×29.7cm(210mm×297mm)
|
|
46
|
+
A4_WIDTH = img2pdf.mm_to_pt(210)
|
|
47
|
+
A4_HEIGHT = img2pdf.mm_to_pt(180)
|
|
48
|
+
page_size = (A4_WIDTH, A4_HEIGHT)
|
|
49
|
+
|
|
50
|
+
layout_fun = img2pdf.get_layout_fun(page_size)
|
|
51
|
+
with open(output_pdf_path, "wb") as f:
|
|
52
|
+
# 指定 fit='into' 以保持宽高比
|
|
53
|
+
f.write(img2pdf.convert(img_paths, layout_fun=layout_fun))
|
|
54
|
+
for img_path in img_paths:
|
|
55
|
+
os.remove(img_path)
|
|
56
|
+
os.removedirs(os.path.dirname(img_paths[0]))
|
|
57
|
+
|
|
58
|
+
def convert(input_pdf_path, output_pdf_path, dpi=300):
|
|
59
|
+
img_paths = pdf_to_img(input_pdf_path, dpi)
|
|
60
|
+
imgs_to_pdf(img_paths, output_pdf_path)
|
|
61
|
+
|
|
62
|
+
def check_poppler_installed():
|
|
63
|
+
import shutil
|
|
64
|
+
import platform
|
|
65
|
+
|
|
66
|
+
"""Check if poppler is installed on the system."""
|
|
67
|
+
# Check for pdftoppm or pdfinfo (part of poppler-utils)
|
|
68
|
+
if shutil.which("pdftoppm") is None and shutil.which("pdfinfo") is None:
|
|
69
|
+
system = platform.system()
|
|
70
|
+
if system == "Windows":
|
|
71
|
+
raise RuntimeError(
|
|
72
|
+
"Poppler is not installed. Please install Poppler for Windows:\n"
|
|
73
|
+
"https://github.com/oschwartz10612/poppler-windows/releases/"
|
|
74
|
+
)
|
|
75
|
+
elif system == "Linux":
|
|
76
|
+
raise RuntimeError(
|
|
77
|
+
"Poppler is not installed. Please install Poppler using your package manager:\n"
|
|
78
|
+
"Ubuntu/Debian: sudo apt-get install poppler-utils\n"
|
|
79
|
+
"CentOS/RHEL: sudo yum install poppler-utils"
|
|
80
|
+
)
|
|
81
|
+
elif system == "Darwin": # macOS
|
|
82
|
+
raise RuntimeError(
|
|
83
|
+
"Poppler is not installed. Please install Poppler using Homebrew:\n"
|
|
84
|
+
"brew install poppler"
|
|
85
|
+
)
|
|
86
|
+
else:
|
|
87
|
+
raise RuntimeError("Unsupported operating system. Please manually install Poppler.")
|
|
88
|
+
|
|
89
|
+
def remove_pycache(root_dir='.'):
|
|
90
|
+
import os
|
|
91
|
+
for dirpath, dirnames, filenames in os.walk(root_dir):
|
|
92
|
+
print(dirpath, dirnames, filenames)
|
|
93
|
+
# for dirname in dirnames:
|
|
94
|
+
# if dirname.endswith("__pycache__"):
|
|
95
|
+
# os.rmdir(os.path.join(dirpath, dirname))
|
|
96
|
+
|
|
97
|
+
if __name__ == "__main__":
|
|
98
|
+
import sys
|
|
99
|
+
if len(sys.argv) < 3:
|
|
100
|
+
print("Usage: pdf2img2pdf.py <input.pdf> <output.pdf> <dpi>")
|
|
101
|
+
else:
|
|
102
|
+
try:
|
|
103
|
+
# Check if poppler is installed before proceeding
|
|
104
|
+
check_poppler_installed()
|
|
105
|
+
convert(sys.argv[1], sys.argv[2], sys.argv[3] if len(sys.argv) > 3 else 100)
|
|
106
|
+
except Exception as e:
|
|
107
|
+
print(e)
|
|
@@ -0,0 +1,119 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: pdf2img2pdf
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: A tool to convert PDF to images and back to PDF
|
|
5
|
+
Home-page: https://github.com/yourusername/pdf2img2pdf
|
|
6
|
+
Author: siwen.huang
|
|
7
|
+
Author-email: 594047189@qq.com
|
|
8
|
+
Classifier: Programming Language :: Python :: 3
|
|
9
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
10
|
+
Classifier: Operating System :: OS Independent
|
|
11
|
+
Requires-Python: >=3.6
|
|
12
|
+
Description-Content-Type: text/markdown
|
|
13
|
+
License-File: LICENSE
|
|
14
|
+
Requires-Dist: pdf2image
|
|
15
|
+
Requires-Dist: img2pdf
|
|
16
|
+
Dynamic: author
|
|
17
|
+
Dynamic: author-email
|
|
18
|
+
Dynamic: classifier
|
|
19
|
+
Dynamic: description
|
|
20
|
+
Dynamic: description-content-type
|
|
21
|
+
Dynamic: home-page
|
|
22
|
+
Dynamic: license-file
|
|
23
|
+
Dynamic: requires-dist
|
|
24
|
+
Dynamic: requires-python
|
|
25
|
+
Dynamic: summary
|
|
26
|
+
|
|
27
|
+
# PDF to Image to PDF Converter (pdf2img2pdf)
|
|
28
|
+
|
|
29
|
+
这是一个轻量级的 Python 工具,用于将 PDF 文件转换为图像,然后再将这些图像合并回 PDF 文件。适用于需要处理 PDF 内容或修复损坏 PDF 的场景。
|
|
30
|
+
|
|
31
|
+
### 功能特性
|
|
32
|
+
|
|
33
|
+
- 支持将 PDF 文件逐页转换为 PNG 图像。
|
|
34
|
+
- 将多张图像合并为一个新的 PDF 文件。
|
|
35
|
+
- 跨平台支持(Windows / Linux / macOS)。
|
|
36
|
+
- 自动检测依赖项(如 Poppler)并提供清晰的安装指引。
|
|
37
|
+
- 支持 PyInstaller 打包后的可执行文件运行。
|
|
38
|
+
|
|
39
|
+
### 安装依赖
|
|
40
|
+
|
|
41
|
+
1. Python 依赖库
|
|
42
|
+
确保已安装以下 Python 库
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
pip install pdf2image img2pdf
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
2. Poppler(必需)
|
|
49
|
+
pdf2image 依赖于 poppler 工具集中的 pdftoppm 命令
|
|
50
|
+
|
|
51
|
+
##### Windows 用户:
|
|
52
|
+
|
|
53
|
+
- 下载预编译版本:[Poppler for Windows](https://github.com/oschwartz10612/poppler-windows/releases/)
|
|
54
|
+
- 解压后将 poppler-x.x.x\Library\bin 添加到系统 PATH 环境变量中。
|
|
55
|
+
- 或者在代码中显式指定路径(见下方示例)。
|
|
56
|
+
|
|
57
|
+
##### Linux 用户:
|
|
58
|
+
|
|
59
|
+
使用包管理器安装:
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
# Ubuntu/Debian
|
|
63
|
+
sudo apt-get install poppler-utils
|
|
64
|
+
# CentOS/RHEL
|
|
65
|
+
sudo yum install poppler-utils
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
##### macOS 用户:
|
|
69
|
+
|
|
70
|
+
使用 Homebrew 安装:
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
brew install poppler
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
### 使用方法
|
|
77
|
+
|
|
78
|
+
##### 命令行调用
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
python pdf2img2pdf.py <input.pdf> <output.pdf>
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
##### 作为模块调用
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
from pdf2img2pdf import convert
|
|
88
|
+
# 转换 PDF 文件
|
|
89
|
+
convert("example_input.pdf", "example_output.pdf")
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### 注意事项
|
|
93
|
+
|
|
94
|
+
- Poppler 必须安装:
|
|
95
|
+
|
|
96
|
+
如果未安装 poppler,程序会自动检测并提示安装方法。
|
|
97
|
+
确保 pdftoppm 或 pdfinfo 命令可在系统中运行。
|
|
98
|
+
|
|
99
|
+
- 权限问题:
|
|
100
|
+
|
|
101
|
+
确保脚本对输入文件有读取权限,对输出目录有写入权限。
|
|
102
|
+
|
|
103
|
+
- 临时文件:
|
|
104
|
+
|
|
105
|
+
转换过程中会在 tmp/imgs 目录下生成临时图像文件,结束后会自动清理。
|
|
106
|
+
|
|
107
|
+
- PyInstaller 支持:
|
|
108
|
+
|
|
109
|
+
此工具支持打包为独立可执行文件,打包时需确保资源文件路径正确
|
|
110
|
+
|
|
111
|
+
### 目录结构
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
pdf2img2pdf/
|
|
115
|
+
├── pdf2img2pdf.py # 主程序文件
|
|
116
|
+
├── tmp/ # 临时文件目录(运行时自动生成)
|
|
117
|
+
│ └── imgs/ # 存放转换过程中的图像文件
|
|
118
|
+
└── README.md # 本文档
|
|
119
|
+
```
|
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
LICENSE
|
|
2
|
+
README.md
|
|
3
|
+
setup.py
|
|
4
|
+
pdf2img2pdf/__init__.py
|
|
5
|
+
pdf2img2pdf/pdf2img2pdf.py
|
|
6
|
+
pdf2img2pdf.egg-info/PKG-INFO
|
|
7
|
+
pdf2img2pdf.egg-info/SOURCES.txt
|
|
8
|
+
pdf2img2pdf.egg-info/dependency_links.txt
|
|
9
|
+
pdf2img2pdf.egg-info/entry_points.txt
|
|
10
|
+
pdf2img2pdf.egg-info/requires.txt
|
|
11
|
+
pdf2img2pdf.egg-info/top_level.txt
|
|
12
|
+
tests/test_pdf2img2pdf.py
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
pdf2img2pdf
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# setup.py
|
|
2
|
+
from setuptools import setup, find_packages
|
|
3
|
+
|
|
4
|
+
with open("README.md", "r", encoding="utf-8") as fh:
|
|
5
|
+
long_description = fh.read()
|
|
6
|
+
|
|
7
|
+
setup(
|
|
8
|
+
name="pdf2img2pdf",
|
|
9
|
+
version="1.0.0",
|
|
10
|
+
author="siwen.huang",
|
|
11
|
+
author_email="594047189@qq.com",
|
|
12
|
+
description="A tool to convert PDF to images and back to PDF",
|
|
13
|
+
long_description=long_description,
|
|
14
|
+
long_description_content_type="text/markdown",
|
|
15
|
+
url="https://github.com/yourusername/pdf2img2pdf",
|
|
16
|
+
packages=find_packages(),
|
|
17
|
+
classifiers=[
|
|
18
|
+
"Programming Language :: Python :: 3",
|
|
19
|
+
"License :: OSI Approved :: MIT License",
|
|
20
|
+
"Operating System :: OS Independent",
|
|
21
|
+
],
|
|
22
|
+
python_requires=">=3.6",
|
|
23
|
+
install_requires=[
|
|
24
|
+
"pdf2image",
|
|
25
|
+
"img2pdf",
|
|
26
|
+
],
|
|
27
|
+
entry_points={
|
|
28
|
+
"console_scripts": [
|
|
29
|
+
"pdf2img2pdf=pdf2img2pdf.pdf2img2pdf:main", # 命令行入口
|
|
30
|
+
],
|
|
31
|
+
},
|
|
32
|
+
)
|
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
import os
|
|
2
|
+
import tempfile
|
|
3
|
+
import unittest
|
|
4
|
+
from unittest.mock import patch, MagicMock
|
|
5
|
+
from pdf2img2pdf import convert, check_poppler_installed
|
|
6
|
+
|
|
7
|
+
class TestPdf2Img2Pdf(unittest.TestCase):
|
|
8
|
+
|
|
9
|
+
def setUp(self):
|
|
10
|
+
"""在每个测试前运行,准备临时文件"""
|
|
11
|
+
self.input_pdf = tempfile.NamedTemporaryFile(suffix=".pdf", delete=False)
|
|
12
|
+
self.output_pdf = tempfile.NamedTemporaryFile(suffix=".pdf", delete=False)
|
|
13
|
+
self.input_pdf.close()
|
|
14
|
+
self.output_pdf.close()
|
|
15
|
+
|
|
16
|
+
def tearDown(self):
|
|
17
|
+
"""在每个测试后运行,清理临时文件"""
|
|
18
|
+
if os.path.exists(self.input_pdf.name):
|
|
19
|
+
os.unlink(self.input_pdf.name)
|
|
20
|
+
if os.path.exists(self.output_pdf.name):
|
|
21
|
+
os.unlink(self.output_pdf.name)
|
|
22
|
+
|
|
23
|
+
@patch("pdf2img2pdf.pdf2image.convert_from_path")
|
|
24
|
+
@patch("pdf2img2pdf.img2pdf.convert")
|
|
25
|
+
def test_convert_success(self, mock_img2pdf, mock_pdf2image):
|
|
26
|
+
"""测试 convert 函数是否能成功执行"""
|
|
27
|
+
# 模拟 pdf2image 返回图像列表
|
|
28
|
+
mock_images = [MagicMock()]
|
|
29
|
+
mock_pdf2image.return_value = mock_images
|
|
30
|
+
|
|
31
|
+
# 模拟 img2pdf 返回二进制数据
|
|
32
|
+
mock_img2pdf.return_value = b"fake_pdf_data"
|
|
33
|
+
|
|
34
|
+
# 调用函数
|
|
35
|
+
convert(self.input_pdf.name, self.output_pdf.name)
|
|
36
|
+
|
|
37
|
+
# 断言 pdf2image 被正确调用
|
|
38
|
+
mock_pdf2image.assert_called_once_with(self.input_pdf.name)
|
|
39
|
+
|
|
40
|
+
# 断言 img2pdf 被正确调用
|
|
41
|
+
mock_img2pdf.assert_called_once_with([img.filename for img in mock_images])
|
|
42
|
+
|
|
43
|
+
# 断言输出文件被创建
|
|
44
|
+
self.assertTrue(os.path.exists(self.output_pdf.name))
|
|
45
|
+
|
|
46
|
+
@patch("shutil.which")
|
|
47
|
+
def test_check_poppler_installed_success(self, mock_which):
|
|
48
|
+
"""测试 check_poppler_installed 在 Poppler 已安装时不会抛出异常"""
|
|
49
|
+
mock_which.return_value = "/usr/bin/pdftoppm" # 模拟找到了 pdftoppm
|
|
50
|
+
try:
|
|
51
|
+
check_poppler_installed()
|
|
52
|
+
except RuntimeError:
|
|
53
|
+
self.fail("check_poppler_installed() raised RuntimeError unexpectedly!")
|
|
54
|
+
|
|
55
|
+
@patch("shutil.which")
|
|
56
|
+
def test_check_poppler_installed_failure(self, mock_which):
|
|
57
|
+
"""测试 check_poppler_installed 在 Poppler 未安装时抛出异常"""
|
|
58
|
+
mock_which.return_value = None # 模拟未找到 pdftoppm 或 pdfinfo
|
|
59
|
+
with self.assertRaises(RuntimeError) as context:
|
|
60
|
+
check_poppler_installed()
|
|
61
|
+
self.assertIn("Poppler is not installed", str(context.exception))
|
|
62
|
+
|
|
63
|
+
if __name__ == "__main__":
|
|
64
|
+
unittest.main()
|