mac 下如何利用Python提取图片的文字

在 macOS 上，你可以使用相同的 Python 代码来利用 Tesseract-OCR 提取图片中的文字。你可以按照以下步骤操作：

确保你已经安装了 Tesseract-OCR 和 pytesseract 库。你可以使用 Homebrew 来安装 Tesseract-OCR，然后使用 pip 安装 pytesseract 库。在终端中执行以下命令：

brew install tesseract
pip install pytesseract
brew install tesseract-ocr-chi-sim
brew install tesseract-lang

创建一个 Python 脚本，例如 extract_text.py，并使用以下代码提取图片中的文字：

import pytesseract
from PIL import Image

# 设置 Tesseract 语言为中文简体
pytesseract.pytesseract.tesseract_cmd = r'/opt/homebrew/bin/tesseract'  # 设置 tesseract 路径
lang = 'chi_sim'  # 指定中文简体语言包

# 读取图片
img = Image.open('example.png')

# 使用Tesseract进行OCR识别
# 使用 Tesseract 进行中文文字识别
text = pytesseract.image_to_string(img, lang=lang)

# 输出识别结果
print(text)

在这个示例中，example.png 是你要提取文字的图片文件。pytesseract.image_to_string(img) 函数将图片转换为文字，并返回识别结果。

在终端中执行该 Python 脚本：

python extract_text.py

这样，你就可以在 macOS 上使用 Python 和 Tesseract-OCR 提取图片中的文字了。

mac 下如何利用Python提取图片的文字

Comments

One response to “mac 下如何利用Python提取图片的文字”

Leave a Reply Cancel reply