1
0
mirror of https://github.com/Stirling-Tools/Stirling-PDF.git synced 2024-10-01 08:50:11 +02:00

Merge branch 'main' into patch-4

This commit is contained in:
Anthony Stirling 2023-05-14 22:11:20 +01:00 committed by GitHub
commit 093dcba4ba
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 28 additions and 3 deletions

View File

@ -18,7 +18,7 @@ Depending on your requirements, you can choose the appropriate language pack for
### Installing Language Packs ### Installing Language Packs
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need. 1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` 2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` (Debian) or `/usr/share/tesseract/tessdata` (Fedora)
# DO NOT REMOVE EXISTING ENG.TRAINEDDATA, ITS REQUIRED. # DO NOT REMOVE EXISTING ENG.TRAINEDDATA, ITS REQUIRED.
@ -48,4 +48,29 @@ Add the following to your existing docker run command
If you are not using Docker, you need to install the OCR components, including the ocrmypdf app. If you are not using Docker, you need to install the OCR components, including the ocrmypdf app.
You can see [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) You can see [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html)
Debian based systems, install languages with this command:
```bash
sudo apt update &&\
# All languages
# sudo apt install -y 'tesseract-ocr-*'
# Find languages:
apt search tesseract-ocr-
# View installed languages:
dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g'
```
Fedora:
```bash
# All languages
# sudo dnf install -y tesseract-langpack-*
# Find languages:
dnf search -C tesseract-langpack-
# View installed languages:
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
```

View File

@ -92,8 +92,8 @@ Install the following software:
For Debian-based systems, you can use the following command: For Debian-based systems, you can use the following command:
```bash ```bash
sudo apt-get install -y libreoffice-core libreoffice-common libreoffice-writer libreoffice-calc libreoffice-impress python3-uno unoconv pngquant unpaper ocrmypdf sudo apt-get install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
pip3 install opencv-python-headless pip3 install uno opencv-python-headless unoconv pngquant
``` ```
For Fedora: For Fedora: