diff --git a/HowToUseOCR.md b/HowToUseOCR.md index b015f53d..e4ba9828 100644 --- a/HowToUseOCR.md +++ b/HowToUseOCR.md @@ -18,7 +18,7 @@ Depending on your requirements, you can choose the appropriate language pack for ### Installing Language Packs 1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need. -2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` +2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` (Debian) or `/usr/share/tesseract/tessdata` (Fedora) # DO NOT REMOVE EXISTING ENG.TRAINEDDATA, ITS REQUIRED. @@ -48,4 +48,29 @@ Add the following to your existing docker run command If you are not using Docker, you need to install the OCR components, including the ocrmypdf app. You can see [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) +Debian based systems, install languages with this command: +```bash +sudo apt update &&\ +# All languages +# sudo apt install -y 'tesseract-ocr-*' + +# Find languages: +apt search tesseract-ocr- + +# View installed languages: +dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g' +``` + +Fedora: + +```bash +# All languages +# sudo dnf install -y tesseract-langpack-* + +# Find languages: +dnf search -C tesseract-langpack- + +# View installed languages: +rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g' +``` diff --git a/LocalRunGuide.md b/LocalRunGuide.md index a1fd49e6..35c66512 100644 --- a/LocalRunGuide.md +++ b/LocalRunGuide.md @@ -43,14 +43,22 @@ sudo apt-get update sudo apt-get install -y git automake autoconf libtool libleptonica-dev pkg-config zlib1g-dev make g++ java-17-openjdk python3 python3-pip ``` +For Fedora-based systems use this command: + +```bash +sudo dnf install -y git automake autoconf libtool leptonica-devel pkg-config zlib-devel make gcc-c++ java-17-openjdk python3 python3-pip +``` + ### Step 2: Clone and Build jbig2enc (Only required for certain OCR functionality) ```bash -git clone https:github.com/agl/jbig2enc -cd jbig2enc -./autogen.sh -./configure -make +mkdir ~/.git +cd ~/.git &&\ +git clone https://github.com/agl/jbig2enc.git &&\ +cd jbig2enc &&\ +./autogen.sh &&\ +./configure &&\ +make &&\ sudo make install ``` @@ -84,15 +92,24 @@ Install the following software: For Debian-based systems, you can use the following command: ```bash -sudo apt-get install -y libreoffice-core libreoffice-common libreoffice-writer libreoffice-calc libreoffice-impress python3-uno unoconv pngquant unpaper ocrmypdf -pip3 install opencv-python-headless +sudo apt-get install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf +pip3 install uno opencv-python-headless unoconv pngquant +``` + +For Fedora: + +```bash +sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf +pip3 install uno opencv-python-headless unoconv pngquant ``` ### Step 4: Clone and Build Stirling-PDF ```bash -git clone https://github.com/Frooodle/Stirling-PDF.git -cd Stirling-PDF +cd ~/.git &&\ +git clone https://github.com/Frooodle/Stirling-PDF.git &&\ +cd Stirling-PDF &&\ +chmod +x ./gradlew &&\ ./gradlew build ``` @@ -104,18 +121,53 @@ You can move this file to a desired location, for example, `/opt/Stirling-PDF/`. You must also move the Script folder within the Stirling-PDF repo that you have downloaded to this directory. This folder is required for the python scripts using OpenCV +```bash +sudo mkdir /opt/Stirling-PDF &&\ +sudo mv /build/libs/S-PDF-*.jar /opt/Stirling-PDF/ &&\ +sudo mv scripts /opt/Stirling-PDF/ &&\ +echo "Scripts installed." +``` ### Step 6: Other files #### OCR -If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running none english scanning. +If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running non-english scanning. ##### Installing Language Packs +Easiest is to use the langpacks provided by your repositories. Skip the other steps + +Manual: 1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need. 2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` -Please view [OCRmyPDF install guide](https:ocrmypdf.readthedocs.io/en/latest/installation.html) for more info. +3. +Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info. **IMPORTANT:** DO NOT REMOVE EXISTING `eng.traineddata`, IT'S REQUIRED. +Debian based systems, install languages with this command: +```bash +sudo apt update &&\ +# All languages +# sudo apt install -y 'tesseract-ocr-*' + +# Find languages: +apt search tesseract-ocr- + +# View installed languages: +dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g' +``` + +Fedora: + +```bash +# All languages +# sudo dnf install -y tesseract-langpack-* + +# Find languages: +dnf search -C tesseract-langpack- + +# View installed languages: +rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g' +``` ### Step 7: Run Stirling-PDF @@ -125,6 +177,31 @@ or java -jar build/libs/app.jar ``` +### Step 8: Adding a Desktop icon + +This will add a modified Appstarter to your Appmenu. +```bash +location=$(pwd)/gradlew +image=$(pwd)/docs/stirling-transparent.svg + +cat > ~/.local/share/applications/Stirling-PDF.desktop < + + + + + + diff --git a/src/main/resources/static/images/flags/se.svg b/src/main/resources/static/images/flags/se.svg new file mode 100644 index 00000000..0e41780e --- /dev/null +++ b/src/main/resources/static/images/flags/se.svg @@ -0,0 +1,4 @@ + + + + diff --git a/src/main/resources/templates/fragments/navbar.html b/src/main/resources/templates/fragments/navbar.html index d4d0a3cf..835b594d 100644 --- a/src/main/resources/templates/fragments/navbar.html +++ b/src/main/resources/templates/fragments/navbar.html @@ -262,12 +262,18 @@ function compareVersions(version1, version2) { icon Español + + icon Italiano + icon 简体中文 icon Català + + icon Svenska + diff --git a/src/main/resources/templates/other/ocr-pdf.html b/src/main/resources/templates/other/ocr-pdf.html index d9c400bb..0dfca950 100644 --- a/src/main/resources/templates/other/ocr-pdf.html +++ b/src/main/resources/templates/other/ocr-pdf.html @@ -80,5 +80,4 @@
- \ No newline at end of file