If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running non-english scanning.
##### Installing Language Packs
Easiest is to use the langpacks provided by your repositories. Skip the other steps
Easiest is to use the langpacks provided by your repositories. Skip the other steps.
Manual:
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tessdata`
3.
Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info.
3. Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info.
**IMPORTANT:** DO NOT REMOVE EXISTING `eng.traineddata`, IT'S REQUIRED.
Debian based systems, install languages with this command:
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
```
Nix:
```bash
nix-env -iA nixpkgs.tesseract
```
**Note:** Nix Package Manager pre-installs almost all the language packs when tesseract is installed.
### Step 7: Run Stirling-PDF
Those who have pushed to the root directory, run the following commands:
```bash
./gradlew bootRun
or
java -jar /opt/Stirling-PDF/Stirling-PDF-*.jar
```
Since libreoffice, soffice, and conversion tools have their dbus_tmp_dir set as `dbus_tmp_dir="/run/user/$(id -u)/libreoffice-dbus"`, you might get the following error when using their endpoints:
To resolve this, before starting the Stirling-PDF, you have to set the environment variable to a directory you have write access to by using the following commands: