* cosmetic changes to be compatible to hadolint
* common configuration for most editors until better plugins comes up
* Changes on PDF metadata, using PyMuPDF (faster and more compatible)
* small changes on other file ingestions in order to try to keep the fields equal
* Lint, review, and review
* fixed unknown chars
* Use PyMuPDF for pdf loading for 200% speed increase
linting
---------
Co-authored-by: Francisco Bischoff <franzbischoff@gmail.com>
Co-authored-by: Francisco Bischoff <984592+franzbischoff@users.noreply.github.com>
* Update filetypes.py
Added mbox format
* Created new file
Added support for mbox files as used by many email services, including Google Takeout's Gmail archive.
* Update filetypes.py
* Update as_mbox.py