OT: Cataloging Software for PDF (Linux or php)

Not image related, but since here are a few people that use Linux I maybe somebody knows:

I want to organize my PDF scans. It’s mostly catalogs/fliers from fairs. They are scanned to PDF with an OCR overlay.

There are some groups, such as: 2019 ABC Fair then there are keywords. The PDF should be searchable (but if the overlay text ist stored in a db it’s fine too).

I know that Adobe has such a software. But, well, expensive.

The cataloging software I looked into so far are usually specialized on something, book, records etv. I also started to write some code for PHP (putting the text overlay in a MariaDB, searching etc.), that somehow works too. It does has trouble though with different PDF versions (need to convert them).

This should run in my NAS, which has a Linux like O/S, can run php too.

But just wondering if anybody else has a comment about it.

Nextcloud has full text search and says it can search inside PDFs and can even have an OCR component. It is in PHP and I find it to be good.

You could probably use Zotero for this purpose. It’s meant for organizing scientific papers, but should be reasonably suitable for your task as well.

Thanks, I will check that out. Nextcloud is already installed, will get the Nextant app and try that first.

Will do some reading about Zotero later.

PS: Nextant will not to supported from version 14 onward (now is 13.0.**), use “Full Text Search