How do you people manage/organize/index downloaded PDF & other type files?

gotnull · Oct 14, 2025

I keep my ebooks(pdf and epub) on my home server(FreeBSD 14.3), I access them through NFS.
It is one dir called "ebooks" which contains several others dirs, one dir equals one category.

Code:

ebooks/shell
ebooks/bsd
ebooks/web
...

Determining what book belongs to which category was probably the most boring part.
All books have been renamed according to a specific syntax, no white space, no punctuation marks except dash and dot, and everything is in lowercase.
I did that with sysutils/p5-File-Rename, be careful with it though, make a good use of dry-run option because things can quickly turn quite wrong (regex power).

To search a book I made a shell script that basically fuzzy search (by using /usr/bin/find and textproc/fzy) through the ebook dir according to an argument passed to it, and opens it either with graphics/zathura if it's a pdf or deskutils/foliate if it's an epub.

Regarding text files, they are mostly my wiki, my draft, my code sometimes, my snippets, etc ... basically every thing I write goes into my notes at some point.
They are in my home in one dir called "notes".
To manage them, I made a script that relies again on textproc/fzy, with it I write/edit/delete/search/list/show notes.
In order to have some categories(kind of) every note name has a prefix, which help me to make a search later.

Code:

notes/wiki.mystuff1
notes/wiki.mystuff2
notes/wiki.mystuff3
notes/code.hello1
notes/code.hello2
notes/text.blabla1
...

In the past I used deskutils/zim or vimwiki to manage my notes, but I needed something simpler so I wrote my own thing, since I have no regrets.

cracauer@ · Oct 14, 2025

I don't like directories as categories. Many PDFs belong in several categories and symlinks would make it messy.

If I play the filename game I rename the pdf to a pretty long name that is both descriptive and has tags and categories as words in it.

Then you can use find(1) on the tree.

ChubaDuba · Oct 14, 2025

Maybe deskutils/cherrytree ?

meaw229a · Oct 15, 2025

I'm also using deskutils/cherrytree for years. It helps me to find notes and documents quite fast.
One of my most important tools and it runs on anything. Win, Mac, Linux BSD.

hruodr · Oct 15, 2025

meaw229a said:
I'm also using deskutils/cherrytree for years. It helps me to find notes and documents quite fast.
One of my most important tools and it runs on anything. Win, Mac, Linux BSD.

Just installed it, 'only' 18MiB, seems sophisticated, but I do not know what to do with it.

I personally prefer an own solution as I wrote above, a small program that generates an sqlite3 db for search, perhaps recoll can be used to generate it.

I do not like to put metadata in file names, not more than a date at the beginning, a pregnant short name and an extension pointing to the format. Some file systems support arbitrary metadata, but that is also not a solution if one wants to do tar balls ore move the documents to other computer.

_martin · Oct 15, 2025

cracauer@ said:
Requires the Google ecosystem, but anyway:

notebooklm.google.com is a nice toy for pdfs. It is a LLM that you can throw a few documents in and it will answer from those documents. Questions, summaries etc. I find it to be quite good as long as the document doesn't rely too much on pictures.

Hm, that might be interesting solution. What's the policy about illegal pdfs though? I imagine most of the pdfs (ebooks) people own are not legal; or so I assume given my library.

froggit9000 · Oct 15, 2025

gotnull said:
Regarding text files, they are mostly my wiki, my draft, my code sometimes, my snippets, etc ... basically every thing I write goes into my notes at some point.
They are in my home in one dir called "notes".
To manage them, I made a script that relies again on textproc/fzy, with it I write/edit/delete/search/list/show notes.
In order to have some categories(kind of) every note name has a prefix, which help me to make a search later.

Code:

notes/wiki.mystuff1 notes/wiki.mystuff2 notes/wiki.mystuff3 notes/code.hello1 notes/code.hello2 notes/text.blabla1 ...

In the past I used deskutils/zim or vimwiki to manage my notes, but I needed something simpler so I wrote my own thing, since I have no regrets.

For editing and managing extensive notes have you ever seen or used emacs with the Org-mode and Org-Roam + Org-Roam-UI extensions?
If not, there are some good videos on youtube showing its ease of use and power to link notes and ability to visualise how all your notes relate to each other. I have used it for a year or so and am still learning but I have not found anything better so far. Highly recommended.

https://share.google/images/nqYvaRUZpkgBGX3Gp

View: https://www.youtube.com/watch?v=Ea_-TaEGa7Y

View: https://www.youtube.com/watch?v=AyhPmypHDEw

View: https://www.youtube.com/watch?v=zRT4vNh-kV8

View: https://www.youtube.com/watch?v=e-SjhYZjIO8

astyle · Oct 15, 2025

I once tried to organize animes that I download and want to re-watch later... and never got around to building any sort of digital library. I just have a few 2TB SSDs where I store them. And what I have - I have a redundant backup of that.

With e-books - I can download any e-book I want pretty much, but reading them is another matter. Just putting in the effort to organize them - yeah, but that's just for me.

If I want to take notes or do some journaling - textproc/obsidian is my go-to app, it offers a nice search function for local stuff, and can be adapted to organize files. Thing is, it does take a lot of time and effort to organize stuff to fit a system. I'm still learning Obsidian's features, it's a nice time sink.

Organizing stuff for convenient access is nice, I usually organize with the idea that it's easy to export and import the contents.

dusan-gvozdenovic · Oct 15, 2025

I am developing a program in my free time to do just this: https://gitlab.com/dusan-gvozdenovic/librarian

It allows direct modifying of XMP metadata in documents (currently only PDFs and DjVus are kind of supported) so that things like title, author (creator) and create date[1] can be updated and later searched against. However:

It's a bit simplistic atm, not feature complete (but there is a general idea of how it should work).
I do not have much free time lately so it gets a commit about every once in a while.

If there is anybody in this thread who would like to contribute in any way that would be great (code, feedback, etc.)!

1: More will come later