TL;DR - had to rebuild my PhotoPrism database and now my originals count is off by ~5,000. Can I do a full sync of my devices and have it only upload what is missing?
Hello gurus,
I’ve been running Photoprism for quite some time and I’m happy with it.
I ran to an unrelated issue with my database (MariaDB) and has to rebuild the database. PhotoPrism uses this instance of MariaDB so naturally the metadata was gone.
The original pictures (originals) were stored in a separate array so at a minimum I still have all my pictures. I rebuilt the database and PhotoPrism (docker container) and pointed it to the array for the originals. Once that was done, I logged in to the PhotoPrism UI and perform a complete rescan and index of my originals. Once it was done, I noticed that my originals count was 27,000 but i should have 31,000 objects (according to a picture I took of the PhotoPrism UI I took the night before rebuilding the database). So I started digging a bit.
The array itself (where my originals are stored) is showing 27,000 objects.
The pictures I took the night before rebuilding the database and PhotoPrism containers said that the count of originals was ~31,000.
The two main devices backing media to PhotoPrism is my phone and my wife’s phone. My phone shows ~4,500 and my wife’s sores ~26,500.
Since these two phones are previously fully backups a few weeks before the rebuild I should have ~31,000 objects in the originals.
My question is, can I redo a full backups sync of both phones (through PhotoSync) and have it only copy the objects that are not in the originals?
Since the database has to be rebuilt, I fear that if I do another full sync, it will just copy everything again and I end up with ~60,000 objects rather than the ~31,000 I should have.
What can I do to see which objects are missing between my devices and PhotoPrism and how can I only copy those over to PhotoPrism?
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.
Rules:
Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
Resources:
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
I am far from a photo prism expert but I can safely say the indexing algorithm is weird and takes multiple runs to finish. Logically I would expect to run it once and it would do everything in one scan but I’ve found it takes sometimes 3 to 5 full scans to update and properly catch up to major changes. It’s almost like it acknowledges big changes and documents it but waits for multiple passes before committing it. Also it does a really good job when scanning to look or duplicate images and stacking/repointing to the valid file. I would advise running the indexing another 2 or 3 times if you are confident the 31k files are actually on storage and just not showing up on the database.
That’s the thing, if I do a count of the objects in the actual storage I get 27k but based on the count of the two devices that I backup using PhotoPrism I should have at least 31k between the two phones. So somehow I’ve lost ~5k. It may have not been a big deal to just do a full sync with PhotoSync again to copy over whatever I was missing between the two phones and storage BUT given the fact that I had to rebuild Photoprism’s database I’m not confident that the new database will have the same unique Id for each picture as before so if I kick off another full sync with PhotoSync it may copy everything again because “the new database doesn’t have a record of that picture”.
I reinfected everything once the new database was built but again I’m not sure if the new unique ids or however Photoprism knows that it already has that object will match and skip the upload or if it will just accept as a new a object.
I think a full resync then re-index will go fine. My setup is different in that I sync everything through Nextcloud and run a script that looks for changes and triggers an indexing scan in photoprism. That being said I’ve absolutely mutilated some photo prism databases (migrating servers, different folder names with the same content) and run full indexing and never ended up with duplicates. It’s very good at stacking and cleaning up the same files in the DB so long as there aren’t actual duplicates in the original storage. But again it might take 3 or more full scans to find and purge duplicates.
Well that’s good news. For now, I’ve created a different path in my array. I’ve reconfigured photoprism to look at this new path for the originals and cleared out the database one more time. I’m in the process of fully reuploading/resyncing my devices (two phones). Once I have that then I will write up a script to see which objects are missing from the old path to the new and viceversa to figure out why Im short ~5,000 objects. Once I have that list then I can rehollad the missing objects and im back in business (hopefully)