Facial Regognition: SQL for separating one person into two (or more) #18974
Replies: 2 comments
-
As stated in the heading, if you have 3 options, you can just add another set of sub-queries for the 3rd person. |
Beta Was this translation helpful? Give feedback.
-
I'm also thinking that this might be something potentially useful to add as a setup for immich for when you are about to upload a large library. Start with asking for a cohort of photos from different ages of your most photographed people, ensure these are properly labelled and then upload the rest. Then it can check every newly recognized face first against this shortlist, and then those that don't match can be treated as they are right now. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The facial recognition software lumped my two sons together, but I didn't want to reprocess everything as the threshold was already splitting up my daughters into ~20 people each. I have a large library spanning about 20 years that I uploaded in multiple phases.
So, I wrote some SQL that helped me re-assign the incorrect cases. It was almost always one way, so I just had to check the older son's in my case.
I used the names Bob and Charlie for B and C, so that its a little clearer for others wanting to copy the code. You have to select photos that will serve as a reference for both people. I used about 20 for each person from different ages, facial expressions, and lighting conditions. Fill these into bob_reference and charlie_reference. (Be sure these are correctly labelled to the person!)
Then all the photos in the photos_to_split list are compared against both sets of reference faces, taking as a metric the one with the minimum distance, and another metric based on the best three matches, called reciprocal distance (based on the inverse of the sum of the inverses, like adding resistors in parallel).
You can tinker with the thresholds listed at the bottom of the SQL, but I found a difference less than -1 was indicative that it should be labelled the other person, ones between -1 and 1 I checked manually and reassigned, above 1 I took as labelled correctly. I've left the actual update commented out for now so that you can test the queries with a select before pulling the trigger on updating the tables.
Using this, I was able to avoid manually having to change 2700 incorrect categorizations, and only had to do about ~160 manually.
If there are some front-end and back-end devs that want to try and implement this as a service within the app, I'd be happy to help. But I code mostly in Python and SQL, so I'm not able to take this on by myself.
Beta Was this translation helpful? Give feedback.
All reactions