How to Improve Classifier Confidence Scoring in DIVE Across Multiple Video Sources?

tisabib · July 27, 2025, 1:48pm

Hello

I am using DIVE to annotate and analyze video datasets from multiple camera sources (e.g., drones, static surveillance, bodycams) & I am noticing inconsistent classifier confidence scores across different environments even when detecting the same class of object.

Is there a recommended way to normalize or calibrate confidence values for better cross-video consistency?

I understand that DIVE & VIAME can be fine-tuned for specific datasets but I would love to hear from others who have addressed similar issues with classifier thresholds / scoring pipelines. Has anyone implemented a post-processing step to align classifier outputs / perhaps used ensemble detection techniques? Checked Home - DIVE guide related to this .

On a related note, while researching model performance metrics; I stumbled upon the concept of what is Perplexity AI, and it made me wonder whether any similar perplexity-style evaluation could be adapted for visual classifiers in DIVE.

Thank you!!

matt_dawkins · August 13, 2025, 4:17pm

Sorry for the delayed reply.

In terms of increasing confidence score consistencies across different sensors, I think we’d need to know more about your particular problem. A lot depends on what type of training data you’re using. If you have a significant skew towards a particular sensor in your training dataset, chances are it will do better there. Sometimes having separate models for different sensors is best, but it depends on how variable the appearances are in the different modalities. If you want to have a meeting, please reach out to matt.dawkins@kitware.com or viame-web@kitware.com

We do use ensemble detections in several pipelines (e.g. habcam add-on, sea lion add-on) which run multiple different detectors then combine their scores into a single output.