Hello! I have some questions regarding retraining a model on the web version of DIVE to improve its accuracy.
We are interested in training a detector that can identify multiple species in a single frame over an image sequence. So far I have:
Uploaded an image sequence containing training data into the web version of VIAME Dive.
Annotated each image by placing bounding boxes around individuals of interest and labeling them.
Ran a training using the configuration file: train_detector_svm_over_generic_detections
Uploaded a new image sequence and ran the trained detector created from the previous image sequence
The results of the detector were then modified by fixing inaccurate bounding boxes and adding boxes around species that were mixed by the trained detector
To improve the detector I created, do I then run another training on the image sequence that I modified using the same configuration file (train_detector_svm_over_generic_detections)? Does that improve the previous detector model or does it just make a new detector model?
Also, is there a way to see the accuracy of the detector to see if it is improving after retraining the model?
The current method for improving models would be to just train over more vetted / annotated sequences, so in training the updated model the 2nd time you would check (aka train) over both the original annotated sequence, and the newly (different) annotated sequence at the same time and in the same training run.
Once you have a large number of annotations, train_detector_svm_over_generic_detections is not a great training config to be using in contrast to train_detector_default or some of the others. The SVM model will not refine the box generation routine all that much, unlike deep-learning based methods (e.g. that contained in the default [CFRNN]), but it depends on how many annotation per target class you have, and the problem.
Thank you, Matt. When you say “a large number of annotations” about how many annotations per species would you say we would need to train a CFRNN model?
That question can’t be answered for everyone’s data across different problems and there’s no fixed number.
Sometimes the SVM will completely flop on a new problem and perform poorly with any number of samples. This is due to failings in either of 2 components: the generic object detector (which can be tested by running the detector ‘Generic’ on the dataset of interest and seeing if it puts decent boxes around the targets of interest), or just because it has a simpler model for target classification (which might fail on fine grained classification problems).
If the SVM model isn’t terrible, it can be estimated that somewhere in the 100s of annotations per category of interest range, that the CFRNN would be better. But that is still not a definite guide and depends on the difficulty of the problem, how much data and resolutions you’re training over, what are the distractor categories (things which look similar to the objects of interest and how close they are), and other factors.