Skip to content

Conversation

@bw4sz
Copy link
Collaborator

@bw4sz bw4sz commented Dec 24, 2025

Description

This PR makes a script to organize and train a new bird detector. It uses data from the original Weinstein et al. 2022 paper, adds in data from the Drones for Ducks and other datasets from lila.science.

I added blank white images to test the performance and can confirm it no longer predicts in blank images with an empty frame accuracy of 100%.

Next steps

  • Update docs
  • Compare performance to old detector
  • Update weights on huggingface
  • Check tiling sensitivity and optionally add more zoom augmentations.
  • reach out to community for other images to test against. Add these to docs.
  • Add a couple non-bird images to test as well.
  • Quick comparison about segment anything 3, just a screenshot of the browser.
  • Check tradeoff in precision and recall for different score-thresholds.

Other issues.

There is an issue that needs to be documented in which model.evaluate() needs a size argument (below), but more importantly doesn't give the same results as within the training loop. They may be related. Let's wait until #1238 is solved and confirm. I saw the performance drop completely.

I am quite confused about the CPU memory (@jveitchmichaelis did you see this in other model training). It just doesn't jive with my expectations and back of the envelope calculations. If you have 6 workers, and an average image size of 10MB, and a prefetch of 2 and batch size of 20 = 6 * 2 * 10 * 20 ~ 3GB. We are seeing HUGE memory usage, and it seems like its more within the model.train loop, not in the dataloader. I am concerned about kornia.

Screenshot 2025-12-24 at 10 37 00 AM

Related Issue(s)

I've made a number of issues during this PR

#1246 #1245 #1244

AI-Assisted Development

  • [x ] I used AI tools (e.g., GitHub Copilot, ChatGPT, etc.) in developing this PR
  • [ x] I understand all the code I'm submitting
  • [ x] I have reviewed and validated all AI-generated code

AI tools used (if applicable):

@bw4sz bw4sz self-assigned this Dec 24, 2025
@jveitchmichaelis
Copy link
Collaborator

jveitchmichaelis commented Dec 24, 2025

Yeah there are possibly some memory leaks. I've been trying to hunt this down with the DINO branch. You can try aggressively clearing the cache + running gc.collect() at the end of each epoch. It's hard to tell on HPG because you're shown the entire system RAM and not only your own process. Best to debug that locally if you can.

Another one was that the losses should be detached when logging, but not when being returned. Also making sure we don't return things from hooks that should not be called directly, metrics are all reset. Let me gather up my changes and PR.

image

Here is my trace for a long run after I made an effort to stop this happening, you need to ignore the "background" level from other resident processes on the cluster (ie you can constrain with mem in SLURM and then see how low you can take it; I normally allocate 64GB/GPU for batch size 16-32). You should see it tick up (caching?) and then flatten out).

I'm pretty sure it's nothing to do with validation as the plots look the same, just without the small dip at the end of each training epoch.

@bw4sz
Copy link
Collaborator Author

bw4sz commented Dec 30, 2025

Comparison to old detector show much improved performance on a 90/10 split.


Box Precision:
  Checkpoint:  0.8492
  Pretrained:  0.7495
  Difference:  +0.0997 (+13.30%)

Box Recall:
  Checkpoint:  0.8662
  Pretrained:  0.4645
  Difference:  +0.4017 (+86.47%)

Empty Frame Accuracy:
  Checkpoint:  1.0000
  Pretrained:  0.0000
  Difference:  +1.0000

This is from trainer.validate(model), the results from main.evaluate() feel muddled with #1238. We need to fully understand that issue before merging this PR.

To do is a zero-shot comparison with a new dataset, I am asking the community for a couple images atleast.

@bw4sz bw4sz force-pushed the bird_training branch 2 times, most recently from 245cf82 to 66d57f4 Compare December 30, 2025 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants