Thanks for the wonderful work!
When I work this network on my own dataset, something strange happened:
1. A single kernel may sometimes correspond to multiple objects;
2. Misdetection of some obvious objects (can detected in the first few frames, with only slight differences in the image).;
Can you give me any guidance or suggections about how to debug or solve these problems?