Dear NAVIDA team,
Thank you very much for your excellent work and for making the model weights available on Hugging Face.
I followed the tutorial and used the released open weights for evaluation. However, the results I obtained are noticeably different from those reported in the paper. I am currently running evaluation on 4 RTX A6000 GPUs.
I would like to ask whether there are any important details, settings, or evaluation procedures that I should pay special attention to in order to reproduce the reported results more accurately.
My current results are as follows:
R2R
Success rate: 1062/1839 (0.577)
Oracle success rate: 1234/1839 (0.671)
SPL: 944.572/1839 (0.514)
Distance to goal: 4.717
Path length: 11.098
RxR
Success rate: 2050/3669 (0.559)
Oracle success rate: 2341/3669 (0.638)
SPL: 1780.191/3669 (0.485)
Distance to goal: 5.428
Path length: 13.456
ndtw: 0.661
I would greatly appreciate any advice or clarification regarding evaluation setup, preprocessing, checkpoints, or other factors that might affect reproducibility.
Thank you very much for your time and support.
Best regards,
Anh Dao
Dear NAVIDA team,
Thank you very much for your excellent work and for making the model weights available on Hugging Face.
I followed the tutorial and used the released open weights for evaluation. However, the results I obtained are noticeably different from those reported in the paper. I am currently running evaluation on 4 RTX A6000 GPUs.
I would like to ask whether there are any important details, settings, or evaluation procedures that I should pay special attention to in order to reproduce the reported results more accurately.
My current results are as follows:
R2R
RxR
I would greatly appreciate any advice or clarification regarding evaluation setup, preprocessing, checkpoints, or other factors that might affect reproducibility.
Thank you very much for your time and support.
Best regards,
Anh Dao