-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Fix CliRunner error with logging message #3140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I confirm that this approach does not regress on #2993 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to update the docstring for __del__ if del is the agreed upon solution: it's not about closing bytesIO instances in a deterministic order any more
|
You're right thanks, I'll change it. |
|
Running |
|
Interesting, so there is a regression after all ? to be complete, I used a slightly different command to check this earlier. Here it is: Also note that I multiplied the number of tests by 10, as compared to my reprod in #2993 I also re-ran this just now, and still didn't hit the race. Maybe it's platform specific now ? I'm running macOS 15 on a 10-cores M2 arch for what it's worth (but I originally discovered this race in GHA). |
|
I'm running inside a Ubuntu 24.04-based docker container (GDAL 3.10.3) on a 16-core Ryzen 9950X3D running Ubuntu 24.04. I don't know anything about the race conditions, but I still believe the root cause is multiple objects believing they have ownership over these buffers. |
| if not self.copy_to.closed: | ||
| self.copy_to.flush() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately this LBYL (Look Before You Leap) approach is infamously inappropriate to guard against races; in a concurrent execution model, self.copy_to.closed could change between the moment the guard clause (if ...) is evaluated, and the rest of the block is, and then you're back to square one, but the code is more complex and the race is potentially harder to reproduce (which isn't necessarily a good thing when you're try to debug it).
I agree. That seems like an accurate description of the problem. |
|
I am in favor of introducing I am testing the effects of these two in: #3151 |
|
FYI, these changes does not fix the issues uncovered from #3151. It doesn't mean that this PR is not fixing something else! 😅 |
Tests using CliRunner run in parallel (with
pytest-xdist) sometimes raise an error due to logging messages not being processed correctly because a buffer has been closed. This bug is a due to a regression after #2991.fixes #3110
Difficult to write a test for it...