Skip to content

Set mp_spawn to True per default#1922

Open
fmaussion wants to merge 2 commits into
OGGM:masterfrom
fmaussion:mptrue
Open

Set mp_spawn to True per default#1922
fmaussion wants to merge 2 commits into
OGGM:masterfrom
fmaussion:mptrue

Conversation

@fmaussion

Copy link
Copy Markdown
Member

Closes or at least addresses #1921

@TimoRoth

Copy link
Copy Markdown
Member

I'm not sure if this is a good idea.
Spawn-Mode is still not the default upstream, since it's quite troublesome at times. The new default there is forkserver.
And for us, using fork still drastically speeds things up.
Since in spawn mode, each new process will have to re-load everything from disk, and can't re-use any existing resources.
So it's a severe performance penalty, and I also think it still has hidden broken edge cases which rely on pre-loaded global state.

Using fork is only deprecated when using it combined with threads.
And we're not using threads anywhere.

@fmaussion

Copy link
Copy Markdown
Member Author

Thanks! That helps me understand the issue a bit more.

Using fork is only deprecated when using it combined with threads.
And we're not using threads anywhere.

this is where I'm fuzzy - I wonder if libraries that we are using don't use threads without us knowing, or we should make sure they don't.

@TimoRoth

Copy link
Copy Markdown
Member

That's impossible to tell, but sadly quite possible.
I guess ideally we'd just be using threads instead of entirely new processes, but python global interpreter lock makes those not really workable and effectively nearly fully synchronize.

We could try experimenting with a free-threaded python build, which are a thing now and eliminate that lock.
But I think the various binary dependencies we have, like GDAL, do not support that mode at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants