Skip to content

Add dask.distributed and Coiled support to the CLI #63

@crusaderky

Description

@crusaderky

As of 1.4.0, recursive-diff already supports and is tested to work when a global dask.distributed.Client is registered, and use it instead of the default threading scheduler to read and compare the data.

This is particularly useful when the data is in AWS S3 (#61) and your client is not in AWS EC2.

Enhance the CLI to allow

  • connecting to an already-running remote distributed.Cluster by URL
  • starting a Coiled cluster, or connect to an already-running one by name. In the first case, it's important to expose the coiled settings that allow leaving the cluster running for some time after the client disconnects, so that a user can rapidly invoke the CLI multiple times and use the already-warm cluster. This (as well as number and type of workers, region, etc. etc. etc.) is probably best served by pointing to a coiled.yaml file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions