Cloud Providers#

Since Fugue is a framework for distributed compute, it is often paired with a solution that manages Spark, Dask, or Ray clusters. This section will cover using Fugue on top of cloud providers such as Databricks or Coiled. Fugue has a fugue-cloudprovider package that allows users to easily spin up ephemeral compute for their compute workflows.

Have questions? Chat with us on Github or Slack:

Homepage Slack Status

img

Spark#

Databricks

Databricks is the most common provider for Spark clusters. Using the databricks-connect library, we can easily spin up an ephemeral Spark cluster. We can connect to the SparkSession on the Databricks cluster from a local machine.

Dask#

Coiled

Coiled is the easiest way to host Dask clusters on the cloud. Using the coiled library, we can easily spin up an ephemeral Dask cluster or connect to an existing Dask cluster on Coiled.

Ray#

Anyscale

Anyscale is the Ray platform on the cloud. Using the anyscale library, we can easily spin up an ephemeral Ray cluster or connect to an existing Ray cluster on Anyscale.