Resources
Contents
Resources#
This is a collection of talks given at various Meetups and conferences. All questions are welcome in the Slack channel.
Case Studies#
Blogs#
2022
Fugue Core
Integrations
Scaling PyCaret with Spark (or Dask) through Fugue (Towards Data Science)
Fugue and DuckDB: Fast SQL Code in Python (Towards Data Science by Khuyen Tran)
2021
Fugue Core
Fugue - Reducing Spark Developer Friction (James Le Blog)
Creating Pandas and Spark Compatible Functions with Fugue (Towards Data Science)
Data Validation
Using Pandera on Spark for Data Validation through Fugue (Towards Data Science)
FugueSQL
Interoperable Python and SQL in Jupyter Notebooks (Towards Data Science)
Data Analysis with FugueSQL on Coiled Dask Clusters (Coiled Blog)
Introducing FugueSQL — SQL for Pandas, Spark, and Dask DataFrames (By Khuyen Tran on Towards Data Science)
Conferences and Meetups#
2022
2021
Data Validation
Fully Utilizing Spark for Data Validation (Spark AI Summit)
Large Scale Data Validation with Fugue (PyData Global)
FugueSQL
Dask SQL Query Engines (Dask Summit)
FugueSQL: Extending SQL Interface for End-to-End Data Pipelines (Dremio Subsurface)
FugueSQL - The Enhanced SQL Interface for Pandas, Spark, and Dask DataFrames (PyData Global)
Distributed Computing Workflows with Fugue-sql (Orlando Python Meetup)
Machine Learning
Superworkflow of Graph Neural Networks with K8S and Fugue (Spark AI Summit)
Scaling Machine Learning Workflows to Big Data with Fugue (KubeCon)
Distributed ML to Learn Causal Effect Using Fugue and Spark (AI Camp)
Tune
Intuitive and Scalable Hyperparameter Tuning with Apache Spark + Fugue (Spark AI Summit)
Fugue Tune (PyData Global)
Testing Spark
Simplifying Testing of Spark Applications (PyData Global)
Simplifying Testing of Spark Applications (DataOps DC Meetup)
2020
Unifying Spark and Non-Spark Ecosystems for Big Data Analytics (Spark AI Summit)