Using Python
Contents
Using Python#
FugueSQL integrates with Python by creating extensions and applying them in the %%fsql cells. This tutorial will show an example of applying a Fugue Transformer. Additionally, we’ll see the simpler ways that the Python interface and FugueSQL interact with each other.
from fugue_jupyter import setup
setup()
import pandas as pd
df = pd.DataFrame({"number": [0,1], "word": ["hello", "world"]})
Jinja Templating to Insert Variables#
Before going to functions, the simplest way FugueSQL integrates with Python is through Jinja templating. DataFrames defined previously are automatically accessible by the DAG. Variables on the other hand, need to be passed with Jinja templating.
x=0
%%fsql
SELECT * FROM df WHERE number={{x}} # see we can use variable x directly
PRINT
| number | word | |
|---|---|---|
| 0 | 0 | hello |
Using Transformers#
Fugue has different extensions that allow Python to interact with SQL. Transformer is most commonly used because it modifies that data in a dataframe. Below we create a Transformer in Python and apply it in the FugueSQL. More on Transformer syntax can be found here.
import re
from typing import Iterable, Dict, Any
# schema: *, vowel_count:int, consonant_count:int
def letter_count(df:Iterable[Dict[str,Any]]) -> Iterable[Dict[str,Any]]:
for row in df:
row['vowel_count'] = len(re.findall(r'[aeiou]', row['word'], flags=re.IGNORECASE))
space_count = len(re.findall(r'[ -]', row['word'], flags=re.IGNORECASE))
row['consonant_count'] = len(row['word']) - row['vowel_count'] - space_count
yield row
%%fsql
SELECT *
FROM df
WHERE number=1
TRANSFORM USING letter_count
PRINT
| number | word | vowel_count | consonant_count | |
|---|---|---|---|---|
| 0 | 1 | world | 1 | 4 |