Using Python
Contents
Using Python#
FugueSQL
integrates with Python by creating extensions and applying them in the %%fsql
cells. This tutorial will show an example of applying a Fugue Transformer. Additionally, we’ll see the simpler ways that the Python interface and FugueSQL
interact with each other.
from fugue_jupyter import setup
setup()
import pandas as pd
df = pd.DataFrame({"number": [0,1], "word": ["hello", "world"]})
Jinja Templating to Insert Variables#
Before going to functions, the simplest way FugueSQL
integrates with Python is through Jinja templating. DataFrames
defined previously are automatically accessible by the DAG
. Variables on the other hand, need to be passed with Jinja templating.
x=0
%%fsql
SELECT * FROM df WHERE number={{x}} # see we can use variable x directly
PRINT
number | word | |
---|---|---|
0 | 0 | hello |
Using Transformers#
Fugue
has different extensions that allow Python to interact with SQL. Transformer
is most commonly used because it modifies that data in a dataframe. Below we create a Transformer
in Python and apply it in the FugueSQL
. More on Transformer
syntax can be found here.
import re
from typing import Iterable, Dict, Any
# schema: *, vowel_count:int, consonant_count:int
def letter_count(df:Iterable[Dict[str,Any]]) -> Iterable[Dict[str,Any]]:
for row in df:
row['vowel_count'] = len(re.findall(r'[aeiou]', row['word'], flags=re.IGNORECASE))
space_count = len(re.findall(r'[ -]', row['word'], flags=re.IGNORECASE))
row['consonant_count'] = len(row['word']) - row['vowel_count'] - space_count
yield row
%%fsql
SELECT *
FROM df
WHERE number=1
TRANSFORM USING letter_count
PRINT
number | word | vowel_count | consonant_count | |
---|---|---|---|---|
0 | 1 | world | 1 | 4 |