Skip to content

quickskilling/marimo_polars_ggsql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title Modern Workflow
marimo-version 0.23.6
width full
import marimo as mo
import polars as pl
import ggsql

import io
import requests

Getting started with Marimo and ggsql

Notes on rendering in Marimo

As of 5/1925, ggsql provides an example for use with Python in Jupyter. It doesn't work for Marimo. You can see notes on two issues that discuss this point.

Notice the chart.properties(width=300, height=300) to get the chart to display in Marimo. The guidance from the issue in Marimo didn't work for me. This guidance said to use chart1.display(width=300, height=300).

This workbook will use render_marimo() to render the ggsql visualizations. The full function is;

def render_marimo(df, viz, width = 600, height = 300, **kwargs):
    chart = ggsql.render_altair(df, viz, **kwargs)
    return chart.properties(width=width, height=height)

Create a simple render_marimo() method

Some added features to change the theme of the chart

# def render_marimo(df, viz, width = 600, height = 300, **kwargs):
#     chart = ggsql.render_altair(df, viz, **kwargs)
#     return chart.properties(width=width, height=height)

def render_marimo(df, viz, width=600, height=300, **kwargs):
    chart = ggsql.render_altair(df, viz, **kwargs)

    chart = (
        chart
        .configure_view(
            fill="white",        # plot area background
            stroke="transparent" # remove grey border if present
        )
        .configure(
            background="white"   # outer chart background
        )
        .properties(width=width, height=height)
    )

    return chart

Load Simple Polars DataFrame

df = pl.DataFrame({
    "x": [1, 2, 3, 4, 5],
    "y": [10, 20, 15, 30, 25],
    "category": ["A", "B", "A", "B", "A"]
})

Example that works

# Example that works
chart = ggsql.render_altair(df, "VISUALISE x, y DRAW point")
chart = chart.properties(width=300, height=300)

chart.display()

Example that doesn't work

# Example that doesn't work
chart1 = ggsql.render_altair(df, "VISUALISE x, y DRAW point")
chart1.display(width=300, height=300)

Using the render_marimo() function

The rest of this example will use render_marimo()

render_marimo(df, "VISUALISE x, y DRAW point")

Exploring ggsql

Line Chart

Line Charts in ggsql

names = pl.read_parquet("https://posit.byui.edu/names_year/names_year.parquet")
mo.ui.table(names,max_columns=60, max_height=250)
# Visualize Command
grammar = '''
VISUALISE year as x, UT as y, name as color
DRAW line
SCALE ORDINAL color
LABEL
    title => 'Plot of Name Distribution',
    x => 'Year',
    y => 'Count of Names',
    color => 'Baby Name'
'''

render_marimo(names.filter(pl.col('name').is_in(['John', 'Jay'])), grammar)
mo.ui.table(names\
    .filter(pl.col('name').is_in(['John', 'Jay']))\
    .with_columns(pl.date(pl.col('year'), 1,1).alias('year')),max_columns=60, max_height=250)
# Note that the color doesn't change the legend title but doesn't error
grammar_date = '''
VISUALISE year as x, UT as y, name as color
DRAW line
SCALE ORDINAL color
SCALE x VIA date
    RENAMING * => '{:time %Y}'
LABEL
    title => 'Plot of Name Distribution',
    x => 'Year',
    y => 'Count of Names',
    color => 'Baby Name'
'''

render_marimo(
    names\
        .filter(pl.col('name').is_in(['John', 'Jay']))\
        .with_columns(pl.date(pl.col('year'), 1,1).alias('year')),
    grammar_date)
# Note that the color doesn't change the legend title but doesn't error
grammar_date_facet = '''
VISUALISE year as x, UT as y, name as color
DRAW line
FACET name SETTING free => 'x'
SCALE ORDINAL color
SCALE x VIA date
    RENAMING * => '{:time %Y}'
LABEL
    title => 'Plot of Name Distribution',
    x => 'Year',
    y => 'Count of Names',
    color => 'Baby Name'
'''
# simple function doesn't handle facet well
render_marimo(
    names\
        .filter(pl.col('name').is_in(['John', 'Jay']))\
        .with_columns(pl.date(pl.col('year'), 1,1).alias('year')),
    grammar_date_facet)

Box Plots and Jitter Plots

names\
    .select('name', 'Total', 'year')\
    .filter(pl.col('name').is_in(['John', 'Jay', 'Joel', 'David', 'Ethan', 'Donovan']))
grammar_bp = '''
VISUALISE name AS x, Total as y
    DRAW point
     SETTING position => 'jitter'
'''

render_marimo(
    names\
        .filter(pl.col('name').is_in(['John', 'Jay', 'Joel', 'David', 'Ethan', 'Donovan'])),
    grammar_bp)

To help you in your further learning, we provide an overview of where the different grammar components fit into the ggsql syntax. Use this as a reference when you explore the full documentation.

Grammar Syntax
Data Data can be specified in several places:SELECT … FROM … (the SQL portion before VISUALISE) gets injected as the global data)VISUALIZE ... FROM ... (the global data source can be specified as part of the VISUALIZE clause)MAPPING ... FROM ... (the layer data source can be specified as part of the mapping in the DRAW clause)
Mappings Mappings also exist in multiple places in the syntax... AS ... following VISUALIZE sets global mapping that layers inherit.... AS ... following MAPPING in the DRAW clause sets layer specific mapping, potentially overriding the inherited global mapping.... AS ... following REMAPPING in the DRAW clause defines how data created by the statistics gets mapped in the layer.
Statistics Statistics are implicitly part of the layer created with DRAW. Each layer has their own statistics transformation hard-wired.
Scales ggsql provides default scales as needed, but these can be overridden with the SCALE clause.
Geometries Geometries are inherent to the layers created with DRAW and PLACE. Each layer has a specific geometry and some layers may share the same geometry (e.g. histogram and bar).
Facets Facets are created with the FACET clause. The default facet creates a single view showing all the data.
Coordinates Coordinate systems are defined using PROJECT but can also be derived from the mapping. If you map to x and y, ggsql knows you are using a Cartesian coordinate system and if you map to angle and radius it knows you want a polar coordinate system.
Theme Currently not supported.

Commands to get it set up

These have already been done in this repo.

  1. Created the repo on Github and then clones
  2. uv init . in the base folder and then delete the main.py file.
  3. Then add package = false under [tool.uv] in the pyproject.toml as this will not be a package.
  4. Now run the following uv command in terminal: uv add marimo[edit,recommended] polars ggsql

About

A short introduction to the next generation data science workflow in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages