Pixel Stylizer Toolkit
Turn real-world photos and videos into stylised, flat-colour living wallpapers using colour clustering, Fourier filtering, edge enhancement, and optional super-resolution upscaling.
The core technique is K-Means colour clustering applied per-frame with temporal centroid smoothing. Each frame's pixels are grouped by colour (and optionally by spatial position), replaced with their cluster centroid, then blended back toward the original with edge enhancement. The result is a stable, flat-colour cartoon look that holds across video without flickering.
Several other transform approaches are included as experiments — see Available transforms below.
pip install -r requirements.txtTorch is a hard dependency — every transform has both NumPy and Torch
backends, and pixtyle auto-selects the GPU when one is available.
TensorFlow is only needed for the optional super-resolution step
(pip install -r sr/requirements.txt).
python pixtyle.py input/images/DayWallpapers/IMG_6418.JPEGpython pixtyle.py input/images/DayWallpapers --output output/styled/DayWallpaperspython pixtyle.py input/videos/saul.mp4python pixtyle.py input/images/DayWallpapers --sr \
--output output/styled/DayWallpapers_superpython pixtyle.py input/images/DayWallpapers --transform enhanced_clusterEdit pixtyle_config.py to set your defaults — transform choice, per-transform
parameters, SR model path, output folder, and video pipeline options. CLI flags
override any config value per-run.
# pixtyle_config.py
TRANSFORM = "kmeans_cluster" # active transform
APPLY_SR = False # enable SR by default
INPUT_IMAGES_DIR = Path("input/images") # default image source
INPUT_VIDEOS_DIR = Path("input/videos") # default video source
OUTPUT_STYLED = Path("output/styled") # pixtyle default
OUTPUT_TEST = Path("output/test_stylizers") # test_stylizers default
OUTPUT_VIDEO_BATCH = Path("output/videos") # video_processing batch default| Key | Description |
|---|---|
kmeans_cluster |
K-Means palette reduction with spatial weighting and per-frame temporal smoothing ← recommended |
enhanced_cluster |
Euclidean + angular blended distance clustering at reduced resolution, with GPU Sobel edge overlay |
cartoon_combo |
Multi-stage pipeline: Fourier feature extraction → K-Means → edge pop → colour finishing |
fourier_animate |
FFT bandstop filter with style presets (anime_soft, anime_bold, cartoon, watercolor, graphic_novel) |
edge_pop |
Standalone edge stylizer — sobel/canny operators run on GPU when available; scharr/prewitt/laplacian are cv2-only |
| Utilities | color_pop, blur, sharpen, invert, clip — small building blocks; not showcased in README grids |
Tune each transform by editing the corresponding *_PARAMS dict in pixtyle_config.py.
Summaries reference the showcase presets (pixtyle_config). For Fourier,
see STYLE_PRESETS keys in transforms/fourier_animate.py.
| What | Flat colour regions, stable palettes, smooth temporal looks on video. |
| How | Vectorized K-means over colour (optionally weighted with spatial cues), centroid smoothing across frames. |
| When to use | Natural photos/video where you want a clean cartoon posterisation without ornate line art first. |
| What | Stronger segmentation readout than plain k-means, with visible edge reinforcement. |
| How | Blended Euclidean + angular distance on downscaled clustering, Sobel-derived edge tint, colour pop finishing. |
| When to use | Busy scenes where k-means looks too soft — you want crisper patches and outlining. |
| What | Outline / glow / ink-forward looks over the pixels you already have. |
| How | Gradient-based edge masks (Torch or OpenCV backends), thresholds, blending modes (ink, glow). |
| When to use | Subjects where line readability beats flat colour flattening — good counterpart to clustering rows. |
| What | Stylised frequency-domain look (anime, graphic, watercolor, etc.) driven by FFT band presets. |
| How | Band-stop / radial masks in Fourier space, colour finishing (posterisation, presets). |
| When to use | Imagery rich in gradients/detail where poster-style clustering is not the first goal. |
| What | A single heavier pipeline: clustered colour + accent edges + finishing in one preset. |
| How | Optional Fourier-derived detail cues, core k-means segmentation, EdgePop lines, contrast/colour tweaks. |
| When to use | Quick “everything on” stylistic punch when you prefer one knob over chaining CLIs manually. |
invert, blur, clip, sharpen, color_pop are useful in tests or as chain steps inside other pipelines — they are intentionally omitted from README image/video grids.
SR upscaling is available for images only using a TensorFlow SavedModel (EDSR or similar).
- Download or export an EDSR SavedModel and place it at
models/EDSR_x3.pb(or updateSR_MODEL_PATHinpixtyle_config.py). - Pass
--sron the command line, or setAPPLY_SR = Truein config.
TensorFlow is not required for the main stylization pipeline — only for SR.
The sr/ directory contains an independent Keras-based SR implementation
(EDSR, SRGAN, WDSR) that requires a separate sisr conda environment
(Python 3.6, TF 2.3).
Masters live under assets/original/ (see assets/original/README.md).
Run python scripts/build_showcase.py to regenerate assets/images/ and assets/video/ from those files only —
no dependency on gitignored input/.
JPEG tiles are centre-cropped to 800×600. GIF tiles are centre-cropped to 240×135, sampled at 12fps GIF display rate from full-duration trimmed masters unless you override GIF_MAX_SECONDS / GIF_START_S in the script.
| Original | K-Means | Enhanced | Edge Pop | Fourier | Combo | |
|---|---|---|---|---|---|---|
| Day | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Day | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Night | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Original | K-Means | Enhanced | Edge Pop | Fourier | Combo | |
|---|---|---|---|---|---|---|
| Arch | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Deer | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Jungle | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
pixtyle.py ← unified entry point (start here)
pixtyle_config.py ← user-editable defaults
video_processing.py ← optional CLI (uses pixtyle_config; overwrite-original batch)
transforms/ ← all transform classes
base.py ← TransformBase ABC
kmeans_cluster.py ← KMeansClusteringTransform (NumPy + Torch)
enhanced_cluster.py ← EnhancedClusterTransform
fourier_animate.py ← FourierAnimateTransform (style presets)
combined_cartoon.py ← CombinedCartoonTransform (multi-stage)
edge_pop.py ← EdgePopTransform
helpers.py ← Fourier-domain helpers (radius grid, band-stop)
simple_ops.py ← blur, sharpen, invert, clip, color_pop
utils/ ← shared utilities
cluster_utils.py ← K-Means math (init, distance, recompute, population)
color_utils.py ← colour-pop, contrast-clip, posterize, YUV conversion
edges_utils.py ← Sobel, Canny, FFT edge extractors (NumPy + Torch)
temporal_core.py ← SceneChangeDetector, reset_if_scene_change
tensor_utils.py ← NumPy ↔ Torch conversion, device detection
io_utils.py ← frame resize, folder iterators, display helpers
math_utils.py ← pure distance functions
processors/ ← VideoProcessor, ImageProcessor, transform registry
assets/original/ ← trimmed masters → build_showcase → assets/images,video/
scripts/ ← build_showcase, test_stylizers, yt-dlp, etc.
sr/ ← super-resolution model code (separate TF environment)
input/images/ ← sample images (gitignored)
input/videos/ ← input videos (gitignored)
output/ ← all generated outputs (gitignored)
docs/ ← project handbook (CONTEXT.md)
Add a new transform:
- Subclass
TransformBaseintransforms/— implementprocess_numpy,process_torch,reset_state,describe. - Register it in
processors/registry.py. - Add a params dict to
pixtyle_config.pyunderTRANSFORM_PARAMS. - Set
TRANSFORM = "your_key"and run.
See transforms/enhanced_cluster.py as a minimal reference implementation.
- Increase
kinKMEANS_PARAMSfor richer palettes; decreasedownscale_factorinENHANCED_CLUSTER_PARAMSfor sharper colour segmentation (at the cost of speed). - Set
blend_with_original/fade_to_original→0for a fully flat cartoon; increase toward0.3for a more photographic result. edge_mask_emainCARTOON_COMBO_PARAMSsmooths the edge overlay across frames — reduce it for snappier outlines, increase it for calmer animation.- Batch video folder processing uses the same transforms as
pixtyle(seepixtyle_config).video_processing.pyis an optional script for overwrite-original batch runs, previews, and async writer tuning — it no longer duplicates parameter dicts; editpixtyle_config.pyfor presets. - Image folders are processed in shape-grouped batches of
VIDEO_BATCH_SIZE(default 16) frames so transforms see uniform-shape tensors and amortize per-call overhead. Mixed-resolution folders just run smaller sub-batches.



































