Skip to content

indianeagle4599/Pixtyle

Repository files navigation

Pixtyle logo Pixtyle

Pixel Stylizer Toolkit

Turn real-world photos and videos into stylised, flat-colour living wallpapers using colour clustering, Fourier filtering, edge enhancement, and optional super-resolution upscaling.


How it works

The core technique is K-Means colour clustering applied per-frame with temporal centroid smoothing. Each frame's pixels are grouped by colour (and optionally by spatial position), replaced with their cluster centroid, then blended back toward the original with edge enhancement. The result is a stable, flat-colour cartoon look that holds across video without flickering.

Several other transform approaches are included as experiments — see Available transforms below.


Quick start

pip install -r requirements.txt

Torch is a hard dependency — every transform has both NumPy and Torch backends, and pixtyle auto-selects the GPU when one is available. TensorFlow is only needed for the optional super-resolution step (pip install -r sr/requirements.txt).

Stylize a single image

python pixtyle.py input/images/DayWallpapers/IMG_6418.JPEG

Stylize a whole folder of images

python pixtyle.py input/images/DayWallpapers --output output/styled/DayWallpapers

Stylize a video

python pixtyle.py input/videos/saul.mp4

Enable super-resolution (images only)

python pixtyle.py input/images/DayWallpapers --sr \
    --output output/styled/DayWallpapers_super

Switch transform on the fly

python pixtyle.py input/images/DayWallpapers --transform enhanced_cluster

Configuration

Edit pixtyle_config.py to set your defaults — transform choice, per-transform parameters, SR model path, output folder, and video pipeline options. CLI flags override any config value per-run.

# pixtyle_config.py
TRANSFORM        = "kmeans_cluster"      # active transform
APPLY_SR         = False                 # enable SR by default
INPUT_IMAGES_DIR = Path("input/images")  # default image source
INPUT_VIDEOS_DIR = Path("input/videos")  # default video source
OUTPUT_STYLED    = Path("output/styled")            # pixtyle default
OUTPUT_TEST      = Path("output/test_stylizers")    # test_stylizers default
OUTPUT_VIDEO_BATCH = Path("output/videos")          # video_processing batch default

Available transforms

Key Description
kmeans_cluster K-Means palette reduction with spatial weighting and per-frame temporal smoothing ← recommended
enhanced_cluster Euclidean + angular blended distance clustering at reduced resolution, with GPU Sobel edge overlay
cartoon_combo Multi-stage pipeline: Fourier feature extraction → K-Means → edge pop → colour finishing
fourier_animate FFT bandstop filter with style presets (anime_soft, anime_bold, cartoon, watercolor, graphic_novel)
edge_pop Standalone edge stylizer — sobel/canny operators run on GPU when available; scharr/prewitt/laplacian are cv2-only
Utilities color_pop, blur, sharpen, invert, clip — small building blocks; not showcased in README grids

Tune each transform by editing the corresponding *_PARAMS dict in pixtyle_config.py.


Stylization methods (what / how / when)

Summaries reference the showcase presets (pixtyle_config). For Fourier, see STYLE_PRESETS keys in transforms/fourier_animate.py.

K-means cluster (kmeans_cluster)

What Flat colour regions, stable palettes, smooth temporal looks on video.
How Vectorized K-means over colour (optionally weighted with spatial cues), centroid smoothing across frames.
When to use Natural photos/video where you want a clean cartoon posterisation without ornate line art first.

Enhanced cluster (enhanced_cluster)

What Stronger segmentation readout than plain k-means, with visible edge reinforcement.
How Blended Euclidean + angular distance on downscaled clustering, Sobel-derived edge tint, colour pop finishing.
When to use Busy scenes where k-means looks too soft — you want crisper patches and outlining.

Edge pop (edge_pop)

What Outline / glow / ink-forward looks over the pixels you already have.
How Gradient-based edge masks (Torch or OpenCV backends), thresholds, blending modes (ink, glow).
When to use Subjects where line readability beats flat colour flattening — good counterpart to clustering rows.

Fourier animate (fourier_animate)

What Stylised frequency-domain look (anime, graphic, watercolor, etc.) driven by FFT band presets.
How Band-stop / radial masks in Fourier space, colour finishing (posterisation, presets).
When to use Imagery rich in gradients/detail where poster-style clustering is not the first goal.

Cartoon combo (cartoon_combo)

What A single heavier pipeline: clustered colour + accent edges + finishing in one preset.
How Optional Fourier-derived detail cues, core k-means segmentation, EdgePop lines, contrast/colour tweaks.
When to use Quick “everything on” stylistic punch when you prefer one knob over chaining CLIs manually.

Utilities

invert, blur, clip, sharpen, color_pop are useful in tests or as chain steps inside other pipelines — they are intentionally omitted from README image/video grids.


Super-resolution

SR upscaling is available for images only using a TensorFlow SavedModel (EDSR or similar).

  1. Download or export an EDSR SavedModel and place it at models/EDSR_x3.pb (or update SR_MODEL_PATH in pixtyle_config.py).
  2. Pass --sr on the command line, or set APPLY_SR = True in config.

TensorFlow is not required for the main stylization pipeline — only for SR. The sr/ directory contains an independent Keras-based SR implementation (EDSR, SRGAN, WDSR) that requires a separate sisr conda environment (Python 3.6, TF 2.3).


Results

Masters live under assets/original/ (see assets/original/README.md). Run python scripts/build_showcase.py to regenerate assets/images/ and assets/video/ from those files only — no dependency on gitignored input/.

JPEG tiles are centre-cropped to 800×600. GIF tiles are centre-cropped to 240×135, sampled at 12fps GIF display rate from full-duration trimmed masters unless you override GIF_MAX_SECONDS / GIF_START_S in the script.

Images

Original K-Means Enhanced Edge Pop Fourier Combo
Day
Day
Night

Video

Original K-Means Enhanced Edge Pop Fourier Combo
Arch
Deer
Jungle

Repo layout

pixtyle.py                 ← unified entry point (start here)
pixtyle_config.py          ← user-editable defaults
video_processing.py        ← optional CLI (uses pixtyle_config; overwrite-original batch)

transforms/                ← all transform classes
  base.py                  ← TransformBase ABC
  kmeans_cluster.py        ← KMeansClusteringTransform (NumPy + Torch)
  enhanced_cluster.py      ← EnhancedClusterTransform
  fourier_animate.py       ← FourierAnimateTransform (style presets)
  combined_cartoon.py      ← CombinedCartoonTransform (multi-stage)
  edge_pop.py              ← EdgePopTransform
  helpers.py               ← Fourier-domain helpers (radius grid, band-stop)
  simple_ops.py            ← blur, sharpen, invert, clip, color_pop

utils/                     ← shared utilities
  cluster_utils.py         ← K-Means math (init, distance, recompute, population)
  color_utils.py           ← colour-pop, contrast-clip, posterize, YUV conversion
  edges_utils.py           ← Sobel, Canny, FFT edge extractors (NumPy + Torch)
  temporal_core.py         ← SceneChangeDetector, reset_if_scene_change
  tensor_utils.py          ← NumPy ↔ Torch conversion, device detection
  io_utils.py              ← frame resize, folder iterators, display helpers
  math_utils.py            ← pure distance functions

processors/                ← VideoProcessor, ImageProcessor, transform registry
assets/original/           ← trimmed masters → build_showcase → assets/images,video/
scripts/                   ← build_showcase, test_stylizers, yt-dlp, etc.
sr/                        ← super-resolution model code (separate TF environment)

input/images/              ← sample images (gitignored)
input/videos/              ← input videos (gitignored)
output/                    ← all generated outputs (gitignored)
docs/                      ← project handbook (CONTEXT.md)

Extending

Add a new transform:

  1. Subclass TransformBase in transforms/ — implement process_numpy, process_torch, reset_state, describe.
  2. Register it in processors/registry.py.
  3. Add a params dict to pixtyle_config.py under TRANSFORM_PARAMS.
  4. Set TRANSFORM = "your_key" and run.

See transforms/enhanced_cluster.py as a minimal reference implementation.


Tips

  • Increase k in KMEANS_PARAMS for richer palettes; decrease downscale_factor in ENHANCED_CLUSTER_PARAMS for sharper colour segmentation (at the cost of speed).
  • Set blend_with_original / fade_to_original0 for a fully flat cartoon; increase toward 0.3 for a more photographic result.
  • edge_mask_ema in CARTOON_COMBO_PARAMS smooths the edge overlay across frames — reduce it for snappier outlines, increase it for calmer animation.
  • Batch video folder processing uses the same transforms as pixtyle (see pixtyle_config). video_processing.py is an optional script for overwrite-original batch runs, previews, and async writer tuning — it no longer duplicates parameter dicts; edit pixtyle_config.py for presets.
  • Image folders are processed in shape-grouped batches of VIDEO_BATCH_SIZE (default 16) frames so transforms see uniform-shape tensors and amortize per-call overhead. Mixed-resolution folders just run smaller sub-batches.

About

Turn real-world photos and videos into stylised, flat-colour living wallpapers using colour clustering, Fourier filtering, edge enhancement, and optional super-resolution upscaling.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages