Discrete Flow Matching

Kaist Visual AI Group

Overview

This project investigates the core differences between Continuous and Discrete Flow Matching for generative modeling. Developed during an internship at KAIST Visual AI Group, the work focuses on building a testbed to compare these approaches on discrete datasets including MNIST and sketch data.

Key Research Question
Why use flow matching over autoregressive transformers for sketch generation?

Motivation

Autoregressive Transformers Flow Matching
Sequential token generation Parallel generation
Cannot revise past decisions Iterative refinement
Limited task flexibility Flexible conditioning

Approach

Dual Modality Architecture

  • CNN encoder for rendered sketch images (UDF representation)
  • Transformer encoder for stroke sequences

Key Components

  • VQ-VAE: Discrete codebook learning for structured representations
  • CFG: Classifier-Free Guidance adapted for discrete state-space models
  • MDM vs DFM: Ablation study comparing Masked Diffusion and Discrete Flow Matching

Target Tasks

  • Stroke infilling
  • Layout generation with given strokes
  • Sketch generation with given layout

Datasets

Dataset Description
MNIST 32x32 grayscale handwritten digits
QuickDraw Simple stroke-based sketches
Creative Sketch Complex multi-stroke sketches with labels

Resources

References