TeleFuser¶

A high-performance runtime for world model inference and multimodal generation, built for long-running pipelines, distributed execution, and production service interfaces.

PyTorch 2.6+ CUDA 12.8+ Triton kernels FastAPI service Ray distributed

Runtime Capabilities¶

World Model Runtime

Continuous execution, stateful sessions, and bidirectional control loops.

Parallel Inference

Ulysses, Ring Attention, tensor parallelism, pipeline parallelism, and FSDP.

Optimized Operators

Compile-aware ops with eager CUDA Triton kernels and PyTorch native fallbacks.

Streaming Service

FastAPI batch serving and LiveKit-backed rooms for server-push and resilient interactive WebRTC.

Feature Cache

AdaTaylorCache and runtime cache controls for repeated generation workloads.

Extensible Pipelines

Reusable stages, model configs, schedulers, and pipeline orchestration.

Supported Models¶

World Model and Real-Time¶

Model	Tasks	Description
LingBot-World v2	Bidirectional streaming	Camera-controlled interactive world model via LiveKit
LingBot-World-Fast	Bidirectional streaming	Legacy/causal-fast model via LiveKit reliable data messages

Video Generation¶

Model	Tasks	Description
WanVideo (Wan2.1 / Wan2.2)	T2V, I2V, FL2V	Video generation and editing
HunyuanVideo	T2V, I2V	Video generation
LTX Video	I2V + Audio	Video generation with audio
FlashVSR	VSR	Video super-resolution
LiveAct	S2V	Speech-to-video
LongCat-Video	T2V, I2V	Long video generation
LingBot-Video	T2I, T2V, TI2V, MoE refiner	Precision-first Dense and MoE video generation

Image Generation¶

Model	Tasks	Description
Qwen-Image	T2I, Edit	Image generation and editing
Z-Image	T2I	Image generation
Flux2 Klein	T2I	Image generation

Quick Start¶

# Install
pip install telefuser

# Batch serving
telefuser serve /path/to/pipeline.py --port 8000

# LiveKit-backed streaming (Python SDK included in the base install)
telefuser stream-serve examples/lingbot/lingbot_world_fast_image_to_video_h100.py \
  --livekit-url ws://127.0.0.1:7880 \
  --livekit-api-key devkey --livekit-api-secret secret \
  -p 8088

Documentation Sections¶

Service GuideBatch serving, task APIs, and SDK. Stream ServerLiveKit sessions, media, data topics, and bidirectional control. Stream SchedulerActor ownership, bounded dataflow, lifecycle, metrics, and GPU placement. AIPerf BenchmarkBatch video and LingBot LiveKit workflows. ConfigurationRuntime, attention, quantization, and offload settings. TF-KernelInstall, build, verify, and use the optional CUDA extension. Parallel InferenceDistributed processing strategies. Adding New ModelIntegrate new model architectures and stages. ProfilerPerformance analysis tools.

Switch to Chinese 🇨🇳