Hash Configuration Management Guide¶

This document explains how to manage and maintain TeleFuser's model hash configurations, including the use of weight_viewer.py tool, configuration version control, and update workflows.

Why Hash-based Recognition?¶

TeleFuser uses a hash-based automatic recognition mechanism instead of relying on configuration files (like model_index.json). This design offers several significant advantages:

1. No Configuration File Required¶

Unlike other frameworks that depend on model_index.json or similar metadata files, TeleFuser can load models directly from weight files:

# TeleFuser - Works with any single weight file
from telefuser.core.module_manager import ModuleManager

mm = ModuleManager(torch_dtype=torch.bfloat16)
mm.load_model("downloads/model.safetensors")  # No config needed!

# Works with:
# - Single .safetensors files
# - Official release weights (non-Diffusers format)
# - Custom trained weights
# - Any directory structure

Comparison:

Scenario	TeleFuser	Frameworks requiring config
Single `.safetensors` file	✅ Works directly	❌ Requires config
No `model_index.json`	✅ Works	❌ Cannot load
Official weights (non-Diffusers)	✅ Works	⚠️ Needs conversion
Custom directory structure	✅ Works	❌ Must follow convention
Multi-file sharded weights	✅ Auto-merge	⚠️ Needs correct naming

2. Strong Model Validation¶

Hash matching provides robust model validation:

Validation Type	Hash-based	String-based (e.g., `_class_name`)
Wrong model file loaded	❌ Hash mismatch, rejected	⚠️ May load, runtime error
Model keys modified	❌ Hash mismatch, rejected	⚠️ Cannot detect
Model structure changed	❌ Rejected (with `shape=True`)	⚠️ Cannot detect
Unknown model	❌ Not in registry	⚠️ May use fallback

# Hash acts as a checksum - ensures model integrity
# Even same model with different quantization gets different hash:
(
    None, "9269f8db9040a9d860eaca435be61814",  # FP16 version
    ["wan_video_dit"], [WanModel], "official",
),
(
    None, "4cf556355bc7e9b6545b38f4930f60b1",  # FP8 version (different hash!)
    ["wan_video_dit"], [WanModel], "official",
),

3. Precise Model Variant Identification¶

Different weight variants of the same model class can be distinguished:

# Same WanModel class, different weights identified by hash
# Wan2.1 T2V 1.3B
(None, "9269f8db9040a9d860eaca435be61814", ["wan_video_dit"], [WanModel], "official"),
# Wan2.2 I2V A14B
(None, "5b013604280dd715f8457c6ed6d6a626", ["wan_video_dit"], [WanModel], "official"),
# Wan2.2 TI2V 5B
(None, "1f5ab7703c6fc803fdded85ff040c316", ["wan_video_dit"], [WanModel], "official"),

This prevents: - Loading T2V weights into an I2V pipeline - Confusing different model sizes - Using wrong quantization variant

4. Security and Audit Benefits¶

Integrity verification: Hash confirms weights haven't been tampered with
Version control: Track which exact weights are being used
Reproducibility: Same hash = same model behavior guaranteed
Supply chain security: Verify weights match expected hash from trusted source

5. Flexibility for Developers¶

# Developer can load models without knowing the exact model type
mm.load_model("/downloads/mystery_model.safetensors")
model = mm.fetch_module("wan_video_dit")  # Auto-detected by hash!

# Works in any environment:
# - Research: Download any .safetensors and run
# - Production: Verify hash before loading
# - CI/CD: Ensure correct model is being tested

Design Trade-offs¶

Aspect	Hash-based (TeleFuser)	Config-based
Correctness	✅ Strong validation	⚠️ Weak validation
Ease of adding new models	⚠️ Need to register hash	✅ Auto-detect
Support for arbitrary models	⚠️ Must be in registry	✅ Fallback available
Best for	Production, critical tasks	Prototyping, research

Summary: TeleFuser's hash-based recognition prioritizes correctness and reliability over convenience, making it suitable for production environments where loading the wrong model could cause significant issues.

Configuration Location¶

All model hash configurations are stored in:

telefuser/core/model_config.py

Core Tool: Weight Viewer¶

TeleFuser provides the weight_viewer.py tool to assist with model analysis and management:

python tools/viewer/weight_viewer.py <model_path> [options]

Basic Usage¶

# View single file model
python tools/viewer/weight_viewer.py /path/to/model.safetensors

# View sharded models (using wildcards)
python tools/viewer/weight_viewer.py "/path/to/model-*.safetensors"

# Show summary only (includes hash)
python tools/viewer/weight_viewer.py /path/to/model.safetensors --quiet

# Export as JSON for further analysis
python tools/viewer/weight_viewer.py /path/to/model.safetensors --export model_info.json

Output Example¶

================================================================================
Model Weight Information Overview
================================================================================
Total parameters: 14.02B (14,022,154,432)
hash with shape: 4c3523c69fb7b24cf2db147a715b277f
Files loaded: 1
File list: ['/path/to/model.safetensors']

Data type distribution:
  torch.bfloat16: 14.02B (100.00%)

Detailed weight structure:
(Structurally identical modules have been merged, use --show-all to view full structure)
model
  transformer
    blocks x32
      norm1.scale                      | (2048,)              | torch.bfloat16  |     2.05K
      norm1.bias                       | (2048,)              | torch.bfloat16  |     2.05K
      ...

Configuration Format¶

model_loader_configs = [
    # Format: (keys_hash, keys_hash_with_shape, model_names, model_classes, model_resource)
    (
        None,                                      # keys_hash (non-strict matching)
        "4c3523c69fb7b24cf2db147a715b277f",       # keys_hash_with_shape (strict matching)
        ["wan_video_decoder"],                     # model_names
        [TAEHV],                                   # model_classes
        "official",                                 # model_resource
    ),
    # ... more configurations
]

Configuration Management Workflow¶

Adding a New Model¶

1. Obtain Model Files¶

# Confirm model files exist
ls /path/to/models/*.safetensors

2. Use Weight Viewer to Analyze Model¶

# Get model hash and structure information
python tools/viewer/weight_viewer.py "/path/to/models/model.safetensors" --quiet

The hash with shape in the output is the keys_hash_with_shape needed for configuration.

3. Analyze Model Structure in Detail (for implementing StateDictConverter)¶

# View complete structure for writing key mappings
python tools/viewer/weight_viewer.py "/path/to/models/model.safetensors" --max-depth 10 --export model_structure.json

Review the exported JSON file, analyze key naming patterns, and write the converter.

4. Add to Configuration¶

Edit telefuser/core/model_config.py to add model configuration:

from ..models.my_model import MyModel

model_loader_configs = [
    # ... existing configurations ...

    # MyModel - Standard version (from weight_viewer output)
    (
        None,  # Non-strict hash (optional)
        "4c3523c69fb7b24cf2db147a715b277f",  # Hash from weight_viewer
        ["my_model"],
        [MyModel],
        "official",  # or "diffusers"
    ),
]

5. Verify Configuration¶

# Use weight_viewer to verify hash matches
python tools/viewer/weight_viewer.py "/path/to/models/model.safetensors" --quiet

# Then test loading
python -c "
from telefuser.core.module_manager import ModuleManager
mm = ModuleManager(device='cpu')
mm.load_model('/path/to/models/model.safetensors')
print('✓ Model loaded successfully!')
print('Available models:', mm.module_name)
"

Batch Processing Multiple Model Variants¶

When there are multiple variants (like FP8, pruned versions), use scripts for batch processing:

#!/bin/bash
# scripts/batch_analyze_models.sh

MODEL_DIR="/path/to/models"

for model in "$MODEL_DIR"/*.safetensors; do
    echo "========================================"
    echo "Analyzing: $(basename "$model")"
    echo "========================================"
    python tools/viewer/weight_viewer.py "$model" --quiet
    echo ""
done

Comparing Different Model Versions¶

# Analyze two versions of models
python tools/viewer/weight_viewer.py "/path/to/model_v1.safetensors" --export v1.json
python tools/viewer/weight_viewer.py "/path/to/model_v2.safetensors" --export v2.json

# Use diff tool to compare structural differences
diff <(jq '.weights_structure' v1.json) <(jq '.weights_structure' v2.json)

Weight Viewer Advanced Usage¶

Analyzing Sharded Models¶

# Automatically recognize and merge shard files
python tools/viewer/weight_viewer.py "/path/to/model-*.safetensors"

# Example: WanVideo 14B model (7 shards)
python tools/viewer/weight_viewer.py \
    "/models/Wan2.1-I2V-14B-720P/diffusion_pytorch_model-*.safetensors" \
    --quiet

Viewing Specific Level Structure¶

# View deeper structure (default depth is 5)
python tools/viewer/weight_viewer.py /path/to/model.safetensors --max-depth 8

# View complete structure (no depth limit)
python tools/viewer/weight_viewer.py /path/to/model.safetensors --show-all

Disabling Structure Merging¶

# Show full information for all repeated modules
python tools/viewer/weight_viewer.py /path/to/model.safetensors --no-merge

Auxiliary Scripts¶

Generate Configuration Template¶

Create script tools/generate_config_template.py:

Note: Before running this script, ensure you have installed the project in development mode:
pip install -e ".[dev]"

#!/usr/bin/env python3
"""
Generate configuration template from weight_viewer output

Usage:
    python tools/generate_config_template.py <model_path> --name my_model --class MyModel
"""

import argparse
import json

from telefuser.utils.model_weight import hash_state_dict_keys


def generate_template(model_path, model_name, model_class, resource="official"):
    """Generate configuration template"""
    import glob

    # Handle wildcards
    files = sorted(glob.glob(model_path))
    if not files:
        print(f"Error: No files found matching {model_path}")
        sys.exit(1)

    # Load all weights
    from telefuser.utils.model_weight import load_state_dict
    all_weights = {}
    for f in files:
        all_weights.update(load_state_dict(f))

    # Compute hash
    hash_with_shape = hash_state_dict_keys(all_weights, with_shape=True)
    hash_without_shape = hash_state_dict_keys(all_weights, with_shape=False)

    # Generate configuration
    config = f'''    # {model_name}
    (
        "{hash_without_shape}",  # keys_hash (non-strict matching)
        "{hash_with_shape}",    # keys_hash_with_shape
        ["{model_name}"],
        [{model_class}],
        "{resource}",
    ),'''

    print("\n" + "="*60)
    print("Generated Configuration Template")
    print("="*60)
    print(config)
    print("\n" + "="*60)
    print(f"Model Statistics:")
    print(f"  Total tensors: {len(all_weights)}")
    print(f"  Files: {len(files)}")
    print("="*60 + "\n")

    return config


def main():
    parser = argparse.ArgumentParser(description="Generate model config template")
    parser.add_argument("model_path", help="Model file path (supports wildcards)")
    parser.add_argument("--name", required=True, help="Model name (e.g., wan_video_dit)")
    parser.add_argument("--class", required=True, dest="model_class", help="Model class name (e.g., WanModel)")
    parser.add_argument("--resource", default="official", choices=["official", "diffusers"], help="Model source")

    args = parser.parse_args()
    generate_template(args.model_path, args.name, args.model_class, args.resource)


if __name__ == "__main__":
    main()

Usage:

python tools/generate_config_template.py \
    "/models/my_model.safetensors" \
    --name my_custom_dit \
    --class MyCustomDiT \
    --resource official

Verify Configuration Integrity¶