Commit Graph

22 Commits

Author SHA1 Message Date
ZGY cd79a647fa Merge pull request #263 from Oumnya/fix/mps-bf16-dtype
fix(mps): force float32 on Apple Silicon to avoid bf16 quality loss
2026-04-21 18:49:48 +08:00
JunghwanNA ec2acec8a1 Harden LoRA checkpoint loading against untrusted pickle payloads
LoRA is a first-class workflow in VoxCPM, and the project already prefers
safetensors plus weights-only fallback loading for base model artifacts. The
legacy LoRA .ckpt/.pth path was the remaining place that still deserialized
arbitrary pickle objects, so this switches it to weights_only=True and adds
focused regression coverage for both model loaders.

Constraint: Must preserve compatibility with tensor-only legacy LoRA checkpoints
Rejected: Remove .ckpt/.pth support entirely | too disruptive for existing users
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep LoRA artifact handling aligned with the existing safetensors-first, weights-only loading pattern
Tested: python3 -m pytest -q tests/test_lora_checkpoint_loading.py tests/test_model_utils.py -q
Not-tested: Full end-to-end LoRA hot-load with heavyweight model assets
2026-04-18 00:31:28 +09:00
oumnya 38d61cdf03 fix(mps): force float32 on Apple Silicon to avoid bf16 quality loss
VoxCPM checkpoints default to bfloat16. Following commit e4e0496 which
added MPS device routing, running with `device=mps` selects bf16 on
Apple Silicon. On Metal, bf16 introduces enough numerical drift in the
diffusion AR loop that the synthesized audio is glitched and trips the
model's badcase detector, which retries until the per-call retry budget
is exhausted. Effectively MPS support is unusable in the default config.

This patch adds a single helper, `pick_runtime_dtype(device, dtype)`,
that promotes any low-precision dtype to float32 when the resolved
device is `mps`. CUDA and CPU paths are untouched. An opt-out env var
`VOXCPM_MPS_DTYPE` lets users force a specific dtype on MPS once future
PyTorch / macOS releases improve bf16 stability.

Both VoxCPMModel and VoxCPM2Model adopt the helper in their __init__,
replacing what would otherwise be duplicated inline checks.

Verified locally on Apple M5 Max, PyTorch 2.11, macOS 15:
- VoxCPM2 (2B): clean output, RTF ~0.78 steady state
- VoxCPM 0.5B: clean output, RTF ~0.92
- No badcase retries fired in any test
- VOXCPM_MPS_DTYPE=bfloat16 round-trips and reproduces the original
  glitched output, confirming the override path.
2026-04-15 12:22:56 +08:00
刘鑫 1565e83efe fix: complete shared generator cleanup coverage
Move generator close handling into a shared utility and wire the core generation pipeline through it so partially-consumed prompt cache generators are cleaned up consistently across both model variants and the public VoxCPM wrapper.

Made-with: Cursor
2026-04-13 17:39:05 +08:00
刘鑫 61b36d4e56 refactor: centralize generator cleanup in model helpers
Factor repeated next-and-close patterns into a shared helper in both VoxCPM model variants so non-streaming inference cleans up generators consistently while keeping the issue reference close to the workaround.

Made-with: Cursor
2026-04-13 16:57:08 +08:00
sharziki fb46aad9a5 fix: close file handles in from_local() config loading
Use context managers when reading config.json in VoxCPMModel.from_local()
and VoxCPM2Model.from_local() to prevent file descriptor leaks. Also add
explicit encoding="utf-8" to avoid locale-dependent decode errors.

Closes #235

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 00:01:14 -04:00
刘鑫 e4e049624c update finetuning pipeline and runtime device handling
Support optional ref_audio samples in finetuning and make runtime device selection explicit while keeping auto fallback behavior consistent. Also ignore the local app override file to avoid accidental commits.

Made-with: Cursor
2026-04-11 11:08:50 +08:00
刘鑫 75cfa3e9b8 fix: use uncompiled feat_encoder for prefill to prevent CUDA Graph dynamic shape accumulation (#209) 2026-04-09 16:00:17 +08:00
刘鑫 d9cf376e16 update voxcpm2 2026-03-31 11:50:37 +08:00
haosenwang1018 8df79de636 fix: use specific exceptions instead of bare except
- lora_ft_webui.py: except (JSONDecodeError, OSError) for config file
- voxcpm.py: except ImportError for triton availability check
2026-02-24 22:19:45 +00:00
vytskalt f2e203d5e2 print debug messages to stderr instead of stdout 2026-01-09 20:05:52 +02:00
刘鑫 ee5f2567ac FIX:When a prompt is present, concatenate two patches as the context for VAE decoding 2025-12-15 20:37:02 +08:00
刘鑫 b3a2d95fec FIX:When a prompt is present, concatenate two patches as the context for VAE decoding 2025-12-15 20:35:46 +08:00
Labmem-Zhouyx 461ad7e506 Update: VoxCPM1.5 and fine-tuning supprt 2025-12-05 21:00:01 +08:00
刘鑫 2eb4d39719 FX: Add MPS support 2025-09-28 21:06:35 +08:00
AbrahamSanders 5c5da0dbe6 Add a streaming API for VoxCPM 2025-09-19 16:56:11 -04:00
刘鑫 dc6b6d1d1c Fx: capture compile error on Windows 2025-09-18 19:23:13 +08:00
周逸轩 1a46c5d1ad update README 2025-09-18 14:53:37 +08:00
周逸轩 5257ec3dc5 FX: noise point 2025-09-18 14:50:01 +08:00
刘鑫 e5bcb735f0 Remove segment text logic 2025-09-18 12:02:37 +08:00
刘鑫 032c7fe403 capture torch compile error 2025-09-17 18:09:09 +08:00
zengguoyang 272b8ffbf6 init 2025-09-16 11:46:47 +08:00