VoxCPM

Author	SHA1	Message	Date
ZGY	cd79a647fa	Merge pull request #263 from Oumnya/fix/mps-bf16-dtype fix(mps): force float32 on Apple Silicon to avoid bf16 quality loss	2026-04-21 18:49:48 +08:00
JunghwanNA	ec2acec8a1	Harden LoRA checkpoint loading against untrusted pickle payloads LoRA is a first-class workflow in VoxCPM, and the project already prefers safetensors plus weights-only fallback loading for base model artifacts. The legacy LoRA .ckpt/.pth path was the remaining place that still deserialized arbitrary pickle objects, so this switches it to weights_only=True and adds focused regression coverage for both model loaders. Constraint: Must preserve compatibility with tensor-only legacy LoRA checkpoints Rejected: Remove .ckpt/.pth support entirely \| too disruptive for existing users Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep LoRA artifact handling aligned with the existing safetensors-first, weights-only loading pattern Tested: python3 -m pytest -q tests/test_lora_checkpoint_loading.py tests/test_model_utils.py -q Not-tested: Full end-to-end LoRA hot-load with heavyweight model assets	2026-04-18 00:31:28 +09:00
oumnya	38d61cdf03	fix(mps): force float32 on Apple Silicon to avoid bf16 quality loss VoxCPM checkpoints default to bfloat16. Following commit `e4e0496` which added MPS device routing, running with `device=mps` selects bf16 on Apple Silicon. On Metal, bf16 introduces enough numerical drift in the diffusion AR loop that the synthesized audio is glitched and trips the model's badcase detector, which retries until the per-call retry budget is exhausted. Effectively MPS support is unusable in the default config. This patch adds a single helper, `pick_runtime_dtype(device, dtype)`, that promotes any low-precision dtype to float32 when the resolved device is `mps`. CUDA and CPU paths are untouched. An opt-out env var `VOXCPM_MPS_DTYPE` lets users force a specific dtype on MPS once future PyTorch / macOS releases improve bf16 stability. Both VoxCPMModel and VoxCPM2Model adopt the helper in their __init__, replacing what would otherwise be duplicated inline checks. Verified locally on Apple M5 Max, PyTorch 2.11, macOS 15: - VoxCPM2 (2B): clean output, RTF ~0.78 steady state - VoxCPM 0.5B: clean output, RTF ~0.92 - No badcase retries fired in any test - VOXCPM_MPS_DTYPE=bfloat16 round-trips and reproduces the original glitched output, confirming the override path.	2026-04-15 12:22:56 +08:00
刘鑫	1565e83efe	fix: complete shared generator cleanup coverage Move generator close handling into a shared utility and wire the core generation pipeline through it so partially-consumed prompt cache generators are cleaned up consistently across both model variants and the public VoxCPM wrapper. Made-with: Cursor	2026-04-13 17:39:05 +08:00
刘鑫	61b36d4e56	refactor: centralize generator cleanup in model helpers Factor repeated next-and-close patterns into a shared helper in both VoxCPM model variants so non-streaming inference cleans up generators consistently while keeping the issue reference close to the workaround. Made-with: Cursor	2026-04-13 16:57:08 +08:00
sharziki	fb46aad9a5	fix: close file handles in from_local() config loading Use context managers when reading config.json in VoxCPMModel.from_local() and VoxCPM2Model.from_local() to prevent file descriptor leaks. Also add explicit encoding="utf-8" to avoid locale-dependent decode errors. Closes #235 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 00:01:14 -04:00
刘鑫	e4e049624c	update finetuning pipeline and runtime device handling Support optional ref_audio samples in finetuning and make runtime device selection explicit while keeping auto fallback behavior consistent. Also ignore the local app override file to avoid accidental commits. Made-with: Cursor	2026-04-11 11:08:50 +08:00
刘鑫	75cfa3e9b8	fix: use uncompiled feat_encoder for prefill to prevent CUDA Graph dynamic shape accumulation (#209 )	2026-04-09 16:00:17 +08:00
刘鑫	d9cf376e16	update voxcpm2	2026-03-31 11:50:37 +08:00
haosenwang1018	8df79de636	fix: use specific exceptions instead of bare except - lora_ft_webui.py: except (JSONDecodeError, OSError) for config file - voxcpm.py: except ImportError for triton availability check	2026-02-24 22:19:45 +00:00
vytskalt	f2e203d5e2	print debug messages to stderr instead of stdout	2026-01-09 20:05:52 +02:00
刘鑫	ee5f2567ac	FIX:When a prompt is present, concatenate two patches as the context for VAE decoding	2025-12-15 20:37:02 +08:00
刘鑫	b3a2d95fec	FIX:When a prompt is present, concatenate two patches as the context for VAE decoding	2025-12-15 20:35:46 +08:00
Labmem-Zhouyx	461ad7e506	Update: VoxCPM1.5 and fine-tuning supprt	2025-12-05 21:00:01 +08:00
刘鑫	2eb4d39719	FX: Add MPS support	2025-09-28 21:06:35 +08:00
AbrahamSanders	5c5da0dbe6	Add a streaming API for VoxCPM	2025-09-19 16:56:11 -04:00
刘鑫	dc6b6d1d1c	Fx: capture compile error on Windows	2025-09-18 19:23:13 +08:00
周逸轩	1a46c5d1ad	update README	2025-09-18 14:53:37 +08:00
周逸轩	5257ec3dc5	FX: noise point	2025-09-18 14:50:01 +08:00
刘鑫	e5bcb735f0	Remove segment text logic	2025-09-18 12:02:37 +08:00
刘鑫	032c7fe403	capture torch compile error	2025-09-17 18:09:09 +08:00
zengguoyang	272b8ffbf6	init	2025-09-16 11:46:47 +08:00

22 Commits