VoxCPM

Author	SHA1	Message	Date
gluttony-10	d3cc88722c	feat: enhance control text processing in VoxCPMDemo Added regex to strip parentheses from control instructions in the text synthesis method to ensure compatibility with the expected prompt format. This change improves the robustness of the input handling.	2026-04-21 07:07:24 +00:00
xliucs	13605c5a0e	Merge pull request #266 from linyueqian/docs/add-vllm-omni-references docs: add vLLM-Omni serving references	2026-04-17 10:46:21 +08:00
Yueqian Lin	afa63e6195	docs: add vLLM-Omni serving references Document vLLM-Omni as a production serving option for VoxCPM2 alongside the existing Nano-vLLM reference. Mirrors the addition in README_zh.md, and adds an ecosystem table entry. Install snippet follows the upstream vLLM-Omni installation guide (from source, since vllm-omni is rapidly evolving). Signed-off-by: Yueqian Lin <linyueqian@outlook.com>	2026-04-16 21:19:27 -05:00
liuxin	eae0a29908	docs: add ComfyUI RH link Made-with: Cursor	2026-04-16 11:46:40 +08:00
Labmem-Zhouyx	35895982d7	Merge PR #212 : perf: stateful streaming VAE decode — eliminate redundant overlap - StreamingVAEDecoder caches CausalConv1d/CausalTransposeConv1d left-pad state between calls — one patch in, one patch out, no overlap - _inference yields single-patch latents in streaming mode - 2x faster streaming VAE decode, more accurate (max diff 0.0005 vs 0.0011)	2026-04-15 16:01:38 +08:00
Labmem-Zhouyx	f7f1b78c4d	fix: correct transpose conv context	2026-04-15 16:01:02 +08:00
刘鑫	1565e83efe	fix: complete shared generator cleanup coverage Move generator close handling into a shared utility and wire the core generation pipeline through it so partially-consumed prompt cache generators are cleaned up consistently across both model variants and the public VoxCPM wrapper. Made-with: Cursor	2026-04-13 17:39:05 +08:00
刘鑫	61b36d4e56	refactor: centralize generator cleanup in model helpers Factor repeated next-and-close patterns into a shared helper in both VoxCPM model variants so non-streaming inference cleans up generators consistently while keeping the issue reference close to the workaround. Made-with: Cursor	2026-04-13 16:57:08 +08:00
刘鑫	b1584aec7c	fix: stabilize CPU SDPA mask broadcasting Use an explicit broadcastable attention mask shape during MiniCPM incremental decoding so CPU runtimes avoid a PyTorch SDPA dimension error without changing attention semantics. Made-with: Cursor	2026-04-13 15:38:53 +08:00
xliucs	5510503182	Merge pull request #246 from sharziki/fix/unclosed-file-handles fix: close file handles in from_local() config loading	2026-04-11 13:10:04 +08:00
sharziki	fb46aad9a5	fix: close file handles in from_local() config loading Use context managers when reading config.json in VoxCPMModel.from_local() and VoxCPM2Model.from_local() to prevent file descriptor leaks. Also add explicit encoding="utf-8" to avoid locale-dependent decode errors. Closes #235 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 00:01:14 -04:00
刘鑫	e4e049624c	update finetuning pipeline and runtime device handling Support optional ref_audio samples in finetuning and make runtime device selection explicit while keeping auto fallback behavior consistent. Also ignore the local app override file to avoid accidental commits. Made-with: Cursor	2026-04-11 11:08:50 +08:00
xliucs	abf01b9bf3	Merge pull request #229 from kuishou68/fix/issue-228-validate-text-type-order fix: correct isinstance/strip order in _generate() to prevent AttributeError on non-string input	2026-04-10 10:30:15 +08:00
cocoon	4f4a5b9f6c	fix: correct type-check order in _generate() to prevent AttributeError on non-string input The previous guard `not text.strip() or not isinstance(text, str)` called .strip() before verifying that text is actually a string, causing an AttributeError (e.g. for int input) instead of the intended ValueError. Swap operand order so isinstance check short-circuits first. Closes #228	2026-04-09 16:13:40 +00:00
刘鑫	79c0cf68dd	chore: remove accidentally committed app_local.py Made-with: Cursor	2026-04-09 16:05:18 +08:00
刘鑫	75cfa3e9b8	fix: use uncompiled feat_encoder for prefill to prevent CUDA Graph dynamic shape accumulation (#209 )	2026-04-09 16:00:17 +08:00
Labmem-Zhouyx	5611bd08a0	optim app.py	2026-04-09 00:30:19 +08:00
Kevin Knoedler	66205135fc	perf: stateful streaming VAE decode — eliminate redundant overlap Streaming decode previously re-decoded 4 overlapping patches through the VAE each step, discarding 75% of the output. Replace with stateful decode that carries causal conv padding buffers between calls — one patch in, one patch out, no overlap. Changes: - Add StreamingVAEDecoder to audiovae/audio_vae_v2.py — caches CausalConv1d and CausalTransposeConv1d left-pad state between calls - AudioVAE.streaming_decode() context manager for clean lifecycle - _inference yields single-patch latents in streaming mode - _generate and _generate_with_prompt_cache use StreamingVAEDecoder Streaming VAE decode time (isolated): 289ms → 148ms (2x faster) Stateful vs full decode: cosine 1.0000, max diff 0.0005 (more accurate than previous overlap approach at max diff 0.001) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 09:09:22 -07:00
Labmem-Zhouyx	364eff6840	update readme: python version	2026-04-08 23:07:38 +08:00
Labmem-Zhouyx	6d10932b09	update readme	2026-04-08 18:48:58 +08:00
Labmem-Zhouyx	68af4fe502	fix: ft log and setting 2.0.2	2026-04-08 18:15:17 +08:00
Labmem-Zhouyx	ee3649c1b3	fix: streaming decode	2026-04-08 17:25:54 +08:00
Labmem-Zhouyx	82d77d445c	fix: decode chunksize for audiovae_v2	2026-04-08 16:31:36 +08:00
Labmem-Zhouyx	8f95d13073	update readme: 30-language asr result on internal benchmark	2026-04-08 15:36:56 +08:00
Labmem-Zhouyx	df38f0a167	update readme for modelscope download 2.0.1	2026-04-08 11:29:19 +08:00
Labmem-Zhouyx	9adfaf6996	update demo for zh	2026-04-08 00:15:16 +08:00
刘鑫	46cfce0c97	fix VoxCPM2 training sample_rate: 48000 -> 16000 (match AudioVAE encoder) Made-with: Cursor	2026-04-07 22:59:18 +08:00
Labmem-Zhouyx	da700f264e	update ZH readme	2026-04-07 18:04:56 +08:00
Labmem-Zhouyx	9da570d409	remove wechat link	2026-04-07 15:29:12 +08:00
Labmem-Zhouyx	9374524c47	update readme	2026-04-06 23:01:16 +08:00
Labmem-Zhouyx	ec6d30e996	update readme	2026-04-06 22:56:06 +08:00
Labmem-Zhouyx	a010d621ff	update readme Made-with: Cursor 2.0.0	2026-04-06 22:09:24 +08:00
Dennis Huang	3f005b0dbd	Enhance README formatting and community section for better visibility	2026-04-06 19:50:29 +08:00
Labmem-Zhouyx	039c6e9f92	update	2026-04-06 17:15:10 +08:00
Dennis Huang	5734ab36b6	Update README	2026-04-06 16:24:12 +08:00
Labmem-Zhouyx	746631c38d	update	2026-04-06 16:10:50 +08:00
Labmem-Zhouyx	07b8b5c01f	update readme	2026-04-06 15:53:58 +08:00
Labmem-Zhouyx	f738cc9946	update	2026-04-03 18:46:29 +08:00
Labmem-Zhouyx	0c2cf23617	Update app.py UI, adjust streaming_prefix_len, remove legacy docs - Refine app.py: Ultimate Cloning naming, NFE slider, i18n polish - Change streaming_prefix_len default from 3 to 4 for smoother decoding - Remove legacy docs/ directory (migrated to ReadTheDocs) Made-with: Cursor	2026-04-03 18:42:41 +08:00
Labmem-Zhouyx	b823d8107c	Merge branch 'dev_2.0' of https://github.com/OpenBMB/VoxCPM into dev_2.0	2026-04-03 17:44:46 +08:00
刘鑫	a87739426f	add voxcpm2 finetune conf	2026-04-03 14:23:15 +08:00
Labmem-Zhouyx	12c2b8ff98	update readme	2026-04-02 21:01:23 +08:00
刘鑫	30c300cfe8	adjust default cfg range	2026-04-02 18:14:35 +08:00
刘鑫	addee2c550	surport voxcpm2 cli	2026-04-01 21:15:55 +08:00
Labmem-Zhouyx	42c428164c	feat: add no_rope support for residual LM and fix streaming continuation decoding - Add `residual_lm_no_rope` config option in VoxCPMConfig and propagate to MiniCPMModel - Add `no_rope` field to MiniCPM4Config; make RoPE embedding optional in MiniCPMModel and MiniCPMAttention - Add `streaming_prefix_len` parameter to generation interface - Fix non-streaming audio decode in continuation mode to trim leading prefix patches consistently - Refactor streaming prefix context preparation: distinguish continuation vs. zero-shot via feat_mask trailing bit instead of audio_mask sum Made-with: Cursor	2026-03-31 17:07:33 +08:00
刘鑫	d9cf376e16	update voxcpm2	2026-03-31 11:50:37 +08:00
刘鑫	23ed7ffeee	fix: fix some bugs in resuming multi-GPU training	2026-03-13 18:43:07 +08:00
xliucs	7823e14b82	Merge pull request #188 from haosenwang1018/fix/bare-excepts fix: use specific exceptions instead of bare except	2026-03-03 11:49:00 +08:00
haosenwang1018	8df79de636	fix: use specific exceptions instead of bare except - lora_ft_webui.py: except (JSONDecodeError, OSError) for config file - voxcpm.py: except ImportError for triton availability check	2026-02-24 22:19:45 +00:00
xliucs	acaadb19e9	Merge pull request #186 from symhsym/patch-1 Update train_voxcpm_finetune.py	2026-02-11 18:05:39 +08:00

1 2 3

127 Commits