gptsovit tensorRT推理加速笔记

📅 2026/6/27 5:15:52
gptsovit tensorRT推理加速笔记
文章目录一、环境准备1模型下载2pth模型导出为onnx模型3安装trt把onnx文件转为trt一、环境准备1模型下载模型放在/root/autodl-tmpI/O读取会快项目放在根目录需要创建软连接还要创建onnx导出目录cd/root/3rd/GPT-SoVITS_minimal_inferencemkdir-p/root/autodl-tmp/GPT-SoVITS_minimal_inference/pretrained_modelsmkdir-p/root/autodl-tmp/GPT-SoVITS_minimal_inference/onnx_exportrm-rfpretrained_models onnx_exportln-s/root/autodl-tmp/GPT-SoVITS_minimal_inference/pretrained_models pretrained_modelsln-s/root/autodl-tmp/GPT-SoVITS_minimal_inference/onnx_export onnx_export然后模型都下载/放到这里/root/autodl-tmp/GPT-SoVITS_minimal_inference/pretrained_models##开始下载模型cd/root/3rd/GPT-SoVITS_minimal_inferenceHF_ENDPOINThttps://hf-mirror.com hf download lj1995/GPT-SoVITS\--includechinese-hubert-base/*\--includechinese-roberta-wwm-ext-large/*\--includes1v3.ckpt\--includev2Pro/s2Gv2ProPlus.pth\--includesv/pretrained_eres2netv2w24s4ep4.ckpt\--local-dir pretrained_models#下载完可以检查ls-lhpretrained_modelsls-lhpretrained_models/v2Prols-lhpretrained_models/sv下载速度慢换成modelscope国内源cd/root/3rd/GPT-SoVITS_minimal_inference python -PY from modelscope import snapshot_download snapshot_download( dienstag/chinese-roberta-wwm-ext-large, local_dirpretrained_models/chinese-roberta-wwm-ext-large ) snapshot_download( innnky/chinese-hubert-base-tencent, local_dirpretrained_models/chinese-hubert-base ) PY2pth模型导出为onnx模型由于基模不能导出为onnx模型补充下载模型文件cd/root/3rd/GPT-SoVITS_minimal_inference#pretrained_models/GPT_weights_v2ProPlus/*.ckpt#pretrained_models/SoVITS_weights_v2ProPlus/*.pthHF_HUB_DISABLE_XET1python -PY from huggingface_hub import hf_hub_download import os import shutil repo_id lj1995/GPT-SoVITS downloads [ (s1v3.ckpt, pretrained_models/GPT_weights_v2ProPlus/s1v3.ckpt), (v2Pro/s2Gv2ProPlus.pth, pretrained_models/SoVITS_weights_v2ProPlus/s2Gv2ProPlus.pth), (sv/pretrained_eres2netv2w24s4ep4.ckpt, pretrained_models/sv/pretrained_eres2netv2w24s4ep4.ckpt), ] for filename, target in downloads: os.makedirs(os.path.dirname(target), exist_okTrue) src hf_hub_download(repo_idrepo_id, filenamefilename) shutil.copy2(src, target) print(fsaved: {target}) PY导出onnx模型cd/root/3rd/GPT-SoVITS_minimal_inference pipinstall-rrequirements.txt python export_onnx.py\--gpt_pathpretrained_models/GPT_weights_v2ProPlus/s1v3.ckpt\--sovits_pathpretrained_models/SoVITS_weights_v2ProPlus/s2Gv2ProPlus.pth\--cnhubert_base_pathpretrained_models/chinese-hubert-base\--bert_pathpretrained_models/chinese-roberta-wwm-ext-large\--sv_pathpretrained_models/sv/pretrained_eres2netv2w24s4ep4.ckpt\--output_dironnx_export/v2proplus_base\--max_len1000报错ImportError: libcudart.so.13 由于 onnxruntime-gpu1.27.0 pip uninstall-yonnxruntime-gpu onnxruntime pipinstall--no-cache-dir onnxruntime-gpu1.22.0开始导出onnx模型(base)rootautodl-container-943b48886a-a5c25e1d:~/3rd/GPT-SoVITS_minimal_inference# python export_onnx.py \--gpt_pathpretrained_models/GPT_weights_v2ProPlus/s1v3.ckpt\--sovits_pathpretrained_models/SoVITS_weights_v2ProPlus/s2Gv2ProPlus.pth\--cnhubert_base_pathpretrained_models/chinese-hubert-base\--bert_pathpretrained_models/chinese-roberta-wwm-ext-large\--sv_pathpretrained_models/sv/pretrained_eres2netv2w24s4ep4.ckpt\--output_dironnx_export/v2proplus_base\--max_len1000Loading models... Exporting to onnx_export/v2proplus_base... Exporting SSL... Exporting BERT... Exporting VQEncoder... Exporting GPT Encoder... Exporting GPT Step... Exporting SoVITS... Exporting Spectrogram... min value is tensor(-3.8879)max value is tensor(4.3243)Exporting SV Embedding... Export complete!Config saved to onnx_export/v2proplus_base/config.json转 FP16python onnx_to_fp16.py\--input_dironnx_export/v2proplus_base\--output_dironnx_export/v2proplus_base_fp16....................Saved: onnx_export/v2proplus_base_fp16/gpt_encoder.onnx Processing: gpt_step.onnx|Strategy: FP16(Mixed)[FP16]Converting to FP16...[Attribute Fix]Fixed49attributes(Random/Cast mismatch). Simplifying... Saved: onnx_export/v2proplus_base_fp16/gpt_step.onnx Processing: sovits.onnx|Strategy: FP16(Mixed)[FP16]Converting to FP16... /root/miniconda3/lib/python3.12/site-packages/onnxconverter_common/float16.py:63: UserWarning: the float32 number-10000.0will be truncated to-10000.0warnings.warn([Attribute Fix]Fixed45attributes(Random/Cast mismatch). Simplifying... Saved: onnx_export/v2proplus_base_fp16/sovits.onnx Processing: spectrogram.onnx|Strategy: FP32(Keep)[FP16]Skipping FP16 conversion(Sensitivity/Low-Cost). Simplifying... Saved: onnx_export/v2proplus_base_fp16/spectrogram.onnx Processing: sv_embedding.onnx|Strategy: FP32(Keep)[FP16]Skipping FP16 conversion(Sensitivity/Low-Cost). Simplifying... Saved: onnx_export/v2proplus_base_fp16/sv_embedding.onnx Optimization complete: onnx_export/v2proplus_base_fp163安装trt把onnx文件转为trt下载trt 10.16版本mkdir-p/root/3rd/trtcd/root/3rd/trtwget-Onv-tensorrt-local-repo-ubuntu2204-10.16.1-cuda-13.2_1.0-1_amd64.deb\https://developer.download.nvidia.com/compute/tensorrt/10.16.1/local_installers/nv-tensorrt-local-repo-ubuntu2204-10.16.1-cuda-13.2_1.0-1_amd64.deb下载完安装 dpkg-inv-tensorrt-local-repo-ubuntu2204-10.16.1-cuda-13.2_1.0-1_amd64.debcp/var/nv-tensorrt-local-repo-ubuntu2204-10.16.1-cuda-13.2/nv-tensorrt-local-*-keyring.gpg /usr/share/keyrings/apt-getupdateapt-getinstall-ytensorrt libnvinfer-dev libnvinfer-plugin-dev libnvonnxparsers-dev libnvinfer-bin 验证whichtrtexec trtexec--version下载trt后转成引擎文件再回到 ONNX 项目转 TRTcd/root/3rd/GPT-SoVITS_minimal_inference python onnx2trt.py\--input_dironnx_export/v2proplus_base_fp16\--output_dironnx_export/v2proplus_base_trt_fp16\--precisionfp16\--shape_profilefitted