Maxkb加载gpt2失败

问题描述

Maxkb使用AI高级编排时,使用AI对话会抛出Can’t load tokenizer for ‘gpt2’的异常

1
2
3
Can't load tokenizer for 'gpt2. if you were trying to load it from "htps://huggingface.co/models, make sure you don't have a
local directory with the same name. 0therwise, make sure 'gpt2 is the correct path to a directory containing all relevant files for
a GPT2TokenizerFast tokenizer.

解决方法

这是由于用于计算token的gpt2没有下载到本地导致的

可以通过huggingface进行下载

1
2
pip install -U huggingface_hub   # 安装huggingface库
huggingface-cli download --resume-download gpt2 --local-dir gpt2 # 下载gpt2

将下载完成后的模型放在/opt/maxkb/app/models/tokenizer/gpt2

修改Maxkb中查找gpt2的路径

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# apps\common\config\tokenizer_manage_config.py
class TokenizerManage:
tokenizer = None

@staticmethod
def get_tokenizer():
from transformers import GPT2TokenizerFast
if TokenizerManage.tokenizer is None:
TokenizerManage.tokenizer = GPT2TokenizerFast.from_pretrained(
'/opt/maxkb/app/models/tokenizer/gpt2',
cache_dir="/opt/maxkb/model/tokenizer",
local_files_only=True,
resume_download=False,
force_download=False)
return TokenizerManage.tokenizer

参考