Investigations - ChatRWKV
Note
I’d like to try https://huggingface.co/spaces/BlinkDL/Raven-RWKV-7B locally 🥺
-
Download some models from https://huggingface.co/BlinkDL/rwkv-4-raven/tree/main
$
-
Clone ChatRWKV
-
$ git clone 'https://github.com/BlinkDL/ChatRWKV.git'
-
-
Roundtrip trying to run
v2/chat.py
-
Some dependency requires
python <3.11
-
$ cd v2 $ python --version Python 3.10.10 $ python chat.py <...ERRORS and FIXES...> $ pip install -r ../requirements.txt $ pip install numpy torch
Tips
-
ModuleNotFoundError: No module named '_ctypes'
-
https://blog.goo.ne.jp/dak-ikd/e/ae172dc7afc8700cc1244ccc96e0a75d
-
Follow https://github.com/pyenv/pyenv/wiki#suggested-build-environment
-
sudo apt install....
-
-
and re-build the Python environment (I am using pyenv)
-
-
RuntimeError: Ninja is required to load C++ extensions
-
when
os.environ["RWKV_CUDA_ON"] = '1'
-
raise EnvironmentError('CUDA_HOME environment variable is not set. '
-
needed
/.pyenv/versions/3.10.11/lib/python3.10/site-packages/nvidia/cuda_runtime
?? -
or install CUDA
-
(Not yet resolved.)
-
-
Combinations
-
args.strategy = 'cpu fp32i8'
-
runs well for 3B (
RWKV-4-Raven-3B-v8-EngAndMore-20230408-ctx4095.pth
) with 12+GB RAM
-
-
args.strategy = 'cuda fp16i8 *10 -> cpu fp32i8'
-
run l for 3B (
RWKV-4-Raven-3B-v8-EngAndMore-20230408-ctx4095.pth
) with 9+GB RAM and 1.9GB VRAM
-
-
args.strategy = 'cuda fp16i8 *15 -> cpu fp32i8'
-
run l for 3B (
RWKV-4-Raven-3B-v8-EngAndMore-20230408-ctx4095.pth
) with 9+GB RAM and 2.3GB VRAM
-
-
args.strategy = 'cuda fp16i8 *16 -> cpu fp32i8'
-
Somehow unstable…