less than 1 minute read

Note

I’d like to try https://huggingface.co/spaces/BlinkDL/Raven-RWKV-7B locally 🥺

$ 
  • Clone ChatRWKV

    • $ git clone 'https://github.com/BlinkDL/ChatRWKV.git'

  • Roundtrip trying to run v2/chat.py

    • Some dependency requires python <3.11

$ cd v2
$ python --version
Python 3.10.10
$ python chat.py
<...ERRORS and FIXES...>

$ pip install -r ../requirements.txt
$ pip install numpy torch

Tips

 

Combinations

  • args.strategy = 'cpu fp32i8'
    • runs well for 3B (RWKV-4-Raven-3B-v8-EngAndMore-20230408-ctx4095.pth) with 12+GB RAM

  • args.strategy = 'cuda fp16i8 *10 -> cpu fp32i8'
    • run l for 3B (RWKV-4-Raven-3B-v8-EngAndMore-20230408-ctx4095.pth) with 9+GB RAM and 1.9GB VRAM

  • args.strategy = 'cuda fp16i8 *15 -> cpu fp32i8'
    • run l for 3B (RWKV-4-Raven-3B-v8-EngAndMore-20230408-ctx4095.pth) with 9+GB RAM and 2.3GB VRAM

  • args.strategy = 'cuda fp16i8 *16 -> cpu fp32i8'
    • Somehow unstable…

References

Tags:

Updated: