2023-03-29 - To Windows user
The last post*1 indicated unpractical performance as prediction of a single token per a minute. Performance this time is enough practical. Using smaller language model as 4GB file size. It is stored on RAM without swap, and consumer PC with 8GB RAM can run function as ChatGPT at practical level.
Just using different program and model from the last time, work procedures are common with. This post introduces the same procedure with "rupeshs/alpaca.cpp"*2.
Specification of the test machine is also common as following.
|OS||Clear Linux 38590*3|
|CPU||Intel Core i5-8250U|
Build and installation of chat
- download (clone) source cord
- make a directory if required
- cmake and make
To download "alpaca.cpp" under the home directory, and build only "chat" there, commands will be
cd ~ git clone https://github.com/rupeshs/alpaca.cpp cd alpaca.cpp mkdir build cd build cmake .. make chat
Download Alpaca 7B model
The file size of this model is 4GB. The GitHub page introduces downloading it with
In any case, download it to the folder "~/alpaca.cpp/build/bin/". "chat" is also stored here.
Now run "chat". Its options are defined in "utils.cpp". Major one is
so to speak, count of responded words
|-s||seed of random number|
|-t||threads assigned for processing|
In case to run with 8 cores, the command will be
cd ~/alpaca.cpp/build/bin ./chat -t 8
As htop shows below, almost full CPUs are busy, RAM still has spare.
"chat" responds quickly enough from trivia to programming even in PC with 8GB RAM, though low precision and accuracy due to small size model.
!git clone https://github.com/rupeshs/alpaca.cpp %cd /content/alpaca.cpp %mkdir build %cd build !cmake .. !make chat !curl -o ggml-alpaca-7b-q4.bin -C - https://gateway.estuary.tech/gw/ipfs/QmQ1bf2BTnYxq73MFJWu1B7bQ2UD6qG7D7YDCxhTndVkPC !./chat