From c26ad07c69e42ba81da99c9c83cf4e7d42cf425e Mon Sep 17 00:00:00 2001 From: Akemi Izuko Date: Sun, 31 Dec 2023 18:43:19 -0700 Subject: [PATCH] Llamas: update urls --- src/content/llama/a-history-of-llamas.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/content/llama/a-history-of-llamas.md b/src/content/llama/a-history-of-llamas.md index 380e1fd..ad0ba61 100644 --- a/src/content/llama/a-history-of-llamas.md +++ b/src/content/llama/a-history-of-llamas.md @@ -102,7 +102,7 @@ VRAM, which meant many home computers could now run 4-bit quantized 7B models! Previously, most enthusiasts would have to rent cloud GPUs to run their "local" llamas. Quantizing into GGUF is a very expensive process, so [TheBloke](https://huggingface.co/TheBloke) on Huggingface emerges the defacto -source for pre-quantized llamas. +source for [pre-quantized llamas](../quantization). Based on LLaMa, the open source [llama.cpp](https://github.com/ggerganov/llama.cpp) becomes the leader of local