From b9083aa263433ccd9e61cc702267bd3c66fae837 Mon Sep 17 00:00:00 2001 From: Akemi Izuko Date: Mon, 1 Jan 2024 13:04:10 -0700 Subject: [PATCH] Llama: change casing from LLaMa to LLaMA --- src/content/llama/a-history-of-llamas.md | 14 +++++++------- src/content/llama/localllama_links.md | 2 +- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/src/content/llama/a-history-of-llamas.md b/src/content/llama/a-history-of-llamas.md index 9721cb7..8eaeff2 100644 --- a/src/content/llama/a-history-of-llamas.md +++ b/src/content/llama/a-history-of-llamas.md @@ -34,8 +34,8 @@ Here's a brief timeline: 1. **March 2022**: InstructGPT paper is pre-print. 2. **November 2022**: ChatGPT is released. - 3. **March 2023**: LLaMa (open source) and GPT4 is released. - 4. **July 2023**: LLaMa 2 is released, alongside GGUF quantization. + 3. **March 2023**: LLaMA (open source) and GPT4 is released. + 4. **July 2023**: LLaMA 2 is released, alongside GGUF quantization. 5. **August 2023**: AWQ quantization paper. 6. **September 2023**: Mistral 7B is released. 7. **December 2023**: Mixtral 8x7B becomes the first MoE local llama. @@ -92,8 +92,8 @@ Nothing open source was even remotely close to GPT4 at this point. #### Mid 2023 We finally see the local llama movement really take off around August 2023. Meta -released [LLaMa2](https://ai.meta.com/blog/llama-2/), which has decent -performance even on its 7B version. One key contribution of LLaMa2 was the GGUF +released [LLaMA2](https://ai.meta.com/blog/llama-2/), which has decent +performance even on its 7B version. One key contribution of LLaMA2 was the GGUF quantization format. This format allows a model to be run on a mix of RAM and VRAM, which meant many home computers could now run 4-bit quantized 7B models! Previously, most enthusiasts would have to rent cloud GPUs to run their "local" @@ -101,9 +101,9 @@ llamas. Quantizing into GGUF is a very expensive process, so [TheBloke](https://huggingface.co/TheBloke) on Huggingface emerges the defacto source for [pre-quantized llamas](../quantization). -Based on LLaMa, the open source +Based on LLaMA, the open source [llama.cpp](https://github.com/ggerganov/llama.cpp) becomes the leader of local -llama inference backends. Its support extends far beyond only running LLaMa2, +llama inference backends. Its support extends far beyond only running LLaMA2, it's the first major backend to support running GGUF quantizations! In addition, the [Activation-aware Weight @@ -114,7 +114,7 @@ quantized models. This is especially true for very heavily quantized models like community at this point. AWQ lacks support anywhere at this time. In late September 2023, out of nowhere came a French startup with a 7B model -that made leaps on top of LLaMa2 7B. +that made leaps on top of LLaMA2 7B. [Mistral](https://mistral.ai/news/announcing-mistral-7b/) remains the best local llama until mid-December 2023. Huge work in improving model tuning, particularly character creation and code-assistant models, is done on top of Mistral 7B. diff --git a/src/content/llama/localllama_links.md b/src/content/llama/localllama_links.md index 6429284..3b6923a 100644 --- a/src/content/llama/localllama_links.md +++ b/src/content/llama/localllama_links.md @@ -13,7 +13,7 @@ heroImage: '/images/llama/llama-cool.avif' locally-hosted (typically open source) llama, in contrast to commercially hosted ones. -# Local LLaMa Quickstart +# Local LLaMA Quickstart I've recently become aware of the open source LLM (local llama) movement. Unlike traditional open source, the speed at which this field is moving is