How to run LLAMA 3 on your local computer-: A STEP-BY-STEP GUIDE
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.
Model developers -Meta
Input Models input -text only.
Output Models generate -text and code only.
Model Architecture Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
What’s new with Llama-3?
Llama 3 brings significant enhancements over Llama 2, including a new tokenizer that increases the vocabulary size to 128,256 tokens (up from 32K tokens). This expanded vocabulary enhances text encoding efficiency, promoting stronger multilingual capabilities.
Moreover, Llama 3 models underwent extensive training on a diverse dataset comprising over 15 trillion tokens, approximately eight times more data than its predecessor. Specifically, Llama 3 Instruct, tailored for dialogue applications, was fine-tuned on a dataset of over 10 million human-annotated samples using a combination of techniques such as supervised fine-tuning, rejection sampling, proximal policy optimization, and direct policy optimization.
The Manual Method: Using Code-Based Integration to llama3.
To manually setup llama3 into local, you can follow the following steps:-
Step 1: Sign-in Procedures
- Create an account on HuggingFace
- Request for llama model access (It may take a day to get access.
- Go to below link and request llama access
Link: https://ai.meta.com/resources/models-and-libraries/llama-downloads/ - As llama 3 is private repo, login by huggingface and generate a token.
Link: https://huggingface.co/settings/tokens
Step 2: Configuring Tokens Locally
pip install huggingface_hub
huggingface-cli login (enter your access token here)
Step 3: Install the necessary libraries
pip install transformers
pip install huggingfacehub
pip install torch
pip install accelerate
Step 4: Create a “touch run.py”
import transformers
import torch
model_id = "meta-llama/Meta-Llama-3-8B"
pipeline = transformers.pipeline("text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto")
pipeline("Hey how are you doing today?")
Step 5 : Run python file
python run.py