Text Summarization Llama2: how to Use LLama2 with Langchain
In this Tutorial, I will guide you through how to use LLama2 with langchain for text summarization and named entity recognition using Google Colab Notebook.
Meta, better known to most of us as Facebook, has released a commercial version of Llama-v2, its open-source large language model (LLM) that uses artificial intelligence (AI) to generate text, images, and code.
What is LLama2?
Meta, better known to most of us as Facebook, has released a commercial version of Llama-v2, its open-source large language model (LLM) that uses artificial intelligence (AI) to generate text, images, and code.
Llama 2 is a successor to the Llama 1 model released earlier this year. However, Llama 1 was “closely guarded” and was only available on request.
Before we start! 🦸🏻♀️
If you like this topic and you want to support me:
Like my articles and share that will really help me out👏
Subscribe to me to get the latest article.
Let’s start coding!
Set Up Google Colab: Go to Google Colab (colab.research.google.com) and create a new notebook.
Install Required Libraries: In the first code cell of your Colab notebook, install the necessary libraries using the following code:
!pip install -q transformers einops accelerate langchain bitsandbytes
Login in !huggingface-cli login
!huggingface-cli login
If you already have a Hugging Face account, you can obtain your access token by going to the settings section on the Hugging Face website. From there, click on the “Access Tokens” tab, and you’ll be able to generate or copy your personal access token.
If you don’t have a Hugging Face account yet, you can easily sign up for one on their website. Once you have an account, you can follow the same steps mentioned above to get your access token and use it to access private models and resources through the Hugging Face API or CLI.
Installed the SentencePiece
If you haven’t installed the SentencePiece library, you can install it in your Colab notebook using:
!pip install sentencepiece
We going to set up a language generation pipeline using Hugging Face’s transformers library and a specified model. The AutoTokenizer is used to fetch the tokenizer associated with the model.
The pipeline is then configured with parameters such as the text generation task, model, tokenizer, max_length of the generated text, and a few more.
from langchain import HuggingFacePipelinefrom transformers import AutoTokenizerimport transformersimport torch
model = "meta-llama/Llama-2-7b-chat-hf"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation", #task
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto",
max_length=1000,
do_sample=True,
top_k=10,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id
)
We create an instance of the HuggingFacePipeline class, using the previously configured pipeline and setting the model’s ‘temperature’ parameter, which influences the randomness of predictions.
llm = HuggingFacePipeline(pipeline = pipeline, model_kwargs
= {'temperature':0})
The PromptTemplate class is used to create a template for a language model prompt. This is the instruction that the model will follow when generating text.
In this case, the template asks the model to summarize a text. The text to summarize is placed within triple backquotes (```).
The model is asked to present the summary in bullet points. The {text} inside the template will be replaced by the actual text you want to summarize.
from langchain import PromptTemplate, LLMChain
template = """
Write a concise summary of the following text delimited by
triple backquotes.
Return your response in bullet points which covers the key
points of the text.
```{text}```
BULLET POINT SUMMARY:
"""
prompt = PromptTemplate(template=template, input_variables=["text"])
llm_chain = LLMChain(prompt=prompt, llm=llm)
text = """ As part of Meta’s commitment to open science, today we are publicly
releasing LLaMA (Large Language Model Meta AI), a state-of-the-art foundational
large language model designed to help researchers advance their work in this
subfield of AI. Smaller, more performant models such as LLaMA enable others
in the research community who don’t have access to large amounts of
infrastructure to study these models, further democratizing access in this
important, fast-changing field.Training smaller foundation models like LLaMA
is desirable in the large language model space because it requires far less
computing power and resources to test new approaches, validate others’ work,
and explore new use cases. Foundation models train on a large set of unlabeled
data, which makes them ideal for fine-tuning for a variety of tasks.
We are making LLaMA available at several sizes (7B, 13B, 33B, and 65B
parameters) and also sharing a LLaMA model card that details how we built
the model in keeping with our approach to Responsible AI practices.
Over the last year, large language models — natural language processing (NLP)
systems with billions of parameters — have shown new capabilities to generate
creative text, solve mathematical theorems, predict protein structures,
answer reading comprehension questions, and more. They are one of the clearest
cases of the substantial potential benefits AI can offer at scale to billions
of people.
Even with all the recent advancements in large language models, full research
access to them remains limited because of the resources that are required to
train and run such large models. This restricted access has limited researchers
’ ability to understand how and why these large language models work,
hindering progress on efforts to improve their robustness and mitigate known
issues, such as bias, toxicity, and the potential for generating misinformation.
Smaller models trained on more tokens — which are pieces of words — are easier
to retrain and fine-tune for specific potential product use cases.
We trained LLaMA 65B and LLaMA 33B on 1.4 trillion tokens. Our smallest model,
LLaMA 7B, is trained on one trillion tokens.
Like other large language models, LLaMA works by taking a sequence of words
as an input and predicts a next word to recursively generate text.
To train our model, we chose text from the 20 languages with the most speakers,
focusing on those with Latin and Cyrillic alphabets.
There is still more research that needs to be done to address the risks of
bias, toxic comments, and hallucinations in large language models.
Like other models, LLaMA shares these challenges. As a foundation model, LLaMA
is designed to be versatile and can be applied to many different use cases,
versus a fine-tuned model that is designed for a specific task.
By sharing the code for LLaMA, other researchers can more easily test new
approaches to limiting or eliminating these problems in large language models.
We also provide in the paper a set of evaluations on benchmarks evaluating
model biases and toxicity to show the model’s limitations and to support
further research in this crucial area.
To maintain integrity and prevent misuse, we are releasing our model under
a noncommercial license focused on research use cases. Access to the model
will be granted on a case-by-case basis to academic researchers; those
affiliated with organizations in government, civil society, and academia;
and industry research laboratories around the world. People interested in
applying for access can find the link to the application in our research paper.
We believe that the entire AI community — academic researchers, civil society,
policymakers, and industry — must work together to develop clear guidelines
around responsible AI in general and responsible large language models
in particular. We look forward to seeing what the community can learn —
and eventually build — using LLaMA.
"""
print(llm_chain.run(text))
Let's try it out
print(llm_chain.run(text))
/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:
1270: UserWarning: You have modified the pretrained model configuration to
control generation. This is a deprecated strategy to control generation and
will be removed soon, in a future version. Please use a generation
configuration file (see https://huggingface.co/docs/transformers/main_classes/
text_generation )
warnings.warn(
• The article discusses the public release of LLaMA (Large Language Model Meta AI)
,a state-of-the-art foundational language model designed to advance research
in the field.
• Training smaller models like LLaMA requires less computing power and
resources,
text1 = """
Tesla, Inc. (/ˈtɛslə/ TESS-lə or /ˈtɛzlə/ TEZ-lə[a]) is an American
multinational automotive and clean energy company headquartered in Austin,
Texas. Tesla designs and manufactures electric vehicles (cars and trucks),
stationary battery energy storage devices from home to grid-scale,
solar panels and solar roof tiles, and related products and services.
Tesla is one of the world's most valuable companies and, as of 2023,
was the world's most valuable automaker. In 2022, the company led the
battery electric vehicle market, with 18% share.
Its subsidiary Tesla Energy develops and is a major installer of photovoltaic
systems in the United States. Tesla Energy is one of the largest global
suppliers of battery energy storage systems, with 6.5 gigawatt-hours (GWh)
installed in 2022.
Tesla was incorporated in July 2003 by Martin Eberhard and Marc Tarpenning as
Tesla Motors. The company's name is a tribute to inventor and
electrical engineer Nikola Tesla. In February 2004, via a $6.5 million
investment, Elon Musk became the company's largest shareholder.
He became CEO in 2008. Tesla's announced mission is to help expedite
the move to sustainable transport and energy, obtained through electric
vehicles and solar power.
Tesla began production of its first car model, the Roadster sports car,
in 2008. This was followed by the Model S sedan in 2012, the Model X SUV in
2015, the Model 3 sedan in 2017, the Model Y crossover in 2020, and the
Tesla Semi truck in 2022. The company plans production of the Cybertruck
light-duty pickup truck in 2023.[8]
The Model 3 is the all-time bestselling plug-in electric car worldwide,
and in June 2021 became the first electric car to sell 1 million units
globally.[9]
Tesla's 2022 deliveries were around 1.31 million vehicles, a 40% increase
over the previous year,[10][11] and cumulative sales totaled 4 million cars
as of April 2023.[12] In October 2021,
Tesla's market capitalization temporarily reached $1 trillion, the sixth
company to do so in U.S. history.
Tesla has been the subject of lawsuits, government scrutiny, and
journalistic criticism, stemming from allegations of whistleblower
retaliation, worker rights violations, product defects,
and Musk's many controversial statements.
"""
print(llm_chain.run(text1))
- Tesla is an American multinational automotive and clean energy company
- Headquartered in Austin, Texas
- Designs and manufactures electric vehicles, battery energy storage devices,
solar panels, and solar roof tiles
- One of the world's most valuable companies
- Led the battery electric vehicle market in 2022
- Installed 6.5 gigawatt-hours of battery energy storage systems in 2022
- Founded in 2003 by Martin Eberhard and Marc Tarpenning
- Elon Musk became CEO in 2008
- Mission is to help expedite the move to sustainable transport and energy
- Produces the Model S sedan, Model X SUV, Model 3 sedan, Model Y crossover,
and the Tesla Semi truck
- Plans production of the Cybertruck light-duty pickup truck in 2023
- Cumulative sales totaled 4 million cars as of April 2023
- Market capitalization temporarily reached $1 trillion in 2
Setting up a task to detect named entities within a given text using the PromptTemplate and LLMChain classes from the langchain library.
The PromptTemplate lays out instructions for the language model to identify named entities and return results in JSON format, with details like the entity name, its type, and its location within the text.
The placeholder {text} in the template will be replaced with the actual text to analyze.
The LLMChain class then binds the prepared prompt template with the specified language model (LLM). The resulting system aims to receive a text, identify the named entities in it, and return a detailed JSON file outlining each identified entity.
template = """
Detect named entities in following text delimited by triple backquotes.
Return your response in json format with spans of named entities with
fields
"named entity","type","span".
Return all entities
```{text}```
json format file:
"""
prompt = PromptTemplate(template=template, input_variables=["text"])
llm_chain = LLMChain(prompt=prompt, llm=llm)
print(llm_chain.run(text2))
{
"namedEntity": [
{
"namedEntity": "Apple Inc.",
"type": "company",
"span": [
{"start": 0, "end": 87}
]
},
{
"namedEntity": "Steve Jobs",
"type": "person",
"span": [
{"start": 109, "end": 141}
]
},
{
"namedEntity": "Steve Wozniak",
"type": "person",
"span": [
{"start": 168, "end": 183}
]
},
{
"namedEntity": "Ronald Wayne",
"type": "person",
"span": [
🚀 You are now well-positioned to begin experimenting with the massive potential that the collaboration of LLAM 2 and LangChain will bring to the industry!
Reference :
https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
https://ai.meta.com/blog/large-language-model-llama-meta-ai/
https://python.langchain.com/docs/get_started/introduction.html