GPT3 has changed the level of language models and revolutionized AI by its capacity of learning with few examples as GPT3 is a few-shot learner. However, it is not open-sourced, and access to OpenAI’s API is only available upon request. So EleutherAI is working on to create similar model to GPT3, which is named GPT-Neo.
GPT-Neo is a transformer-based language model, whose architecture is nearly the same architecture as the GPT3 model and results are also roughly equal to the lower versions of GPT3 model. GPT-Neo is trained on the Pile Dataset. Same as GPT3, GPT-Neo is also a few-shot learner. And the good thing about GPT-Neo over GPT3 is it is an open-source model.
GPT-Neo is an autoregressive language model. This can be explained as functionality of first taking a prompt text and based on that predicting the next text tokens.
GPT-Neo is available in two different model size:
GPT-Neo-1.3B: The parameters of this model are 1.3B, tokens used in the training data are around 380B, and it is trained for 362,000 steps.
GPT-Neo-2.7B: The parameters of this model are 2.7B, tokens used in the training data are about 380B, and it is trained for 362,000 steps. This is a comparatively large model.
In this blog, we will see how GPT-Neo can be used for different tasks. Here we have tried 4 tasks which are Text Generation, Entity detection, Intent Classification and Ads Generation.
We have used Huggingface‘s Transformers library, which provides a pipeline to use pre-trained weights of GPT-Neo Model. Let’s see how we have prepared prompt text for different task.
Install and setup GPT NEO
- Install Transformers Library
!pip install git+https://github.com/huggingface/transformers.git
- Import pipeline from transformers to perform the task
from transformers import pipeline
- Allocate a pipeline for Text Generation
Below code is to allocate the pipeline for text generation in which you need to provide the model name, if you want to use the 2.7B model, you can provide EleutherAI/gpt-neo-2.7B instead of EleutherAI/gpt-neo-1.3B
generator = pipeline('text-generation',
model='EleutherAI/gpt-neo-1.3B')
- Now for all the below task you just need to copy the prompt_text and generate the response as given.
Text Generation
To generate text using GPT-NEO, we simply need to give an input prompt such as a beginner sentence or title which helps the model to generate text related to particular topics and then set the “max_length” as a prediction parameter to generate the output with the GPT-NEO model.
In the below examples, we have given the starting of a sentence as input prompt. In both examples, we set the max_length = 200 and do_sample = true.
prompt_text : “Google was founded in 1998 by Larry Page and Sergey Brin while”
After defining the prompt_text, run the below code to get the result.
text = generator(prompt_text , do_sample=True, max_length=200)
print(text[0]['generated_text'])
Output for the above prompt is:
Google was founded in 1998 by Larry Page and Sergey Brin while they were students at Stanford University. The company was originally called Google, Inc. and was incorporated in Delaware. In 2001, Google was acquired by the holding company Google Capital. Google is a search engine that provides information about the web. It is the most popular search tool in the world, with over 1.5 billion searches per month. Google is also the largest search company in terms of revenue. Its revenue in 2010 was $8.6 billion.
You can generate text on any topic, just provide the prompt_text of your choice. Let’s see the second example, where we have given prompt_text regarding COVID 19 and GPT NEO has generated the output as shown below.
prompt_text: Globally, coronavirus (COVID19) pandemic has become the most significant crisis to challenge the health.
Output for the above prompt text:
Globally, coronavirus (COVID19) pandemic has become the most significant crisis to challenge the health and safety of individuals, societies, and nations, and has resulted in the closure of approximately 1.25 million businesses and other facilities in more than 110 countries globally. During December 29–31, 2020, the Chinese government extended the suspension of major events and implemented stringent restrictions on all-day working, school trips, and other activities [@bib1]. The first confirmed cases were reported by the Health Administration Bureau of Hubei Province in Wuhan on January 22, 2020, and the second confirmed case was reported by the Zhuhai Municipal Health Commission on January.
Entity Detection
For entity detection we need to create an input prompt which will have some sample data. You can use the below sample data, or can prepare it yourself.
sample_data =
‘’’
Entities: Person, Facility, Location, Organization, Work Of Art, Event, Date, Time, Nationality / Religious / Political group, Law Terms, Product, Percentage, Currency, Language, Quantity, Ordinal Number, Cardinal Number, Degree, Company, Food
Text: Google was incorporated as a privately held company on September 4, 1998 by Larry Page and Sergey Brin, while they were Ph.D. students at Stanford University in California. They own about 14 percent of its shares and control 56 percent of the stockholder voting power through supervoting stock.
Entities Detected:
Google : Company
September 4, 1998 : Date
Larry Page : Person
Sergey Brin : Person
Stanford University : Organization
California : Location
14 percent : Percentage
56 percent : Percentage
Ph. D. : Degree
Entities: Person, Facility, Location, Organization, Work Of Art, Event, Date, Time, Nationality / Religious / Political group, Law Terms, Product, Percentage, Currency, Language, Quantity, Ordinal Number, Cardinal Number, Degree, Company, Food
Text: The U.S. President Donald Trump came to visit Ahmedabad for the first time at Reliance University with our Prime Minister Narendra Modi in February 2020.
Entities Detected:
U.S. : Location
Donald Trump : Person
Ahmedabad : Location
Narendra Modi : Person
Reliance University : Organization
February 2020 : Date
Entities: Person, Facility, Location, Organization, Work Of Art, Event, Date, Time, Nationality / Religious / Political group, Law Terms, Product, Percentage, Currency, Language, Quantity, Ordinal Number, Cardinal Number, Degree, Company, Food
Text: SpaceX is an aerospace manufacturer and space transport services company headquartered in California. It was founded in 2002 by entrepreneur and investor Elon Musk with the goal of reducing space transportation costs and enabling the colonization of Mars. Elon Musk is an American Entrepreneur.
Entities Detected:
SpaceX : Company
California : Location
2002 : Date
Elon Musk : Person
Mars : Location
American : Nationality / Religious / Political group
‘’’
In the sample data, we are providing Entities: and Text: as an input to the GPT NEO and Entites Detected: is the expected output. In Entities: we have used 20 entities, you can increase or decrease it as required. In Text: we have given the sentence and in Entities Detected: we have given the entity names and labels which is to be detected from text.
Now, for testing, we simply need to add list of entities as Entities: and input sentence as Text: with our sample_data. Below are some test examples.
In the first example, we have added Amazon related input text.
input_text = '''Entities: Person, Facility, Location, Organization, Work Of Art, Event, Date, Time, Nationality / Religious / Political group, Law Terms, Product, Percentage, Currency, Language, Quantity, Ordinal Number, Cardinal Number, Degree, Company, Food
Text: Amazon.com, Inc., known as Amazon, is an American online business and cloud computing company. It was founded on July 5, 1994 by Jeff Bezos. It is based in Seattle, Washington. It is the largest Internet-based store in the world by total sales and market capitalization. Amazon.com started as an online bookstore. When it got bigger, it started selling DVDs, Blu-rays, CDs, video downloads/streaming, MP3s, audiobooks, software, video games, electronics, apparel, furniture, food, toys, and jewelry. It also makes consumer electronics like Kindle e-readers, Fire tablets, Fire TV, and Echo. It is the world's largest provider of cloud computing services. Amazon also sells products like USB cables using the name AmazonBasics.'''
Let’s add input_text to the sample_data and create the final prompt_text:
prompt_text = sample_data + ’\n\n’ + input_text
After defining the prompt_text now run the below code to get the result.
text = generator(prompt_text , do_sample=True, max_length=200)
print(text[0]['generated_text'])
The result is:
Entities detected:
Amazon : Company
Seattle : Location
Jeff Bezos : Person
AmazonBasics : Product
July 1994 : Date
The second example is about entity detection for SpaceX related sample text. Provide input_text as shown below:
input_text = '''Entities: Person, Facility, Location, Organization, Work Of Art, Event, Date, Time, Nationality / Religious / Political group, Law Terms, Product, Percentage, Currency, Language, Quantity, Ordinal Number, Cardinal Number, Degree, Company, Food
Text: SpaceX is an aerospace manufacturer and space transport services company headquartered in California. It was founded in 2002 by entrepreneur and investor Elon Musk with the goal of reducing space transportation costs and enabling the colonization of Mars. Elon Musk is an American Entrepreneur.'''
prompt_text = sample_data+’/n/n’+input_text
The result is:
SpaceX : Company
California : Location
2002 : Date
Elon Musk : Person
Mars : Location
American : Nationality / Religious / Political group
After getting the result, we can clearly see that GPT-NEO has detected entities from our input sentences.
GPT-NEO doesn’t have any option like stop sequence, so we have to adjust length of our output according to our need. We can use the max_length parameter to adjust the output length.
Intent Classification
Intent Classification is the task of automatically analysing the text, and based on that categorizing into the intents.
In our previous blog post we have seen, how to perform Intent classification using GPT3, the result were promising. Now we will see how GPT NEO works on Intent Classification.
We have created the prompt by giving Sentences: and Classification: as a prefix, below is the prompt which we will use. You can even provide your own data.
prompt_text = '''
Sentence: listen to westbam alumb allergic on Google music
Classification: PlayMusic
Sentence: Give me a list of movie times for films in the area
Classification: SearchScreeningEvent
Sentence: Show me the picture creatures of light and darkness
Classification: SearchCreativeWork
Sentence: I would like to go to the popular bistro in oh
Classification: BookRestaurant'''
After defining the prompt_text, add the input sentence for which we need to classify the intent:
input_text = ”Sentence: What is the weather like in the city of frewen in the country of Venezuela”
prompt_text = prompt_text + “\n” + input_text
After adding the input_text to prompt_text now run the below code to get the result.
text = generator(prompt_text , do_sample=True, max_length=150)
print(text[0]['generated_text'])
The result for our test sentence What is the weather like in the city of frewen in the country of Venezuela is :
Classification: getWeather
We have provided max_length of 150, hence generator will try to generate 150 tokens though you have got classification. You can adjust the max_length to get the desired output length.
The results of 1.3B Model is Good for Intent Classification Task, but you can try the same for GPT NEO 2.7B model and see that the results are more impressive.
Facebook Ads Generator
We can generate different ads using GPT NEO, here we will see how we can generate Facebook Ads.
Below is the sample data which we will use as a prompt. From which product, description and search terms are given as input whereas Data, Link and Other is considered as expected output for ad generator.
sample_data = ”’product: Dollar Shave Club
description: We sell everything you need in the bathroom – from razor blades to grooming products for a small fee.
search terms: Quality, Cheap, Subscription, Gender neutral.
Data: It’s 2021. Who says a lady’s razor has to be pink? Dollar Shave Club delivers amazing razors (to both genders) for just a few bucks a month.
Link: DOLLARSHAVE CLUB.COM
Other: Sign up and get top-shelf grooming routine.
product: Squarespace
description: A simple and easy to use website builder.
search terms: Cheap, Ease of use, Quality.
Data: Squarespace is an online company that enables people to create digital storefronts so that they can sell their products or services online. They allow entrepreneurs to purchase a domain and start selling right away.
Link: www.squarespace.com
Other: Get 15% off your first purchase, signup bonus.
product: Canva
description: Beautiful graphic design tools with unique ideas and features.
search terms: Canva, Canva graphic design, Graphic design.
Data: Make your photos, videos or ideas look great with Canva’s online design platform. They offer graphics, templates, and fonts for free.
Link: canva.com/ads
Other: Sign up and get 2 months of premium for free.”’
After defining the sample_data, now we need to test it on some input text, for which we will need to provide the product, description and search terms. Let’s define a sample input text.
input_text = ‘’’product: Jeans-Villa
description: Top classic jeans available specially for women’s wear in discount of 40%.
search terms: online store, jeans, women’s wear.’’’
prompt_text = sample_data + “\n\n” + input_text
Now generate the ad text.
text = generator(prompt_text , do_sample=True, max_length=400)
print(text[0]['generated_text'])
Below is the output:
Data: Jeans-Villa is the best place to buy a traditional jeans for as low as $4.00, free shipping on all.
Link: jeansvilla.com
Other: Get 75% off on your first purchase, signup bonus.
You can use your sample data and can try to generate output.
Note: You may get some different Output other than what is given in the blog, as the generator generates different response for same input, you can test this by executing the same code numerous times.
Let us know your feedback or query in comment section. You can get in touch with us at letstalk@pragnakalp.com for any project inquiry related to GPT-2, GPT-3, GPT-NEO or GPT-J.