When OpenAI published a blog regarding GPT-2, transformer-based language model with 1.5 Billion Parameters, which could generate text as good as humans, it created quite a good amount of buzz in Natural Language Processing community. Though, OpenAI was cautious and didn’t open source the 1.5 Billion parameters model instantly. First they released 117 Million Parameters model, then 345M then 774M and finally in last November, 2019 they open sourced 1.5B parameters model.

More information about “Language Models are Unsupervised Multitask Learners” paper and how to download GPT-2 models can be found from GPT-2 github page.

We started working on GPT-2 117M model last year after it was made available to download by OpenAI. We made it work and generated some text but it was not of very good quality. As OpenAI kept publishing better models, we kept trying on them and result was improving. Finally, when 1.5B parameters model was published, we tried it and the result was pretty impressive!

1.5B Parameters GPT2 model was generating text on the given input with good level of accuracy. For previous models, we had seen that sometimes model would generate text which was totally unrelated to input but in 1.5B Parameters the output mostly maintained the context of the input.

GPT-2 Text Generator Demo

To make GPT-2 based text generation available for testing for all enthusiasts we started working on to create a demo and now it is available at:

Text generation Using GPT-2 Demo

You can provide input and select the length of the text you would like to generate. As the model is big and we have limited CPU/RAM resources, it may take few seconds or few minutes to generate the text so kindly be patient.

Fine-tuning and improving GPT-2

As you can see in the demo, it generates the text but it is not of that good quality and sometimes it doesn’t maintain the context of input. To improve it further it was needed to be fine-tuned further.

GPT-2 is already trained on very large text – 40GB of text from 8 million web pages of internet text. But that text would be general and not domain specific.

So, we took approach to make it domain specific by creating dataset of specific domain.

For example, we created a dataset for “Artificial Intelligence” and fine-tuned the GPT-2 Model further. On this fined-tuned model, when we generated the text, it was improved a lot. It wrote the articles on “Artificial Intelligence” topic way better than normal GPT-2 model.

Once we have the fine-tuned model, we can churn out the articles very quickly by providing it various inputs on the same topic. We could generate thousands of sample articles within couple of days for various inputs.

GPT-2 Generated Text Samples

The sample of those articles fine-tuned on specific domain are published on our website

MachineWrites.com – Fully AI based GPT-2 Generated Articles Samples

After witnessing some improvements for articles on Artificial Intelligence, we went ahead and fine-tuned GPT-2 model on other topics. Below are lists and links of topics for which we fine-tuned GPT-2 Model and generated sample articles

Check the sample articles on machinewrites.com and do let us know your feedback on machine generated articles in comment section.

Further Roadmap

  • The capability of AI as content writing has improved a lot but still it’s not up to the mark. Many of the times we get jumbled up, fully meaningless text. We would like to make it better by fine-tuning the model further.
  • We are also working on to make it multi-lingual. Our initial tests showed that normal GPT-2 model is not able to generate proper text other than English. We will fine-tune it further on other languages and see how it works.
  • Another idea is to make GPT-2 write short stories. We will be fine-tuning GPT-2 on dataset of short stories and see if it generates a text with good story in it.

Let us know what would you like to get done from GPT-2 model in comment section. If we find it interesting then we may work on it. 🙂

Check out other interesting AI/ML based products we are working on at Pragnakalp Techlabs or do contact us if you are looking for Chatbot Development, Natural Language Processing or Python/NodeJS programming services.

Categories: Natural Language Processing NLP

Leave a Reply

Your email address will not be published.

You may use these <abbr title="HyperText Markup Language">HTML</abbr> tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>