January 2, 2023 No Comments

Introduction

Prompt tuning is a technique that uses frozen pre-trained language models to downstream tasks that minimize per-task storage and memory usage during the training phase and this is useful for Large Language Models (LLMs) such as GPT2, T5, GPT-J, GPT-NEO, GPT-NEOX, GPT-20B, GPT3, etc where the model is so large that fine-tuning becomes difficult or very expensive.

A pre-trained language model parameters are frozen and only embedding parameters for a particular task are updated during the training phase.

The description of the task is included in the actual input sentence in some way. The task description is called prompt as it prompts the model to perform a specific task.

Benefits of Prompt Tuning

The size of large pre-trained language models is constantly growing. While fine-tuning those models will require more memory, fine-tuned models also get larger, so it would be prohibitively expensive to store and infer a fine-tuned copy of the model for each downstream task. While prompt-tuning will require less memory and the size of prompt tuned model is also less compared to the fine-tuned model.

Drawback of Prompt Tuning

  1. Prompt tuning takes more training time than fine-tuning

  2. The designing of the prompts so that the model can understand

Here in this tutorial, we will learn how to infer the prompt-tuned model for sentiment analysis. For that you will need a prompt-tuned model, to perform prompt tuning you can check our colab file Prompt Tuning Large Language Model. Once you get the prompt tuned model, you can follow the below steps to perform the inference.

Resource Requirements

Colab Pro:
25 GB RAM
2 x vCPU
T4 GPU

Before you infer prompt-tuned model, you must ensure that all the required packages are installed in your system. Perform the following steps to install all the required packages:

Install Apex

				
					git clone https://github.com/ericharper/apex.git
cd apex
git checkout nm_v1.11.0
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" --global-option="--fast_layer_norm" --global-option="--distributed_adam" --global-option="--deprecated_fused_adam" ./
				
			

Install Nemo toolkit

				
					apt-get update && apt-get install -y libsndfile1 ffmpeg
pip install Cython
pip install nemo_toolkit['all']

				
			

Download the gpt model’s .nemo file.

Once you have installed the packages, now we will download the model. In this tutorial, we are using nemo_gpt1.3B_fp16.nemo large model for prompt tuning, you can download the nemo_gpt1.3B_fp16.nemo as shown below:

				
					wget https://huggingface.co/nvidia/nemo-megatron-gpt-1.3B/resolve/main/nemo_gpt1.3B_fp16.nemo
				
			

Clone the Nemo repo from GitHub

				
					git clone https://github.com/NVIDIA/NeMo.git
cd NeMo/examples/nlp/language_modeling
				
			

Inference

To perform the inference, run the megatron_gpt_prompt_learning_eval.py file as shown below.

				
					python megatron_gpt_prompt_learning_eval.py \
    +virtual_prompt_model_file=PATH_TO_NEMO_PROMPT_LEARNING_MODEL_FILE \
    gpt_model_file=PATH_TO_FROZEN_GPT_MODEL_FILE \
    inference.greedy=True \
    inference.add_BOS=False \
    trainer.devices=1 \
    trainer.num_nodes=1 \
    tensor_model_parallel_size=1 \
    pipeline_model_parallel_size=1 \
    pred_file_path=PATH_WHERE_PRED_TEXT_FILE_WILL_BE_SAVED \
    data_paths=[path/to/prompt_1.json, path/to/prompt_2.json] 
				
			

+virtual_prompt_model_file = Path to your prompt tuned .nemo file
gpt_model_file  = Path to the gpt model’s .nemo file
data_paths = List of json or jsonl files that contains the testing input

Note: The json file should have keys that match the fields specified in the prompt template, which is used during the training phase.

For example, the prompt template during the training phase looks like below:

				
					"<|VIRTUAL_PROMPT_0|> {sentence} sentiment: {label}"
				
			
				
					{"taskname": "sentiment", "sentence": "some text"}
				
			

Input

The sample json file looks like the one below
Prompt_1.json

				
					{"taskname": "sentiment", "sentence": "The movie was not good."},

				
			

Output

On running the inference, it will create a text file that contains the results. Below is the sample text file.
results.txt

				
					The movie was not good. sentiment: negative

				
			

Write a comment

Your email address will not be published. Required fields are marked *

Pragnakalp Techlabs: Your trusted partner in Python, AI, NLP, Generative AI, ML, and Automation. Our skilled experts have successfully delivered robust solutions to satisfied clients, driving innovation and success.