Project Overview
This case study focuses on building a multilingual Retrieval-Augmented Generation (RAG) system using two advanced open-source language models: meta-llama/Meta-Llama-3.1-8B-Instruct and mistralai/Mistral-7B-Instruct-v0.3. The system aims to answer questions in both Hindi and English by retrieving relevant information and generating accurate responses in the appropriate language. A key challenge is ensuring the system’s performance in both languages, particularly when the original data is available only in English. To evaluate the system, we focus on two main aspects: the accuracy of the generated responses and the effectiveness of language detection.
Objective
The primary objective is to create a robust RAG system that can handle bilingual queries, retrieve pertinent information from a corpus embedded in English, and generate answers in the language of the query. This system aims to demonstrate the versatility of the “meta-llama/Meta-Llama-3.1-8B-Instruct” and “mistralai/Mistral-7B-Instruct-v0.3” models in a multilingual context.
Language Models:
1. Meta-LLaMA (Meta-Llama-3.1-8B-Instruct):
Meta-Llama-3-8B-Instruct is an 8 billion parameter language model developed by Meta AI. It’s based on the Transformer architecture and is likely an evolution of the Llama 2 series. This model is specifically designed for instruction-following tasks and is trained on a vast corpus of internet text, potentially including code and multilingual data. It’s known for its efficient performance and strong capabilities across various NLP tasks, particularly in areas like code generation and multilingual processing. However, its relatively smaller size compared to larger language models may limit its performance on extremely complex tasks.
2. Mistral (Mistral-7B-Instruct-v0.3):
Mistral-7B-Instruct-v0.3 is a 7 billion parameter language model created by Mistral AI. Built on the Transformer architecture with potential optimizations, this model is trained on a diverse dataset and is optimized for instruction-following tasks. Despite its compact size, Mistral-7B is known for its highly efficient performance and strong capabilities across a wide range of NLP tasks. It offers a good balance between model size and performance, making it suitable for various applications. Like the Llama model, its smaller size may pose limitations for very complex tasks, but its efficiency and strong instruction-following abilities make it a powerful tool for many NLP applications.
Document Embedding:
Document Embedding consists of two key components: the vector database and the embedding model.
We can embed documents using the embedding model and store them in a vector database for future reference.
The section below explains this in more detail.
1. Embedding Model - LaBSE (Language-Agnostic BERT Sentence Embedding):
LaBSE, or Language-agnostic BERT Sentence Embedding, is a multilingual sentence embedding.
It’s designed to create semantically meaningful vector representations of sentences across 109 languages. It’s designed to create semantically meaningful vector representations of sentences across 109 languages.
2. Vector Database - Chroma:
Chroma is an open-source embedding database designed for building AI applications with embeddings and natural language processing. It provides a simple, fast, and scalable solution for storing and querying high-dimensional vector data. Chroma is particularly well-suited for applications like semantic search, recommendation systems, and similarity matching.
Language Detection and Instruction Prompting:
The system includes an instruction in the prompt that guides the model to detect the query’s language and respond accordingly. This instruction enables the automatic conversion of retrieved information into the language of the query, whether Hindi or English.
Question Answering:
Questions were posed in both Hindi and English to assess the models’ capabilities. The system was designed to automatically detect the language of the query and use the corresponding model to generate the response.
Documents Used:
The financial reports used in this study were sourced from publicly available open-source files. These files are accessible to anyone interested in exploring or replicating the study. The documents can be downloaded from the following links:
https://s2.q4cdn.com/470004039/files/doc_financials/2022/q4/_10-K-2022-(As-Filed).pdf
https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf
Output:
Here are the results for the questions asked in English Language:
Questions | Actual Answer | Meta-Llama-3-8B-Instruct Response | Mistral-7B-Instruct-v0.3 Response |
---|---|---|---|
What is the total operating income for 2022? | The total operating income for 2022 is $119,437 million. | The total operating income for 2022 is $119,437 million. | The total operating income for 2022 is $119,437 (in millions). |
How does the company measure share-based compensation? | The Company generally measures share-based compensation based on the closing price of the Company’s common stock on the date of grant, and recognizes expense on a straight-line basis for its estimate of equity awards that will ultimately vest. | The company recognizes share-based compensation expense on a straight-line basis for its estimate of equity awards that will ultimately vest. | The company measures share-based compensation on a straight-line basis for its estimate of equity awards that will ultimately vest. |
Can you provide information about Restricted Marketable Securities? | The Company considers marketable securities to be restricted when withdrawal or general use is legally restricted. | The Company considers marketable securities to be restricted when withdrawal or general use is legally restricted. | The Company considers marketable securities to be restricted when withdrawal or general use is legally restricted. |
Can you provide information about the 2022 Employee Stock Plan? | The 2022 Employee Stock Plan provides for broad-based equity grants to employees, including executive officers, and permits the granting of restricted stock units (“RSUs”), stock grants, performance-based awards, stock options and stock appreciation rights. | The 2022 Employee Stock Plan provides for broad-based equity grants to employees, including executive officers, and permits the granting of restricted stock units, stock grants, performance-based awards, stock options, and stock appreciation rights. | The 2022 Employee Stock Plan is a shareholder-approved plan that provides for broad-based equity grants to employees, including executive officers. It permits the granting of Restricted Stock Units (RSUs), stock grants, performance-based awards, stock options, and stock appreciation rights. RSUs generally vest over four years. |
What is the total amount of income taxes paid in 2023, 2022, and 2021? | The total amount of income taxes paid in 2023, 2022, and 2021 is $16,741 million, $19,300 million, and $14,527 million, respectively. | The total amount of income taxes paid in 2023, 2022, and 2021 is $16,741 million, $19,300 million, and $14,527 million, respectively. | The total amount of income taxes paid in 2023 was $16,741 million, in 2022 was $19,300 million, and in 2021 was $14,527 million. |
Here are the results for the questions asked in Hindi Language:
Questions | Actual Answer | Meta-Llama-3-8B-Instruct Response | Mistral-7B-Instruct-v0.3 |
---|---|---|---|
2022 के लिए कुल परिचालन आय क्या थी? | 2022 के लिए कुल परिचालन आय $119,437 मिलियन है। | कुल परिचालन आय 2022 में $119,437 मिलियन थी। | 2022 के लिए कुल परिचालन आय $119,103 था। |
कंपनी शेयर-आधारित मुआवजे की माप कैसे करती है? | कंपनी आम तौर पर कंपनी के सामान्य स्टॉक के समापन मूल्य के आधार पर शेयर-आधारित मुआवजे को मापती है अनुदान की तारीख, और इक्विटी पुरस्कारों के अनुमान के लिए सीधी रेखा के आधार पर व्यय को पहचानती है जो अंततः निहित होंगे। | कंपनी शेयर-आधारित मुआवजे की माप को सtraight-line मетод से किया जाता है, जिसमें कंपनी अपने पूरे संभावित शेयरों के लिए मुआवजे की अनुमानित राशि को समान रूप से वितरित करती है। | कंपनी शेयर-आधारित मुआवजे को सामान्यता के अनुसार शुद्ध स्ट्रैट लाइन में किया जाता है और उसके साथ अगले से विकल्प से विशेष प्रकार के शेयर को स्थानांतरित किया जाता है जिसके अंतर्गत अब विकल्प हो सकता है कि उन शेयरों को अपनी घोषित कर दी गई समय से वे फिर से विकल्प हो सकते हैं या उन्हें अपनी घोषित कर दी गई समय से वे फिर से विकल्प नहीं हो सकते। |
सीमित विपणनीय प्रतिभूतियों के बारे में जानकारी दें। | जब निकासी या सामान्य उपयोग कानूनी रूप से प्रतिबंधित हो तो कंपनी विपणन योग्य प्रतिभूतियों को प्रतिबंधित मानती है। | कंपनी ने सीमित विपणनीय प्रतिभूतियों को ऐसा माना जाता है जिसके लिए निकास या सामान्य उपयोग कानूनी रूप से सीमित है। | सीमित विपणनीय प्रतिभूतियों के बारे में, कंपनी अपने विपणनीय सेकुरिटीज को सीमित होना समझती है जबकि उनका अनुपलब्ध उपयोग या सीमित करना हो सकता है। यह कंपनी कोन्सोलिडेटेड बैलेंस शेट्स में अपने कुरेंट या अक्रूअर्ट मार्केटेबल सेकुरिटीज के रूप में शामिल करती है, जिसके अधिकारियों के अधिकार के अनुसार प्राप्त हुए सेकुरिटीज का वर्ग होता है। |
2022 के कर्मचारी स्टॉक योजना के बारे में जानकारी दें। | 2022 कर्मचारी स्टॉक योजना कार्यकारी अधिकारियों सहित कर्मचारियों को व्यापक-आधारित इक्विटी अनुदान प्रदान करती है, और प्रतिबंधित स्टॉक देने की अनुमति देती है इकाइयाँ ("आरएसयू"), स्टॉक अनुदान, प्रदर्शन-आधारित पुरस्कार, स्टॉक विकल्प और स्टॉक प्रशंसा अधिकार। | 2022 Employee Stock Plan के तहत कर्मचारियों को ब्रॉड-बेस्ड इक्विटी ग्रैंट्स प्रदान किए जाते हैं, जिसमें कार्यकारी अधिकारियों को भी शामिल किया जाता है। इस योजना के तहत RSUs, स्टॉक ग्रैंट्स, प्रदर्शन-आधारित अवार्ड्स, स्टॉक ऑप्शन्स और स्टॉक ऐप्रेशिएशन राइट्स जैसे प्रकार के इक्विटी अवार्ड्स प्रदान किए जाते हैं। | The 2022 Employee Stock Plan is a shareholder-approved plan that provides for broad-based equity grants to employees, including executive officers. It permits the granting of restricted stock units (RSUs), stock grants, performance-based awards, stock options, and stock appreciation rights. RSUs granted under the plan generally vest over four years. |
2023, 2022 और 2021 में भुगतान किए गए आयकर की कुल राशि क्या है? | 2023, 2022 और 2021 में भुगतान किए गए आयकर की कुल राशि क्रमशः $16,741 मिलियन, $19,300 मिलियन और $14,527 मिलियन है। | 2023 में भुगतान किए गए आयकर की कुल राशि 16,741 करोड़, 2022 में 19,300 करोड़ और 2021 में 14,527 करोड़ थी। | 2023 में भुगतान किए गए आयकर की कुल राशि $16,741 मिलियन रूपये थी। |
Performance Evaluation
1. Language Detection:
The models effectively detected the language of the queries and provided accurate responses in the corresponding language.
2. Response Accuracy in English:
Meta-Llama-3-8B-Instruct performs well in answering queries based on the English PDFs in comparison of Mistral-7B-Instruct-v0.3 in English.
3. Response Accuracy in Hindi:
Meta-Llama-3-8B-Instruct performs better in Hindi compared to Mistral-7B-Instruct-v0.3.
Mistral-7B-Instruct-v0.3 struggles considerably with Hindi.
4. Overall Performance:
Meta-Llama-3-8B-Instruct consistently outperforms Mistral-7B-Instruct-v0.3 in both languages.
Both models show better performance in English compared to Hindi, which is expected given that they are likely trained primarily on English data.
Challenges
Prompt Engineering:
The effectiveness of the responses was also influenced by the quality of the prompts. In some cases, the prompts used were not optimized, leading to less accurate or incomplete answers.
Partial Hindi Translation:
The Mistral model occasionally failed to fully translate words in Hindi, resulting in responses that included both Hindi and English words.
Conclusion
The integration of the meta-llama/Meta-Llama-3.1-8B-Instruct and mistralai/Mistral-7B-Instruct-v0.3 models in a Multilingual RAG application has proven effective in handling complex, cross-lingual question-answering tasks. By embedding English documents and enabling Multilingual queries, the project demonstrated the potential of these models in a real-world application.
However, the challenges with partial translations in Hindi responses indicate that improvement is needed, especially in handling specific languages. The effectiveness of prompts also played a crucial role in the quality of the generated responses, highlighting the need for careful prompt engineering. Overall, the project succeeded in providing accurate and contextually appropriate answers in both languages.
This case study underscores the effectiveness of combining state-of-the-art language models in creating a bilingual RAG system and highlights the importance of continuous refinement to address specific linguistic challenges.