LegalMind: A Fine-Tuned Gemma-2-Based Legal Assistant for Indian Judiciary with RAG and Embedding Integration

Harsh Dilip Pimpale; Aditi Raut; Yash Patil; Gaurav Parpol; Prajwal Yadav; Janhavi Sangoi

doi:10.5935/jetia.v11i55.1925

Harsh Dilip Pimpale St. John College of Engineering and Management, Palghar, Maharashtra. 2 D J Sanghvi College of Engineering, Mumbai, Maharashtra https://orcid.org/0009-0005-9129-0836
Aditi Raut D J Sanghvi College of Engineering, Mumbai, Maharashtra https://orcid.org/0009-0007-9372-8090
Yash Patil St. John College of Engineering and Management https://orcid.org/0009-0000-8084-0699
Gaurav Parpol St. John College of Engineering and Management https://orcid.org/0009-0006-2861-3986
Prajwal Yadav St. John College of Engineering and Management https://orcid.org/0009-0001-2083-8607
Janhavi Sangoi St. John College of Engineering and Management https://orcid.org/0009-0001-0560-4458

DOI: https://doi.org/10.5935/jetia.v11i55.1925

Abstract

Legal research and case analysis remain labor-intensive tasks requiring domain expertise and significant time investment. LegalMind is an AI-driven legal assistant built to streamline and automate legal understanding through fine-tuned large language models (LLMs) and Retrieval-Augmented Generation (RAG) techniques. The system integrates Google’s text-embedding-004 for document-level embeddings and a fine-tuned Gemma 2 (9B) model for legal summarization and question answering. LegalMind is enhanced through a situation-based RAG pipeline that retrieves contextually relevant content from "The Bharatiya Nyaya Sanhita, 2023", ensuring precise legal responses grounded in statute. The model's performance is rigorously evaluated using standardized natural language generation metrics, including ROUGE-L, BLEU, METEOR, BERTScore, and F1-score. Evaluation is conducted against the pseudo-reference output of GPT-4o, enabling a reliable benchmark for quality comparison with other LLMs such as Gemini Pro 1.5, LLaMA 3, and Mistral. Empirical results highlight significant improvements post fine-tuning, with LegalMind outperforming several baseline and open-source models on key semantic and syntactic metrics. The system offers a scalable, cost-effective legal NLP pipeline suited for real-world use cases in law firms, research institutions, and legal consultancies, reinforcing the potential of domain-specific LLMs in transforming legal technology.

Downloads

Download data is not yet available.

JETIA Journal Data
Available:	2015 - 2026
Volumes:	12
Issues:	58
Articles:	1.110
Article Processing Charges (APC):	PAID