中芸汇科技
GeneralAIModel Fine-tuningKnowledge BaseChina

Model Fine-Tuning Project for a Legal Tech Platform

Model Fine-Tuning Project for a Legal Tech Platform

Project Background

A legal tech platform provides online legal consulting services for enterprises and individuals, with an average daily consultation volume exceeding 3,000. The platform previously used general-purpose large language models to answer legal questions, but due to the highly specialized nature and dense terminology of the legal domain, the general model achieved only 71% accuracy in legal consulting scenarios, with a hallucination rate as high as 28%. It frequently gave plausible but incorrect or even erroneous suggestions, severely undermining the platform's professionalism and user trust. The platform urgently needed a dedicated model with genuine legal understanding.

Core Pain Points

  • Low Legal Consultation Accuracy: The general model achieved only 71% accuracy, failing to meet the professional standards required for legal services.
  • Extremely High Hallucination Rate: 28% of responses contained fabricated legal provisions or erroneous citations, posing professional liability risks.
  • Poor Comprehension of Legal Terminology: The general model inadequately understood specialized legal terms and citations.
  • High Data Annotation Costs: High-quality annotated data in the legal domain was scarce and expensive to produce.
  • Solution

    Legal Domain LoRA Fine-Tuning

    LoRA (Low-Rank Adaptation) fine-tuning was performed based on ChatGLM-6B for the legal domain. A high-quality dataset of 2,000 annotated legal Q&A pairs was carefully constructed, covering core legal areas such as contract disputes, labor disputes, intellectual property, and corporate law. After fine-tuning, model accuracy increased from 71% to 95%, and the hallucination rate dropped from 28% to 4%.

    Legal Knowledge Enhancement

    A legal knowledge base was built as a RAG supplement, incorporating authoritative sources such as laws and regulations, judicial interpretations, and leading cases. When generating responses, the model automatically retrieves relevant legal provisions and case precedents as supporting evidence, ensuring every answer is traceable to legal authority and further enhancing credibility and professionalism.

    Quality Assessment and Continuous Iteration

    A legal response quality evaluation system was established, automatically assessing model output across three dimensions: accuracy, completeness, and compliance. Training data is continuously supplemented based on issues identified during evaluation, creating a data flywheel that ensures ongoing model capability improvement.

    Effect Data

    MetricBeforeAfterImprovement
    Legal consultation accuracy71%95%34%
    Hallucination rate28%4%86%
    Legal provision citation accuracy55%92%67%
    User satisfaction62%91%47%

    Tech Stack

    ChatGLM-6B, LoRA fine-tuning, PEFT, Legal Knowledge Base, RAG, Python, PyTorch, Hugging Face Transformers

    The fine-tuned model truly understands the law now. Lawyers have started trusting the AI's suggestions. This was a key step in our transition from a general model to a specialized one.