Skip to content

Chinese-Vicuna Model Documentation

Overview

Model Name: Chinese-Vicuna
Category: Language Model, Instruction-Following Model, Dataset
Variants: Chinese-Vicuna7B, Chinese-Vicuna-medical7B, Ours-7b-chatv1

Chinese-Vicuna is a language model fine-tuned to enhance instruction-following capabilities in Chinese, particularly in medical and legal domains. It is designed to be resource-efficient, making it accessible for various applications, including conversational AI and domain-specific question answering.

Problem Statement

Addressed Challenges

  • Gap in Chinese Instruction-Following: Existing models struggle with instruction-following tasks in Chinese.
  • Resource Intensity: Many Chinese LLMs require high-end hardware, limiting accessibility.
  • Lack of Specialization: Current models often lack vertical specialization, particularly in medical and legal contexts.

Key Contributions

  • Fine-Tuning Methodology: Utilizes Low-Rank Adaptation (LoRA) for efficient parameter tuning.
  • Domain-Specific Adaptation: Tailored for healthcare and legal applications, improving performance in these areas.
  • Quantization Techniques: Implements 4-bit quantization (QLoRA) for efficient inference and cost-effective deployment.
  • Training Data: Incorporates a merged dataset from Belle and Guanaco, along with domain-specific datasets like cMedQA2 and Lawyer-LLaMA.

Model Architecture and Techniques

Core Techniques

  • LoRA: Introduces low-rank adaptation matrices into frozen layers, significantly reducing memory usage during training while maintaining performance.
  • QLoRA: Employs 4-bit quantization to lower VRAM requirements, enabling deployment on consumer-grade GPUs.
  • Model Quantization: Uses GPTQ for efficient inference, compatible with CPU-based systems.

Specialized Models

  • Chinese Medical Model: Fine-tuned on medical Q&A data to enhance responses related to health inquiries.
  • Legal Model: Trained specifically on legal datasets to improve performance in legal question-answering tasks.

Training and Evaluation

Training Process

  • Data Requirements: Requires general instruction data, dialogue data, and domain-specific texts.
  • Training Pipeline: Involves collecting Chinese-related corpora, applying quantization, and fine-tuning models for specific tasks.

Evaluation Metrics

  • Performance Benchmarks: Achieves competitive results across medical and legal tasks, with multi-turn dialogue coherence and real-time legal updates.
  • Comparative Performance: Claims performance on par with ChatGPT for Chinese tasks and superior capabilities in medical question answering.

Limitations and Future Work

  • Alignment Improvements: Future iterations will integrate Reinforcement Learning from Human Feedback (RLHF) for better alignment.
  • Knowledge Retrieval: Plans to implement dynamic knowledge retrieval systems to address temporal data gaps.

Practical Considerations

Hyperparameters

  • Quantization: 8-bit for 7B model, 4-bit for 13B model.
  • Batch Size: 128
  • Learning Rate: 3 × 10^-4
  • Epochs: 3
  • LoRA Parameters: R=8, Alpha=16, Dropout=0.05

Resource Requirements

  • Hardware: Requires four 2080Ti GPUs for training.

Conclusion

Chinese-Vicuna represents a significant advancement in Chinese language models, particularly for instruction-following tasks in specialized domains. Its efficient architecture and fine-tuning strategies make it a valuable resource for developers and researchers focusing on multilingual and domain-specific applications.

Sources

https://arxiv.org/abs/2504.12737v1