LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Date:
This paper proposes LLaMA-Adapter, a lightweight adaption method to efficiently finetune LLaMA into an instruction-following model.
Specifically, they adopt a set of learnable adaption prompts, and prepend them to the word tokens at higher transformer layers.
Then, a zero-initialized attention mechanism with zero gating is proposed, which adaptively injects the new instructional cues into LLaMA, while effectively preserves its pre-trained knowledge.
LLaMA-Adapter can generate high-quality responses, comparable to Alpaca with fully fine-tuned 7B parameters.
Also, it can be simply extended to multi-modal instructions for learning image-conditioned LLaMA model, which achieves superior reasoning performance on ScienceQA and COCO Caption benchmarks.
Leave a Comment