LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

Date:

This paper proposes LLaMA-Adapter, a lightweight adaption method to efficiently finetune LLaMA into an instruction-following model.

Specifically, they adopt a set of learnable adaption prompts, and prepend them to the word tokens at higher transformer layers.

Then, a zero-initialized attention mechanism with zero gating is proposed, which adaptively injects the new instructional cues into LLaMA, while effectively preserves its pre-trained knowledge.

LLaMA-Adapter can generate high-quality responses, comparable to Alpaca with fully fine-tuned 7B parameters.

Also, it can be simply extended to multi-modal instructions for learning image-conditioned LLaMA model, which achieves superior reasoning performance on ScienceQA and COCO Caption benchmarks.

Powerpoint for this talk

Reference Paper

Leave a Comment