Quark: Controllable Text Generation with Reinforced [Un]learning

Date:

Large language models may generate content that is misaligned with the user’s expectations. For example, generating toxic words, repeated content, and undesired responses for users.

This paper addresses this challenge with an algorithm for optimizing a reward function that quantifies an (un)wanted property.

Powerpoint for this talk

Reference Paper

Leave a Comment