Training Diffusion Models with Reinforcement Learning – The Berkeley Artificial Intelligence Research Blog

How Diffusion Models Can be Trained with Reinforcement Learning for Improved Performance

Diffusion models have become widely used for generating complex, high-dimensional outputs such as AI art, synthetic images, drug design, and continuous control. While traditionally trained using maximum likelihood estimation to match training data, many applications are instead focused on downstream objectives. In this post, we explore using reinforcement learning (RL) to train diffusion models directly on these objectives.

To achieve this, we finetune Stable Diffusion on various objectives including image compressibility, human-perceived aesthetic quality, and prompt-image alignment, using feedback from a large vision-language model. Our algorithm, denoising diffusion policy optimization (DDPO), refocuses the diffusion process as a multi-step Markov decision process, allowing for better maximization of rewards.

We found that DDPO significantly outperforms existing algorithms, and when finetuning Stable Diffusion, it generalizes to unseen and novel inputs. However, we also identified overoptimization issues when the model exploits rewards to achieve high scores in non-useful ways.

In conclusion, the study suggests that training diffusion models with RL yields improved performance, while also highlighting the importance of addressing overoptimization in future work. Overall, this work demonstrates the potential for combining diffusion models with RL to achieve better performance on downstream objectives.

Introducing AI for customer service

Top Stories

Using Data to Nurture Long-Term Customer Relationships

Harnessing the Power of AI for Efficient Data Processing and Analysis

RudderStack Reverse ETL Unlocks the Data in Your Warehouse | RudderStack

Training Diffusion Models with Reinforcement Learning – The Berkeley Artificial Intelligence Research Blog

Leave a Reply Cancel reply

Related Strories

Microsoft’s $3.2 billion investment in Sweden’s cloud and AI sectors

Cisco Unveils AI Deployment Solution With NVIDIA

AWS and Orange to bring the cloud to Morocco and Senegal

Quantum computing is set to destroy crypto. Could cloud-based quantum-proof encryption be the solution?

Quicklinks

Company

Follow Socials

Introducing AI for customer service

Top Stories

Using Data to Nurture Long-Term Customer Relationships

Harnessing the Power of AI for Efficient Data Processing and Analysis

RudderStack Reverse ETL Unlocks the Data in Your Warehouse | RudderStack

Training Diffusion Models with Reinforcement Learning – The Berkeley Artificial Intelligence Research Blog

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Microsoft’s $3.2 billion investment in Sweden’s cloud and AI sectors

Cisco Unveils AI Deployment Solution With NVIDIA

AWS and Orange to bring the cloud to Morocco and Senegal

Quantum computing is set to destroy crypto. Could cloud-based quantum-proof encryption be the solution?

Get Insider Tips and Tricks in Our Newsletter!