CtrlGen: Controllable Generative Modeling in Language and Vision

CtrlGen is a NeurIPS 2021 workshop that took place virtually on Monday, December 13, 2021 here. It aimed to explore disentanglement, controllability, and manipulation for the generative vision and language modalities. We featured an exciting lineup of speakers, a live QA and panel session, interactive activities, and networking opportunities.

UPDATE (Feb. 2022): the recording of our workshop is now publicly available here! It starts at around 35 minutes into the main recording.

Contact: ctrlgenworkshop@gmail.com

Important Dates

~~Paper Submission Deadline: October 3, 2021~~
~~Paper Acceptance Notification: October 25, 2021~~
~~Paper Camera-Ready Deadline: November 5, 2021~~
~~Demo Submission Deadline: November 26, 2021~~
~~Demo Acceptance Notification: December 3, 2021~~
Workshop Date: December 13, 2021

Note that the above deadlines are all 11:59pm AOE.

Workshop Description

Over the past few years, there has been an increased interest in the areas of language and image generation within the ML, NLP, and Vision communities. As generated texts by models like GPT-3 start to sound more fluid and natural, and generated images and videos by GAN models appear more realistic, researchers began focusing on qualitative properties of the generated content such as the ability to control its style and structure, or incorporate information from external sources and other texts or images into the output. Such aims are extremely important to make language and image generation useful for human-machine interaction and other real-world applications including machine co-creativity, entertainment, ethical purposes, enhanced training for self-driving vehicles, and improving the ability of conversational agents and personal assistants to effectively interact.

Achieving these ambitious but important goals introduces challenges not only from NLP and Vision perspectives, but ones that closely pertain to Machine Learning as a whole, which has concurrently witnessed a growing body of work in relevant domains such as interpretability, disentanglement, robustness, and representation learning. We believe progress towards the realization of human-like language and image generation may benefit greatly from insights in these and other ML areas.

In this workshop, we propose to bring together researchers from the NLP, Vision, and ML communities to discuss the current challenges and explore potential directions for controllable generation and improve its quality, correctness, and diversity. As excitement about language and image generation has significantly increased recently thanks to the advent and improvement of large language models, transformer networks, and GANs, we feel this is the opportune time to hold a new workshop about this subject. We hope CtrlGen will foster discussion and interaction across NLP, Vision, and ML communities, including areas that range from structured prediction to disentanglement, and sprout fruitful cross-domain relations that may open the door for enhanced controllability in language and image generation.

Call for Papers and Demonstrations

Please see our call for papers and call for demonstrations pages for details.