Program on Wednesday, February 12

09:00am Registration and Coffee & Tea!
09:30am Opening Remarks
Eric Xing (MBZUAI & Carnegie Mellon University)
10:10am Statistical Methods for Assessing the Factual Accuracy of Large Language Models
Emmanuel Candès (Stanford University)
We present new statistical methods for obtaining validity guarantees on the output of large language models (LLMs). These methods enhance conformal prediction techniques to filter out claims/remove hallucinations while providing a finite-sample guarantee on the error rate of what it being presented to the user. This error rate is adaptive in the sense that it depends on the prompt to preserve the utility of the output by not removing too many claims. We demonstrate performance on real-world examples. This is joint work with John Cherian and Isaac Gibbs.
10:50am Coffee & Tea Break
11:00am The ChatGLM's Road to AGI
Jie Tang (Tsinghua University)
Large language models have substantially advanced the state of the art in various AI tasks, such as natural language understanding and text generation, and image processing, multimodal modeling. In this talk, we will first introduce the development of AI in the past decades, in particular from the angle  of China. We will also talk about the opportunities, challenges, and risks of AGI in the future. In the second part of the talk, we will use ChatGLM, an alternative but open sourced model to ChatGPT, as an example to explain our understandings and insights derived during the implementation of the model.
11:40am Exploiting Knowledge for Model-based Deep Music Generation
Gaël Richard (Télécom Paris)
We will describe and illustrate the concept of hybrid (or model-based) deep learning for music generation. This paradigm refers here to models that associates data-driven and model-based approaches in a joint framework by integrating our prior knowledge about the data in more controllable deep models. In the music domain, prior knowledge can relate for instance to the production or propagation of sound (using an acoustic or physical model) or how music is composed or structured (using a musicological model). In this presentation, we will first illustrate the concept and potential of such model-based deep learning approaches and then describe in more details its application to unsupervised music separation with source production models, music timbre transfer with diffusion and symbolic music generation with transformers using structured informed positional encoding.
12:20pm Auditing and Mitigating Biases in (compressed) Language Models
Julien Velcin (University of Lyon)
The size of language models plays a critical role in their ability to address complex tasks in NLP. However such big LMs can be hard to deploy on edge devices which leads to the need of compressing LLMs. Recent studies have shown that compressing pretrained models can significantly influence the way they deal with various biases, such as biases related to fairness and model calibration. In this talk, I will provide an overview of recent research conducted at the ERIC Lab as part of the DIKé project. In particular, We will see how important quantization can lead to calibration errors and alter the model's confidence in its predictions. Additionnally, I will discuss ongoing work on the alignement of LLMs with moral values.
13:00pm Lunch
14:00pm Intricacies of Game-theoretical LLM Alignment
Michal Valko (INRIA & Stealth Startup)
Ensuring alignment of language models' outputs with human preferences is critical to guarantee a useful, safe, and pleasant user experience. Thus, human alignment has been extensively studied recently and several methods such as Reinforcement Learning from Human Feedback (RLHF), Direct Policy Optimisation (DPO) and Sequence Likelihood Calibration (SLiC) have emerged. In this paper, our contribution is two-fold. First, we show the equivalence between two recent alignment methods, namely Identity Policy Optimisation (IPO) and Nash Mirror Descent (Nash-MD). Second, we introduce a generalisation of IPO, named IPO-MD, that leverages the regularised sampling approach proposed by Nash-MD. @This equivalence may seem surprising at first sight, since IPO is an offline method whereas Nash-MD is an online method using a preference model. However, this equivalence can be proven when we consider the online version of IPO, that is when both generations are sampled by the online policy and annotated by a trained preference model. Optimising the IPO loss with such a stream of data becomes equivalent to finding the Nash equilibrium of the preference model through self-play. Building on this equivalence, we introduce the IPO-MD algorithm that generates data with a mixture policy (between the online and reference policy) similarly as the general Nash-MD algorithm. We compare online-IPO and IPO-MD to different online versions of existing losses on preference data such as DPO and SLiC on a summarisation task.
14:40pm Moshi: A Speech-text Foundation Model for Real-time Dialogue
Alexandre Défossez (Kyutai)
We will discuss Moshi, our recently released model. Moshi is capable of full-duplex dialogue, e.g. it can both speak and listen at any time, offering the most natural speech interaction to date. Besides, Moshi is also multimodal, in particular it is able to leverage its inner text monologue to improve the quality of its generation. We will cover the design choices behind Moshi in particular the efficient joint sequence modeling permitted by RQ-Transformer, and the use of large scale synthetic instruct data.
15:20pm Coffee & Tea Break
15:30pm Feature-Conditioned Graph Generation using Latent Diffusion Models
Giannis Nikolentzos (University of Peloponnese)
Graph generation has emerged as a crucial task in machine learning, with significant challenges in generating graphs that accurately reflect specific properties. In this talk, I will present Neural Graph Generator, our recently released model which utilizes conditioned latent diffusion models for graph generation. The model employs a variational graph autoencoder for graph compression and a diffusion process in the latent vector space, guided by vectors summarizing graph statistics. Overall, this work represents a shift in graph generation methodologies, offering a more practical and efficient solution for generating diverse graphs with specific characteristics.
16:10pm Redefining AI Reasoning: From Self-Guided Exploration to Causal Loops, and Transformer-GNN Fusion
Martin Takáč (MBZUAI)
In this talk, we explore three intertwined directions that collectively redefine how AI systems reason about complex tasks. First, we introduce Self-Guided Exploration (SGE), a prompting strategy that enables Large Language Models (LLMs) to autonomously generate multiple “thought trajectories” for solving combinatorial problems. Through iterative decomposition and refinement, SGE delivers significant performance gains on NP-hard tasks—showcasing LLMs’ untapped potential in reasoning, logistics and resource management problems. Next, we delve into the Self-Referencing Causal Cycle (ReCall), a mechanism that sheds new light on LLMs’ ability to recall prior context from future tokens. Contrary to the common belief that unidirectional token generation fundamentally restricts memory, ReCall illustrates how “cycle tokens” create loops in the training data, enabling models to overcome the notorious “reversal curse.” Finally, we present a Transformer-GNN fusion architecture that addresses Transformers’ limitations in processing graph-structured data.
18:00pm Poster Session with Buffet at MBZUAI France Lab
To present a poster, please fill out the Google form for review.
Workshop participants are invited to join the poster session at MBZUAI France Lab.
Address: 42 Rue Notre Dame des Victoires, 75002 Paris