RIKEN AIP LLM×ML Workshop

Dates and Location

Schedule

Click a date to switch.

Talk Break Social Gathering
Day 1
October 2, 2025
Welcome and Opening Remarks
10:00 ~ 10:10
Session A 10:10 ~ 12:30
Session Chair: Zhen-Yu Zhang
10:10 ~ 10:45
Interpretability: What Do We Know and What Do We Want to Know?
Speaker: Benjamin Heinzerling
Natural Language Understanding Team
Abstract

The first half of the talk will give a high-level overview of the state of the art in LLM interpretability, covering representational analysis, sparse autoencoders, and circuits. The second half will be about the question what the goal of interpretability is (or should be).

10:45 ~ 11:20
Understanding Fact Recall in Language Models: Why Mixed Training Teaches Knowledge While Two-Stage Training Encourages Memorization
Speaker: Ying Zhang
Natural Language Understanding Team
Abstract

Fact recall, the ability of language models (LMs) to retrieve specific factual knowledge, remains a challenging task despite their impressive general capabilities. Common training strategies often struggle to promote robust recall behavior with two-stage training, which first trains a model with fact-storing examples (e.g., factual statements) and then with fact-recalling examples (question–answer pairs), tending to encourage rote memorization rather than generalizable fact retrieval. In contrast, mixed training, which jointly uses both types of examples, has been empirically shown to improve the ability to recall facts, but the underlying mechanisms are still poorly understood. In this work, we investigate how these training strategies affect how model parameters are shaped during training and how these differences relate to their ability to recall facts. Our analysis reveals that mixed training encouraging a larger and more centralized set of shared parameters that are strongly influenced by both fact-storing and fact-recalling examples. These findings suggest that the emergence of parameters may play a key role in enabling LMs to generalize factual knowledge across task formulations.

11:20 ~ 11:55
TopK Language Models
Speaker: Ryosuke Takahashi
Natural Language Understanding Team
Abstract

Add abstract here.

11:55 ~ 12:30
Mechanistic Insights into Grokking from the Embedding Layer
Speaker: Hilal AlQuabeh
Natural Language Understanding Team
Abstract

Grokking, a delayed generalization in neural networks after perfect training performance, has been observed in Transformers and MLPs, but the components driving it remain underexplored. We show that embeddings are central to grokking: introducing them into MLPs induces delayed generalization in modular arithmetic tasks, whereas MLPs without embeddings can generalize immediately. Our analysis identifies two key mechanisms: (1) Embedding update dynamics, where rare tokens stagnate due to sparse gradient updates and weight decay, and (2) Bilinear coupling, where the interaction between embeddings and downstream weights introduces saddle points and increases sensitivity to initialization. To confirm these mechanisms, we investigate frequency-aware sampling, which balances token updates by minimizing gradient variance, and embedding-specific learning rates, derived from the asymmetric curvature of the bilinear loss landscape. We prove that an adaptive learning rate ratio, (\frac{\eta_E}{\eta_W} \propto \frac{\sigma_{\max}(E)}{\sigma_{\max}(W)} \cdot \frac{f_W}{f_E}), mitigates bilinear coupling effects, accelerating convergence. Our methods not only improve grokking dynamics but also extend to broader challenges in Transformer optimization, where bilinear interactions hinder efficient training.

Discussion over Lunch
12:30 ~ 13:30
Session B 13:30 ~ 15:15
Session Chair: Wei Huang
13:30 ~ 14:05
Select Before Use: On the Importance of Base Model Selection in Preference Alignment
Speaker: Muyang Li
Imperfect Information Learning Team
Abstract

The post-training stage of Large Language Models (LLMs) typically involves Supervised Fine-Tuning (SFT) followed by preference alignment to ensure models to generate safe, helpful, and instruction-aligned content. The SFT model critically serves as both the initialization and reference policy for subsequent preference alignment. However, an essential yet often neglected question is the optimal selection of the SFT checkpoint for this role. In this presentation, we demonstrate that the choice of SFT checkpoint significantly impacts final aligned performance. Furthermore, the conventional practice of selecting the SFT checkpoint with the minimum validation loss often fails to identify the optimal starting point for achieving maximal performance after preference alignment. We attribute this to the fundamental objective conflict between SFT's focus on verbatim fitting and preference alignment's goal of enhancing response discriminability. A naive solution would be exhaustively evaluating all candidate SFT checkpoints through full preference alignment process, which is computationally prohibitive. To this end, we propose Margin Score, a simple, yet efficient metrics for estimating initial discriminability between the chosen and rejected response from the preference data, to efficiently estimate the learning difficulty for adapting a SFT model for preference alignment. Empirical evidence suggests that, using our selected model as reference can gain 23.4% relative increase on length-controlled win rate on the popular Zephyr recipe comparing to existing techniques.

14:05 ~ 14:40
Unsupervised Prompt Learning with Few-shot Examples for Answering Objective Questions
Speaker: Zhen-Yu Zhang
Imperfect Information Learning Team
Abstract

Add abstract here.

14:40 ~ 15:15
Physics Informed Large Language Models for Power Grids OptimizationOnline
Speaker: Salah Ghamizi
Imperfect Information Learning Team
Abstract

Efficiently solving Optimal Power Flow (OPF) problems in power systems is crucial for operational planning and grid management. There is a growing need for scalable algorithms capable of handling the increasing variability, constraints, and uncertainties in modern power networks while providing accurate and fast solutions. To address this, machine learning techniques, particularly Graph Neural Networks (GNNs) have emerged as promising approaches. This work introduces PowerGraph-LLM, the first framework explicitly designed for solving OPF problems using Large Language Models (LLMs). The proposed approach combines graph and tabular representations of power grids to effectively query LLMs, capturing the complex relationships and constraints in power systems. A new implementation of in-context learning and fine-tuning protocols for LLMs is introduced, tailored specifically for the OPF problem. PowerGraph-LLM demonstrates reliable performances using off-the-shelf LLM. Our study reveals the impact of LLM architecture, size, and fine-tuning and demonstrates our framework's ability to handle realistic grid components and constraints.

Discussion over Tea/Coffee
15:15 ~ 16:00
Session C 16:00 ~ 17:10
Session Chair: Gang Niu
16:00 ~ 16:35
Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression
Speaker: Wei Huang
Deep Learning Theory Team
Abstract

State-space models (SSMs), particularly Mamba, emerge as an efficient Transformer alternative with linear complexity for long-sequence modeling. Recent empirical works demonstrate Mamba's in-context learning (ICL) capabilities competitive with Transformers, a critical capacity for large foundation models. However, theoretical understanding of Mamba’s ICL remains limited, restricting deeper insights into its underlying mechanisms. Even fundamental tasks such as linear regression ICL, widely studied as a standard theoretical benchmark for Transformers, have not been thoroughly analyzed in the context of Mamba. To address this gap, we study the training dynamics of Mamba on the linear regression ICL task. By developing novel techniques tackling non-convex optimization with gradient descent related to Mamba's structure, we establish an exponential convergence rate to ICL solution, and derive a loss bound that is comparable to Transformer's. Importantly, our results reveal that Mamba can perform a variant of online gradient descent to learn the latent function in context. This mechanism is different from that of Transformer, which is typically understood to achieve ICL through gradient descent emulation. The theoretical results are verified by experimental simulation.

16:35 ~ 17:10
Primacy and Recency Effect in Mamba: A Mechanistic PerspectiveOnline
Speaker: Muhammad Cendekia Airlangga
Natural Language Understanding Team
Abstract

Add abstract here.

Self-paid Social Gathering 釣宿酒場マヅメ 日本橋店
18:00 ~
Day 2
October 3, 2025
Session D 10:00 ~ 11:45
Session Chair: Benjamin Tobias Heinzerling
10:00 ~ 10:35
Fast Substructure Search on JSONL for Structured RAG
Speaker: Yasuo Tabei
Succinct Information Processing Team
Abstract

Substructure search in JSONL datasets is indispensable for realizing structured Retrieval-Augmented Generation (RAG) in foundation model applications. However, existing methods incur prohibitively high computational costs. In this work, we propose jXBW, a fast substructure search method that introduces a merged tree representation, a succinct data structure based on the extended Burrows–Wheeler Transform, and a three-step search algorithm. Experimental results demonstrate that jXBW achieves up to 4,700× speedup over tree-based methods and more than 6×10⁶× speedup over XML-based processing.

10:35 ~ 11:10
Best-of-N Mixtures in the Infinite-N Limit
Speaker: Junpei Komiyama
Sequential Decision Making Team
Abstract

We study best-of-$N$ for large language models (LLMs) where the selection is based on majority voting. In particular, we analyze the limit $N \to \infty$, which we denote as \boinf. While this approach achieves impressive performance in the limit, it requires an infinite test-time budget. To address this, we propose an adaptive generation scheme that selects $N$ based on answer agreement, thereby efficiently allocating inference-time computation. Beyond adaptivity, we extend the framework to weighted ensembles of multiple LLMs, showing that such mixtures can outperform any individual model. The optimal ensemble weighting is formulated as a mixed-integer linear program. Extensive experiments demonstrate the effectiveness of our approach: for example, an ensemble of Bo1 accuracy $\le 75\%$ LLMs achieves up to 93\% accuracy on AIME2025.

11:10 ~ 11:45
Seeing Is Believing, But How Much? A Comprehensive Analysis of Verbalized Calibration in Vision-Language Models
Speaker: Weihao Xuan
Geoinformatics Team
Abstract

Uncertainty quantification is essential for assessing the reliability and trustworthiness of modern AI systems. Among existing approaches, verbalized uncertainty, where models express their confidence through natural language, has emerged as a lightweight and interpretable solution in large language models (LLMs). However, its effectiveness in vision-language models (VLMs) remains insufficiently studied. In this work, we conduct a comprehensive evaluation of verbalized confidence in VLMs, spanning three model categories, four task domains, and three evaluation scenarios. Our results show that current VLMs often display notable miscalibration across diverse tasks and settings. Notably, visual reasoning models (i.e., thinking with images) consistently exhibit better calibration, suggesting that modality-specific reasoning is critical for reliable uncertainty estimation. To further address calibration challenges, we introduce Visual Confidence-Aware Prompting, a two-stage prompting strategy that improves confidence alignment in multimodal settings. Overall, our study highlights the inherent miscalibration in VLMs across modalities. More broadly, our findings underscore the fundamental importance of modality alignment and model faithfulness in advancing reliable multimodal systems.

Discussion over Lunch
11:45 ~ 13:15
Session E 13:15 ~ 16:20
Session Chair: Yuning Qiu
13:15 ~ 13:50
Variational Training for Improving and Understanding LLMs
Speaker: Thomas Möllenhoff
Adaptive Bayesian Intelligence Team
Abstract

Variational training with an Improved Variational Online Newton method (IVON) can consistently match or outperform Adam for neural networks such as GPT-2. IVON’s computational costs are nearly identical to Adam but computes Bayesian weight uncertainty on top for free, which is useful in many downstream tasks. The talk gives a broad overview of several projects which have been using IVON since its publication last year. In particular, I will talk about a theoretical analysis of IVON’s promising performance, and its application to LLM fine-tuning, language generation, sensitivity analysis via influence functions and possible use-cases in mechanistic interpretability.

13:50 ~ 14:25
Variational Model Merging
Speaker: Hugo Monzon
Adaptive Bayesian Intelligence Team
Abstract

Add abstract here.

Discussion over Tea/Coffee
14:25 ~ 15:10
Session E 13:15 ~ 16:20
Session Chair: Yuning Qiu
15:10 ~ 15:45
SPIRIT: Patching Speech Language Models against Jailbreak Attacks Online
Speaker: Nurdaulet Mukhituly
Natural Language Understanding Team
Abstract

Add abstract here.

15:45 ~ 16:20
Interpretting LLM's Processing of Various Inputs: Repetition, Typo, and Spelling-outOnline
Speaker: Tatsuya Hiraoka
Natural Language Understanding Team
Abstract

Add abstract here.

Closing Remarks
16:20 ~ 16:30

Organizers

For any questions please contact Zhen-Yu Zhang (zhen-yu.zhang@riken.jp) or Benjamin T. Heinzerling (benjamin.heinzerling@riken.jp).

Acknowledgment

We gratefully acknowledge support from RIKEN AIP and all participating teams.