RIKEN AIP LLM×ML Workshop

Dates and Location

Dates: October 2 to 3, 2025
Location: RIKEN AIP Open Space, Tokyo · Online (with verification)
Accessibility: Open to all RIKEN-affiliated members

Schedule

Click a date to switch.

Talk Break Social Gathering

Day 1

October 2, 2025

Welcome and Opening Remarks

10:00 ~ 10:10

Session A 10:10 ~ 12:30

Session Chair: Zhen-Yu Zhang

10:10 ~ 10:45

Interpretability: What Do We Know and What Do We Want to Know?

Speaker: Benjamin Heinzerling

Natural Language Understanding Team

＋Abstract

The first half of the talk will give a high-level overview of the state of the art in LLM interpretability, covering representational analysis, sparse autoencoders, and circuits. The second half will be about the question what the goal of interpretability is (or should be).

10:45 ~ 11:20

Understanding Fact Recall in Language Models: Why Mixed Training Teaches Knowledge While Two-Stage Training Encourages Memorization

Speaker: Ying Zhang

Natural Language Understanding Team

＋Abstract

Fact recall, the ability of language models (LMs) to retrieve specific factual knowledge, remains a challenging task despite their impressive general capabilities. Common training strategies often struggle to promote robust recall behavior with two-stage training, which first trains a model with fact-storing examples (e.g., factual statements) and then with fact-recalling examples (question–answer pairs), tending to encourage rote memorization rather than generalizable fact retrieval. In contrast, mixed training, which jointly uses both types of examples, has been empirically shown to improve the ability to recall facts, but the underlying mechanisms are still poorly understood. In this work, we investigate how these training strategies affect how model parameters are shaped during training and how these differences relate to their ability to recall facts. Our analysis reveals that mixed training encouraging a larger and more centralized set of shared parameters that are strongly influenced by both fact-storing and fact-recalling examples. These findings suggest that the emergence of parameters may play a key role in enabling LMs to generalize factual knowledge across task formulations.

11:20 ~ 11:55

TopK Language Models

Speaker: Ryosuke Takahashi

Natural Language Understanding Team

＋Abstract

Add abstract here.

11:55 ~ 12:30

Mechanistic Insights into Grokking from the Embedding Layer

Speaker: Hilal AlQuabeh

Natural Language Understanding Team

＋Abstract

Grokking, a delayed generalization in neural networks after perfect training performance, has been observed in Transformers and MLPs, but the components driving it remain underexplored. We show that embeddings are central to grokking: introducing them into MLPs induces delayed generalization in modular arithmetic tasks, whereas MLPs without embeddings can generalize immediately. Our analysis identifies two key mechanisms: (1) Embedding update dynamics, where rare tokens stagnate due to sparse gradient updates and weight decay, and (2) Bilinear coupling, where the interaction between embeddings and downstream weights introduces saddle points and increases sensitivity to initialization. To confirm these mechanisms, we investigate frequency-aware sampling, which balances token updates by minimizing gradient variance, and embedding-specific learning rates, derived from the asymmetric curvature of the bilinear loss landscape. We prove that an adaptive learning rate ratio, (\frac{\eta_E}{\eta_W} \propto \frac{\sigma_{\max}(E)}{\sigma_{\max}(W)} \cdot \frac{f_W}{f_E}), mitigates bilinear coupling effects, accelerating convergence. Our methods not only improve grokking dynamics but also extend to broader challenges in Transformer optimization, where bilinear interactions hinder efficient training.

Discussion over Lunch

12:30 ~ 13:30

Session B 13:30 ~ 15:15

Session Chair: Wei Huang

13:30 ~ 14:05

Select Before Use: On the Importance of Base Model Selection in Preference Alignment

Speaker: Muyang Li

Imperfect Information Learning Team

＋Abstract

The post-training stage of Large Language Models (LLMs) typically involves Supervised Fine-Tuning (SFT) followed by preference alignment to ensure models to generate safe, helpful, and instruction-aligned content. The SFT model critically serves as both the initialization and reference policy for subsequent preference alignment. However, an essential yet often neglected question is the optimal selection of the SFT checkpoint for this role. In this presentation, we demonstrate that the choice of SFT checkpoint significantly impacts final aligned performance. Furthermore, the conventional practice of selecting the SFT checkpoint with the minimum validation loss often fails to identify the optimal starting point for achieving maximal performance after preference alignment. We attribute this to the fundamental objective conflict between SFT's focus on verbatim fitting and preference alignment's goal of enhancing response discriminability. A naive solution would be exhaustively evaluating all candidate SFT checkpoints through full preference alignment process, which is computationally prohibitive. To this end, we propose Margin Score, a simple, yet efficient metrics for estimating initial discriminability between the chosen and rejected response from the preference data, to efficiently estimate the learning difficulty for adapting a SFT model for preference alignment. Empirical evidence suggests that, using our selected model as reference can gain 23.4% relative increase on length-controlled win rate on the popular Zephyr recipe comparing to existing techniques.

14:05 ~ 14:40

Unsupervised Prompt Learning with Few-shot Examples for Answering Objective Questions

Speaker: Zhen-Yu Zhang

Imperfect Information Learning Team

＋Abstract

Add abstract here.

14:40 ~ 15:15

Physics Informed Large Language Models for Power Grids OptimizationOnline

Speaker: Salah Ghamizi

Imperfect Information Learning Team

＋Abstract

Efficiently solving Optimal Power Flow (OPF) problems in power systems is crucial for operational planning and grid management. There is a growing need for scalable algorithms capable of handling the increasing variability, constraints, and uncertainties in modern power networks while providing accurate and fast solutions. To address this, machine learning techniques, particularly Graph Neural Networks (GNNs) have emerged as promising approaches. This work introduces PowerGraph-LLM, the first framework explicitly designed for solving OPF problems using Large Language Models (LLMs). The proposed approach combines graph and tabular representations of power grids to effectively query LLMs, capturing the complex relationships and constraints in power systems. A new implementation of in-context learning and fine-tuning protocols for LLMs is introduced, tailored specifically for the OPF problem. PowerGraph-LLM demonstrates reliable performances using off-the-shelf LLM. Our study reveals the impact of LLM architecture, size, and fine-tuning and demonstrates our framework's ability to handle realistic grid components and constraints.

Discussion over Tea/Coffee

15:15 ~ 16:00

Session C 16:00 ~ 17:10

Session Chair: Gang Niu

16:00 ~ 16:35

Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression

Speaker: Wei Huang

Deep Learning Theory Team

＋Abstract

State-space models (SSMs), particularly Mamba, emerge as an efficient Transformer alternative with linear complexity for long-sequence modeling. Recent empirical works demonstrate Mamba's in-context learning (ICL) capabilities competitive with Transformers, a critical capacity for large foundation models. However, theoretical understanding of Mamba’s ICL remains limited, restricting deeper insights into its underlying mechanisms. Even fundamental tasks such as linear regression ICL, widely studied as a standard theoretical benchmark for Transformers, have not been thoroughly analyzed in the context of Mamba. To address this gap, we study the training dynamics of Mamba on the linear regression ICL task. By developing novel techniques tackling non-convex optimization with gradient descent related to Mamba's structure, we establish an exponential convergence rate to ICL solution, and derive a loss bound that is comparable to Transformer's. Importantly, our results reveal that Mamba can perform a variant of online gradient descent to learn the latent function in context. This mechanism is different from that of Transformer, which is typically understood to achieve ICL through gradient descent emulation. The theoretical results are verified by experimental simulation.

16:35 ~ 17:10

Primacy and Recency Effect in Mamba: A Mechanistic PerspectiveOnline

Speaker: Muhammad Cendekia Airlangga

Natural Language Understanding Team

＋Abstract

Add abstract here.

Self-paid Social Gathering 釣宿酒場マヅメ日本橋店

18:00 ~

Day 2

October 3, 2025

Session D 10:00 ~ 11:45

Session Chair: Benjamin Tobias Heinzerling

10:00 ~ 10:35

Fast Substructure Search on JSONL for Structured RAG

Speaker: Yasuo Tabei

Succinct Information Processing Team

＋Abstract

Substructure search in JSONL datasets is indispensable for realizing structured Retrieval-Augmented Generation (RAG) in foundation model applications. However, existing methods incur prohibitively high computational costs. In this work, we propose jXBW, a fast substructure search method that introduces a merged tree representation, a succinct data structure based on the extended Burrows–Wheeler Transform, and a three-step search algorithm. Experimental results demonstrate that jXBW achieves up to 4,700× speedup over tree-based methods and more than 6×10⁶× speedup over XML-based processing.

10:35 ~ 11:10

Best-of-N Mixtures in the Infinite-N Limit

Speaker: Junpei Komiyama

Sequential Decision Making Team

＋Abstract

We study best-of-$N$ for large language models (LLMs) where the selection is based on majority voting. In particular, we analyze the limit $N \to \infty$, which we denote as \boinf. While this approach achieves impressive performance in the limit, it requires an infinite test-time budget. To address this, we propose an adaptive generation scheme that selects $N$ based on answer agreement, thereby efficiently allocating inference-time computation. Beyond adaptivity, we extend the framework to weighted ensembles of multiple LLMs, showing that such mixtures can outperform any individual model. The optimal ensemble weighting is formulated as a mixed-integer linear program. Extensive experiments demonstrate the effectiveness of our approach: for example, an ensemble of Bo1 accuracy $\le 75\%$ LLMs achieves up to 93\% accuracy on AIME2025.

11:10 ~ 11:45

Seeing Is Believing, But How Much? A Comprehensive Analysis of Verbalized Calibration in Vision-Language Models

Speaker: Weihao Xuan

Geoinformatics Team

＋Abstract

Uncertainty quantification is essential for assessing the reliability and trustworthiness of modern AI systems. Among existing approaches, verbalized uncertainty, where models express their confidence through natural language, has emerged as a lightweight and interpretable solution in large language models (LLMs). However, its effectiveness in vision-language models (VLMs) remains insufficiently studied. In this work, we conduct a comprehensive evaluation of verbalized confidence in VLMs, spanning three model categories, four task domains, and three evaluation scenarios. Our results show that current VLMs often display notable miscalibration across diverse tasks and settings. Notably, visual reasoning models (i.e., thinking with images) consistently exhibit better calibration, suggesting that modality-specific reasoning is critical for reliable uncertainty estimation. To further address calibration challenges, we introduce Visual Confidence-Aware Prompting, a two-stage prompting strategy that improves confidence alignment in multimodal settings. Overall, our study highlights the inherent miscalibration in VLMs across modalities. More broadly, our findings underscore the fundamental importance of modality alignment and model faithfulness in advancing reliable multimodal systems.

Discussion over Lunch

11:45 ~ 13:15

Session E 13:15 ~ 16:20

Session Chair: Yuning Qiu

13:15 ~ 13:50

Variational Training for Improving and Understanding LLMs

Speaker: Thomas Möllenhoff

Adaptive Bayesian Intelligence Team

＋Abstract

Variational training with an Improved Variational Online Newton method (IVON) can consistently match or outperform Adam for neural networks such as GPT-2. IVON’s computational costs are nearly identical to Adam but computes Bayesian weight uncertainty on top for free, which is useful in many downstream tasks. The talk gives a broad overview of several projects which have been using IVON since its publication last year. In particular, I will talk about a theoretical analysis of IVON’s promising performance, and its application to LLM fine-tuning, language generation, sensitivity analysis via influence functions and possible use-cases in mechanistic interpretability.

13:50 ~ 14:25

Variational Model Merging

Speaker: Hugo Monzon

Adaptive Bayesian Intelligence Team

＋Abstract

Add abstract here.

Discussion over Tea/Coffee

14:25 ~ 15:10

Session E 13:15 ~ 16:20

Session Chair: Yuning Qiu

15:10 ~ 15:45

SPIRIT: Patching Speech Language Models against Jailbreak Attacks Online

Speaker: Nurdaulet Mukhituly

Natural Language Understanding Team

＋Abstract

Add abstract here.

15:45 ~ 16:20

Interpretting LLM's Processing of Various Inputs: Repetition, Typo, and Spelling-outOnline

Speaker: Tatsuya Hiraoka

Natural Language Understanding Team

＋Abstract

Add abstract here.

Closing Remarks

16:20 ~ 16:30

Organizers

Prof. Masashi Sugiyama
Prof. Kentaro Inui
Dr. Benjamin Tobias Heinzerling
Dr. Zhen-Yu Zhang

For any questions please contact Zhen-Yu Zhang (zhen-yu.zhang@riken.jp) or Benjamin T. Heinzerling (benjamin.heinzerling@riken.jp).

Acknowledgment

We gratefully acknowledge support from RIKEN AIP and all participating teams.