Research | AIML@K AI + Math Lab at Korea University

Mathematical Foundations of AI

Sat, 20 Jun 2026 00:00:00 +0000

Overview

Modern AI systems work remarkably well, yet much of why they work from algorithmic and scientific perspective remains poorly understood. This theme builds mathematical foundations for learning systems, treating optimization landscapes, training dynamics, and generalization as objects we can analyze rather than merely observe.

Our results span several fronts. From optimization side, we have studied how training can bypass stationary points, and also dabbled on fractional-order gradient methods. On deep learning model structure, we develop principled structural pruning — using bifurcation dynamics and projective geometry to decide what edges a network can lose without giving up its functional output. From the representation learning viewpoint, we analyzed feature learning as a form of covariance learning and study the emergent linear separability of features in a network’s last layer. The throughline is turning empirical phenomena into theory that predicts behavior of deep learning models, guides better model design, and deeper understanding of why modern AI works.

Core Questions

What governs optimization dynamics, and when can training escape or bypass bad critical points?
How much of a network is truly necessary, and how do we prune with provable structure rather than heuristics?
How do useful features and clean geometric structure emerge in representations during training?

Representative Work

Bypassing Stationary Points in Training Deep Learning Models — IEEE TNNLS, 2024
Curse of Smoothness in Functional Neural Networks — IEEE Signal Processing Letters, 2025
Catalyst: Structured Pruning with Robust Bifurcation Dynamics — ICML 2025 HiLD Workshop
Feature Learning as a Virtual Covariance Learning — NeurIPS 2025 OPT Workshop
Emergent Linear Separability of Unseen Data Points in High-dimensional Last-Layer Feature Space — ICML 2025 HiLD Workshop

See all work on the Publications page.

Logs tagged Math4AI
Events and reading groups on learning theory

People

Donghun Lee — Principal Investigator
Taehun Cha — feature learning, last-layer geometry
Jaeheun Jung — structured pruning, training dynamics
Bosung Jung — optimization, unlearning

See People for the full lab.

Applied AI Systems

Fri, 19 Jun 2026 00:00:00 +0000

Overview

This theme is where our research meet real problem with practical constraints. We build and study applied systems where data is messy, latency matters, and reliability is as important as accuracy — and we feed what we learn in deployment back into the lab’s core research.

The goal is twofold: deliver AI-powered systems that are genuinely useful, and stress-test research ideas against the friction of the real world.

Core Questions

How do research methods hold up under real-world data, scale, and constraints?
What makes an ML system dependable, maintainable, and genuinely useful?
How can AI support teaching, learning, and sequential decision-making under uncertainty?

Representative Work

Agent-based Instructional Support Chatbot — KCC 2025
Investigating the Limits of Graph Foundation Model in Real-World Travel Recommendation Systems — PAKDD 2025 GLFM Workshop
Online Learning with Regularized Knowledge Gradients — PAKDD 2022
Bias-Corrected Q-Learning With Multistate Extension — IEEE TAC, 2019

See all work on the Publications page.

Logs tagged Application
Related logs and events

People

Donghun Lee — Principal Investigator
Dayeon Shin — instructional chatbots
Kyunghee Roh — ML systems and workloads
Yeajin Lee — applied recognition models

See People for the full lab. Earlier work on recommendation and education systems was led by alumni including Nayoung Lee and Yanggee Kim.

AI for Science and Engineering

Thu, 18 Jun 2026 00:00:00 +0000

Overview

We believe that mathematics-guided AI application can cover a wide range of science and engineering fields.

We built neural operators that use both Laplace and Fourier representations for generalizable, efficient operator learning algorithm suitable for many PDE problems. In seismology we showed that AI can create high quality broadband ground motion and ambient seismic noise with diffusion and generative models.
In material science and engineering we created AI-driven pipelines for estimating and optimizing mechanical properties of epoxy polymers. Also, transfer-learning across data-rich and data-poor battery imaging , and synthesize bar-link mechanism designs directly from specification. In biomedical field we developed multimodal models for knee-osteoarthritis diagnosis and automated measurement of spinal parameters. Across all of these, the common thread is to abstractify the governing nature of the underlying problem.

Core Questions

How can neural operators learn solution maps for whole families of PDEs (often used in STEM problems), efficiently and with guarantees of generalization?
How do we build generative models that respect physical structure and stay reliable even when facing out of distribution cases?
How can effective surrogates be designed to replace or accelerate classical simulation in science and engineering?

Representative Work

Best of Both Worlds: Bridging Laplace and Fourier for Generalizable and Efficient Operator Learning — NeurIPS 2025 ML4PS Workshop
Broadband Ground Motion Synthesis by Diffusion Model with Minimal Condition — ICML 2025
CLIP-KOA: Enhancing Knee Osteoarthritis Diagnosis with Multi-Modal Learning — MICCAI 2025 LMID Workshop
Re-experiment Smart: Enhancing Data-driven Prediction of Mechanical Properties of Epoxy Polymers — preprint, 2025
PPSD GAN: PPSD-informed Generative Model for Ambient Seismic Noise Synthesizing — IEEE GRSL, 2024

See all work on the Publications page.

Logs tagged AI4Science
Events and collaborations with science and engineering partners

People

Donghun Lee — Principal Investigator
Jeongun Ha — operator learning
Jaehyuk Lee — seismic and generative modeling
Jaeheun Jung — ground-motion synthesis, mechanism design
Hanyoung Kim — generative seismic modeling
Minseok Choi — Visiting Scholar (POSTECH), scientific machine learning

See People for the full lab.

Language, Reasoning, and Knowledge

Wed, 17 Jun 2026 00:00:00 +0000

Overview

Language models (LMs) have become general-purpose reasoning engines, yet they reason unevenly and sometimes state confident falsehoods. This theme studies the internal representations of LMs and how they affect the behavior of LM-based system, e.g. knowledge retrieval, reasoning, and self-assessment, to name a few.

Hallucination is a famous example: we have shown that pre-trained language models return distinguishable probability distributions on unfaithfully hallucinated text, and we have mapped the types of hallucination that arise in question answering along with the limits of the metrics used to detect them. We also probe how far models can extrapolate beyond their training distribution — for instance, into chemical domains — and we develop better representations and retrieval, from sentence-level topic models to aspect-based dense passage retrieval. Bringing in reinforcement learning to control the behavior of LMs (akin to how harnessing agents work nowadays) to high-stakes real-world text, such as extracting the consequences of central bank communications.

Core Questions

What causes hallucination, and how can it be detected, measured, and reduced towards trustable and explainable language models?
How well can/do language models extrapolate beyond their training distribution?
How should internal knowledge in LMs be represented and retrieved so that reasoning over text becomes more reliable?

Representative Work

Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts — EMNLP 2024 Findings
SentenceLDA: Discriminative and Robust Document Representation with Sentence Level Topic Model — EACL 2024
Evaluating Extrapolation Ability of Large Language Model in Chemical Domain — ACL 2024 Workshop
Consequence-Guided Information Extraction for Predicting Central Bank Communication’s Effect — Computational Economics, 2026

See all work on the Publications page.

Logs tagged LanguageModel
Events and discussions on LLMs and reasoning

People

Donghun Lee — Principal Investigator
Taehun Cha — hallucination, extrapolation, representation
Suhyun Bae — hallucination in question answering
Hanyoung Kim — dense passage retrieval

See People for the full lab.

AI for Mathematics

Mon, 15 Jun 2026 00:00:00 +0000

Overview

Mathematics is the most precise of the sciences, built on logical deduction and proof. The tools by which mathematicians explore, conjecture, and verify are now changing: powerful computation and modern AI are turning parts of mathematical research into a collaboration between human and machine. As an AI-research lab housed in a department of mathematics, this is our home ground.

A recurring keyword is representation, i.e. how mathematical objects should be encoded so that a modern AI model can show reasoning behavior, explainable AI aspects, faithfulness in semantics, and on. We have shown that numbers can be richly represented to deep neural networks using its p-adic expansions, and that a principled Fourier basis tied to prime structure gives a clean theoretical foundation to understand why language models struggle to learn modular arithmetics. We also studied how AI handles mathematical language before the advent of ChatGPT, using math word problems with non-numeric answers and customized tool-calling operations that modern AI services rely on to answer computation-related questions.

Core Questions

How should numbers, structures, and proofs be represented so that models can reason about them faithfully rather than by surface pattern?
Which mathematical tasks can AI genuinely assist, and where does it break down?
How can computation, conjecture, and proof reinforce one another?

Representative Work

Prime Fourier Embeddings: A Principled Basis for Modular Arithmetic — ICML 2026 AI for Math Workshop
Numbers Already Carry Their Own Embeddings — NeurIPS 2025 MathAI Workshop
Noun-MWP: Math Word Problems Meet Noun Answers — COLING 2022

See all work on the Publications page.

Logs tagged AI4Math · conjecture
Events and talks on AI for mathematics

People

Donghun Lee — Principal Investigator
Suhyun Bae — number embeddings, modular arithmetic
Taehun Cha — mathematical language and reasoning

See People for the full lab.

Research | AIML@K AI + Math Lab at Korea University

Mathematical Foundations of AI

Overview

Core Questions

Representative Work

Related

People

Applied AI Systems

Overview

Core Questions

Representative Work

Related

People

AI for Science and Engineering

Overview

Core Questions

Representative Work

Related

People

Language, Reasoning, and Knowledge

Overview

Core Questions

Representative Work

Related

People

AI for Mathematics

Overview

Core Questions

Representative Work

Related

People