| 
            
              
                Yibo Jiang
              
              
                Google Scholar  / 
                 Twitter  / 
                Email
               
              
                I am a Quantitative Researcher at Jump Trading. 
               
              
                I earned my Ph.D. in Computer Science from the University of Chicago, advised by Prof. Victor Veitch. I also work very closely with Prof. Bryon Aragam. 
               
              
                Previously, I received a MS in Computer Science from Columbia University, and graduated, as a 
                Bronze Tablet recipient, from the University of Illinois Urbana-Champaign with double degrees in Electrical Engineering and Math. 
               
                         
                Before starting my PhD, I was also a research fellow at Harvard, where I worked with Prof. Cengiz Pehlevan in the Theoretical Neuroscience Group. I have also interned at ByteDance (AML) and (a couple of times) at Nvidia (AI Dev Tech).
               
             | 
           
         
        
          
            
              Research
              
                I am broadly interested in robust/trustworthy machine learning, representation learning, causality, and more recently, the interpretability of large language models (LLMs). One speficic reseach goal of mine is to study the emergent structures and dynamics within representations, as well as complex networks arising from the training of modern machine learning models. My aim is to develop theories and methods for interpreting and ultimately utilizing these insights for alignment purposes.
               
               I like collaborations; If you have a cool problem, don't hesitate to reach out – let's explore it together! 
             | 
           
         
        Publications
        
          
          
          
          
            
            
              The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)
               
              Zihao Wang, Yibo Jiang, Jiahao Yu, Heqing Huang
               
              International Conference on Machine Learning (ICML), 2025
              
               
              
              
              arxiv 
              
              
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              The Geometry of Categorical and Hierarchical Concepts in Large Language Models
               
              Kiho Park, Yo Joong Choe, Yibo Jiang, Victor Veitch
               
              International Conference on Learning Representations (ICLR), 2025
              
              (Oral, 1.8%)
              
               
              
              ICML Mechanistic Interpretability Workshop, 2024
              
              (Best Paper Award)
              
               
              
              
              arxiv 
              
              
              
               / code
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              Quantifying Generalization Complexity for Large Language Models
               
              Zhenting Qi, Hongyin Luo, Xuliang Huang, Zhuokai Zhao, Yibo Jiang, Xiangjun Fan, Himabindu Lakkaraju, James Glass
               
              International Conference on Learning Representations (ICLR), 2025
              
               
              
              
              arxiv 
              
              
              
               / code
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              Do LLMs Dream of Elephants (When Told Not to)? Latent Concept Association and Associative Memory in Transformers
               
               Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam
               
              Advances in Neural Information Processing Systems (NeurIPS), 2024
              
               
              
              
              arxiv 
              
              
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              On the Origins of Linear Representations in Large Language Models
               
              Yibo Jiang*, Goutham Rajendran*, Pradeep Ravikumar, Bryon Aragam, Victor Veitch
               
              International Conference on Machine Learning (ICML), 2024
              
               
              
              
              arxiv 
              
              
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints
               
              Chaoqi Wang, Yibo Jiang, Chenghao Yang, Han Liu, Yuxin Chen
               
              International Conference on Learning Representations (ICLR), 2024
              
              (Spotlight, 5%)
              
               
              
              
              arxiv 
              
              
              
               / code
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              Uncovering Meanings of Embeddings via Partial Orthogonality
               
              Yibo Jiang, Bryon Aragam, Victor Veitch
               
              Advances in Neural Information Processing Systems (NeurIPS), 2023
              
               
              
              
              arxiv 
              
              
               / video
              
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              Learning Nonparametric Latent Causal Graphs with Unknown Interventions
               
              Yibo Jiang, Bryon Aragam
               
              Advances in Neural Information Processing Systems (NeurIPS), 2023
              
               
              
              
              arxiv 
              
              
               / video
              
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              Invariant and Transportable Representations for Anti-Causal Domain Shifts
               
              Yibo Jiang, Victor Veitch
               
              Advances in Neural Information Processing Systems (NeurIPS), 2022
              
               
              
              
              arxiv 
              
              
               / video
              
              
               / code
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              Associative Memory in Iterated Overparameterized Sigmoid Autoencoders
               
              Yibo Jiang, Cengiz Pehlevan
               
              International Conference on Machine Learning (ICML), 2020
              
               
              
              
              arxiv 
              
              
               / video
              
              
               / code
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              Meta-Learning to Cluster
               
              Yibo Jiang, Nakul Verma
               
              arxiv Preprint, 2019
              
               
              
              
              arxiv 
              
              
              
              
              
              
              
              
             | 
           
          
          
          
          
          
          
            
            
              Model-Agnostic Meta-Learning using Runge-Kutta Methods
               
              Daniel Jiwoong Im, Yibo Jiang, Nakul Verma
               
              arxiv Preprint, 2019
              
               
              
              
              arxiv 
              
              
              
              
              
              
              
              
             | 
           
          
          
          
         
        
        
          
            
              Academic Services
              
                  Conference Reviewer: NeurIPS, ICML, ICLR, AISTATS, KDD, AAAI
               
             | 
           
         
        
          
            
               
              
                Profile picture taken by Qian Sheng. Last updated Jan. 2025.
               
             | 
           
         
       |