Hello, my name is

Yi Chern Tan.

I am a researcher and software engineer with experience in machine learning and natural language processing. I am currently @Cohere. I was previously @Waymo, @Facebook and @Yale. I have published on interpretability, fairness, representation learning and semantic parsing at NeurIPS, ICLR, ACL and EMNLP.

Scholar LinkedIn GitHub Resume

I received my BS in Computer Science and Ethics, Politics and Economics from Yale University in 2020. At Yale, I was advised by Dragomir Radev (LILY Lab), Robert Frank (CLAY Lab) and Elisa Celis (Controlling Bias in AI Group). I also worked with John Lafferty to create a curriculum for, and teach, text data science.

My research interests lie in the intersection of language + X ∈ {learning, representation, bias, reasoning} and technology + Y ∈ {ethics, culture, government, security}. In particular, I work on natural language processing and machine learning with the aim of building scalable systems that have a robust understanding of human language, and which are supported both by principled values and elegant engineering.

Apart from research, I also have experience in government and products. At Singapore's Smart Nation and Digital Government Office, I authored strategy papers and analyses on the societal harms of AI and the responsible deployment of human-centered AI systems. At Facebook, I developed transformer-based models for the detection of harmful comments on Instagram and secured the highest tier full-time return offer.

Since my graduation, I've been fulfilling mandatory military service in Singapore, carrying out operations as an "IO" (infantry officer). I am currently a Member of Technical Staff at Cohere, working on large language models.

  • Python
  • C
  • JavaScript/Node.js
  • PHP/Hack
  • HTML
  • CSS/Sass
  • SQL
  • R
  • Racket
  • PyTorch
  • TensorFlow
  • Keras
  • Spark
  • Hive
  • React
  • Flask
  • Express
  • ROS
  • Transformers
  • scikit-learn
  • nltk
  • spaCy
  • AllenNLP
  • PyText
  • NumPy
  • fast.ai
  • OpenCV
  • MuJoCo
  • OpenAI Gym
  • PostgreSQL
  • MongoDB
  • Git/GitHub
  • VSCode
  • Atom
  • Bash
  • Colab
  • Weights & Biases
  • Tensorboard
  • Prettier
  • ESLint
  • flake8
  • Emmet
Aug 2022 - present
Member of Technical Staff
Mar 2022 - Aug 2022
Machine Learning Engineer
May 2019 - Aug 2019
Software Engineering Intern
Aug 2018 - May 2019
Data Structures, Algorithms, Text Data Science
Teaching Assistant
Dec 2018 - Jan 2019
Applied Research Scientist Intern
May 2018 - Aug 2018
Planning and Prioritization Directorate
Policy Research Intern

*= equal contribution

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong
ICLR 2021 (Long Paper)
DART: Open-Domain Structured Data Record to Text Generation
Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, Yangxiaokang Liu, Nadia Irwanto, Jessica Pan, Faiaz Rahman, Ahmad Zaidi, Mutethia Mutuma, Yasin Tarabar, Ankit Gupta, Tao Yu, Yi Chern Tan, Xi Victoria Lin, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani
NAACL 2021 (Long Paper)
ESPRIT: Explaining Solutions to Physical Reasoning Tasks
Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming Xiong, Richard Socher, Dragomir Radev
ACL 2020 (Long Paper)
Assessing Social and Intersectional Biases in Contextualized Word Representations
Yi Chern Tan, L. Elisa Celis
NeurIPS 2019
Spotlight Paper
CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases
Tao Yu, Rui Zhang, He Yang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, Yi Chern Tan, Tianze Shi, Zihan Li, Youxuan Jiang, Michihiro Yasunaga, Sungrok Shim, Tao Chen, Alexander Fabbri, Zifan Li, Luyao Chen, Yuwen Zhang, Shreya Dixit, Vincent Zhang, Caiming Xiong, Richard Socher, Walter Lasecki, Dragomir Radev
EMNLP 2019 (Long Paper)
SParC: Cross-Domain Semantic Parsing in Context
Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher and Dragomir Radev
ACL 2019 (Long Paper)
Open Sesame: Getting Inside BERT's Linguistic Knowledge
*Yongjie Lin, *Yi Chern Tan, Robert Frank
ACL 2019 (BlackboxNLP Workshop)
YData: Seminar on Text Data Science
Undergraduate Learning Assistant (John Lafferty)
Data Structures and Programming Techniques
Undergraduate Learning Assistant (James Glenn)
Undergraduate Learning Assistant (James Glenn)
Professional Service
Program Committee / Reviewer
EMNLP 2021
NeurIPS 2021 (Outstanding Reviewer Award)
ICLR 2022
ICML 2022