Mingkai Deng
mingkaid [at] cs [dot] cmu [dot] edu

I am a first-year PhD student at the Language Technologies Institute at Carnegie Mellon University, working with Prof. Eric P. Xing. I also work very closely with Prof. Zhiting Hu. My current research interests include visual reasoning and natural language generation.

I did my MS at CMU's Machine Learning Department. Prior to that, I worked as a data scientist in the industry after interning at Baidu Research, supervised by Prof. Hui Xiong. I did my undergrad at Columbia University, where I double-majored in Mathematics-Statistics and Computer Science.

CV  /  LinkedIn  /  Twitter  /  GitHub

profile photo
  • 2022-10 Our paper "RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning" accepted by EMNLP 2022
  • 2021-10 Post about our EMNLP 2021 paper is published at ML@CMU Blog
  • 2021-09 Our paper "Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation" accepted by EMNLP 2021
  • 2020-08 Concluded my wonderful time at Weber Shandwick
  • 2019-06 Started internship with Prof. Hui Xiong at Baidu Research
  • 2018-11 Our project "Data-Driven Analysis and Prediction of Gentrification in New York City" won Best Insights Award at Columbia ASA DataFest
RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning
Mingkai Deng*, Jianyu Wang*, Cheng-Ping Hsieh*, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric P. Xing, Zhiting Hu
EMNLP 2022
paper / code

An efficient and flexible framework for using RL to optimize prompts of discrete text that enable pre-trained LMs (e.g., BERT, GPT-2) to perform diverse NLP tasks. Experiments on few-shot classification and unsupervised text style transfer show superior performance to a wide range of existing methods.

Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation
Mingkai Deng*, Bowen Tan*, Zhengzhong Liu, Eric P. Xing, Zhiting Hu
EMNLP 2021
paper / slides / blog / code / open-source library

A general framework that helps solve the difficulty of evaluating natural language generation (NLG) with a single unified operation. Inspired evaluation metrics improve over SOTA metrics for diverse NLG tasks. Our metrics are available as library on PyPI and GitHub.

Discovering the Area of Point-of-Interests: A Heterogeneous Data Fusion Perspective
Mingkai Deng*, Guanglei Du*, Xinjiang Lu, Jingbo Zhou, Jing Sun, Yiming Zhang
Preprint 2019

Combine structured point-of-interest (POI, e.g., parking lots and buildings) data with unstructured satellite image data to automatically identify and draw the boundaries of urban areas-of-interest (e.g., residential areas and campuses).

ODE Transformer
Mingkai Deng, Biqing Qiu, Yanda Chen, Iddo Drori
Preprint 2019
paper / code

Formulate the Transformer model as an ordinary differential equation (ODE) and perform forward-backward operations using an ODE solver. The resulting continuous-depth model has fewer parameters, converges faster, and shows promising performance in empirical experiments.

Data-Driven Analysis and Prediction of Gentrification in New York City (NYC)
Mingkai Deng, Jerry Shi, Yvonne Zhou
ASA DataFest 2018   (Best Insights Award)
slides / code

A data-driven narrative of NYC gentrification patterns over time and their second-order impact on the city's inhabitants. Construct significant leading predictors using resources from NYC OpenData.

The source code of this website is adapted from here