Chen Wu
GitHub |
Citations |
Linkedin
Biography
I am a Principal Applied Scientist at Microsoft AI working on LLM foundation and applied technologies.
Previously, I was a Research Scientist Manager at Microsoft C+AI.
I received my Ph.D. in Computer Science from Tongji University.
My research interests include natural-language processing, information retrieval, multimodal ML, large-scale model pre-training, and applying deep learning to software-engineering tasks.
Professional Experience
Role & Organization |
Dates |
Key Responsibilities |
Principal Applied Scientist & Tech Lead – Microsoft (AI) |
2022-09 – Present |
• Lead LLM inference acceleration (long-sequence & reasoning models). • Research multimodal models to improve Feeds Quality & Ranking. • Apply large-scale LMs in AIOps for decision intelligence in the M365 China team. |
Principal Research Scientist & Tech Lead – Tencent (YouTu Lab) |
2021-09 – 2022-09 |
• Led large-scale vision-language model pre-training for Tencent Cloud. |
Lead Research Scientist & Manager – Microsoft (Developer Division) |
2020-01 – 2021-09 |
• Research on semantic code search, code translation, patch generation, performance-bug detection. |
Staff Engineer & Manager – Alibaba (DAMO Academy) |
2016-04 – 2020-01 |
• Built models for personalized e-commerce search, machine-reading comprehension, self-supervised pre-training. |
Staff Engineer & Tech Lead – Baidu (Search Ads) |
2011-04 – 2016-04 |
• Applied ML, knowledge-graph mining, NLP & ad-click prediction to branding-ads performance. |
Selected Projects
Project |
Organization |
Highlights / Links |
Question Answering via Machine Reading Comprehension |
Alibaba DAMO Academy |
Designed a hierarchical-attention network. SLQA+ Demo |
Alibaba’s Collection of Encoder-Decoders (AliceMind) |
Alibaba DAMO Academy |
Suite of pre-trained encoder-decoder models and optimizations. Portal |
Microsoft DeepDev |
Microsoft Research |
Latest DL innovations for software-engineering tasks. Preview |
Selected Achievements
- SOTA in code-to-code translation on CodeXGLUE (Microsoft) — 2021-08
- Rank #1 CodeSearchNet Leaderboard (GitHub) — 2020-06
- 1st place TREC ’19 Deep Learning Track (NIST) — 2019-11
- Rank #1 GLUE Benchmark — 2019-09
- Rank #1 MS MARCO 2.0 Leaderboard (Microsoft) — 2019-06
- Rank #1 DuReader 2.0 Leaderboard (Baidu) — 2018-10
- Rank #1 TriviaQA (Q&A session, Univ. Washington) — 2018-02
- First system to surpass humans on SQuAD machine-reading comprehension — 2018-01
- Champion ACM CIKM Cup 2016 — 2016-10
- MIT Technology Review – “Alibaba has claimed a new record in AI language understanding” — Jul 9 2019
- Financial Times – “Alibaba and Microsoft AI beat humans in Stanford reading test” — Jan 15 2018
- MIT Technology Review – “AI Beats Humans at Reading Comprehension” — Jan 15 2018
- CNN – “Computers are getting better than humans at reading” — Jan 16 2018
- AP News – “AI can read! Tech firms race to smarten up thinking machines” — Jan 27 2018
- Daily Mail – “Alibaba’s AI outperforms humans in tough reading test” — Jan 15 2018
- Bloomberg – “Alibaba’s AI Outguns Humans in Reading Test” — Jan 14 2018
- The Washington Post, The Times, The Wall Street Journal, … (and others) — Jan 2018
Selected Talks & Presentations
Date |
Event / Venue |
Title |
2019-11-03 |
ACM CIKM |
Incorporating Relation Knowledge into Commonsense Reading Comprehension with Multi-task Learning |
2019-08-16 |
Boundless – Zhejiang Lab Int’l Youth Talents Forum |
Large Scale Neural Network Language Models |
2018-09-20 |
Alibaba Apsara Conf. – Intelligent NLP Tech Session |
An Introduction to Machine Reading Comprehension for Question Answering |
2017-10-13 |
Alibaba Apsara Conf. – NLP Tech Session |
Machine Reading Comprehension Technology and Applications |
2017-11-06 |
ACM CIKM |
Session-aware Information Embedding for E-commerce Product Recommendation |
2016-10-24 |
ACM CIKM |
Ensemble Methods for Personalized E-commerce Search Challenge (CIKM Cup 2016) |
2016-10-20 |
QCon Software Dev. Conf. |
Learning to Rank in Personalized E-commerce Search |
2016-06-18 |
CCF YOCSEF |
Introduction to Query Auto-Completion |
Program & Review Committees (Partial List)
NeurIPS 2025, IJCAI 2025, ICLR 2025, EMNLP 2024, COLING 2025, AAAI 2025, NeurIPS 2024, IJCAI 2024, NAACL 2024, ICLR 2024, LREC-COLING 2024, AAAI 2024, EMNLP 2023, NeurIPS 2023, ACL 2023, AAAI 2023, COLING 2022, IJCAI 2022, IJCAI 2021, COLING 2020
Patents
Title |
Inventors (incl. Chen Wu) |
Publication No. |
System and Method for Identifying Performance Bottlenecks |
Spandan Garg, Roshanak Zilouchian Moghaddam, Paul Sean Harrington, Chen Wu, … |
US 12 164 412 B2 |
Distilling Transformers for Neural Cross-Domain Search |
Colin B. Clement, Dawn Drain, Neelakantan Sundaresan, Chen Wu, … |
US 2023 0042051 A1 |
Automated Fine-tuning & Deployment of Pre-trained DL Models |
Colin B. Clement, Shao Kun Deng, Dawn Drain, Chen Wu, … |
US 2022 0398462 A1 |
Semi-supervised Translation of Source-code Programs using Neural Transformers |
Colin B. Clement, Dawn Drain, Chen Wu, … |
US 12 045 592 B2 |
Source-code Generation using Code Templates with Neural Transformers |
Mikhail Breslav, Colin B. Clement, Dawn Drain, Chen Wu, … |
US 2022 0244952 A1 |
Performance Bug Detection and Code Recommendation |
Spandan Garg, Paul S. Harrington, Chen Wu, … |
US 12 135 628 B2 |
Method and Device for Determining Target Page as well as Equipment |
Chen Wu, Xiao Jiang |
CN 104063394 B |
Publications (Selected)
Year |
Venue |
Title / Authors* |
2022 |
EMNLP (Industry Track) |
Grafting Pre-trained Models for Multimodal Headline Generation — Lingfeng Qiao, Chen Wu, … |
2022 |
ACM ESEC/FSE |
DeepDev-PERF: A Deep Learning-based Approach for Improving Software Performance — Spandan Garg, Chen Wu, … |
2022 |
4th Person-in-Context Wkshp. |
Exploiting Feature Diversity for Make-up Temporal Video Grounding — Xiujun Shu, Chen Wu, … |
2021 |
ACM SOAP |
PerfLens: A Data-driven Performance Bug Detection and Fix Platform — Spandan Garg, Chen Wu, … |
2021 |
ACM MAPS |
Generating Bug-fixes using Pre-trained Transformers — Dawn Drain, Chen Wu, … |
2020 |
EMNLP |
PALM: Pre-training an Autoencoding & Autoregressive LM for Context-conditioned Generation — Bin Bi, Chen Wu, … |
2020 |
AAAI |
Generating Well-formed Answers by Machine Reading with Stochastic Selector Networks — Bin Bi, Chen Wu, … |
2020 |
ICLR |
StructBERT: Incorporating Language Structures into Pre-Training for Deep LU — Wei Wang, Chen Wu, … |
2019 |
EMNLP + IJCNLP |
Incorporating External Knowledge into Machine Reading for Generative QA — Bin Bi, Chen Wu, … |
2019 |
ACM CIKM |
Incorporating Relation Knowledge into Commonsense Reading Comprehension with Multi-task Learning — Jiangnan Xia, Chen Wu, … |
2019 |
AAAI |
A Deep Cascade Model for Multi-Document Reading Comprehension — Ming Yan, Chen Wu, … |
2019 |
ICSE-Companion |
Optimizing Seed Inputs in Fuzzing with Machine Learning — Liang Cheng, Chen Wu, … |
2019 |
TREC DL Track |
IDST at TREC 2019: Deep-Cascade Ranking with Generation-based Doc Expansion… — Ming Yan, Chen Wu, … |
2018 |
ACL |
Multi-Granularity Hierarchical Attention Fusion Networks for RC & QA — Wei Wang, Chen Wu |
2017 |
ACM CIKM |
Session-aware Information Embedding for E-commerce Product Recommendation — Chen Wu, Ming Yan |