Chen Wu
GitHub |
Citations |
Linkedin
Research Interests
My research interests include Natural Language Processing, Information Retrieval, Multimodal Machine Learning, and the application of Deep Learning to Software Engineering tasks.
Education
Experience
-
Principal Data and Applied Scientist, Microsoft, Suzhou, China. 2022.10 – Present
-
Principal Research Scientist and Tech Lead, Tencent, Shanghai, China. 2021.09 – 2022.10
- Working on Large-scale Vision-Language Model Pretraining in YouTu Lab of Tencent Cloud.
-
Lead Research Scientist and Manager, Microsoft, Shanghai, China. 2020.01 – 2021.09
- Conducting research in Semantic Code Search, Code Translation, Patch Generation, and Performance Bug Detection in Data&AI Research Team of Microsoft Developer Division.
-
Staff Engineer and Manager, Alibaba, Hangzhou, China. 2016.04 – 2020.01
- Develop models for Personalized E-Commerce Search, Machine Reading Comprehension, and Self-Supervised Pretraining in Language Technology Lab of Alibaba DAMO Academy.
-
Staff Engineer and Tech Lead, Baidu, Shanghai, China. 2011.04 – 2016.04
- Apply Machine Learning, Knowledge Graph Mining, Nature Language Processing and Ad Click Prediction to improve the performance of branding advertising in Search Advertising Team.
Technical Skills
- Programming Languages
- Deep Learning Frameworks
- Others
Selected Projects
- Question Answering based on Machine Reading Comprehension Alibaba DAMO Academy
- Designed a hierarchical attention network for reading comprehension style question answering.
- Demo (SLQA+)
- Alibaba's Collection of Encoder-Decoders (AliceMind) Alibaba DAMO Academy
- Provides pre-trained encoder-decoder models and its related optimization techniques.
- Official Website
- Microsoft DeepDev Microsoft Research
- Sharing latest innovations and deep learning models aimed at software engineering tasks.
- Preview Website
Selected Achievements
- SOTA results in code-to-code translation on the CodeXGLUE organized by Microsoft 2021.08
- Ranked 1st on the CodeSearchNet Leaderboard by GitHub 2020.06
- Received 1st place in TREC'19 Deep Learning Track sponsored by the NIST 2019.11
- Ranked 1st on the GLUE Benchmark 2019.09
- Ranked 1st on the MS MARCO 2.0 Leaderboard hosted by Microsoft 2019.06
- Ranked 1st on the DuReader 2.0 Leaderboard hosted by Baidu 2018.10
- Ranked No.1 in the Q&A session of the TriviaQA hosted by University of Washington 2018.02
- For the first time in the history, accurate machine reading technology surpassed human reading results in the famous SQuAD Machine Reading Comprehension Competition organized by Stanford University. 2018.01
- Champion of the ACM CIKM Cup 2016 Competition 2016.10
-
Alibaba has claimed a new record in AI language understanding
MIT Technology Review , Jul 9, 2019
-
Alibaba and Microsoft AI beat humans in Stanford reading test
Financial Times , Jan 15, 2018
-
AI Beats Humans at Reading Comprehension
MIT Technology Review , Jan 15, 2018
-
Computers are getting better than humans at reading
CNN , Jan 16, 2018
-
AI can read! Tech firms race to smarten up thinking machines
AP News , Jan 27, 2018
-
Alibaba's AI outperforms humans in tough reading test DailyMail , Jan 15, 2018
-
Alibaba's AI Outguns Humans in Reading Test
Bloomberg , Jan 14, 2018
-
AI models beat humans at reading comprehension, but they've still got a ways to go
The Washington Post , Jan 16, 2018
-
Writing's on the wall: Alibaba AI machines beat humans at reading test
The Times , Jan 16, 2018
-
The Morning Download: Alibaba, Microsoft AI Bests Human Reading Comprehension
The Wall Street Journal , Jan 16, 2018
Selected Talks & Presentations
Program Committees
- EMNLP 2023 (The 2023 Conference on Empirical Methods in Natural Language Processing)
- NeurIPS 2023 (Thirty-seventh Conference on Neural Information Processing Systems)
- ACL 2023 (The 61st Annual Meeting of the Association for Computational Linguistics)
- AAAI 2023 (The Thirty-Seventh AAAI Conference on Artificial Intelligence)
- COLING 2022 (The 29th International Conference on Computational Linguistics)
- IJCAI 2022 (The 31st International Joint Conference on Artificial Intelligence)
- IJCAI 2021 (The 30th International Joint Conference on Artificial Intelligence)
- COLING 2020 (The 28th International Conference on Computational Linguistics)
Patent
Selected Publications
- Bin Bi, Chenliang Li, Chen Wu , Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si. PALM: Pre-training an Autoencoding & Autoregressive Language Model for Context-conditioned Generation. EMNLP 2020
- Wei Wang, Bin Bi, Ming Yan, Chen Wu , Zuyi Bao, Jiangnan Xia, Liwei Peng, Luo Si. StructBERT: Incorporating Language Structures into Pre-Training for Deep Language Understanding. ICLR 2020
- Bin Bi, Chen Wu , Ming Yan, Wei Wang, Jiangnan Xia, Chenliang Li. Incorporating External Knowledge into Machine Reading for Generative Question Answering. EMNLP 2019
- Jiangnan Xia, Chen Wu , Ming Yan. Incorporating Relation Knowledge into Commonsense Reading Comprehension with Multi-task Learning. CIKM 2019
- Ming Yan, Jiangnan Xia, Chen Wu , Bin Bi, Zhongzhou Zhao, Ji Zhang, Luo Si, Rui Wang, Wei Wang, Haiqing Chen. A Deep Cascade Model for Multi-Document Reading Comprehension. AAAI 2019
- Wei Wang, Ming Yan, Chen Wu. Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering. ACL 2018
- Chen Wu , Ming Yan. Session-aware information embedding for e-commerce product recommendation. CIKM 2017