About me

Address: Room 1459, Science Building # 1, Peking University (北京大学理科1号楼1459房间)

Hi, welcome to my webpage! I am currently an Assistant Professor at the School of Computer Science, Peking University. Prior to this, I was working with Prof. Mayur Naik as a Post-Doc at the Department of Computer and Information Science, University of Pennsylvania between August 2021 and January 2024. Before that, I obtained my PhD degree from the Department of Computer and Information Science, University of Pennsylvania in August 2021, under the supervision of Prof. Susan Davidson. I also got my Bachelor degree from the Department of Automation, Tsinghua University in 2016. My research interests lie at the intersection of the data science, data management and machine learning (see the Research Overview page for more details).

I am actively seeking talented and self-motivated students to join my group. There are several openings for Ph.D candidates and research interns. Feel free to contact me via email!


News

  • March 2024 : I officially join the School of Computer Science at Peking University as an assistant professor!
  • December 2023 : One paper is accepted by OOPSLA 2024
  • October 2023 : awarded NSFC Excellent Young Scientists Fund Overseas Program (国家自然科学基金优秀青年科学基金海外项目)!
  • April 2023 : One paper is accepted by ICML 2023
  • December 2022 : One paper is accepted by AAAI 2023

  • Selected Publications

  • TorchQL: A Programming Framework for Integrity Constraints in Machine Learning (OOPSLA 2024)
  • Do Machine Learning Models Learn Statistical Rules Inferred from Data? (ICML 2023)
  • Learning to Select Pivotal Samples for Meta Re-weighting (AAAI 2023)
  • CHEF: A Cheap and Fast Pipeline for Iteratively Cleaning Label Uncertainties (VLDB 2021)
  • Dynamic Gaussian Mixture based Deep Generative Model For Robust Forecasting on Sparse Multivariate Time Series (AAAI 2021)
  • DeltaGrad: Rapid retraining of machine learning models (ICML 2020)
  • PrIU: A provenance-based approach for incrementally updating regression models (SIGMOD 2020)
  • ProvCite: Provenance-based Data Citation (VLDB 2019)
  • Data Citation: Giving Credit where Credit is Due (SIGMOD 2018)