Selected Publications

2024


  • DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation
    Yinjun Wu, Mayank Keoliya, Kan Chen, Neelay Velingker, Ziyang Li, Emily J Getzen, Qi Long, Mayur Naik, Ravi B Parikh, Eric Wong
    ICML 2024
  • Towards Compositionality in Concept Learning
    Adam Stein, Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong
    ICML 2024
  • TorchQL: A Programming Framework for Integrity Constraints in Machine Learning
    Aaditya Naik, Adam Stein, Yinjun Wu, Mayur Naik, and Eric Wong
    OOPSLA 2024 [Code][Paper]

  • 2023


  • Rectifying Group Irregularities in Explanations for Distribution Shift
    Adam Stein, Yinjun Wu, Eric Wong, Mayur Naik
    Neurips 2023 (XAIA workshop) [Paper]
  • Do Machine Learning Models Learn Statistical Rules Inferred from Data?
    Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong
    ICML 2023 [Code][Paper]
  • Learning to Select Pivotal Samples for Meta Re-weighting
    Yinjun Wu, Adam Stein, Jacob Gardner, Mayur Naik
    AAAI 2023 (oral) [Code][Paper][Slides]

  • 2022


  • Provenance-based Model Maintenance: Implications for Privacy [Paper]
    Yinjun Wu, Val Tannen and Susan B. Davidson
    IEEE Data Eng. Bull. 2022

  • 2021


  • CHEF: A Cheap and Fast Pipeline for Iteratively Cleaning Label Uncertainties
    Yinjun Wu, James Weimer, Susan B. Davidson [Technical report][Code]
    Wu, Yinjun, James Weimer, and Susan B. Davidson. "CHEF: a cheap and fast pipeline for iteratively cleaning label uncertainties." in Proceedings of the VLDB Endowment 14, no. 11 (2021): 2410-2418.
  • Dynamic Gaussian Mixture based Deep Generative Model For Robust Forecasting on Sparse Multivariate Time Series
    Yinjun Wu, Jingchao Ni, Wei Cheng, Bo Zong, Dongjin Song, Zhengzhang Chen, Yanchi Liu, Xuchao Zhang, Haifeng Chen, Susan B. Davidson [Paper][Full version][Code]
    In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 1, pp. 651-659. 2021

  • 2020


  • DeltaGrad: Rapid retraining of machine learning models
    Yinjun Wu, Edgar Dobriban, Susan B. Davidson [Paper][Slides][Code]
    In International Conference on Machine Learning (ICML), pp. 10355-10366. PMLR, 2020.
  • Lessons learned from the early performance evaluation of Intel Optane DC Persistent Memory in DBMS
    Yinjun Wu, Kwanghyun Park, Rathijit Sen, Brian Kroth, Jaeyoung Do [Paper][Technical report]
    In Proceedings of the 16th International Workshop on Data Management on New Hardware, pp. 1-3. 2020.
  • PrIU: A provenance-based approach for incrementally updating regression models
    Yinjun Wu, Val Tannen, Susan B. Davidson [Paper][Slides]
    In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 447-462. 2020.

  • 2019


  • ProvCite: Provenance-based Data Citation [Paper][Slides]
    Yinjun Wu, Abdussalam Alawini, Daniel Deutch, Tova Milo, Susan B. Davidson
    In Proceedings of the VLDB Endowment (2019), 12(7)

  • 2018


  • Data Citation: Giving Credit where Credit is Due [Paper][Slides]
    Yinjun Wu, Abdussalam Alawini, Susan B. Davidson, Gianmaria Silvello
    In Proceedings of the 2018 International Conference on Management of Data (SIGMOD conference), pp. 99-114. ACM, 2018.
  • Data Citation: A New Provenance Challenge [Paper]
    Abdussalam Alawini, Susan B. Davidson, Gianmaria Silvello, Val Tannen, Yinjun Wu (authors sorted alphabetically)
    IEEE Data Eng. Bull. 41(1): 27-38 (2018)

  • 2017


  • Automating Data Citation in CiteDB [Paper]
    Abdussalam Alawini, Susan Davidson, Wei Hu, Yinjun Wu (authors sorted alphabetically)
    Proceedings of the VLDB Endowment 10.12 (2017): 1881-1884.

  • See my Google Scholar page for full paper list