Feng Y., Kwiatkowski A., Zheng K., Kempe J., Duan Y. (2025), PILAF: Optimal Human Preference Sampling for Reward Modeling, in , Proceedings of Machine Learning Research (PMLR), 16744-16776 p. 
Gehring J., Zheng K., Copet J., Mella V., Carbonneaux Q., Cohen T., Synnaeve G. (2025), RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning, in , Proceedings of Machine Learning Research (PMLR), 19034-19055 p. 
Zheng K., Decujis J., Gehring J., Cohen T., Negrevergne B., Synnaeve G. (2025), What Makes Large Language Models Reason In (Multi-Turn) Code Generation?, in , OpenReview
Tang Y., Zheng K., Synnaeve G., Munos R. (2025), Optimizing Language Models for Inference Time Objectives using Reinforcement Learning, in , Proceedings of Machine Learning Research (PMLR), 59066-59085 p. 
Yoran O., Zheng K., Gloeckle F., Gehring J., Synnaeve G., Cohen T. (2025), The KoLMogorov Test: Compression by Code Generation, in , OpenReview 
Verine A., Le Bronnec F., Zheng K., Allauzen A., Chevaleyre Y., Negrevergne B. (2025), Improving Diversity in Language Models: When Temperature Fails, Change the Loss, Paris, Preprint Lamsade