Publications
publications by categories in reversed chronological order.
2026
- ICMLExVerus: Verus Proof Repair via Counterexample ReasoningJun Yang, Yuechun Sun, Yi Wu, Rodrigo Caridad, Yongwei Yuan, Jianan Yao, Shan Lu, and Kexin PeiIn Forty-third International Conference on Machine Learning, 2026
Large Language Models (LLMs) have shown promising results in automating formal verification. However, existing approaches treat proof generation as a static, end-to-end prediction over source code, relying on limited verifier feedback and lacking access to concrete program behaviors. We present EXVERUS, a counterexample-guided framework that enables LLMs to reason about proofs using behavioral feedback via counterexamples. When a proof fails, EXVERUS automatically generates and validates counterexamples, and then guides the LLM to generalize them into inductive invariants to block these failures. Our evaluation shows that EXVERUS significantly improves proof accuracy, robustness, and token efficiency over the state-of-the-art prompting-based Verus proof generator.
@inproceedings{yang2026exverus, title = {ExVerus: Verus Proof Repair via Counterexample Reasoning}, author = {Yang, Jun and Sun, Yuechun and Wu, Yi and Caridad, Rodrigo and Yuan, Yongwei and Yao, Jianan and Lu, Shan and Pei, Kexin}, booktitle = {Forty-third International Conference on Machine Learning}, year = {2026}, url = {https://openreview.net/forum?id=FNDkXA0OUJ}, } - ACM CAISAI Realtor: Towards Grounded Persuasive Language Generation for Automated CopywritingJibang Wu, Chenghao Yang, Yi Wu, Simon Mahns, Chaoqi Wang, Hao Zhu, Fei Fang, and Haifeng XuIn Proceedings of the ACM Conference on AI and Agentic Systems, 2026
This paper develops an agentic framework that employs large language models (LLMs) for grounded persuasive language generation in automated copywriting, with real estate marketing as a focal application. Our method is designed to align the generated content with user preferences while highlighting useful factual attributes. This agent consists of three key modules: (1) Grounding Module, mimicking expert human behavior to predict marketable features; (2) Personalization Module, aligning content with user preferences; (3) Marketing Module, ensuring factual accuracy and the inclusion of localized features. We conduct systematic human-subject experiments in the domain of real estate marketing, with a focus group of potential house buyers. The results demonstrate that marketing descriptions generated by our approach are preferred over those written by human experts by a clear margin while maintaining the same level of factual accuracy. Our findings suggest a promising agentic approach to automate large-scale targeted copywriting while ensuring factuality of content generation.
@inproceedings{10.1145/3786335.3813172, author = {Wu, Jibang and Yang, Chenghao and Wu, Yi and Mahns, Simon and Wang, Chaoqi and Zhu, Hao and Fang, Fei and Xu, Haifeng}, title = {AI Realtor: Towards Grounded Persuasive Language Generation for Automated Copywriting}, year = {2026}, isbn = {9798400724152}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3786335.3813172}, booktitle = {Proceedings of the ACM Conference on AI and Agentic Systems}, pages = {419–444}, numpages = {26} }
2025
- Reason&Plan4LLM@ICLRLLMs Aren’t Good Strategists, Yet Can Accumulate Episodes for Improved PlanningYi Wu, and Zhimin HuIn Workshop on Reasoning and Planning for Large Language Models, 2025
Strategic reasoning in dynamic environments, such as games, requires a balance between long-term strategy and short-term adaptations. Although specially trained agents can achieve superhuman performance, they often lack explainability and are highly dependent on extensive data for training. In contrast, approaches that leverage large language models (LLMs) benefit from few-shot learning but struggle to maintain strategic consistency. Drawing inspiration from existing cognitive models of human decision-making, which utilize various forms of memory, we introduce EpicStar, an LLM-based agent with cognitively inspired episodic and working memory modules. Episodic memory enables agents to draw on past experiences to formulate coherent long-term strategies, while working memory modulates active observation and decision variables essential for adaptation. We evaluated EpicStar in the strategy game StarCraft II, where it competes effectively against built-in agents at Level 6 difficulty, surpassing its predecessor at Level 5 with a smaller token budget. Our approach not only enhances adaptability, but also ensures strategic consistency, demonstrating the pivotal role that cognitive memory can play in strategic reasoning.
@inproceedings{wu2025llms, title = {{LLM}s Aren't Good Strategists, Yet Can Accumulate Episodes for Improved Planning}, author = {Wu, Yi and Hu, Zhimin}, booktitle = {Workshop on Reasoning and Planning for Large Language Models}, year = {2025}, url = {https://openreview.net/forum?id=34NjywGCdx}, }
2023
- DistShift@NeurIPSEvolving Domain Adaptation of Pretrained Language Models for Text ClassificationYun-Shiuan Chuang, Yi Wu, Dhruv Gupta, Rheeya Uppaal, Ananya Kumar, Luhang Sun, Makesh Narsimhan Sreedhar, Sijia Yang, Timothy T. Rogers, and Junjie HuIn NeurIPS 2023 Workshop on Distribution Shifts: New Frontiers with Foundation Models, 2023
Adapting pre-trained language models (PLMs) for time-series text classification amidst evolving domain shifts (EDS) is critical for maintaining accuracy in applications like stance detection. This study benchmarks the effectiveness of evolving domain adaptation (EDA) strategies, notably self-training, domain-adversarial training, and domain-adaptive pretraining, with a focus on an incremental self-training method. Our analysis across various datasets reveals that this incremental method excels at adapting PLMs to EDS, outperforming traditional domain adaptation techniques. These findings highlight the importance of continually updating PLMs to ensure their effectiveness in real-world applications, paving the way for future research into PLM robustness against the natural temporal evolution of language.
@inproceedings{chuang2023evolving, title = {Evolving Domain Adaptation of Pretrained Language Models for Text Classification}, author = {Chuang, Yun-Shiuan and Wu, Yi and Gupta, Dhruv and Uppaal, Rheeya and Kumar, Ananya and Sun, Luhang and Sreedhar, Makesh Narsimhan and Yang, Sijia and Rogers, Timothy T. and Hu, Junjie}, booktitle = {NeurIPS 2023 Workshop on Distribution Shifts: New Frontiers with Foundation Models}, year = {2023}, archiveprefix = {arXiv}, primaryclass = {cs.CL}, url = {https://openreview.net/forum?id=HSBV4cCheG}, } - WMTKnowComp Submission for WMT23 Word-Level AutoCompletion TaskYi Wu, Haochen Shi, Weiqi Wang, and Yangqiu SongIn Proceedings of the Eighth Conference on Machine Translation, Dec 2023
@inproceedings{wu-etal-2023-knowcomp, title = {{K}now{C}omp Submission for {WMT}23 Word-Level {A}uto{C}ompletion Task}, author = {Wu, Yi and Shi, Haochen and Wang, Weiqi and Song, Yangqiu}, editor = {Koehn, Philipp and Haddow, Barry and Kocmi, Tom and Monz, Christof}, booktitle = {Proceedings of the Eighth Conference on Machine Translation}, month = dec, year = {2023}, address = {Singapore}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2023.wmt-1.79}, doi = {10.18653/v1/2023.wmt-1.79}, pages = {882--889} }