Highlighted Research


A Survey on Large Language Model based Autonomous Agents
Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, Wayne Xin Zhao, Zhewei Wei, Ji-Rong Wen.
Autonomous agents have long been a prominent research topic in the academic community. Previous research in this field often focuses on training agents with limited knowledge within isolated environments, which diverges significantly from the human learning processes, and thus makes the agents hard to achieve human-like decisions. Recently, through the acquisition of vast amounts of web knowledge, large language models (LLMs) have demonstrated remarkable potential in achieving human-level intelligence. This has sparked an upsurge in studies investigating autonomous agents based on LLMs. To harness the full potential of LLMs, researchers have devised diverse agent architectures tailored to different applications. In this paper, we present a comprehensive survey of these studies, delivering a systematic review of the field of autonomous agents from a holistic perspective. More specifically, our focus lies in the construction of LLM-based agents, for which we propose a unified framework that encompasses a majority of the previous work. Additionally, we provide a summary of the various applications of LLM-based AI agents in the domains of social science, natural science, and engineering. Lastly, we discuss the commonly employed evaluation strategies for LLM-based AI agents. Based on the previous studies, we also present several challenges and future directions in this field.


When LLM based Agent Meets User Behavior Analysis: A Novel User Simulation Paradigm
Lei Wang, Jingsen Zhang, Hao Yang, Zhiyuan Chen, Jiakai Tang, Zeyu Zhang, Xu Chen, Yankai Lin, Ruihua Song, Wayne Xin Zhao, Jun Xu, Zhicheng Dou, Jun Wang, Ji-Rong Wen.
User behavior analysis is crucial in human-centered AI applications. In this field, the collection of sufficient and high-quality user behavior data has always been a fundamental yet challenging problem. An intuitive idea to address this problem is automatically simulating the user behaviors. However, due to the subjective and complex nature of human cognitive processes, reliably simulating the user behavior is difficult. Recently, large language models (LLM) have obtained remarkable successes, showing great potential to achieve human-like intelligence. We argue that these models present significant opportunities for reliable user simulation, and have the potential to revolutionize traditional study paradigms in user behavior analysis. In this paper, we take recommender system as an example to explore the potential of using LLM for user simulation. Specifically, we regard each user as an LLM-based autonomous agent, and let different agents freely communicate, behave and evolve in a virtual simulator called RecAgent. For comprehensively simulation, we not only consider the behaviors within the recommender system (\emph{e.g.}, item browsing and clicking), but also accounts for external influential factors, such as, friend chatting and social advertisement. Our simulator contains at most 1000 agents, and each agent is composed of a profiling module, a memory module and an action module, enabling it to behave consistently, reasonably and reliably. In addition, to more flexibly operate our simulator, we also design two global functions including real-human playing and system intervention. To evaluate the effectiveness of our simulator, we conduct extensive experiments from both agent and system perspectives.


REASONER: An Explainable Recommendation Dataset with Multi-aspect Real User Labeled Ground Truths: Towards more Measurable Explainable Recommendation
Xu Chen, Jingsen Zhang, Lei Wang, Quanyu Dai, Zhenhua Dong, Ruiming Tang, Rui Zhang, Li Chen, Wayne Xin Zhao, Ji-Rong Wen.
REASONER is an explainable recommendation dataset. It contains the ground truths for multiple explanation purposes, for example, enhancing the recommendation persuasiveness, informativeness and so on. In this dataset, the ground truth annotators are exactly the people who produce the user-item interactions, and they can make selections from the explanation candidates with multi-modalities. This dataset can be widely used for explainable recsys, unbiased recommendation and psychology-informed recommendation.
Conference on Neural Information Processing Systems (NeurIPS 2023 Dataset and Benchmarks Track)


RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms.
Wayne Xin Zhao, Shanlei Mu, Yupeng Hou, Zihan Lin, Yushuo Chen, Xingyu Pan, Kaiyuan Li, Yujie Lu, Hui Wang, Changxin Tian, Yingqian Min, Zhichao Feng, Xinyan Fan, Xu Chen*, Pengfei Wang*, Wendi Ji, Yaliang Li, Xiaoling Wang and Ji-Rong Wen.
In RecBole, we implement 72 commonly used recommendation algorithms, and provide the formatted copies of 28 recommendation datasets. Welcome to use our toolkit.
The Conference on Information and Knowledge Management (CIKM 2021, resource)

Measuring the "Why" in Recommender Systems: a Comprehensive Survey on the Evaluation of Explainable Recommendation.
Xu Chen, Yongfeng Zhang and Ji-Rong Wen.
In the field of explainable recommendation, how to evaluate the explanations has long been a fundamental yet not clearly discussed problem. In this survey, we aim to provide a systematic and comprehensive summarization on existing evaluation strategies. The contents of this survey are concluded from more than 100 papers from top-tier conferences like IJCAI, AAAI, TheWebConf, SIGIR, KDD, Recsys, UMAP and IUI, and the complete comparisons are presented at https://shimo.im/sheets/VKrpYTcwVH6KXgdy/MODOC/.

Explainable Recommendation: A Survey and New Perspectives.
Yongfeng Zhang and Xu Chen.
In this survey, we (1) provide a chronological research timeline of explainable recommendation, (2) present a two-dimensional taxonomy to classify existing explainable recommendation research, and (3) summarize how explainable recommendation applies to different recommendation tasks.
Foundations and Trends in Information Retrieval (FTinIR 2020)