This repository is the implementation of our ACL 2024-Findings paper Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation.
The framework simulates social media users in different ways:
- For LLM-empowered core users, the implementation is build on AgentVerse, many thanks to THUNLP for the open-source resource.
- For ordinary users supported by conventional ABMs, we use the mesa library to implement the agent-based models such as the Bounded Confidence Model.
Social media has emerged as a cornerstone of social movements, wielding significant influence in driving societal change. Simulating the response of the public and forecasting the potential impact has become increasingly important. However, existing methods for simulating such phenomena encounter challenges concerning their efficacy and efficiency in capturing the behaviors of social movement participants. In this paper, we introduce a hybrid framework HiSim for social media user simulation, wherein users are categorized into two types. Core users are driven by Large Language Models, while numerous ordinary users are modeled by deductive agent-based models. We further construct a Twitter-like environment to replicate their response dynamics following trigger events. Subsequently, we develop a multi-faceted benchmark SoMoSiMu-Bench for evaluation and conduct comprehensive experiments across real-world datasets. Experimental results demonstrate the effectiveness and flexibility of our method.
To be in compliance with Twitter’s terms of service, we can not publish the raw data. Instead, we only disclose the original tweet ids, from which you can filter out the users you want to study, to minimize the privacy risk.
- Metoo: from #metoo Digital Media Collection, we further keep the tweets during the events, where the ids can be downloaded at metoo_link.
- Roe: from #RoeOverturned, we further keep the tweets during the events, where the ids can be downloaded at roe_link.
- BLM: from blm_twitter_corpus, we further keep the tweets during the events, where the ids can be downloaded at blm_link.
For user list we used in our paper, we can only provide the ids id_link.
conda create -n HiSim python=3.9
conda activate HiSim
git clone https://github.com/xymou/HiSim.git
cd HiSim
pip install -e .
You need to export your OpenAI API key as follows:
# Export your OpenAI API key
export OPENAI_API_BASE="your_api_base_here"
export OPENAI_API_KEY="your_api_key_here"
- agentverse
- agents
- simulation_agent
- twitter
- environments
- simulation_env
- twitter
- abm_model
- twitter_page
- info_box
- message
The micro-level simulation aims to simulate the behaviors of users at the individual level given a certain context in the pattern of single-round simulation. In this scenarios, we do not include multi-agent interaction but only observe the replication of individual behaviors. Here is an example:
agentverse-microtest --task simulation/roe_micro --ckpt path_to_save_the_intermediate_status
For micro-level simulation, you mainly need to prepare the agent list and the corresponding context list in the config.yaml. Since a user can be involved in different (user, context) tuples, so there may be repeated agent in the list. Data fields in the "context" includes:
- tweet_page: the real tweet that user can see
- trigger_news: the offline event news at the corresponding time
- text: the ground truth reponse of the user
- msg_type: the message type of the ground truth reponse Note that text and msg_type will not be used in simulation. They are provided for subsequent evaluation.
The macro-level simulation runs for consecutive rounds, to help observe how collective opinions shift over time resulting from agent interactions. Here is an example:
agentverse-simulation --task simulation/roe_macro_hybrid --ckpt path_to_save_the_intermediate_status
You can create your own social media simulation by defining new scenarios in agentverse/tasks/simulation. There are some key points to define a macro-level simulation:
- Social Network Cnonstruction: you can assign the social networks of agents by uploading their follower list, either give a dict like {"userA":["followerA", "followerB"]}, or give a file path to "follower_info" of visibility, as the example in example_data/follower_dict.json.
- Offline News Feed: pre-defined offline news can be fed by "trigger_news" in environment. It is a dict whose key is the turn of news feeding and the value is the content of the news.
- Conventional ABM-driven Users: you need to provide the type and parameters of the abm models and initial attitudes of all the agents (including both the core users and oridinary users) in the "abm_model" field
- Target: the target/topic of the opinion modeling, such as "the Protection of Abortion", "Metoo Movement"
- Personal Experience: you can provide the real historical tweets of the users to model the personal memory, where the txt file can be specified in memory_path of personal_history of agent. The format of the authentic user tweets can be found in example_data/sample_user_tweets. If the historical tweets are not available, you can set the path to None.
A full example can be found in the config.yaml
Note:
If you want to run the simulation with LLM-based agents only (instead of the hybrid pattern), just set the abm_model config None. An exmaple can be found in the config.yaml
.
Please consider citing this paper if you find this repository useful:
@article{mou2024unveiling,
title={Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation},
author={Xinyi Mou and Zhongyu Wei and Xuanjing Huang},
year={2024},
journal = {arXiv preprint arXiv: 2402.16333},
}