AndroidGen: Building an Android Language Agent under Data Scarcity

Overview

Overview: AndroidGen framework is designed to complete tasks in Android. Our process comprises three stages: preliminary, task execution, and update. Preliminary (a): ExpSearch retrieve the top-1 similar tasks and trajectories from the database and input them into the agent. Task Execution (b): ReflectPlan assesses the progress and updates the plan. Then, the agent generates operations based on the environment, plan, and retrieval example. AutoCheck verifies these operations, executing them if successful or regenerating them if not. Update (c): StepCritic evaluates the trajectories in fine-grand and updates the database accordingly.

Features

ExpSearch: ExpSearch is a novel approach leveraging LLM’s in-context learning ability to optimize the agent iteratively by learning from its own trajectories.
ReflectPlan: We develop ReflectPlan that enables self-assessment of the progress of tasks during execution. This approach empowers the agent to enhance planning and reflecting capabilities
AutoCheck: We develop AutoCheck module to enhance agent robustness. Upon generating the operation, AutoCheck proactively verifies the response's validity. When detecting potential issues or non-compliant actions, the subsequent execution is terminated, and feedback is provided to the agent in the next round.
StepCritic: StepCritic can decompose tasks into various sub-goals, and evaluate the trajectory step-by-step. This approach enables a granular assessment of the trajectories, maximizing the data’s learning value.

Preparation

Prepare Code and Environments

Install the required dependencies:

pip install -r requirements.txt

Install AndroidWorld:

git clone https://github.com/google-research/android_world

You need to download Android Studio and configure it according to the guidance of AndroidWorld.

[Optional] Prepare Retriever Checkpoint

If you want to enable the ExpSearch function, you should download the retriever checkpoint and specify the path.

You need to download Contriever

Then change the "retriever_ckpt_path" in config.json to your path.

Prepare Agent LLM

You can use OpenAI or our trained models as your Android agent.

OpenAI Model

You need to export your OPENAI_TOKEN for usage.

export OPENAI_TOKEN = "YOUR KEY"

Trained GLM / Llama

Download our trained GLM here and Llama here, then deploy the checkpoint with vLLM

Usage

Export Environment Variables

export PYTHONPATH="Path to your androidworld:$PYTHONPATH"

Launch the Android Emulator from the command line

EMULATOR_NAME=AndroidWorldAVD
~/Library/Android/sdk/emulator/emulator -avd $EMULATOR_NAME -no-snapshot -grpc 8554

Configuration

llm: Settings related to the language model.
- model_name: Name of the model to use. Example: "gpt-4o-2024-08-06"
- model_type: Type of the model. Example: "text"
architecture: Settings related to the autonomous reasoning module.
- reflectplan: Configuration for the planning module.
  - model_name: Name of the model for planning. Example: "gpt-4o-2024-08-06"
  - model_type: Type of the model for planning. Example: "text"
- autocheck: Boolean flag for automatic checking. Example: true
- expsearch: Configuration for example retrieval.
  - retriever_ckpt_path: Path to the retriever checkpoint.
  - database_path: Path to the example database.
trace_dir: Directory for storing trace files. Example: "./episodes"

Run AndroidGen

To run AndroidGen, just type the following command in your terminal, and it will start working:

python run.py

After running, you will get the complete trajectory of the task in trace_dir

[Optional] Prepare database

After obtaining task trajectories from the previous runs, you can run judge.py to build your own database for ExpSearch.

python -m model.judge.judge \
    --data_dir=./episodes \
    --output_path=./database.json

Evaluation

To run the AndroidWorld evaluation, you can run the following command:

cd evaluate/androidworld
python eval.py \
  --suite_family=android_world \
  --agent_name=androidgen

For the evaluation of popular applications and AitW, we provide the datasets under /evaluate (the AitW test set is taken from DigiRL).

License

This repository is licensed under the Apache-2.0 License. All open-sourced data is for resarch purpose only.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
democase		democase
environment		environment
evaluate		evaluate
model		model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.json		config.json
recorder.py		recorder.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AndroidGen: Building an Android Language Agent under Data Scarcity

Table of Contents

Overview

Features

Preparation

Prepare Code and Environments

[Optional] Prepare Retriever Checkpoint

Prepare Agent LLM

OpenAI Model

Trained GLM / Llama

Usage

Export Environment Variables

Launch the Android Emulator from the command line

Configuration

Run AndroidGen

[Optional] Prepare database

Evaluation

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

THUDM/AndroidGen

Folders and files

Latest commit

History

Repository files navigation

AndroidGen: Building an Android Language Agent under Data Scarcity

Table of Contents

Overview

Features

Preparation

Prepare Code and Environments

[Optional] Prepare Retriever Checkpoint

Prepare Agent LLM

OpenAI Model

Trained GLM / Llama

Usage

Export Environment Variables

Launch the Android Emulator from the command line

Configuration

Run AndroidGen

[Optional] Prepare database

Evaluation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages