UI-TARS Desktop

Important

[2025-03-18] We released a technical preview version of a new desktop app - Agent TARS, a multimodal AI agent that leverages browser operations by visually interpreting web pages and seamlessly integrating with command lines and file systems.

UI-TARS Desktop

UI-TARS Desktop is a GUI Agent application based on UI-TARS (Vision-Language Model) that allows you to control your computer using natural language.

📑 Paper | 🤗 Hugging Face Models | 🫨 Discord | 🤖 ModelScope
🖥️ Desktop Application | 👓 Midscene (use in browser)

Showcases

Instruction	Video
Get the current weather in SF using the web browser	new_mac_action_weather.mp4
Send a twitter with the content "hello world"	new_send_twitter_windows.mp4

News

[2025-02-20] - 📦 Introduced UI TARS SDK, is a powerful cross-platform toolkit for building GUI automation agents.
[2025-01-23] - 🚀 We updated the Cloud Deployment section in the 中文版: GUI模型部署教程 with new information related to the ModelScope platform. You can now use the ModelScope platform for deployment.

Features

🤖 Natural language control powered by Vision-Language Model
🖥️ Screenshot and visual recognition support
🎯 Precise mouse and keyboard control
💻 Cross-platform support (Windows/MacOS)
🔄 Real-time feedback and status display
🔐 Private and secure - fully local processing

Quick Start

See Quick Start.

Deployment

See Deployment.

Contributing

See CONTRIBUTING.md.

SDK (Experimental)

See @ui-tars/sdk

License

UI-TARS Desktop is licensed under the Apache License 2.0.

Citation

If you find our paper and code useful in your research, please consider giving a star ⭐ and citation 📝

@article{qin2025ui,
  title={UI-TARS: Pioneering Automated GUI Interaction with Native Agents},
  author={Qin, Yujia and Ye, Yining and Fang, Junjie and Wang, Haoming and Liang, Shihao and Tian, Shizuo and Zhang, Junda and Li, Jiahao and Li, Yunxin and Huang, Shijue and others},
  journal={arXiv preprint arXiv:2501.12326},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 200 Commits
.changeset		.changeset
.github/workflows		.github/workflows
.husky		.husky
.vscode		.vscode
apps		apps
docs		docs
examples		examples
packages		packages
rfcs		rfcs
scripts		scripts
.commitlintrc.cjs		.commitlintrc.cjs
.editorconfig		.editorconfig
.env.example		.env.example
.eslintignore		.eslintignore
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
.lintstagedrc.mjs		.lintstagedrc.mjs
.node-version		.node-version
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc.mjs		.prettierrc.mjs
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
codecov.yml		codecov.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json
turbo.json		turbo.json
vitest.config.mts		vitest.config.mts
vitest.workspace.mts		vitest.workspace.mts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UI-TARS Desktop

Showcases

News

Features

Quick Start

Deployment

Contributing

SDK (Experimental)

License

Citation

About

Releases 11

Packages

Contributors 16

Languages

License

bytedance/UI-TARS-desktop

Folders and files

Latest commit

History

Repository files navigation

UI-TARS Desktop

Showcases

News

Features

Quick Start

Deployment

Contributing

SDK (Experimental)

License

Citation

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 11

Packages 0

Contributors 16

Languages

Packages