Skip to content

Latest commit

 

History

History
5536 lines (3989 loc) · 420 KB

README.md

File metadata and controls

5536 lines (3989 loc) · 420 KB

Awesome Python

Awesome Last commit License: MIT

Hand-picked awesome Python libraries and frameworks, organised by category 🐍

Interactive version: www.awesomepython.org

Updated 05 May 2024

Categories

  • Newly Created Repositories - Awesome Python is regularly updated, and this category lists the most recently created GitHub repositories from all the other repositories here (10 repos)
  • Code Quality - Code quality tooling: linters, formatters, pre-commit hooks, unused code removal (17 repos)
  • Crypto and Blockchain - Cryptocurrency and blockchain libraries: trading bots, API integration, Ethereum virtual machine, solidity (13 repos)
  • Data - General data libraries: data processing, serialisation, formats, databases, SQL, connectors, web crawlers, data generation/augmentation/checks (100 repos)
  • Debugging - Debugging and tracing tools (9 repos)
  • Diffusion Text to Image - Text-to-image diffusion model libraries, tools and apps for generating images from natural language (36 repos)
  • Finance - Financial and quantitative libraries: investment research tools, market data, algorithmic trading, backtesting, financial derivatives (31 repos)
  • Game Development - Game development tools, engines and libraries (6 repos)
  • GIS - Geospatial libraries: raster and vector data formats, interactive mapping and visualisation, computing frameworks for processing images, projections (28 repos)
  • Graph - Graphs and network libraries: network analysis, graph machine learning, visualisation (6 repos)
  • GUI - Graphical user interface libraries and toolkits (8 repos)
  • Jupyter - Jupyter and JupyterLab and Notebook tools, libraries and plugins (24 repos)
  • LLMs and ChatGPT - Large language model and GPT libraries and frameworks: auto-gpt, agents, QnA, chain-of-thought workflows, API integations. Also see the Natural Language Processing category for crossover (228 repos)
  • Math and Science - Mathematical, numerical and scientific libraries (22 repos)
  • Machine Learning - General - General and classical machine learning libraries. See below for other sections covering specialised ML areas (152 repos)
  • Machine Learning - Deep Learning - Machine learning libraries that cross over with deep learning in some way (71 repos)
  • Machine Learning - Interpretability - Machine learning interpretability libraries. Covers explainability, prediction explainations, dashboards, understanding knowledge development in training (16 repos)
  • Machine Learning - Ops - MLOps tools, frameworks and libraries: intersection of machine learning, data engineering and DevOps; deployment, health, diagnostics and governance of ML models (44 repos)
  • Machine Learning - Reinforcement - Machine learning libraries and toolkits that cross over with reinforcement learning in some way: agent reinforcement learning, agent environemnts, RLHF (22 repos)
  • Machine Learning - Time Series - Machine learning and classical timeseries libraries: forecasting, seasonality, anomaly detection, econometrics (18 repos)
  • Natural Language Processing - Natural language processing libraries and toolkits: text processing, topic modelling, tokenisers, chatbots. Also see the LLMs and ChatGPT category for crossover (82 repos)
  • Packaging - Python packaging, dependency management and bundling (28 repos)
  • Pandas - Pandas and dataframe libraries: data analysis, statistical reporting, pandas GUIs, pandas performance optimisations (24 repos)
  • Performance - Performance, parallelisation and low level libraries (28 repos)
  • Profiling - Memory and CPU/GPU profiling tools and libraries (11 repos)
  • Security - Security related libraries: vulnerability discovery, SQL injection, environment auditing (14 repos)
  • Simulation - Simulation libraries: robotics, economic, agent-based, traffic, physics, astronomy, chemistry, quantum simulation. Also see the Maths and Science category for crossover (27 repos)
  • Study - Miscellaneous study resources: algorithms, general resources, system design, code repos for textbooks, best practices, tutorials (52 repos)
  • Template - Template tools and libraries: cookiecutter repos, generators, quick-starts (8 repos)
  • Terminal - Terminal and console tools and libraries: CLI tools, terminal based formatters, progress bars (14 repos)
  • Testing - Testing libraries: unit testing, load testing, acceptance testing, code coverage, browser automation, plugins (22 repos)
  • Typing - Typing libraries: static and run-time type checking, annotations (12 repos)
  • Utility - General utility libraries: miscellaneous tools, linters, code formatters, version management, package tools, documentation tools (189 repos)
  • Vizualisation - Vizualisation tools and libraries. Application frameworks, 2D/3D plotting, dashboards, WebGL (33 repos)
  • Web - Web related frameworks and libraries: webapp servers, WSGI, ASGI, asyncio, HTTP, REST, user management (56 repos)

Newly Created Repositories

Awesome Python is regularly updated, and this category lists the most recently created GitHub repositories from all the other repositories here.

  1. xai-org/grok-1 ⭐ 48,132
    This repository contains JAX example code for loading and running the Grok-1 open-weights model.

  2. karpathy/llm.c ⭐ 17,622
    LLM training in simple, pure C/CUDA. There is no need for 245MB of PyTorch or 107MB of cPython

  3. stitionai/devika ⭐ 16,933
    Devika is an advanced AI software engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective.

  4. jasonppy/VoiceCraft ⭐ 6,729
    Zero-Shot Speech Editing and Text-to-Speech in the Wild

  5. apple/corenet ⭐ 6,226
    CoreNet is a deep neural network toolkit that allows researchers and engineers to train standard and novel small and large-scale models for variety of tasks, including foundation models (e.g., CLIP and LLM), object classification, object detection, and semantic segmentation.

  6. databricks/dbrx ⭐ 2,399
    Code examples and resources for DBRX, a large language model developed by Databricks
    🔗 www.databricks.com

  7. cohere-ai/cohere-toolkit ⭐ 1,928
    Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.

  8. luijait/DarkGPT ⭐ 1,725
    DarkGPT is an OSINT assistant based on GPT-4-200K (recommended use) designed to perform queries on leaked databases, thus providing an artificial intelligence assistant that can be useful in your traditional OSINT processes.

  9. google-deepmind/penzai ⭐ 1,413
    A JAX library for writing models as legible, functional pytree data structures, along with tools for visualizing, modifying, and analyzing them. Penzai focuses on making it easy to do stuff with models after they have been trained
    🔗 penzai.readthedocs.io

  10. pydantic/logfire ⭐ 652
    Uncomplicated Observability for Python and beyond! 🪵🔥
    🔗 docs.pydantic.dev/logfire

Code Quality

Code quality tooling: linters, formatters, pre-commit hooks, unused code removal.

  1. psf/black ⭐ 37,434
    The uncompromising Python code formatter
    🔗 black.readthedocs.io/en/stable

  2. astral-sh/ruff ⭐ 26,765
    An extremely fast Python linter and code formatter, written in Rust.
    🔗 docs.astral.sh/ruff

  3. google/yapf ⭐ 13,655
    A formatter for Python files

  4. pre-commit/pre-commit ⭐ 12,087
    A framework for managing and maintaining multi-language pre-commit hooks.
    🔗 pre-commit.com

  5. sqlfluff/sqlfluff ⭐ 7,232
    A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
    🔗 www.sqlfluff.com

  6. pycqa/isort ⭐ 6,321
    A Python utility / library to sort imports.
    🔗 pycqa.github.io/isort

  7. davidhalter/jedi ⭐ 5,673
    Awesome autocompletion, static analysis and refactoring library for python
    🔗 jedi.readthedocs.io

  8. pycqa/pylint ⭐ 5,129
    It's not just a linter that annoys you!
    🔗 pylint.readthedocs.io/en/latest

  9. asottile/pyupgrade ⭐ 3,331
    A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language.

  10. pycqa/flake8 ⭐ 3,265
    flake8 is a python tool that glues together pycodestyle, pyflakes, mccabe, and third-party plugins to check the style and quality of some python code.
    🔗 flake8.pycqa.org

  11. jendrikseipp/vulture ⭐ 3,024
    Find dead Python code

  12. wemake-services/wemake-python-styleguide ⭐ 2,430
    The strictest and most opinionated python linter ever!
    🔗 wemake-python-styleguide.rtfd.io

  13. codespell-project/codespell ⭐ 1,747
    check code for common misspellings

  14. python-lsp/python-lsp-server ⭐ 1,675
    Fork of the python-language-server project, maintained by the Spyder IDE team and the community

  15. sourcery-ai/sourcery ⭐ 1,483
    Instant AI code reviews
    🔗 sourcery.ai

  16. akaihola/darker ⭐ 612
    Apply black reformatting to Python files only in regions changed since a given commit. For a practical usage example, see the blog post at https://dev.to/akaihola/improving-python-code-incrementally-3f7a
    🔗 pypi.org/project/darker

  17. tconbeer/sqlfmt ⭐ 345
    sqlfmt formats your dbt SQL files so you don't have to
    🔗 sqlfmt.com

Crypto and Blockchain

Cryptocurrency and blockchain libraries: trading bots, API integration, Ethereum virtual machine, solidity.

  1. ccxt/ccxt ⭐ 31,415
    A JavaScript / TypeScript / Python / C# / PHP cryptocurrency trading API with support for more than 100 bitcoin/altcoin exchanges
    🔗 docs.ccxt.com

  2. freqtrade/freqtrade ⭐ 25,609
    Free, open source crypto trading bot
    🔗 www.freqtrade.io

  3. crytic/slither ⭐ 5,021
    Static Analyzer for Solidity and Vyper
    🔗 blog.trailofbits.com/2018/10/19/slither-a-solidity-static-analysis-framework

  4. ethereum/web3.py ⭐ 4,809
    A python interface for interacting with the Ethereum blockchain and ecosystem.
    🔗 web3py.readthedocs.io

  5. ethereum/consensus-specs ⭐ 3,432
    Ethereum Proof-of-Stake Consensus Specifications

  6. cyberpunkmetalhead/Binance-volatility-trading-bot ⭐ 3,347
    This is a fully functioning Binance trading bot that measures the volatility of every coin on Binance and places trades with the highest gaining coins If you like this project consider donating though the Brave browser to allow me to continuously improve the script.

  7. ethereum/py-evm ⭐ 2,188
    A Python implementation of the Ethereum Virtual Machine
    🔗 py-evm.readthedocs.io/en/latest

  8. bmoscon/cryptofeed ⭐ 2,074
    Cryptocurrency Exchange Websocket Data Feed Handler

  9. binance/binance-public-data ⭐ 1,346
    Details on how to get Binance public data

  10. ofek/bit ⭐ 1,207
    Bitcoin made easy.
    🔗 ofek.dev/bit

  11. man-c/pycoingecko ⭐ 1,032
    Python wrapper for the CoinGecko API

  12. palkeo/panoramix ⭐ 756
    Ethereum decompiler

  13. dylanhogg/awesome-crypto ⭐ 66
    A list of awesome crypto and blockchain projects
    🔗 www.awesomecrypto.xyz

Data

General data libraries: data processing, serialisation, formats, databases, SQL, connectors, web crawlers, data generation/augmentation/checks.

  1. scrapy/scrapy ⭐ 50,957
    Scrapy, a fast high-level web crawling & scraping framework for Python.
    🔗 scrapy.org

  2. apache/spark ⭐ 38,410
    Apache Spark - A unified analytics engine for large-scale data processing
    🔗 spark.apache.org

  3. getredash/redash ⭐ 24,994
    Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
    🔗 redash.io

  4. jaidedai/EasyOCR ⭐ 22,034
    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
    🔗 www.jaided.ai

  5. mindsdb/mindsdb ⭐ 21,337
    The platform for customizing AI from enterprise data
    🔗 mindsdb.com

  6. qdrant/qdrant ⭐ 17,990
    Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
    🔗 qdrant.tech

  7. joke2k/faker ⭐ 17,117
    Faker is a Python package that generates fake data for you.
    🔗 faker.readthedocs.io

  8. humansignal/label-studio ⭐ 16,561
    Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats.
    🔗 labelstud.io

  9. binux/pyspider ⭐ 16,336
    A Powerful Spider(Web Crawler) System in Python.
    🔗 docs.pyspider.org

  10. twintproject/twint ⭐ 15,556
    An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

  11. airbytehq/airbyte ⭐ 14,099
    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
    🔗 airbyte.com

  12. apache/arrow ⭐ 13,555
    Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
    🔗 arrow.apache.org

  13. tiangolo/sqlmodel ⭐ 13,041
    SQL databases in Python, designed for simplicity, compatibility, and robustness.
    🔗 sqlmodel.tiangolo.com

  14. chroma-core/chroma ⭐ 12,380
    the AI-native open-source embedding database
    🔗 www.trychroma.com

  15. redis/redis-py ⭐ 12,272
    Redis Python client

  16. coleifer/peewee ⭐ 10,812
    a small, expressive orm -- supports postgresql, mysql, sqlite and cockroachdb
    🔗 docs.peewee-orm.com

  17. s0md3v/Photon ⭐ 10,517
    Incredibly fast crawler designed for OSINT.

  18. simonw/datasette ⭐ 8,955
    An open source multi-tool for exploring and publishing data
    🔗 datasette.io

  19. sqlalchemy/sqlalchemy ⭐ 8,829
    The Database Toolkit for Python
    🔗 www.sqlalchemy.org

  20. bigscience-workshop/petals ⭐ 8,692
    🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
    🔗 petals.dev

  21. avaiga/taipy ⭐ 8,673
    Turns Data and AI algorithms into production-ready web applications in no time.
    🔗 www.taipy.io

  22. yzhao062/pyod ⭐ 7,964
    A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)
    🔗 pyod.readthedocs.io

  23. voxel51/fiftyone ⭐ 6,721
    The open-source tool for building high-quality datasets and computer vision models
    🔗 fiftyone.ai

  24. gristlabs/grist-core ⭐ 6,269
    Grist is the evolution of spreadsheets.
    🔗 www.getgrist.com

  25. alirezamika/autoscraper ⭐ 5,947
    A Smart, Automatic, Fast and Lightweight Web Scraper for Python

  26. kaggle/kaggle-api ⭐ 5,925
    Official Kaggle API

  27. tobymao/sqlglot ⭐ 5,566
    Python SQL Parser and Transpiler
    🔗 sqlglot.com

  28. vi3k6i5/flashtext ⭐ 5,539
    Extract Keywords from sentence or Replace keywords in sentences.

  29. madmaze/pytesseract ⭐ 5,529
    A Python wrapper for Google Tesseract

  30. airbnb/knowledge-repo ⭐ 5,433
    A next-generation curated knowledge sharing platform for data scientists and other technical professions.

  31. facebookresearch/AugLy ⭐ 4,900
    A data augmentations library for audio, image, text, and video.
    🔗 ai.facebook.com/blog/augly-a-new-data-augmentation-library-to-help-build-more-robust-ai-models

  32. jazzband/tablib ⭐ 4,531
    Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
    🔗 tablib.readthedocs.io

  33. superduperdb/superduperdb ⭐ 4,371
    🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
    🔗 superduperdb.com

  34. lk-geimfari/mimesis ⭐ 4,307
    Mimesis is a powerful Python library that empowers developers to generate massive amounts of synthetic data efficiently.
    🔗 mimesis.name

  35. amundsen-io/amundsen ⭐ 4,277
    Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
    🔗 www.amundsen.io/amundsen

  36. ibis-project/ibis ⭐ 4,240
    Ibis is a Python library that provides a lightweight, universal interface for data wrangling. It helps Python users explore and transform data of any size, stored anywhere.
    🔗 ibis-project.org

  37. mongodb/mongo-python-driver ⭐ 4,053
    PyMongo - the Official MongoDB Python driver
    🔗 pymongo.readthedocs.io

  38. andialbrecht/sqlparse ⭐ 3,589
    A non-validating SQL parser module for Python

  39. jmcnamara/XlsxWriter ⭐ 3,495
    A Python module for creating Excel XLSX files.
    🔗 xlsxwriter.readthedocs.io

  40. run-llama/llama-hub ⭐ 3,404
    A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
    🔗 llamahub.ai

  41. deepchecks/deepchecks ⭐ 3,373
    Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
    🔗 docs.deepchecks.com/stable

  42. praw-dev/praw ⭐ 3,321
    PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.
    🔗 praw.readthedocs.io

  43. rom1504/img2dataset ⭐ 3,265
    Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

  44. giskard-ai/giskard ⭐ 3,147
    🐢 Open-Source Evaluation & Testing framework for LLMs and ML models
    🔗 docs.giskard.ai

  45. pyeve/cerberus ⭐ 3,111
    Lightweight, extensible data validation library for Python
    🔗 python-cerberus.org

  46. datafold/data-diff ⭐ 2,847
    Compare tables within or across databases
    🔗 docs.datafold.com

  47. zoomeranalytics/xlwings ⭐ 2,840
    xlwings is a Python library that makes it easy to call Python from Excel and vice versa. It works with Excel on Windows and macOS as well as with Google Sheets and Excel on the web.
    🔗 www.xlwings.org

  48. pallets/itsdangerous ⭐ 2,828
    Safely pass trusted data to untrusted environments and back.
    🔗 itsdangerous.palletsprojects.com

  49. lancedb/lancedb ⭐ 2,824
    Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
    🔗 lancedb.github.io/lancedb

  50. goldsmith/Wikipedia ⭐ 2,820
    A Pythonic wrapper for the Wikipedia API
    🔗 wikipedia.readthedocs.org

  51. docarray/docarray ⭐ 2,762
    Represent, send, store and search multimodal data
    🔗 docs.docarray.org

  52. awslabs/amazon-redshift-utils ⭐ 2,713
    Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment

  53. sqlalchemy/alembic ⭐ 2,472
    A database migrations tool for SQLAlchemy.

  54. kayak/pypika ⭐ 2,380
    PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
    🔗 pypika.readthedocs.io/en/latest

  55. pynamodb/PynamoDB ⭐ 2,377
    A pythonic interface to Amazon's DynamoDB
    🔗 pynamodb.readthedocs.io

  56. emirozer/fake2db ⭐ 2,256
    Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql, mongodb, redis, couchdb.

  57. sdv-dev/SDV ⭐ 2,143
    Synthetic data generation for tabular data
    🔗 docs.sdv.dev/sdv

  58. uqfoundation/dill ⭐ 2,139
    serialize all of Python
    🔗 dill.rtfd.io

  59. accenture/AmpliGraph ⭐ 2,093
    Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org

  60. graphistry/pygraphistry ⭐ 2,060
    PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer

  61. rapidai/RapidOCR ⭐ 2,002
    Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVION and PaddlePaddle.
    🔗 rapidai.github.io/rapidocrdocs/docs

  62. samuelcolvin/arq ⭐ 1,934
    Fast job queuing and RPC in python with asyncio and redis.
    🔗 arq-docs.helpmanual.io

  63. sfu-db/connector-x ⭐ 1,787
    Fastest library to load data from DB to DataFrames in Rust and Python
    🔗 sfu-db.github.io/connector-x/intro.html

  64. uber/petastorm ⭐ 1,752
    Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

  65. dlt-hub/dlt ⭐ 1,737
    data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
    🔗 dlthub.com/docs

  66. pathwaycom/pathway ⭐ 1,733
    Pathway is a high-throughput, low-latency data processing framework that handles live data & streaming for you. Made with ❤️ for Python & ML/AI developers.
    🔗 pathway.com

  67. agronholm/sqlacodegen ⭐ 1,723
    Automatic model code generator for SQLAlchemy

  68. aio-libs/aiomysql ⭐ 1,703
    aiomysql is a library for accessing a MySQL database from the asyncio
    🔗 aiomysql.rtfd.io

  69. milvus-io/bootcamp ⭐ 1,628
    Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
    🔗 milvus.io

  70. simple-salesforce/simple-salesforce ⭐ 1,596
    A very simple Salesforce.com REST API client for Python

  71. aminalaee/sqladmin ⭐ 1,590
    SQLAlchemy Admin for FastAPI and Starlette
    🔗 aminalaee.dev/sqladmin

  72. collerek/ormar ⭐ 1,580
    python async orm with fastapi in mind and pydantic validation
    🔗 collerek.github.io/ormar

  73. simonw/sqlite-utils ⭐ 1,522
    Python CLI utility and library for manipulating SQLite databases
    🔗 sqlite-utils.datasette.io

  74. sdispater/orator ⭐ 1,425
    The Orator ORM provides a simple yet beautiful ActiveRecord implementation.
    🔗 orator-orm.com

  75. eleutherai/the-pile ⭐ 1,407
    The Pile is a large, diverse, open source language modelling data set that consists of many smaller datasets combined together.

  76. mchong6/JoJoGAN ⭐ 1,406
    Official PyTorch repo for JoJoGAN: One Shot Face Stylization

  77. aio-libs/aiopg ⭐ 1,376
    aiopg is a library for accessing a PostgreSQL database from the asyncio
    🔗 aiopg.readthedocs.io

  78. zarr-developers/zarr-python ⭐ 1,340
    An implementation of chunked, compressed, N-dimensional arrays for Python.
    🔗 zarr.readthedocs.io

  79. huggingface/datatrove ⭐ 1,313
    Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

  80. ydataai/ydata-synthetic ⭐ 1,296
    Synthetic data generators for tabular and time-series data
    🔗 docs.synthetic.ydata.ai

  81. google/tensorstore ⭐ 1,280
    Library for reading and writing large multi-dimensional arrays.
    🔗 google.github.io/tensorstore

  82. scholarly-python-package/scholarly ⭐ 1,235
    Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
    🔗 scholarly.readthedocs.io

  83. pytorch/data ⭐ 1,070
    A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

  84. eliasdabbas/advertools ⭐ 1,058
    advertools - online marketing productivity and analysis tools
    🔗 advertools.readthedocs.io

  85. uber/fiber ⭐ 1,040
    Distributed Computing for AI Made Simple
    🔗 uber.github.io/fiber

  86. brettkromkamp/contextualise ⭐ 1,036
    Contextualise is an effective tool particularly suited for organising information-heavy projects and activities consisting of unstructured and widely diverse data and information resources
    🔗 contextualise.dev

  87. aio-libs/aiocache ⭐ 1,027
    Asyncio cache manager for redis, memcached and memory
    🔗 aiocache.readthedocs.io

  88. intake/intake ⭐ 982
    Intake is a lightweight package for finding, investigating, loading and disseminating data.
    🔗 intake.readthedocs.io

  89. scikit-hep/awkward ⭐ 793
    Manipulate JSON-like data with NumPy-like idioms.
    🔗 awkward-array.org

  90. koaning/human-learn ⭐ 780
    Natural Intelligence is still a pretty good idea.
    🔗 koaning.github.io/human-learn

  91. duckdb/dbt-duckdb ⭐ 736
    dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)

  92. macbre/sql-metadata ⭐ 736
    Uses tokenized query returned by python-sqlparse and generates query metadata
    🔗 pypi.python.org/pypi/sql-metadata

  93. hyperqueryhq/whale ⭐ 724
    🐳 The stupidly simple CLI workspace for your data warehouse.
    🔗 rsyi.gitbook.io/whale

  94. goccy/bigquery-emulator ⭐ 712
    BigQuery emulator provides a way to launch a BigQuery server on your local machine for testing and development.

  95. googleapis/python-bigquery ⭐ 708
    Python Client for Google BigQuery

  96. mcfunley/pugsql ⭐ 663
    A HugSQL-inspired database library for Python
    🔗 pugsql.org

  97. dgarnitz/vectorflow ⭐ 637
    VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
    🔗 www.getvectorflow.com

  98. kagisearch/vectordb ⭐ 546
    A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search.
    🔗 vectordb.com

  99. koaning/bulk ⭐ 449
    Bulk is a quick UI developer tool to apply some bulk labels.

  100. igorbenav/fastcrud ⭐ 413
    FastCRUD is a Python package for FastAPI, offering robust async CRUD operations and flexible endpoint creation utilities.

Debugging

Debugging and tracing tools.

  1. cool-rr/PySnooper ⭐ 16,265
    Never use print for debugging again

  2. gruns/icecream ⭐ 8,484
    🍦 Never use print() to debug again.

  3. shobrook/rebound ⭐ 4,075
    Command-line tool that instantly fetches Stack Overflow results when an exception is thrown

  4. inducer/pudb ⭐ 2,877
    Full-screen console debugger for Python
    🔗 documen.tician.de/pudb

  5. gotcha/ipdb ⭐ 1,812
    Integration of IPython pdb

  6. alexmojaki/heartrate ⭐ 1,728
    Simple real time visualisation of the execution of a Python program.

  7. alexmojaki/birdseye ⭐ 1,634
    Graphical Python debugger which lets you easily view the values of all evaluated expressions
    🔗 birdseye.readthedocs.io

  8. alexmojaki/snoop ⭐ 1,197
    A powerful set of Python debugging tools, based on PySnooper

  9. samuelcolvin/python-devtools ⭐ 947
    Dev tools for python
    🔗 python-devtools.helpmanual.io

Diffusion Text to Image

Text-to-image diffusion model libraries, tools and apps for generating images from natural language.

  1. automatic1111/stable-diffusion-webui ⭐ 130,176
    Stable Diffusion web UI

  2. compvis/stable-diffusion ⭐ 65,513
    A latent text-to-image diffusion model
    🔗 ommer-lab.com/research/latent-diffusion-models

  3. stability-ai/stablediffusion ⭐ 36,334
    High-Resolution Image Synthesis with Latent Diffusion Models

  4. comfyanonymous/ComfyUI ⭐ 33,811
    The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.

  5. lllyasviel/ControlNet ⭐ 27,951
    Let us control diffusion models!

  6. huggingface/diffusers ⭐ 22,640
    🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
    🔗 huggingface.co/docs/diffusers

  7. invoke-ai/InvokeAI ⭐ 21,347
    InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, supports terminal use through a CLI, and serves as the foundation for multi...
    🔗 invoke-ai.github.io/invokeai

  8. apple/ml-stable-diffusion ⭐ 16,128
    Stable Diffusion with Core ML on Apple Silicon

  9. borisdayma/dalle-mini ⭐ 14,642
    DALL·E Mini - Generate images from a text prompt
    🔗 www.craiyon.com

  10. divamgupta/diffusionbee-stable-diffusion-ui ⭐ 11,937
    Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
    🔗 diffusionbee.com

  11. lucidrains/DALLE2-pytorch ⭐ 10,836
    Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

  12. compvis/latent-diffusion ⭐ 10,626
    High-Resolution Image Synthesis with Latent Diffusion Models

  13. instantid/InstantID ⭐ 9,891
    InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
    🔗 instantid.github.io

  14. facebookresearch/dinov2 ⭐ 7,899
    PyTorch code and models for the DINOv2 self-supervised learning method.

  15. ashawkey/stable-dreamfusion ⭐ 7,826
    Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.

  16. carson-katri/dream-textures ⭐ 7,607
    Stable Diffusion built-in to Blender

  17. xavierxiao/Dreambooth-Stable-Diffusion ⭐ 7,456
    Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion

  18. timothybrooks/instruct-pix2pix ⭐ 5,960
    PyTorch implementation of InstructPix2Pix, an instruction-based image editing model, based on the original CompVis/stable_diffusion repo.

  19. openai/consistency_models ⭐ 5,941
    Official repo for consistency models.

  20. idea-research/GroundingDINO ⭐ 5,051
    Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
    🔗 arxiv.org/abs/2303.05499

  21. salesforce/BLIP ⭐ 4,274
    PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

  22. nateraw/stable-diffusion-videos ⭐ 4,235
    Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

  23. jina-ai/discoart ⭐ 3,839
    🪩 Create Disco Diffusion artworks in one line

  24. lkwq007/stablediffusion-infinity ⭐ 3,805
    Outpainting with Stable Diffusion on an infinite canvas

  25. openai/glide-text2im ⭐ 3,471
    GLIDE: a diffusion-based text-conditional image synthesis model

  26. mlc-ai/web-stable-diffusion ⭐ 3,439
    Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
    🔗 mlc.ai/web-stable-diffusion

  27. openai/improved-diffusion ⭐ 2,819
    Release for Improved Denoising Diffusion Probabilistic Models

  28. saharmor/dalle-playground ⭐ 2,762
    A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini)

  29. stability-ai/stability-sdk ⭐ 2,399
    SDK for interacting with stability.ai APIs (e.g. stable diffusion inference)
    🔗 platform.stability.ai

  30. divamgupta/stable-diffusion-tensorflow ⭐ 1,568
    Stable Diffusion in TensorFlow / Keras

  31. coyote-a/ultimate-upscale-for-automatic1111 ⭐ 1,499
    Ultimate SD Upscale extension for AUTOMATIC1111 Stable Diffusion web UI

  32. nvlabs/prismer ⭐ 1,288
    The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
    🔗 shikun.io/projects/prismer

  33. chenyangqiqi/FateZero ⭐ 1,044
    [ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
    🔗 fate-zero-edit.github.io

  34. thereforegames/unprompted ⭐ 746
    Templating language written for Stable Diffusion workflows. Available as an extension for the Automatic1111 WebUI.

  35. sharonzhou/long_stable_diffusion ⭐ 674
    Long-form text-to-images generation, using a pipeline of deep generative models (GPT-3 and Stable Diffusion)

  36. tanelp/tiny-diffusion ⭐ 535
    A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.

Finance

Financial and quantitative libraries: investment research tools, market data, algorithmic trading, backtesting, financial derivatives.

  1. openbb-finance/OpenBBTerminal ⭐ 26,122
    Investment Research for Everyone, Everywhere.
    🔗 openbb.co

  2. quantopian/zipline ⭐ 17,077
    Zipline, a Pythonic Algorithmic Trading Library
    🔗 www.zipline.io

  3. microsoft/qlib ⭐ 14,191
    Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, ...
    🔗 qlib.readthedocs.io/en/latest

  4. mementum/backtrader ⭐ 13,079
    Python Backtesting library for trading strategies
    🔗 www.backtrader.com

  5. ranaroussi/yfinance ⭐ 11,888
    Download market data from Yahoo! Finance's API
    🔗 aroussi.com/post/python-yahoo-finance

  6. ai4finance-foundation/FinGPT ⭐ 11,531
    FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
    🔗 ai4finance.org

  7. ai4finance-foundation/FinRL ⭐ 9,124
    FinRL: Financial Reinforcement Learning. 🔥
    🔗 ai4finance.org

  8. ta-lib/ta-lib-python ⭐ 9,038
    Python wrapper for TA-Lib (http://ta-lib.org/).
    🔗 ta-lib.github.io/ta-lib-python

  9. quantconnect/Lean ⭐ 8,714
    Lean Algorithmic Trading Engine by QuantConnect (Python, C#)
    🔗 lean.io

  10. quantopian/pyfolio ⭐ 5,431
    Portfolio and risk analytics in Python
    🔗 quantopian.github.io/pyfolio

  11. kernc/backtesting.py ⭐ 4,847
    🔎 📈 🐍 💰 Backtest trading strategies in Python.
    🔗 kernc.github.io/backtesting.py

  12. twopirllc/pandas-ta ⭐ 4,765
    Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 150+ Indicators
    🔗 twopirllc.github.io/pandas-ta

  13. ranaroussi/quantstats ⭐ 4,304
    Portfolio analytics for quants, written in Python

  14. gbeced/pyalgotrade ⭐ 4,299
    Python Algorithmic Trading Library
    🔗 gbeced.github.io/pyalgotrade

  15. google/tf-quant-finance ⭐ 4,292
    High-performance TensorFlow library for quantitative finance.

  16. borisbanushev/stockpredictionai ⭐ 3,944
    In this noteboook I will create a complete process for predicting stock price movements. Follow along and we will achieve some pretty good results. For that purpose we will use a Generative Adversarial Network (GAN) with LSTM, a type of Recurrent Neural Network, as generator, and a Convolutional Neural Networ...

  17. polakowo/vectorbt ⭐ 3,751
    Find your trading edge, using the fastest engine for backtesting, algorithmic trading, and research.
    🔗 vectorbt.dev

  18. matplotlib/mplfinance ⭐ 3,361
    Financial Markets Data Visualization using Matplotlib
    🔗 pypi.org/project/mplfinance

  19. cuemacro/finmarketpy ⭐ 3,356
    Python library for backtesting trading strategies & analyzing financial markets (formerly pythalesians)
    🔗 www.cuemacro.com

  20. quantopian/alphalens ⭐ 3,094
    Performance analysis of predictive (alpha) stock factors
    🔗 quantopian.github.io/alphalens

  21. zvtvz/zvt ⭐ 2,990
    modular quant framework.
    🔗 zvt.readthedocs.io/en/latest

  22. goldmansachs/gs-quant ⭐ 2,480
    Python toolkit for quantitative finance
    🔗 developer.gs.com/discover/products/gs-quant

  23. robcarver17/pysystemtrade ⭐ 2,398
    Systematic Trading in python

  24. quantopian/research_public ⭐ 2,318
    Quantitative research and educational materials
    🔗 www.quantopian.com/lectures

  25. pmorissette/bt ⭐ 2,031
    bt - flexible backtesting for Python
    🔗 pmorissette.github.io/bt

  26. blankly-finance/blankly ⭐ 1,973
    🚀 💸 Easily build, backtest and deploy your algo in just a few lines of code. Trade stocks, cryptos, and forex across exchanges w/ one package.
    🔗 package.blankly.finance

  27. domokane/FinancePy ⭐ 1,913
    A Python Finance Library that focuses on the pricing and risk-management of Financial Derivatives, including fixed-income, equity, FX and credit derivatives.
    🔗 financepy.com

  28. pmorissette/ffn ⭐ 1,799
    ffn - a financial function library for Python
    🔗 pmorissette.github.io/ffn

  29. cuemacro/findatapy ⭐ 1,567
    Python library to download market data via Bloomberg, Eikon, Quandl, Yahoo etc.

  30. quantopian/empyrical ⭐ 1,227
    Common financial risk and performance metrics. Used by zipline and pyfolio.
    🔗 quantopian.github.io/empyrical

  31. idanya/algo-trader ⭐ 746
    Trading bot with support for realtime trading, backtesting, custom strategies and much more.

Game Development

Game development tools, engines and libraries.

  1. kitao/pyxel ⭐ 13,187
    A retro game engine for Python

  2. pygame/pygame ⭐ 6,979
    🐍🎮 pygame (the library) is a Free and Open Source python programming language library for making multimedia applications like games built on top of the excellent SDL library. C, Python, Native, OpenGL.
    🔗 www.pygame.org

  3. panda3d/panda3d ⭐ 4,270
    Powerful, mature open-source cross-platform game engine for Python and C++, developed by Disney and CMU
    🔗 www.panda3d.org

  4. pokepetter/ursina ⭐ 2,091
    A game engine powered by python and panda3d.
    🔗 pokepetter.github.io/ursina

  5. pyglet/pyglet ⭐ 1,756
    pyglet is a cross-platform windowing and multimedia library for Python, for developing games and other visually rich applications.
    🔗 pyglet.org

  6. pythonarcade/arcade ⭐ 1,612
    Easy to use Python library for creating 2D arcade games.
    🔗 arcade.academy

GIS

Geospatial libraries: raster and vector data formats, interactive mapping and visualisation, computing frameworks for processing images, projections.

  1. domlysz/BlenderGIS ⭐ 7,254
    Blender addons to make the bridge between Blender and geographic data

  2. python-visualization/folium ⭐ 6,690
    Python Data. Leaflet.js Maps.
    🔗 python-visualization.github.io/folium

  3. gboeing/osmnx ⭐ 4,673
    OSMnx is a Python package to easily download, model, analyze, and visualize street networks and other geospatial features from OpenStreetMap.
    🔗 osmnx.readthedocs.io

  4. osgeo/gdal ⭐ 4,503
    GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
    🔗 gdal.org

  5. geopandas/geopandas ⭐ 4,192
    Python tools for geographic data
    🔗 geopandas.org

  6. shapely/shapely ⭐ 3,679
    Manipulation and analysis of geometric objects
    🔗 shapely.readthedocs.io/en/stable

  7. holoviz/datashader ⭐ 3,208
    Quickly and accurately render even the largest data.
    🔗 datashader.org

  8. giswqs/geemap ⭐ 3,207
    A Python package for interactive geospatial analysis and visualization with Google Earth Engine.
    🔗 geemap.org

  9. opengeos/leafmap ⭐ 2,905
    A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
    🔗 leafmap.org

  10. opengeos/segment-geospatial ⭐ 2,664
    A Python package for segmenting geospatial data with the Segment Anything Model (SAM)
    🔗 samgeo.gishub.org

  11. google/earthengine-api ⭐ 2,541
    Python and JavaScript bindings for calling the Earth Engine API.

  12. microsoft/torchgeo ⭐ 2,233
    TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
    🔗 www.osgeo.org/projects/torchgeo

  13. rasterio/rasterio ⭐ 2,140
    Rasterio reads and writes geospatial raster datasets
    🔗 rasterio.readthedocs.io

  14. mcordts/cityscapesScripts ⭐ 2,097
    README and scripts for the Cityscapes Dataset

  15. azavea/raster-vision ⭐ 2,000
    An open source library and framework for deep learning on satellite and aerial imagery.
    🔗 docs.rastervision.io

  16. plant99/felicette ⭐ 1,815
    Satellite imagery for dummies.

  17. apache/sedona ⭐ 1,779
    A cluster computing framework for processing large-scale geospatial data
    🔗 sedona.apache.org

  18. gboeing/osmnx-examples ⭐ 1,460
    Gallery of OSMnx tutorials, usage examples, and feature demonstations.
    🔗 osmnx.readthedocs.io

  19. jupyter-widgets/ipyleaflet ⭐ 1,452
    A Jupyter - Leaflet.js bridge
    🔗 ipyleaflet.readthedocs.io

  20. pysal/pysal ⭐ 1,278
    PySAL: Python Spatial Analysis Library Meta-Package
    🔗 pysal.org/pysal

  21. microsoft/GlobalMLBuildingFootprints ⭐ 1,275
    Worldwide building footprints derived from satellite imagery

  22. anitagraser/movingpandas ⭐ 1,145
    Movement trajectory classes and functions built on top of GeoPandas
    🔗 movingpandas.org

  23. residentmario/geoplot ⭐ 1,118
    High-level geospatial data visualization library for Python.
    🔗 residentmario.github.io/geoplot/index.html

  24. sentinel-hub/eo-learn ⭐ 1,076
    Earth observation processing framework for machine learning in Python
    🔗 eo-learn.readthedocs.io/en/latest

  25. makepath/xarray-spatial ⭐ 783
    Raster-based Spatial Analytics for Python
    🔗 xarray-spatial.readthedocs.io

  26. osgeo/grass ⭐ 769
    GRASS GIS - free and open-source geospatial processing engine
    🔗 grass.osgeo.org

  27. developmentseed/titiler ⭐ 693
    Build your own Raster dynamic map tile services
    🔗 developmentseed.org/titiler

  28. scikit-mobility/scikit-mobility ⭐ 692
    scikit-mobility: mobility analysis in Python
    🔗 scikit-mobility.github.io/scikit-mobility

Graph

Graphs and network libraries: network analysis, graph machine learning, visualisation.

  1. networkx/networkx ⭐ 14,203
    Network Analysis in Python
    🔗 networkx.org

  2. stellargraph/stellargraph ⭐ 2,895
    StellarGraph - Machine Learning on Graphs
    🔗 stellargraph.readthedocs.io

  3. westhealth/pyvis ⭐ 912
    Python package for creating and visualizing interactive network graphs.
    🔗 pyvis.readthedocs.io/en/latest

  4. rampasek/GraphGPS ⭐ 594
    Recipe for a General, Powerful, Scalable Graph Transformer

  5. microsoft/graspologic ⭐ 511
    graspologic is a package for graph statistical algorithms
    🔗 microsoft.github.io/graspologic/latest

  6. dylanhogg/llmgraph ⭐ 97
    Create knowledge graphs with LLMs

GUI

Graphical user interface libraries and toolkits.

  1. pysimplegui/PySimpleGUI ⭐ 13,143
    Python GUIs for Humans! PySimpleGUI is the top-rated Python application development environment. Launched in 2018 and actively developed, maintained, and supported in 2024. Transforms tkinter, Qt, WxPython, and Remi into a simple, intuitive, and fun experience for both hobbyists and expert users.
    🔗 www.pysimplegui.com

  2. hoffstadt/DearPyGui ⭐ 12,310
    Dear PyGui: A fast and powerful Graphical User Interface Toolkit for Python with minimal dependencies
    🔗 dearpygui.readthedocs.io/en/latest

  3. parthjadhav/Tkinter-Designer ⭐ 8,342
    An easy and fast way to create a Python GUI 🐍

  4. samuelcolvin/FastUI ⭐ 7,371
    FastUI is a new way to build web application user interfaces defined by declarative Python code.
    🔗 fastui-demo.onrender.com

  5. r0x0r/pywebview ⭐ 4,331
    Build GUI for your Python program with JavaScript, HTML, and CSS
    🔗 pywebview.flowrl.com

  6. beeware/toga ⭐ 4,105
    A Python native, OS native GUI toolkit.
    🔗 toga.readthedocs.io/en/latest

  7. dddomodossola/remi ⭐ 3,453
    Python REMote Interface library. Platform independent. In about 100 Kbytes, perfect for your diet.

  8. wxwidgets/Phoenix ⭐ 2,205
    wxPython's Project Phoenix. A new implementation of wxPython, better, stronger, faster than he was before.
    🔗 wxpython.org

Jupyter

Jupyter and JupyterLab and Notebook tools, libraries and plugins.

  1. jupyterlab/jupyterlab ⭐ 13,791
    JupyterLab computational environment.
    🔗 jupyterlab.readthedocs.io

  2. jupyter/notebook ⭐ 11,171
    Jupyter Interactive Notebook
    🔗 jupyter-notebook.readthedocs.io

  3. mwouts/jupytext ⭐ 6,425
    Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
    🔗 jupytext.readthedocs.io

  4. nteract/papermill ⭐ 5,634
    📚 Parameterize, execute, and analyze notebooks
    🔗 papermill.readthedocs.io/en/latest

  5. connorferster/handcalcs ⭐ 5,357
    Python library for converting Python calculations into rendered latex.

  6. voila-dashboards/voila ⭐ 5,217
    Voilà turns Jupyter notebooks into standalone web applications
    🔗 voila.readthedocs.io

  7. executablebooks/jupyter-book ⭐ 3,694
    Create beautiful, publication-quality books and documents from computational content.
    🔗 jupyterbook.org

  8. jupyterlite/jupyterlite ⭐ 3,661
    Wasm powered Jupyter running in the browser 💡
    🔗 jupyterlite.rtfd.io/en/stable/try/lab

  9. jupyterlab/jupyterlab-desktop ⭐ 3,372
    JupyterLab desktop application, based on Electron.

  10. jupyter-widgets/ipywidgets ⭐ 3,056
    Interactive Widgets for the Jupyter Notebook
    🔗 ipywidgets.readthedocs.io

  11. quantopian/qgrid ⭐ 3,029
    An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks

  12. jupyterlab/jupyter-ai ⭐ 2,865
    A generative AI extension for JupyterLab
    🔗 jupyter-ai.readthedocs.io

  13. jupyter/nbdime ⭐ 2,596
    Tools for diffing and merging of Jupyter notebooks.
    🔗 nbdime.readthedocs.io

  14. mito-ds/mito ⭐ 2,219
    The mitosheet package, trymito.io, and other public Mito code.
    🔗 trymito.io

  15. jupyter/nbviewer ⭐ 2,164
    nbconvert as a web service: Render Jupyter Notebooks as static web pages
    🔗 nbviewer.jupyter.org

  16. maartenbreddels/ipyvolume ⭐ 1,912
    3d plotting for Python in the Jupyter notebook based on IPython widgets using WebGL

  17. jupyter-lsp/jupyterlab-lsp ⭐ 1,733
    Coding assistance for JupyterLab (code navigation + hover suggestions + linters + autocompletion + rename) using Language Server Protocol
    🔗 jupyterlab-lsp.readthedocs.io

  18. jupyter/nbconvert ⭐ 1,665
    Jupyter Notebook Conversion
    🔗 nbconvert.readthedocs.io

  19. nbqa-dev/nbQA ⭐ 969
    Run ruff, isort, pyupgrade, mypy, pylint, flake8, and more on Jupyter Notebooks
    🔗 nbqa.readthedocs.io/en/latest/index.html

  20. vizzuhq/ipyvizzu ⭐ 923
    Build animated charts in Jupyter Notebook and similar environments with a simple Python syntax.
    🔗 ipyvizzu.vizzuhq.com

  21. koaning/drawdata ⭐ 706
    Draw datasets from within Jupyter.
    🔗 calmcode.io/labs/drawdata.html

  22. aws/graph-notebook ⭐ 685
    Library extending Jupyter notebooks to integrate with Apache TinkerPop, openCypher, and RDF SPARQL.
    🔗 github.com/aws/graph-notebook

  23. linealabs/lineapy ⭐ 657
    Move fast from data science prototype to pipeline. Capture, analyze, and transform messy notebooks into data pipelines with just two lines of code.
    🔗 lineapy.org

  24. xiaohk/stickyland ⭐ 500
    Break the linear presentation of Jupyter Notebooks with sticky cells!
    🔗 xiaohk.github.io/stickyland

LLMs and ChatGPT

Large language model and GPT libraries and frameworks: auto-gpt, agents, QnA, chain-of-thought workflows, API integations. Also see the Natural Language Processing category for crossover.

  1. significant-gravitas/AutoGPT ⭐ 161,557
    AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
    🔗 agpt.co

  2. hwchase17/langchain ⭐ 83,903
    🦜🔗 Build context-aware reasoning applications
    🔗 python.langchain.com

  3. nomic-ai/gpt4all ⭐ 64,779
    gpt4all: run open-source LLMs anywhere
    🔗 gpt4all.io

  4. xtekky/gpt4free ⭐ 57,578
    The official gpt4free repository | various collection of powerful language models
    🔗 g4f.ai

  5. ggerganov/llama.cpp ⭐ 57,378
    LLM inference in C/C++

  6. facebookresearch/llama ⭐ 53,203
    Inference code for Llama models

  7. imartinez/private-gpt ⭐ 51,914
    Interact with your documents using the power of GPT, 100% privately, no data leaks
    🔗 docs.privategpt.dev

  8. gpt-engineer-org/gpt-engineer ⭐ 50,584
    Specify what you want it to build, the AI asks for clarification, and then builds it.

  9. killianlucas/open-interpreter ⭐ 48,667
    A natural language interface for computers
    🔗 openinterpreter.com

  10. xai-org/grok-1 ⭐ 48,132
    This repository contains JAX example code for loading and running the Grok-1 open-weights model.

  11. geekan/MetaGPT ⭐ 39,454
    🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
    🔗 deepwisdom.ai

  12. thudm/ChatGLM-6B ⭐ 39,345
    ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

  13. hpcaitech/ColossalAI ⭐ 37,930
    Making large AI models cheaper, faster and more accessible
    🔗 www.colossalai.org

  14. laion-ai/Open-Assistant ⭐ 36,658
    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
    🔗 open-assistant.io

  15. oobabooga/text-generation-webui ⭐ 36,516
    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  16. moymix/TaskMatrix ⭐ 34,529
    Connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting.

  17. lm-sys/FastChat ⭐ 34,260
    An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

  18. quivrhq/quivr ⭐ 32,665
    Your GenAI Second Brain 🧠 A personal productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, Groq that you can share with users ! Local & Private alternative to OpenAI GPTs & ChatGPT powered by retrieval-augmented...
    🔗 quivr.app

  19. karpathy/nanoGPT ⭐ 31,966
    The simplest, fastest repository for training/finetuning medium-sized GPTs.

  20. jerryjliu/llama_index ⭐ 31,269
    LlamaIndex is a data framework for your LLM applications
    🔗 docs.llamaindex.ai

  21. tatsu-lab/stanford_alpaca ⭐ 28,832
    Code and documentation to train Stanford's Alpaca models, and generate the data.
    🔗 crfm.stanford.edu/2023/03/13/alpaca.html

  22. pythagora-io/gpt-pilot ⭐ 28,218
    The first real AI developer

  23. microsoft/autogen ⭐ 25,356
    A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap
    🔗 microsoft.github.io/autogen

  24. vision-cair/MiniGPT-4 ⭐ 24,912
    Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
    🔗 minigpt-4.github.io

  25. microsoft/JARVIS ⭐ 23,067
    JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

  26. openai/gpt-2 ⭐ 21,202
    Code for the paper "Language Models are Unsupervised Multitask Learners"
    🔗 openai.com/blog/better-language-models

  27. openai/chatgpt-retrieval-plugin ⭐ 20,850
    The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.

  28. hiyouga/LLaMA-Factory ⭐ 20,793
    Unify Efficient Fine-Tuning of 100+ LLMs

  29. hiyouga/LLaMA-Factory ⭐ 20,793
    Unify Efficient Fine-Tuning of 100+ LLMs

  30. yoheinakajima/babyagi ⭐ 19,260
    GPT-4 powered task-driven autonomous agent

  31. karpathy/minGPT ⭐ 18,914
    A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

  32. vllm-project/vllm ⭐ 18,833
    A high-throughput and memory-efficient inference and serving engine for LLMs
    🔗 docs.vllm.ai

  33. microsoft/semantic-kernel ⭐ 18,280
    Integrate cutting-edge LLM technology quickly and easily into your apps
    🔗 aka.ms/semantic-kernel

  34. tloen/alpaca-lora ⭐ 18,209
    Instruct-tune LLaMA on consumer hardware

  35. rasahq/rasa ⭐ 17,993
    💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
    🔗 rasa.com/docs/rasa

  36. karpathy/llm.c ⭐ 17,622
    LLM training in simple, pure C/CUDA. There is no need for 245MB of PyTorch or 107MB of cPython

  37. logspace-ai/langflow ⭐ 17,528
    ⛓️ Langflow is a dynamic graph where each node is an executable unit. Its modular and interactive design fosters rapid experimentation and prototyping, pushing hard on the limits of creativity.
    🔗 www.langflow.org

  38. guidance-ai/guidance ⭐ 17,401
    A guidance language for controlling large language models.

  39. mlc-ai/mlc-llm ⭐ 17,027
    Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
    🔗 llm.mlc.ai/docs

  40. stitionai/devika ⭐ 16,933
    Devika is an advanced AI software engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective.

  41. haotian-liu/LLaVA ⭐ 16,367
    [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
    🔗 llava.hliu.cc

  42. karpathy/llama2.c ⭐ 16,015
    Inference Llama 2 in one file of pure C

  43. thudm/ChatGLM2-6B ⭐ 15,509
    ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

  44. facebookresearch/codellama ⭐ 15,074
    Inference code for CodeLlama models

  45. mayooear/gpt4-pdf-chatbot-langchain ⭐ 14,575
    GPT4 & LangChain Chatbot for large PDF docs
    🔗 www.youtube.com/watch?v=ih9pbgvvoo4

  46. transformeroptimus/SuperAGI ⭐ 14,507
    <⚡️> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
    🔗 superagi.com

  47. fauxpilot/fauxpilot ⭐ 14,264
    FauxPilot - an open-source alternative to GitHub Copilot server

  48. openai/evals ⭐ 13,934
    Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

  49. huggingface/peft ⭐ 13,895
    🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
    🔗 huggingface.co/docs/peft

  50. deepset-ai/haystack ⭐ 13,714
    🔍 LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conv...
    🔗 haystack.deepset.ai

  51. idea-research/Grounded-Segment-Anything ⭐ 13,544
    Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
    🔗 arxiv.org/abs/2401.14159

  52. joaomdmoura/crewAI ⭐ 13,334
    Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
    🔗 crewai.com

  53. openlmlab/MOSS ⭐ 11,823
    An open-source tool-augmented conversational language model from Fudan University
    🔗 txsun1997.github.io/blogs/moss.html

  54. blinkdl/RWKV-LM ⭐ 11,659
    RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

  55. smol-ai/developer ⭐ 11,652
    the first library to let you embed a developer agent in your own app!
    🔗 twitter.com/smolmodels

  56. paddlepaddle/PaddleNLP ⭐ 11,448
    👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
    🔗 paddlenlp.readthedocs.io

  57. dao-ailab/flash-attention ⭐ 10,898
    Fast and memory-efficient exact attention

  58. stanfordnlp/dspy ⭐ 10,829
    DSPy: The framework for programming—not prompting—foundation models
    🔗 dspy-docs.vercel.app

  59. databrickslabs/dolly ⭐ 10,785
    Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
    🔗 www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html

  60. h2oai/h2ogpt ⭐ 10,472
    Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://codellama.h2o.ai/
    🔗 h2o.ai

  61. shishirpatil/gorilla ⭐ 10,119
    Enables LLMs to use tools by invoking APIs. Given a query, Gorilla comes up with the semantically and syntactically correct API.
    🔗 gorilla.cs.berkeley.edu

  62. danielmiessler/fabric ⭐ 9,734
    fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
    🔗 danielmiessler.com/p/fabric-origin-story

  63. artidoro/qlora ⭐ 9,440
    QLoRA: Efficient Finetuning of Quantized LLMs
    🔗 arxiv.org/abs/2305.14314

  64. facebookresearch/llama-recipes ⭐ 9,365
    Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta ...

  65. google-research/vision_transformer ⭐ 9,319
    Vision Transformer and MLP-Mixer Architectures

  66. blinkdl/ChatRWKV ⭐ 9,281
    ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

  67. mlc-ai/web-llm ⭐ 9,137
    Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.
    🔗 mlc.ai/web-llm

  68. microsoft/LoRA ⭐ 9,113
    Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
    🔗 arxiv.org/abs/2106.09685

  69. assafelovic/gpt-researcher ⭐ 8,738
    GPT based autonomous agent that does online comprehensive research on any given topic
    🔗 gptr.dev

  70. mistralai/mistral-src ⭐ 8,706
    Reference implementation of Mistral AI 7B v0.1 model.
    🔗 mistral.ai

  71. nvidia/Megatron-LM ⭐ 8,654
    Ongoing research training transformer models at scale
    🔗 docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start

  72. embedchain/embedchain ⭐ 8,528
    Personalizing LLM Responses
    🔗 docs.embedchain.ai

  73. berriai/litellm ⭐ 8,449
    Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
    🔗 docs.litellm.ai/docs

  74. unslothai/unsloth ⭐ 8,306
    Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory
    🔗 unsloth.ai

  75. microsoft/promptflow ⭐ 8,201
    Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
    🔗 microsoft.github.io/promptflow

  76. lvwerra/trl ⭐ 8,163
    Train transformer language models with reinforcement learning.
    🔗 hf.co/docs/trl

  77. eleutherai/gpt-neo ⭐ 8,146
    An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
    🔗 www.eleuther.ai

  78. mshumer/gpt-prompt-engineer ⭐ 8,079
    Simply input a description of your task and some test cases, and the system will generate, test, and rank a multitude of prompts to find the ones that perform the best.

  79. optimalscale/LMFlow ⭐ 8,023
    An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
    🔗 optimalscale.github.io/lmflow

  80. karpathy/minbpe ⭐ 7,964
    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

  81. apple/ml-ferret ⭐ 7,811
    Ferret: Refer and Ground Anything Anywhere at Any Granularity

  82. thudm/CodeGeeX ⭐ 7,785
    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
    🔗 codegeex.cn

  83. thudm/GLM-130B ⭐ 7,616
    GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

  84. lianjiatech/BELLE ⭐ 7,559
    BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

  85. openlm-research/open_llama ⭐ 7,205
    OpenLLaMA: An Open Reproduction of LLaMA

  86. plachtaa/VALL-E-X ⭐ 7,193
    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

  87. bigcode-project/starcoder ⭐ 7,115
    Home of StarCoder: fine-tuning & inference!

  88. sweepai/sweep ⭐ 7,079
    Sweep: open-source AI-powered Software Developer for small features and bug fixes.
    🔗 sweep.dev

  89. sjtu-ipads/PowerInfer ⭐ 6,973
    High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

  90. vanna-ai/vanna ⭐ 6,863
    🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
    🔗 vanna.ai/docs

  91. jzhang38/TinyLlama ⭐ 6,837
    The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

  92. bhaskatripathi/pdfGPT ⭐ 6,710
    PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!
    🔗 huggingface.co/spaces/bhaskartripathi/pdfgpt_turbo

  93. lightning-ai/litgpt ⭐ 6,679
    Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
    🔗 lightning.ai

  94. eleutherai/gpt-neox ⭐ 6,592
    An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

  95. abetlen/llama-cpp-python ⭐ 6,527
    Python bindings for llama.cpp
    🔗 llama-cpp-python.readthedocs.io

  96. zilliztech/GPTCache ⭐ 6,435
    Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
    🔗 gptcache.readthedocs.io

  97. vaibhavs10/insanely-fast-whisper ⭐ 6,417
    An opinionated CLI to transcribe Audio files w/ Whisper on-device! Powered by 🤗 Transformers, Optimum & flash-attn

  98. apple/corenet ⭐ 6,226
    CoreNet is a deep neural network toolkit that allows researchers and engineers to train standard and novel small and large-scale models for variety of tasks, including foundation models (e.g., CLIP and LLM), object classification, object detection, and semantic segmentation.

  99. mit-han-lab/streaming-llm ⭐ 6,213
    [ICLR 2024] Efficient Streaming Language Models with Attention Sinks
    🔗 arxiv.org/abs/2309.17453

  100. langchain-ai/opengpts ⭐ 6,125
    An open source effort to create a similar experience to OpenAI's GPTs and Assistants API.

  101. nat/openplayground ⭐ 6,082
    An LLM playground you can run on your laptop

  102. run-llama/rags ⭐ 5,921
    Build ChatGPT over your data, all with natural language

  103. lightning-ai/lit-llama ⭐ 5,807
    Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

  104. skypilot-org/skypilot ⭐ 5,675
    SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
    🔗 skypilot.readthedocs.io

  105. chainlit/chainlit ⭐ 5,478
    Build Conversational AI in minutes ⚡️
    🔗 docs.chainlit.io

  106. dsdanielpark/Bard-API ⭐ 5,386
    The unofficial python package that returns response of Google Bard through cookie value.
    🔗 pypi.org/project/bardapi

  107. internlm/InternLM ⭐ 5,220
    Official release of InternLM2 7B and 20B base and chat models. 200K context support
    🔗 internlm.intern-ai.org.cn

  108. jxnl/instructor ⭐ 5,184
    Instructor is a Python library that makes it a breeze to work with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API to manage validation, retries, and streaming responses.
    🔗 python.useinstructor.com

  109. minedojo/Voyager ⭐ 5,168
    An Open-Ended Embodied Agent with Large Language Models
    🔗 voyager.minedojo.org

  110. eleutherai/lm-evaluation-harness ⭐ 5,109
    A framework for few-shot evaluation of language models.
    🔗 www.eleuther.ai

  111. pytorch-labs/gpt-fast ⭐ 5,102
    Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

  112. microsoft/promptbase ⭐ 5,065
    promptbase is an evolving collection of resources, best practices, and example scripts for eliciting the best performance from foundation models.

  113. phidatahq/phidata ⭐ 4,903
    Phidata is a toolkit for building AI Assistants using function calling.
    🔗 docs.phidata.com

  114. langchain-ai/chat-langchain ⭐ 4,786
    Locally hosted chatbot specifically focused on question answering over the LangChain documentation
    🔗 chat.langchain.com

  115. explodinggradients/ragas ⭐ 4,709
    Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
    🔗 docs.ragas.io

  116. openbmb/ToolBench ⭐ 4,423
    [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
    🔗 openbmb.github.io/toolbench

  117. togethercomputer/RedPajama-Data ⭐ 4,357
    The RedPajama-Data repository contains code for preparing large datasets for training large language models.

  118. mnotgod96/AppAgent ⭐ 4,305
    AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
    🔗 appagent-official.github.io

  119. microsoft/BioGPT ⭐ 4,233
    Implementation of BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining

  120. kyegomez/tree-of-thoughts ⭐ 4,047
    Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%
    🔗 discord.gg/qutxnk2nmf

  121. allenai/OLMo ⭐ 3,993
    OLMo is a repository for training and using AI2's state-of-the-art open language models. It is built by scientists, for scientists.
    🔗 allenai.org/olmo

  122. instruction-tuning-with-gpt-4/GPT-4-LLM ⭐ 3,978
    Instruction Tuning with GPT-4
    🔗 instruction-tuning-with-gpt-4.github.io

  123. microsoft/LLMLingua ⭐ 3,855
    To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
    🔗 llmlingua.com

  124. ravenscroftj/turbopilot ⭐ 3,832
    Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU

  125. mshumer/gpt-llm-trainer ⭐ 3,810
    Input a description of your task, and the system will generate a dataset, parse it, and fine-tune a LLaMA 2 model for you

  126. 1rgs/jsonformer ⭐ 3,801
    A Bulletproof Way to Generate Structured JSON from Language Models

  127. yizhongw/self-instruct ⭐ 3,785
    Aligning pretrained language models with instruction data generated by themselves.

  128. vikhyat/moondream ⭐ 3,742
    A tiny open-source computer-vision language model designed to run efficiently on edge devices
    🔗 moondream.ai

  129. whitead/paper-qa ⭐ 3,618
    LLM Chain for answering questions from documents with citations

  130. h2oai/h2o-llmstudio ⭐ 3,598
    H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://h2oai.github.io/h2o-llmstudio/
    🔗 gpt-gm.h2o.ai

  131. mmabrouk/llm-workflow-engine ⭐ 3,584
    Power CLI and Workflow manager for LLMs (core package)

  132. skyvern-ai/skyvern ⭐ 3,524
    Skyvern automates browser-based workflows using LLMs and computer vision. It provides a simple API endpoint to fully automate manual workflows, replacing brittle or unreliable automation solutions.
    🔗 www.skyvern.com

  133. luodian/Otter ⭐ 3,453
    🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
    🔗 otter-ntu.github.io

  134. cg123/mergekit ⭐ 3,443
    Tools for merging pretrained large language models.

  135. minimaxir/simpleaichat ⭐ 3,386
    Python package for easily interfacing with chat apps, with robust features and minimal code complexity.

  136. minimaxir/gpt-2-simple ⭐ 3,381
    Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

  137. nvidia/NeMo-Guardrails ⭐ 3,377
    NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

  138. guardrails-ai/guardrails ⭐ 3,340
    Open-source Python package for specifying structure and type, validating and correcting the outputs of large language models (LLMs)
    🔗 www.guardrailsai.com/docs

  139. eth-sri/lmql ⭐ 3,337
    A language for constraint-guided and efficient LLM programming.
    🔗 lmql.ai

  140. deep-diver/LLM-As-Chatbot ⭐ 3,239
    LLM as a Chatbot Service

  141. microsoft/LMOps ⭐ 3,190
    General technology for enabling AI capabilities w/ LLMs and MLLMs
    🔗 aka.ms/generalai

  142. llmware-ai/llmware ⭐ 3,171
    Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.
    🔗 llmware-ai.github.io/llmware

  143. simonw/llm ⭐ 2,973
    Access large language models from the command-line
    🔗 llm.datasette.io

  144. baichuan-inc/Baichuan-13B ⭐ 2,959
    A 13B large language model developed by Baichuan Intelligent Technology
    🔗 huggingface.co/baichuan-inc/baichuan-13b-chat

  145. microsoft/torchscale ⭐ 2,926
    Foundation Architecture for (M)LLMs
    🔗 aka.ms/generalai

  146. iryna-kondr/scikit-llm ⭐ 2,923
    Seamlessly integrate LLMs into scikit-learn.
    🔗 beastbyte.ai

  147. freedomintelligence/LLMZoo ⭐ 2,872
    ⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡

  148. next-gpt/NExT-GPT ⭐ 2,871
    Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
    🔗 next-gpt.github.io

  149. langchain-ai/langgraph ⭐ 2,843
    LangGraph is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain.
    🔗 langchain-ai.github.io/langgraph

  150. juncongmoo/pyllama ⭐ 2,787
    LLaMA: Open and Efficient Foundation Language Models

  151. promptfoo/promptfoo ⭐ 2,763
    Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
    🔗 www.promptfoo.dev

  152. defog-ai/sqlcoder ⭐ 2,742
    SoTA LLM for converting natural language questions to SQL queries

  153. paperswithcode/galai ⭐ 2,647
    Model API for GALACTICA

  154. li-plus/chatglm.cpp ⭐ 2,601
    C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & more LLMs

  155. open-compass/opencompass ⭐ 2,572
    OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
    🔗 opencompass.org.cn

  156. alpha-vllm/LLaMA2-Accessory ⭐ 2,514
    An Open-source Toolkit for LLM Development
    🔗 llama2-accessory.readthedocs.io

  157. pathwaycom/llm-app ⭐ 2,504
    LLM App templates for RAG, knowledge mining, and stream analytics. Ready to run with Docker,⚡in sync with your data sources.
    🔗 pathway.com/developers/showcases/llm-app-pathway

  158. hegelai/prompttools ⭐ 2,439
    Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
    🔗 prompttools.readthedocs.io

  159. databricks/dbrx ⭐ 2,399
    Code examples and resources for DBRX, a large language model developed by Databricks
    🔗 www.databricks.com

  160. sgl-project/sglang ⭐ 2,373
    SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

  161. weaviate/Verba ⭐ 2,331
    Retrieval Augmented Generation (RAG) chatbot powered by Weaviate

  162. ofa-sys/OFA ⭐ 2,326
    Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

  163. civitai/sd_civitai_extension ⭐ 2,264
    All of the Civitai models inside Automatic 1111 Stable Diffusion Web UI

  164. young-geng/EasyLM ⭐ 2,242
    Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.

  165. bclavie/RAGatouille ⭐ 2,146
    Bridging the gap between state-of-the-art research and alchemical RAG pipeline practices.

  166. openai/finetune-transformer-lm ⭐ 2,086
    Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
    🔗 s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

  167. huggingface/text-embeddings-inference ⭐ 2,022
    A blazing fast inference solution for text embeddings models
    🔗 huggingface.co/docs/text-embeddings-inference/quick_tour

  168. openai/image-gpt ⭐ 2,002
    Archived. Code and models from the paper "Generative Pretraining from Pixels"

  169. noahshinn/reflexion ⭐ 1,981
    [NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning

  170. cheshire-cat-ai/core ⭐ 1,969
    Production ready AI assistant framework
    🔗 cheshirecat.ai

  171. intel/neural-compressor ⭐ 1,968
    SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
    🔗 intel.github.io/neural-compressor

  172. tairov/llama2.mojo ⭐ 1,948
    Inference Llama 2 in one file of pure 🔥
    🔗 www.modular.com/blog/community-spotlight-how-i-built-llama2-by-aydyn-tairov

  173. cohere-ai/cohere-toolkit ⭐ 1,928
    Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.

  174. lucidrains/toolformer-pytorch ⭐ 1,891
    Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI

  175. openai/gpt-2-output-dataset ⭐ 1,890
    Dataset of GPT-2 outputs for research in detection, biases, and more

  176. neulab/prompt2model ⭐ 1,883
    prompt2model - Generate Deployable Models from Natural Language Instructions

  177. spcl/graph-of-thoughts ⭐ 1,867
    Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
    🔗 arxiv.org/pdf/2308.09687.pdf

  178. minimaxir/aitextgen ⭐ 1,828
    A robust Python tool for text-based AI training and generation using GPT-2.
    🔗 docs.aitextgen.io

  179. openai/gpt-discord-bot ⭐ 1,712
    Example Discord bot written in Python that uses the completions API to have conversations with the text-davinci-003 model, and the moderations API to filter the messages.

  180. ist-daslab/gptq ⭐ 1,706
    Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
    🔗 arxiv.org/abs/2210.17323

  181. truera/trulens ⭐ 1,629
    Evaluation and Tracking for LLM Experiments
    🔗 www.trulens.org

  182. epfllm/meditron ⭐ 1,624
    Meditron is a suite of open-source medical Large Language Models (LLMs).
    🔗 huggingface.co/epfl-llm

  183. microsoft/Megatron-DeepSpeed ⭐ 1,617
    Ongoing research training transformer language models at scale, including: BERT & GPT-2

  184. predibase/lorax ⭐ 1,539
    Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
    🔗 loraexchange.ai

  185. ray-project/llm-applications ⭐ 1,505
    A comprehensive guide to building RAG-based LLM applications for production.

  186. jina-ai/thinkgpt ⭐ 1,464
    Agent techniques to augment your LLM and push it beyong its limits

  187. cstankonrad/long_llama ⭐ 1,434
    LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

  188. akariasai/self-rag ⭐ 1,433
    This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
    🔗 selfrag.github.io

  189. farizrahman4u/loopgpt ⭐ 1,393
    Re-implementation of Auto-GPT as a python package, written with modularity and extensibility in mind.

  190. explosion/spacy-transformers ⭐ 1,318
    🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
    🔗 spacy.io/usage/embeddings-transformers

  191. run-llama/llama-lab ⭐ 1,310
    Llama Lab is a repo dedicated to building cutting-edge projects using LlamaIndex

  192. bigscience-workshop/Megatron-DeepSpeed ⭐ 1,244
    Ongoing research training transformer language models at scale, including: BERT & GPT-2

  193. chatarena/chatarena ⭐ 1,224
    ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
    🔗 www.chatarena.org

  194. srush/MiniChain ⭐ 1,170
    A tiny library for coding with large language models.
    🔗 srush-minichain.hf.space

  195. ray-project/ray-llm ⭐ 1,148
    RayLLM - LLMs on Ray
    🔗 aviary.anyscale.com

  196. ibm/Dromedary ⭐ 1,089
    Dromedary: towards helpful, ethical and reliable LLMs.

  197. meetkai/functionary ⭐ 1,059
    Chat language model that can use tools and interpret the results

  198. linksoul-ai/AutoAgents ⭐ 1,038
    [IJCAI 2024] Generate different roles for GPTs to form a collaborative entity for complex tasks.
    🔗 huggingface.co/spaces/linksoul/autoagents

  199. nomic-ai/pygpt4all ⭐ 1,024
    Official supported Python bindings for llama.cpp + gpt4all
    🔗 nomic-ai.github.io/pygpt4all

  200. rlancemartin/auto-evaluator ⭐ 1,024
    Evaluation tool for LLM QA chains
    🔗 autoevaluator.langchain.com

  201. lupantech/chameleon-llm ⭐ 1,019
    Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".
    🔗 chameleon-llm.github.io

  202. ctlllll/LLM-ToolMaker ⭐ 1,000
    Large Language Models as Tool Makers

  203. keirp/automatic_prompt_engineer ⭐ 994
    Large Language Models Are Human-Level Prompt Engineers

  204. microsoft/Llama-2-Onnx ⭐ 987
    A Microsoft optimized version of the Llama 2 model, available from Meta

  205. hao-ai-lab/LookaheadDecoding ⭐ 976
    Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

  206. explosion/spacy-llm ⭐ 948
    🦙 Integrating LLMs into structured NLP pipelines
    🔗 spacy.io/usage/large-language-models

  207. ajndkr/lanarky ⭐ 942
    The web framework for building LLM microservices
    🔗 lanarky.ajndkr.com

  208. pinecone-io/canopy ⭐ 884
    Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
    🔗 www.pinecone.io

  209. cerebras/modelzoo ⭐ 850
    Examples of common deep learning models that can be trained on Cerebras hardware

  210. agenta-ai/agenta ⭐ 838
    The all-in-one LLM developer platform: prompt management, evaluation, human feedback, and deployment all in one place.
    🔗 www.agenta.ai

  211. muennighoff/sgpt ⭐ 809
    SGPT: GPT Sentence Embeddings for Semantic Search
    🔗 arxiv.org/abs/2202.08904

  212. huggingface/nanotron ⭐ 803
    Minimalistic large language model 3D-parallelism training

  213. oliveirabruno01/babyagi-asi ⭐ 747
    BabyAGI: an Autonomous and Self-Improving agent, or BASI

  214. opengenerativeai/GenossGPT ⭐ 738
    One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT 3.5/4, Vertex, GPT4ALL, HuggingFace ...) 🌈🐂 Replace OpenAI GPT with any LLMs in your app with one line.
    🔗 genoss.ai

  215. salesforce/xgen ⭐ 713
    Salesforce open-source LLMs with 8k sequence length.

  216. datadreamer-dev/DataDreamer ⭐ 650
    DataDreamer is a powerful open-source Python library for prompting, synthetic data generation, and training workflows. It is designed to be simple, extremely efficient, and research-grade.
    🔗 datadreamer.dev

  217. topoteretes/cognee ⭐ 640
    Deterministic LLMs Outputs for AI Applications and AI Agents
    🔗 www.cognee.ai

  218. langchain-ai/langsmith-cookbook ⭐ 609
    LangSmith is a platform for building production-grade LLM applications.
    🔗 langsmith-cookbook.vercel.app

  219. opengvlab/OmniQuant ⭐ 569
    [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

  220. squeezeailab/SqueezeLLM ⭐ 569
    [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
    🔗 arxiv.org/abs/2306.07629

  221. lupantech/ScienceQA ⭐ 548
    Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".

  222. hazyresearch/ama_prompting ⭐ 530
    Ask Me Anything language model prompting

  223. zhudotexe/kani ⭐ 527
    kani (カニ) is a highly hackable microframework for chat-based language models with tool use/function calling. (NLP-OSS @ EMNLP 2023)
    🔗 kani.readthedocs.io

  224. vahe1994/SpQR ⭐ 512
    Quantization algorithm and the model evaluation code for SpQR method for LLM compression

  225. continuum-llms/chatgpt-memory ⭐ 510
    Allows to scale the ChatGPT API to multiple simultaneous sessions with infinite contextual and adaptive memory powered by GPT and Redis datastore.

  226. huggingface/lighteval ⭐ 347
    LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.

  227. judahpaul16/gpt-home ⭐ 118
    ChatGPT at home! Basically a better Google Nest Hub or Amazon Alexa home assistant. Built on the Raspberry Pi using the OpenAI API.

  228. stanford-oval/suql ⭐ 99
    SUQL: Conversational Search over Structured and Unstructured Data with LLMs
    🔗 arxiv.org/abs/2311.09818

Math and Science

Mathematical, numerical and scientific libraries.

  1. numpy/numpy ⭐ 26,415
    The fundamental package for scientific computing with Python.
    🔗 numpy.org

  2. taichi-dev/taichi ⭐ 24,787
    Productive, portable, and performant GPU programming in Python: Taichi Lang is an open-source, imperative, parallel programming language for high-performance numerical computation.
    🔗 taichi-lang.org

  3. scipy/scipy ⭐ 12,478
    SciPy library main repository
    🔗 scipy.org

  4. sympy/sympy ⭐ 12,406
    A computer algebra system written in pure Python
    🔗 sympy.org

  5. google/or-tools ⭐ 10,463
    Google Optimization Tools (a.k.a., OR-Tools) is an open-source, fast and portable software suite for solving combinatorial optimization problems.
    🔗 developers.google.com/optimization

  6. z3prover/z3 ⭐ 9,747
    Z3 is a theorem prover from Microsoft Research with a Python language binding.

  7. cupy/cupy ⭐ 7,789
    NumPy & SciPy for GPU
    🔗 cupy.dev

  8. google-deepmind/alphageometry ⭐ 3,697
    Solving Olympiad Geometry without Human Demonstrations

  9. mikedh/trimesh ⭐ 2,761
    Python library for loading and using triangular meshes.
    🔗 trimesh.org

  10. mckinsey/causalnex ⭐ 2,147
    A Python library that helps data scientists to infer causation rather than observing correlation.
    🔗 causalnex.readthedocs.io

  11. pyomo/pyomo ⭐ 1,845
    An object-oriented algebraic modeling language in Python for structured optimization problems.
    🔗 www.pyomo.org

  12. facebookresearch/theseus ⭐ 1,604
    A library for differentiable nonlinear optimization

  13. google-research/torchsde ⭐ 1,475
    Differentiable SDE solvers with GPU support and efficient sensitivity analysis.

  14. dynamicslab/pysindy ⭐ 1,298
    A package for the sparse identification of nonlinear dynamical systems from data
    🔗 pysindy.readthedocs.io/en/latest

  15. geomstats/geomstats ⭐ 1,152
    Computations and statistics on manifolds with geometric structures.
    🔗 geomstats.ai

  16. cma-es/pycma ⭐ 1,028
    pycma is a Python implementation of CMA-ES and a few related numerical optimization tools.

  17. sj001/AI-Feynman ⭐ 586
    Implementation of AI Feynman: a Physics-Inspired Method for Symbolic Regression

  18. willianfuks/tfcausalimpact ⭐ 576
    Python Causal Impact Implementation Based on Google's R Package. Built using TensorFlow Probability.

  19. brandondube/prysm ⭐ 234
    Prysm is an open-source library for physical and first-order modeling of optical systems and analysis of related data: numerical and physical optics, integrated modeling, phase retrieval, segmented systems, polynomials and fitting, sequential raytracing.
    🔗 prysm.readthedocs.io/en/stable

  20. lean-dojo/ReProver ⭐ 164
    Retrieval-Augmented Theorem Provers for Lean
    🔗 leandojo.org

  21. albahnsen/pycircular ⭐ 86
    pycircular is a Python module for circular data analysis

  22. gbillotey/Fractalshades ⭐ 26
    Arbitrary-precision fractal explorer - Python package

Machine Learning - General

General and classical machine learning libraries. See below for other sections covering specialised ML areas.

  1. scikit-learn/scikit-learn ⭐ 58,182
    scikit-learn: machine learning in Python
    🔗 scikit-learn.org

  2. openai/openai-cookbook ⭐ 56,024
    Examples and guides for using the OpenAI API
    🔗 cookbook.openai.com

  3. tencentarc/GFPGAN ⭐ 34,640
    GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

  4. google-research/google-research ⭐ 32,879
    This repository contains code released by Google Research
    🔗 research.google

  5. suno-ai/bark ⭐ 32,696
    🔊 Text-Prompted Generative Audio Model

  6. facebookresearch/faiss ⭐ 28,288
    A library for efficient similarity search and clustering of dense vectors.
    🔗 faiss.ai

  7. google/jax ⭐ 28,025
    Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
    🔗 jax.readthedocs.io

  8. open-mmlab/mmdetection ⭐ 27,855
    OpenMMLab Detection Toolbox and Benchmark
    🔗 mmdetection.readthedocs.io

  9. ageron/handson-ml2 ⭐ 26,944
    A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

  10. lutzroeder/netron ⭐ 26,166
    Visualizer for neural network, deep learning and machine learning models
    🔗 netron.app

  11. dmlc/xgboost ⭐ 25,595
    Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
    🔗 xgboost.readthedocs.io/en/stable

  12. google/mediapipe ⭐ 25,520
    Cross-platform, customizable ML solutions for live and streaming media.
    🔗 mediapipe.dev

  13. harisiqbal88/PlotNeuralNet ⭐ 21,138
    Latex code for making neural networks diagrams

  14. jina-ai/jina ⭐ 20,073
    ☁️ Build multimodal AI applications with cloud-native stack
    🔗 docs.jina.ai

  15. onnx/onnx ⭐ 16,881
    Open standard for machine learning interoperability
    🔗 onnx.ai

  16. microsoft/LightGBM ⭐ 16,069
    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
    🔗 lightgbm.readthedocs.io/en/latest

  17. tensorflow/tensor2tensor ⭐ 14,913
    Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

  18. ddbourgin/numpy-ml ⭐ 14,561
    Machine learning, in numpy
    🔗 numpy-ml.readthedocs.io

  19. ml-explore/mlx ⭐ 14,337
    MLX is an array framework for machine learning on Apple silicon, brought to you by Apple machine learning research.
    🔗 ml-explore.github.io/mlx

  20. aleju/imgaug ⭐ 14,163
    Image augmentation for machine learning experiments.
    🔗 imgaug.readthedocs.io

  21. roboflow/supervision ⭐ 14,069
    We write your reusable computer vision tools. 💜
    🔗 supervision.roboflow.com

  22. microsoft/nni ⭐ 13,760
    An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
    🔗 nni.readthedocs.io

  23. microsoft/Swin-Transformer ⭐ 12,994
    This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
    🔗 arxiv.org/abs/2103.14030

  24. jindongwang/transferlearning ⭐ 12,879
    Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
    🔗 transferlearning.xyz

  25. deepmind/deepmind-research ⭐ 12,815
    This repository contains implementations and illustrative code to accompany DeepMind publications

  26. microsoft/onnxruntime ⭐ 12,767
    ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
    🔗 onnxruntime.ai

  27. spotify/annoy ⭐ 12,714
    Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

  28. neonbjb/tortoise-tts ⭐ 11,827
    A multi-voice TTS system trained with an emphasis on quality

  29. deepmind/alphafold ⭐ 11,697
    Implementation of the inference pipeline of AlphaFold v2

  30. facebookresearch/AnimatedDrawings ⭐ 10,199
    Code to accompany "A Method for Animating Children's Drawings of the Human Figure"

  31. twitter/the-algorithm-ml ⭐ 9,886
    Source code for Twitter's Recommendation Algorithm
    🔗 blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm

  32. ggerganov/ggml ⭐ 9,741
    Tensor library for machine learning

  33. optuna/optuna ⭐ 9,689
    A hyperparameter optimization framework
    🔗 optuna.org

  34. statsmodels/statsmodels ⭐ 9,566
    Statsmodels: statistical modeling and econometrics in Python
    🔗 www.statsmodels.org/devel

  35. epistasislab/tpot ⭐ 9,505
    A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
    🔗 epistasislab.github.io/tpot

  36. megvii-basedetection/YOLOX ⭐ 9,030
    YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/

  37. cleanlab/cleanlab ⭐ 8,673
    The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
    🔗 cleanlab.ai

  38. pycaret/pycaret ⭐ 8,433
    An open-source, low-code machine learning library in Python
    🔗 www.pycaret.org

  39. wandb/wandb ⭐ 8,231
    🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
    🔗 wandb.ai

  40. pymc-devs/pymc ⭐ 8,171
    Bayesian Modeling and Probabilistic Programming in Python
    🔗 docs.pymc.io

  41. uberi/speech_recognition ⭐ 8,051
    Speech recognition module for Python, supporting several engines and APIs, online and offline.
    🔗 pypi.python.org/pypi/speechrecognition

  42. catboost/catboost ⭐ 7,754
    A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
    🔗 catboost.ai

  43. facebookresearch/xformers ⭐ 7,624
    Hackable and optimized Transformers building blocks, supporting a composable construction.
    🔗 facebookresearch.github.io/xformers

  44. open-mmlab/mmsegmentation ⭐ 7,431
    OpenMMLab Semantic Segmentation Toolbox and Benchmark.
    🔗 mmsegmentation.readthedocs.io/en/main

  45. automl/auto-sklearn ⭐ 7,409
    Automated Machine Learning with scikit-learn
    🔗 automl.github.io/auto-sklearn

  46. awslabs/autogluon ⭐ 7,142
    Fast and Accurate ML in 3 Lines of Code
    🔗 auto.gluon.ai

  47. hyperopt/hyperopt ⭐ 7,089
    Distributed Asynchronous Hyperparameter Optimization in Python
    🔗 hyperopt.github.io/hyperopt

  48. featurelabs/featuretools ⭐ 7,033
    An open source python library for automated feature engineering
    🔗 www.featuretools.com

  49. huggingface/accelerate ⭐ 7,007
    🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
    🔗 huggingface.co/docs/accelerate

  50. lmcinnes/umap ⭐ 6,959
    Uniform Manifold Approximation and Projection

  51. hips/autograd ⭐ 6,800
    Efficiently computes derivatives of numpy code.

  52. py-why/dowhy ⭐ 6,752
    DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
    🔗 www.pywhy.org/dowhy

  53. scikit-learn-contrib/imbalanced-learn ⭐ 6,705
    A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
    🔗 imbalanced-learn.org

  54. open-mmlab/mmagic ⭐ 6,591
    OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
    🔗 mmagic.readthedocs.io/en/latest

  55. probml/pyprobml ⭐ 6,263
    Python code for "Probabilistic Machine learning" book by Kevin Murphy

  56. nicolashug/Surprise ⭐ 6,193
    A Python scikit for building and analyzing recommender systems
    🔗 surpriselib.com

  57. google/automl ⭐ 6,157
    Google Brain AutoML

  58. cleverhans-lab/cleverhans ⭐ 6,082
    An adversarial example library for constructing attacks, building defenses, and benchmarking both

  59. kevinmusgrave/pytorch-metric-learning ⭐ 5,770
    The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
    🔗 kevinmusgrave.github.io/pytorch-metric-learning

  60. open-mmlab/mmcv ⭐ 5,612
    OpenMMLab Computer Vision Foundation
    🔗 mmcv.readthedocs.io/en/latest

  61. project-monai/MONAI ⭐ 5,358
    AI Toolkit for Healthcare Imaging
    🔗 monai.io

  62. mdbloice/Augmentor ⭐ 5,023
    Image augmentation library in Python for machine learning.
    🔗 augmentor.readthedocs.io/en/stable

  63. ml-explore/mlx-examples ⭐ 5,004
    Examples in the MLX framework

  64. online-ml/river ⭐ 4,778
    🌊 Online machine learning in Python
    🔗 riverml.xyz

  65. uber/causalml ⭐ 4,770
    Uplift modeling and causal inference with machine learning algorithms

  66. rasbt/mlxtend ⭐ 4,768
    A library of extension and helper modules for Python's data analysis and machine learning libraries.
    🔗 rasbt.github.io/mlxtend

  67. lucidrains/deep-daze ⭐ 4,385
    Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun

  68. google-deepmind/graphcast ⭐ 4,242
    GraphCast: Learning skillful medium-range global weather forecasting

  69. districtdatalabs/yellowbrick ⭐ 4,200
    Visual analysis and diagnostic tools to facilitate machine learning model selection.
    🔗 www.scikit-yb.org

  70. skvark/opencv-python ⭐ 4,179
    Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages.
    🔗 pypi.org/project/opencv-python

  71. marqo-ai/marqo ⭐ 4,141
    Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
    🔗 www.marqo.ai

  72. nv-tlabs/GET3D ⭐ 4,123
    Generative Model of High Quality 3D Textured Shapes Learned from Images

  73. sanchit-gandhi/whisper-jax ⭐ 4,093
    JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

  74. apple/coremltools ⭐ 4,073
    Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
    🔗 coremltools.readme.io

  75. nmslib/hnswlib ⭐ 4,024
    Header-only C++/python library for fast approximate nearest neighbors
    🔗 github.com/nmslib/hnswlib

  76. cmusphinx/pocketsphinx ⭐ 3,750
    A small speech recognizer

  77. microsoft/FLAML ⭐ 3,681
    A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
    🔗 microsoft.github.io/flaml

  78. ourownstory/neural_prophet ⭐ 3,645
    NeuralProphet: A simple forecasting package
    🔗 neuralprophet.com

  79. py-why/EconML ⭐ 3,556
    ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to brin...
    🔗 www.microsoft.com/en-us/research/project/alice

  80. thudm/CogVideo ⭐ 3,497
    Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"

  81. huggingface/notebooks ⭐ 3,298
    Notebooks using the Hugging Face libraries 🤗

  82. facebookresearch/vissl ⭐ 3,230
    VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
    🔗 vissl.ai

  83. huggingface/autotrain-advanced ⭐ 3,214
    AutoTrain Advanced: faster and easier training and deployments of state-of-the-art machine learning models
    🔗 huggingface.co/autotrain

  84. yoheinakajima/instagraph ⭐ 3,190
    Converts text input or URL into knowledge graph and displays

  85. rucaibox/RecBole ⭐ 3,181
    A unified, comprehensive and efficient recommendation library
    🔗 recbole.io

  86. pytorch/glow ⭐ 3,155
    Compiler for Neural Network hardware accelerators

  87. hrnet/HRNet-Semantic-Segmentation ⭐ 3,053
    The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919

  88. lucidrains/musiclm-pytorch ⭐ 3,018
    Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch

  89. zjunlp/DeepKE ⭐ 2,955
    [EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
    🔗 deepke.zjukg.cn

  90. mljar/mljar-supervised ⭐ 2,936
    Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
    🔗 mljar.com

  91. lightly-ai/lightly ⭐ 2,756
    A python library for self-supervised learning on images.
    🔗 docs.lightly.ai/self-supervised-learning

  92. teamhg-memex/eli5 ⭐ 2,729
    A library for debugging/inspecting machine learning classifiers and explaining their predictions
    🔗 eli5.readthedocs.io

  93. scikit-optimize/scikit-optimize ⭐ 2,726
    Sequential model-based optimization with a scipy.optimize interface
    🔗 scikit-optimize.github.io

  94. shankarpandala/lazypredict ⭐ 2,687
    Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning

  95. scikit-learn-contrib/hdbscan ⭐ 2,675
    A high performance implementation of HDBSCAN clustering.
    🔗 hdbscan.readthedocs.io/en/latest

  96. google-research/t5x ⭐ 2,502
    T5X is a modular, composable, research-friendly framework for high-performance, configurable, self-service training, evaluation, and inference of sequence models (starting with language) at many scales.

  97. apple/ml-ane-transformers ⭐ 2,470
    Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)

  98. huggingface/safetensors ⭐ 2,450
    Implements a new simple format for storing tensors safely (as opposed to pickle) and that is still fast (zero-copy).
    🔗 huggingface.co/docs/safetensors

  99. scikit-learn-contrib/category_encoders ⭐ 2,370
    A library of sklearn compatible categorical variable encoders
    🔗 contrib.scikit-learn.org/category_encoders

  100. freedmand/semantra ⭐ 2,273
    Semantra is a multipurpose tool for semantically searching documents. Query by meaning rather than just by matching text.

  101. huggingface/optimum ⭐ 2,158
    🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
    🔗 huggingface.co/docs/optimum/main

  102. rom1504/clip-retrieval ⭐ 2,143
    Easily compute clip embeddings and build a clip retrieval system with them
    🔗 rom1504.github.io/clip-retrieval

  103. aws/sagemaker-python-sdk ⭐ 2,043
    A library for training and deploying machine learning models on Amazon SageMaker
    🔗 sagemaker.readthedocs.io

  104. huggingface/evaluate ⭐ 1,822
    🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
    🔗 huggingface.co/docs/evaluate

  105. rentruewang/koila ⭐ 1,816
    Prevent PyTorch's CUDA error: out of memory in just 1 line of code.
    🔗 rentruewang.github.io/koila

  106. contextlab/hypertools ⭐ 1,801
    A Python toolbox for gaining geometric insights into high-dimensional data
    🔗 hypertools.readthedocs.io/en/latest

  107. linkedin/greykite ⭐ 1,792
    A flexible, intuitive and fast forecasting library

  108. bmabey/pyLDAvis ⭐ 1,780
    Python library for interactive topic model visualization. Port of the R LDAvis package.

  109. scikit-learn-contrib/lightning ⭐ 1,709
    Large-scale linear classification, regression and ranking in Python
    🔗 contrib.scikit-learn.org/lightning

  110. huggingface/huggingface_hub ⭐ 1,691
    The official Python client for the Huggingface Hub.
    🔗 huggingface.co/docs/huggingface_hub

  111. tensorflow/addons ⭐ 1,677
    Useful extra functionality for TensorFlow 2.x maintained by SIG-addons

  112. eric-mitchell/direct-preference-optimization ⭐ 1,649
    Reference implementation for DPO (Direct Preference Optimization)

  113. microsoft/i-Code ⭐ 1,636
    The ambition of the i-Code project is to build integrative and composable multimodal AI. The "i" stands for integrative multimodal learning.

  114. castorini/pyserini ⭐ 1,465
    Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
    🔗 pyserini.io

  115. jina-ai/finetuner ⭐ 1,428
    🎯 Task-oriented embedding tuning for BERT, CLIP, etc.
    🔗 finetuner.jina.ai

  116. kubeflow/katib ⭐ 1,424
    Automated Machine Learning on Kubernetes
    🔗 www.kubeflow.org/docs/components/katib

  117. visual-layer/fastdup ⭐ 1,409
    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.

  118. laekov/fastmoe ⭐ 1,386
    A fast MoE impl for PyTorch
    🔗 fastmoe.ai

  119. scikit-learn-contrib/metric-learn ⭐ 1,376
    Metric learning algorithms in Python
    🔗 contrib.scikit-learn.org/metric-learn

  120. googlecloudplatform/vertex-ai-samples ⭐ 1,362
    Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud
    🔗 cloud.google.com/vertex-ai

  121. csinva/imodels ⭐ 1,292
    Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
    🔗 csinva.io/imodels

  122. borealisai/advertorch ⭐ 1,273
    A Toolbox for Adversarial Robustness Research

  123. awslabs/dgl-ke ⭐ 1,236
    High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
    🔗 dglke.dgl.ai/doc

  124. microsoft/Olive ⭐ 1,225
    Olive is an easy-to-use hardware-aware model optimization tool that composes industry-leading techniques across model compression, optimization, and compilation.
    🔗 microsoft.github.io/olive

  125. microsoft/Semi-supervised-learning ⭐ 1,191
    A Unified Semi-Supervised Learning Codebase (NeurIPS'22)
    🔗 usb.readthedocs.io

  126. patchy631/machine-learning ⭐ 1,186
    Machine Learning Tutorials Repository

  127. google/vizier ⭐ 1,173
    Python-based research interface for blackbox and hyperparameter optimization, based on the internal Google Vizier Service.
    🔗 oss-vizier.readthedocs.io

  128. spotify/voyager ⭐ 1,160
    🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
    🔗 spotify.github.io/voyager

  129. koaning/scikit-lego ⭐ 1,155
    Extra blocks for scikit-learn pipelines.
    🔗 koaning.github.io/scikit-lego

  130. automl/TabPFN ⭐ 1,086
    Official implementation of the TabPFN paper (https://arxiv.org/abs/2207.01848) and the tabpfn package.
    🔗 priorlabs.ai

  131. google-research/deeplab2 ⭐ 989
    DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.

  132. lmcinnes/pynndescent ⭐ 841
    A Python nearest neighbor descent for approximate nearest neighbors

  133. hazyresearch/safari ⭐ 840
    Convolutions for Sequence Modeling

  134. davidmrau/mixture-of-experts ⭐ 834
    PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

  135. qdrant/fastembed ⭐ 794
    Fast, Accurate, Lightweight Python library to make State of the Art Embedding
    🔗 qdrant.github.io/fastembed

  136. opentensor/bittensor ⭐ 781
    Internet-scale Neural Networks
    🔗 www.bittensor.com

  137. nvidia/cuda-python ⭐ 773
    CUDA Python Low-level Bindings
    🔗 nvidia.github.io/cuda-python

  138. oml-team/open-metric-learning ⭐ 762
    OML is a PyTorch-based framework to train and validate the models producing high-quality embeddings.
    🔗 open-metric-learning.readthedocs.io/en/latest/index.html

  139. criteo/autofaiss ⭐ 750
    Automatically create Faiss knn indices with the most optimal similarity search parameters.
    🔗 criteo.github.io/autofaiss

  140. facebookresearch/balance ⭐ 673
    The balance python package offers a simple workflow and methods for dealing with biased data samples when looking to infer from them to some target population of interest.
    🔗 import-balance.org

  141. awslabs/python-deequ ⭐ 649
    Python API for Deequ, a library built on Spark for defining "unit tests for data", which measure data quality in large datasets

  142. nicolas-hbt/pygraft ⭐ 640
    Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
    🔗 pygraft.readthedocs.io/en/latest

  143. replicate/replicate-python ⭐ 638
    Python client for Replicate
    🔗 replicate.com

  144. hpcaitech/EnergonAI ⭐ 631
    Large-scale model inference.

  145. qdrant/quaterion ⭐ 625
    Blazing fast framework for fine-tuning similarity learning models
    🔗 quaterion.qdrant.tech

  146. huggingface/quanto ⭐ 582
    A pytorch Quantization Toolkit

  147. microsoft/Focal-Transformer ⭐ 542
    [NeurIPS 2021 Spotlight] Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

  148. googleapis/python-aiplatform ⭐ 537
    A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.

  149. huggingface/exporters ⭐ 533
    Export Hugging Face models to Core ML and TensorFlow Lite

  150. nevronai/MetisFL ⭐ 531
    The first open Federated Learning framework implemented in C++ and Python.
    🔗 metisfl.org

  151. cvxgrp/pymde ⭐ 516
    Minimum-distortion embedding with PyTorch
    🔗 pymde.org

  152. dylanhogg/gptauthor ⭐ 44
    GPTAuthor is an AI tool for writing long form, multi-chapter stories given a story prompt.

Machine Learning - Deep Learning

Machine learning libraries that cross over with deep learning in some way.

  1. tensorflow/tensorflow ⭐ 182,581
    An Open Source Machine Learning Framework for Everyone
    🔗 tensorflow.org

  2. pytorch/pytorch ⭐ 78,096
    Tensors and Dynamic neural networks in Python with strong GPU acceleration
    🔗 pytorch.org

  3. keras-team/keras ⭐ 60,973
    Deep Learning for humans
    🔗 keras.io

  4. openai/whisper ⭐ 60,643
    Robust Speech Recognition via Large-Scale Weak Supervision

  5. deepfakes/faceswap ⭐ 49,279
    Deepfakes Software For All
    🔗 www.faceswap.dev

  6. iperov/DeepFaceLab ⭐ 45,526
    DeepFaceLab is the leading software for creating deepfakes.

  7. facebookresearch/segment-anything ⭐ 44,181
    The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

  8. microsoft/DeepSpeed ⭐ 32,815
    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
    🔗 www.deepspeed.ai

  9. rwightman/pytorch-image-models ⭐ 29,844
    PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
    🔗 huggingface.co/docs/timm

  10. facebookresearch/detectron2 ⭐ 28,762
    Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
    🔗 detectron2.readthedocs.io/en/latest

  11. lightning-ai/pytorch-lightning ⭐ 26,963
    The deep learning framework to pretrain, finetune and deploy AI models. PyTorch Lightning is just organized PyTorch - Lightning disentangles PyTorch code to decouple the science from the engineering.
    🔗 lightning.ai

  12. xinntao/Real-ESRGAN ⭐ 26,163
    Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

  13. facebookresearch/Detectron ⭐ 26,146
    FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

  14. matterport/Mask_RCNN ⭐ 24,167
    Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

  15. openai/CLIP ⭐ 22,297
    CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

  16. paddlepaddle/Paddle ⭐ 21,625
    PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
    🔗 www.paddlepaddle.org

  17. apache/mxnet ⭐ 20,713
    Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
    🔗 mxnet.apache.org

  18. pyg-team/pytorch_geometric ⭐ 20,157
    Graph Neural Network Library for PyTorch
    🔗 pyg.org

  19. lucidrains/vit-pytorch ⭐ 18,058
    Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

  20. sanster/IOPaint ⭐ 17,194
    Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
    🔗 www.iopaint.com

  21. rasbt/deeplearning-models ⭐ 16,316
    A collection of various deep learning architectures, models, and tips

  22. danielgatis/rembg ⭐ 14,588
    Rembg is a tool to remove images background

  23. albumentations-team/albumentations ⭐ 13,448
    Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
    🔗 albumentations.ai

  24. dmlc/dgl ⭐ 13,020
    Python package built to ease deep learning on graph, on top of existing DL frameworks.
    🔗 dgl.ai

  25. facebookresearch/detr ⭐ 12,863
    End-to-End Object Detection with Transformers

  26. nvidia/DeepLearningExamples ⭐ 12,643
    State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

  27. kornia/kornia ⭐ 9,419
    Geometric Computer Vision Library for Spatial AI
    🔗 kornia.readthedocs.io

  28. keras-team/autokeras ⭐ 9,067
    AutoML library for deep learning
    🔗 autokeras.com

  29. mlfoundations/open_clip ⭐ 8,491
    An open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training)

  30. pyro-ppl/pyro ⭐ 8,368
    Deep universal probabilistic programming with Python and PyTorch
    🔗 pyro.ai

  31. facebookresearch/pytorch3d ⭐ 8,311
    PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
    🔗 pytorch3d.org

  32. nvidia/apex ⭐ 8,044
    A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

  33. google/trax ⭐ 7,958
    Trax — Deep Learning with Clear Code and Speed

  34. arogozhnikov/einops ⭐ 7,942
    Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
    🔗 einops.rocks

  35. facebookresearch/ImageBind ⭐ 7,894
    ImageBind One Embedding Space to Bind Them All

  36. lucidrains/imagen-pytorch ⭐ 7,792
    Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

  37. xpixelgroup/BasicSR ⭐ 6,204
    Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.
    🔗 basicsr.readthedocs.io/en/latest

  38. skorch-dev/skorch ⭐ 5,635
    A scikit-learn compatible neural network library that wraps PyTorch

  39. google/flax ⭐ 5,538
    Flax is a neural network library for JAX that is designed for flexibility.
    🔗 flax.readthedocs.io

  40. facebookresearch/mmf ⭐ 5,417
    A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
    🔗 mmf.sh

  41. mosaicml/composer ⭐ 5,002
    Supercharge Your Model Training
    🔗 docs.mosaicml.com

  42. pytorch/ignite ⭐ 4,458
    High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
    🔗 pytorch-ignite.ai

  43. facebookincubator/AITemplate ⭐ 4,456
    AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

  44. deci-ai/super-gradients ⭐ 4,339
    Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
    🔗 www.supergradients.com

  45. nvidiagameworks/kaolin ⭐ 4,234
    A PyTorch Library for Accelerating 3D Deep Learning Research

  46. williamyang1991/VToonify ⭐ 3,468
    [SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer

  47. facebookresearch/PyTorch-BigGraph ⭐ 3,351
    Generate embeddings from large-scale graph-structured data.
    🔗 torchbiggraph.readthedocs.io

  48. cvg/LightGlue ⭐ 2,997
    LightGlue: Local Feature Matching at Light Speed (ICCV 2023)

  49. alpa-projects/alpa ⭐ 2,987
    Training and serving large-scale neural networks with auto parallelization.
    🔗 alpa.ai

  50. pytorch/botorch ⭐ 2,957
    Bayesian optimization in PyTorch
    🔗 botorch.org

  51. deepmind/dm-haiku ⭐ 2,806
    JAX-based neural network library
    🔗 dm-haiku.readthedocs.io

  52. explosion/thinc ⭐ 2,794
    🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
    🔗 thinc.ai

  53. nerdyrodent/VQGAN-CLIP ⭐ 2,570
    Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

  54. danielegrattarola/spektral ⭐ 2,346
    Graph Neural Networks with Keras and Tensorflow 2.
    🔗 graphneural.network

  55. google-research/electra ⭐ 2,296
    ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

  56. neuralmagic/sparseml ⭐ 1,977
    Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

  57. fepegar/torchio ⭐ 1,957
    Medical imaging toolkit for deep learning
    🔗 www.torchio.org

  58. pytorch/torchrec ⭐ 1,731
    Pytorch domain library for recommendation systems

  59. tensorflow/mesh ⭐ 1,557
    Mesh TensorFlow: Model Parallelism Made Easier

  60. vt-vl-lab/FGVC ⭐ 1,546
    [ECCV 2020] Flow-edge Guided Video Completion

  61. tensorly/tensorly ⭐ 1,495
    TensorLy: Tensor Learning in Python.
    🔗 tensorly.org

  62. calculatedcontent/WeightWatcher ⭐ 1,393
    The WeightWatcher tool for predicting the accuracy of Deep Neural Networks

  63. hysts/pytorch_image_classification ⭐ 1,317
    PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet

  64. jeshraghian/snntorch ⭐ 1,087
    Deep and online learning with spiking neural networks in Python
    🔗 snntorch.readthedocs.io/en/latest

  65. xl0/lovely-tensors ⭐ 1,051
    Tensors, ready for human consumption
    🔗 xl0.github.io/lovely-tensors

  66. tensorflow/similarity ⭐ 996
    TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

  67. deepmind/android_env ⭐ 954
    RL research on Android devices.

  68. keras-team/keras-cv ⭐ 949
    Industry-strength Computer Vision workflows with Keras

  69. deepmind/chex ⭐ 716
    Chex is a library of utilities for helping to write reliable JAX code
    🔗 chex.readthedocs.io

  70. kakaobrain/rq-vae-transformer ⭐ 690
    The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)

  71. whitead/dmol-book ⭐ 579
    Deep learning for molecules and materials book
    🔗 dmol.pub

Machine Learning - Interpretability

Machine learning interpretability libraries. Covers explainability, prediction explainations, dashboards, understanding knowledge development in training.

  1. slundberg/shap ⭐ 21,673
    A game theoretic approach to explain the output of any machine learning model.
    🔗 shap.readthedocs.io

  2. marcotcr/lime ⭐ 11,302
    Lime: Explaining the predictions of any machine learning classifier

  3. interpretml/interpret ⭐ 6,001
    Fit interpretable models. Explain blackbox machine learning.
    🔗 interpret.ml/docs

  4. tensorflow/lucid ⭐ 4,613
    A collection of infrastructure and tools for research in neural network interpretability.

  5. pytorch/captum ⭐ 4,581
    Model interpretability and understanding for PyTorch
    🔗 captum.ai

  6. pair-code/lit ⭐ 3,398
    The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
    🔗 pair-code.github.io/lit

  7. maif/shapash ⭐ 2,646
    🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models
    🔗 maif.github.io/shapash

  8. arize-ai/phoenix ⭐ 2,638
    AI Observability & Evaluation
    🔗 docs.arize.com/phoenix

  9. seldonio/alibi ⭐ 2,290
    Algorithms for explaining machine learning models
    🔗 docs.seldon.io/projects/alibi/en/stable

  10. oegedijk/explainerdashboard ⭐ 2,227
    Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.
    🔗 explainerdashboard.readthedocs.io

  11. eleutherai/pythia ⭐ 2,054
    Interpretability analysis and scaling laws to understand how knowledge develops and evolves during training in autoregressive transformers

  12. jalammar/ecco ⭐ 1,906
    Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
    🔗 ecco.readthedocs.io

  13. google-deepmind/penzai ⭐ 1,413
    A JAX library for writing models as legible, functional pytree data structures, along with tools for visualizing, modifying, and analyzing them. Penzai focuses on making it easy to do stuff with models after they have been trained
    🔗 penzai.readthedocs.io

  14. cdpierse/transformers-interpret ⭐ 1,212
    Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.

  15. ethicalml/xai ⭐ 1,064
    XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contains various tools that enable for analysis and evaluation of data and models
    🔗 ethical.institute/principles.html#commitment-3

  16. selfexplainml/PiML-Toolbox ⭐ 871
    PiML (Python Interpretable Machine Learning) toolbox for model development & diagnostics
    🔗 selfexplainml.github.io/piml-toolbox

Machine Learning - Ops

MLOps tools, frameworks and libraries: intersection of machine learning, data engineering and DevOps; deployment, health, diagnostics and governance of ML models.

  1. apache/airflow ⭐ 34,571
    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
    🔗 airflow.apache.org

  2. ray-project/ray ⭐ 31,174
    Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
    🔗 ray.io

  3. spotify/luigi ⭐ 17,331
    Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

  4. mlflow/mlflow ⭐ 17,321
    Open source platform for the machine learning lifecycle
    🔗 mlflow.org

  5. prefecthq/prefect ⭐ 14,696
    Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines
    🔗 prefect.io

  6. horovod/horovod ⭐ 13,955
    Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
    🔗 horovod.ai

  7. iterative/dvc ⭐ 13,141
    🦉 ML Experiments and Data Management with Git
    🔗 dvc.org

  8. ludwig-ai/ludwig ⭐ 10,828
    Low-code framework for building custom LLMs, neural networks, and other AI models
    🔗 ludwig.ai

  9. dagster-io/dagster ⭐ 10,269
    An orchestration platform for the development, production, and observation of data assets.
    🔗 dagster.io

  10. great-expectations/great_expectations ⭐ 9,475
    Always know what to expect from your data.
    🔗 docs.greatexpectations.io

  11. kedro-org/kedro ⭐ 9,368
    Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
    🔗 kedro.org

  12. dbt-labs/dbt-core ⭐ 8,922
    dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
    🔗 getdbt.com

  13. bentoml/OpenLLM ⭐ 8,825
    Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.
    🔗 bentoml.com

  14. huggingface/text-generation-inference ⭐ 7,927
    A Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace to power Hugging Chat, the Inference API and Inference Endpoint.
    🔗 hf.co/docs/text-generation-inference

  15. activeloopai/deeplake ⭐ 7,726
    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
    🔗 activeloop.ai

  16. netflix/metaflow ⭐ 7,612
    🚀 Build and manage real-life ML, AI, and data science projects with ease!
    🔗 metaflow.org

  17. mage-ai/mage-ai ⭐ 7,067
    🧙 Build, run, and manage data pipelines for integrating and transforming data.
    🔗 www.mage.ai

  18. bentoml/BentoML ⭐ 6,561
    The most flexible way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Inference Graph/Pipelines, Compound AI systems, Multi-Modal, RAG as a Service, and more!
    🔗 bentoml.com

  19. kestra-io/kestra ⭐ 6,434
    Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
    🔗 kestra.io

  20. feast-dev/feast ⭐ 5,271
    Feature Store for Machine Learning
    🔗 feast.dev

  21. allegroai/clearml ⭐ 5,271
    ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
    🔗 clear.ml/docs

  22. aimhubio/aim ⭐ 4,804
    Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
    🔗 aimstack.io

  23. flyteorg/flyte ⭐ 4,782
    Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
    🔗 flyte.org

  24. evidentlyai/evidently ⭐ 4,665
    Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b

  25. adap/flower ⭐ 4,203
    Flower: A Friendly Federated Learning Framework
    🔗 flower.ai

  26. orchest/orchest ⭐ 4,022
    Build data pipelines, the easy way 🛠️
    🔗 orchest.readthedocs.io/en/stable

  27. zenml-io/zenml ⭐ 3,675
    ZenML 🙏: Build portable, production-ready MLOps pipelines. https://zenml.io.
    🔗 zenml.io

  28. langfuse/langfuse ⭐ 3,561
    🪢 Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
    🔗 langfuse.com/docs

  29. polyaxon/polyaxon ⭐ 3,485
    MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle
    🔗 polyaxon.com

  30. kubeflow/pipelines ⭐ 3,446
    Machine Learning Pipelines for Kubeflow
    🔗 www.kubeflow.org/docs/components/pipelines

  31. ploomber/ploomber ⭐ 3,380
    The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
    🔗 docs.ploomber.io

  32. towhee-io/towhee ⭐ 3,001
    Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
    🔗 towhee.io

  33. determined-ai/determined ⭐ 2,866
    Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
    🔗 determined.ai

  34. leptonai/leptonai ⭐ 2,452
    A Pythonic framework to simplify AI service building
    🔗 lepton.ai

  35. internlm/xtuner ⭐ 2,420
    An efficient, flexible and full-featured toolkit for fine-tuning large models (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

  36. internlm/lmdeploy ⭐ 2,399
    LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
    🔗 lmdeploy.readthedocs.io/en/latest

  37. meltano/meltano ⭐ 1,597
    Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
    🔗 meltano.com

  38. hi-primus/optimus ⭐ 1,446
    🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
    🔗 hi-optimus.com

  39. kubeflow/examples ⭐ 1,375
    A repository to host extended examples and tutorials

  40. dagworks-inc/hamilton ⭐ 1,343
    Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
    🔗 hamilton.dagworks.io/en/latest

  41. azure/PyRIT ⭐ 1,271
    The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and ML engineers to red team foundation models and their applications.

  42. dstackai/dstack ⭐ 1,105
    An open-source container orchestration engine for running AI workloads in any cloud or data center. https://discord.gg/u8SmfwPpMd
    🔗 dstack.ai

  43. nccr-itmo/FEDOT ⭐ 605
    Automated modeling and machine learning framework FEDOT
    🔗 fedot.readthedocs.io

  44. dagworks-inc/burr ⭐ 438
    Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, persist, and execute on your own infrastructure.
    🔗 burr.dagworks.io

Machine Learning - Reinforcement

Machine learning libraries and toolkits that cross over with reinforcement learning in some way: agent reinforcement learning, agent environemnts, RLHF

  1. openai/gym ⭐ 33,908
    A toolkit for developing and comparing reinforcement learning algorithms.
    🔗 www.gymlibrary.dev

  2. unity-technologies/ml-agents ⭐ 16,374
    The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
    🔗 unity.com/products/machine-learning-agents

  3. openai/baselines ⭐ 15,355
    OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

  4. google/dopamine ⭐ 10,375
    Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
    🔗 github.com/google/dopamine

  5. deepmind/pysc2 ⭐ 7,916
    StarCraft II Learning Environment

  6. lucidrains/PaLM-rlhf-pytorch ⭐ 7,596
    Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

  7. thu-ml/tianshou ⭐ 7,431
    An elegant PyTorch deep reinforcement learning library.
    🔗 tianshou.org

  8. tensorlayer/TensorLayer ⭐ 7,297
    Deep Learning and Reinforcement Learning Library for Scientists and Engineers
    🔗 tensorlayerx.com

  9. farama-foundation/Gymnasium ⭐ 5,767
    An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
    🔗 gymnasium.farama.org

  10. keras-rl/keras-rl ⭐ 5,493
    Deep Reinforcement Learning for Keras.
    🔗 keras-rl.readthedocs.io

  11. deepmind/dm_control ⭐ 3,557
    Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

  12. facebookresearch/ReAgent ⭐ 3,522
    A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
    🔗 reagent.ai

  13. ai4finance-foundation/ElegantRL ⭐ 3,453
    Massively Parallel Deep Reinforcement Learning. 🔥
    🔗 ai4finance.org

  14. deepmind/acme ⭐ 3,380
    A library of reinforcement learning components and agents

  15. eureka-research/Eureka ⭐ 2,596
    Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models"
    🔗 eureka-research.github.io

  16. pettingzoo-team/PettingZoo ⭐ 2,385
    An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities
    🔗 pettingzoo.farama.org

  17. kzl/decision-transformer ⭐ 2,160
    Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

  18. pytorch/rl ⭐ 1,883
    A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
    🔗 pytorch.org/rl

  19. anthropics/hh-rlhf ⭐ 1,442
    Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
    🔗 arxiv.org/abs/2204.05862

  20. humancompatibleai/imitation ⭐ 1,140
    Clean PyTorch implementations of imitation and reward learning algorithms
    🔗 imitation.readthedocs.io

  21. arise-initiative/robosuite ⭐ 1,087
    robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
    🔗 robosuite.ai

  22. denys88/rl_games ⭐ 730
    RL Games: High performance RL library

Natural Language Processing

Natural language processing libraries and toolkits: text processing, topic modelling, tokenisers, chatbots. Also see the LLMs and ChatGPT category for crossover.

  1. huggingface/transformers ⭐ 125,379
    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
    🔗 huggingface.co/transformers

  2. pytorch/fairseq ⭐ 29,290
    Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

  3. explosion/spaCy ⭐ 28,782
    💫 Industrial-strength Natural Language Processing (NLP) in Python
    🔗 spacy.io

  4. myshell-ai/OpenVoice ⭐ 23,425
    Instant voice cloning by MyShell.
    🔗 research.myshell.ai/open-voice

  5. huggingface/datasets ⭐ 18,439
    🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
    🔗 huggingface.co/docs/datasets

  6. microsoft/unilm ⭐ 18,364
    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
    🔗 aka.ms/generalai

  7. rare-technologies/gensim ⭐ 15,258
    Topic Modelling for Humans
    🔗 radimrehurek.com/gensim

  8. gunthercox/ChatterBot ⭐ 13,899
    ChatterBot is a machine learning, conversational dialog engine for creating chat bots
    🔗 chatterbot.readthedocs.io

  9. ukplab/sentence-transformers ⭐ 13,832
    Multilingual Sentence & Image Embeddings with BERT
    🔗 www.sbert.net

  10. flairnlp/flair ⭐ 13,580
    A very simple framework for state-of-the-art Natural Language Processing (NLP)
    🔗 flairnlp.github.io/flair

  11. nltk/nltk ⭐ 13,047
    NLTK Source
    🔗 www.nltk.org

  12. jina-ai/clip-as-service ⭐ 12,198
    🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
    🔗 clip-as-service.jina.ai

  13. allenai/allennlp ⭐ 11,695
    An open-source NLP research library, built on PyTorch.
    🔗 www.allennlp.org

  14. facebookresearch/ParlAI ⭐ 10,431
    A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
    🔗 parl.ai

  15. facebookresearch/seamless_communication ⭐ 10,231
    Foundational Models for State-of-the-Art Speech and Text Translation

  16. nvidia/NeMo ⭐ 10,122
    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
    🔗 docs.nvidia.com/nemo-framework/user-guide/latest/overview.html

  17. openai/tiktoken ⭐ 9,878
    tiktoken is a fast BPE tokeniser for use with OpenAI's models.

  18. google/sentencepiece ⭐ 9,517
    Unsupervised text tokenizer for Neural Network-based text generation.

  19. m-bain/whisperX ⭐ 9,070
    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

  20. doccano/doccano ⭐ 9,008
    Open source annotation tool for machine learning practitioners.

  21. togethercomputer/OpenChatKit ⭐ 8,998
    OpenChatKit provides a powerful, open-source base to create both specialized and general purpose chatbots

  22. sloria/TextBlob ⭐ 8,953
    Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
    🔗 textblob.readthedocs.io

  23. clips/pattern ⭐ 8,668
    Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
    🔗 github.com/clips/pattern/wiki

  24. vikparuchuri/marker ⭐ 8,222
    Marker converts PDF, EPUB, and MOBI to markdown. It's 10x faster than nougat, more accurate on most documents, and has low hallucination risk.

  25. facebookresearch/nougat ⭐ 8,079
    Implementation of Nougat Neural Optical Understanding for Academic Documents
    🔗 facebookresearch.github.io/nougat

  26. speechbrain/speechbrain ⭐ 7,903
    A PyTorch-based Speech Toolkit
    🔗 speechbrain.github.io

  27. espnet/espnet ⭐ 7,896
    End-to-End Speech Processing Toolkit
    🔗 espnet.github.io/espnet

  28. neuml/txtai ⭐ 7,021
    💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
    🔗 neuml.github.io/txtai

  29. deeppavlov/DeepPavlov ⭐ 6,548
    An open source library for deep learning end-to-end dialog systems and chatbots.
    🔗 deeppavlov.ai

  30. facebookresearch/metaseq ⭐ 6,388
    A codebase for working with Open Pre-trained Transformers, originally forked from fairseq.

  31. kingoflolz/mesh-transformer-jax ⭐ 6,222
    Model parallel transformers in JAX and Haiku

  32. vikparuchuri/surya ⭐ 6,188
    OCR, layout analysis, reading order, line detection in 90+ languages

  33. maartengr/BERTopic ⭐ 5,573
    Leveraging BERT and c-TF-IDF to create easily interpretable topics.
    🔗 maartengr.github.io/bertopic

  34. minimaxir/textgenrnn ⭐ 4,942
    Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

  35. prefecthq/marvin ⭐ 4,775
    ✨ Build AI interfaces that spark joy
    🔗 askmarvin.ai

  36. salesforce/CodeGen ⭐ 4,769
    CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

  37. aiwaves-cn/agents ⭐ 4,531
    An Open-source Framework for Autonomous Language Agents
    🔗 www.aiwaves-agents.com

  38. layout-parser/layout-parser ⭐ 4,473
    A Unified Toolkit for Deep Learning Based Document Image Analysis
    🔗 layout-parser.github.io

  39. facebookresearch/DrQA ⭐ 4,467
    Reading Wikipedia to Answer Open-Domain Questions

  40. makcedward/nlpaug ⭐ 4,305
    Data augmentation for NLP
    🔗 makcedward.github.io

  41. thilinarajapakse/simpletransformers ⭐ 3,988
    Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
    🔗 simpletransformers.ai

  42. life4/textdistance ⭐ 3,302
    📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

  43. jsvine/markovify ⭐ 3,271
    A simple, extensible Markov chain generator.

  44. maartengr/KeyBERT ⭐ 3,224
    Minimal keyword extraction with BERT
    🔗 maartengr.github.io/keybert

  45. argilla-io/argilla ⭐ 3,122
    Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
    🔗 docs.argilla.io

  46. bytedance/lightseq ⭐ 3,098
    LightSeq: A High Performance Library for Sequence Processing and Generation

  47. errbotio/errbot ⭐ 3,060
    Errbot is a chatbot, a daemon that connects to your favorite chat service and bring your tools and some fun into the conversation.
    🔗 errbot.io

  48. promptslab/Promptify ⭐ 3,033
    Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research
    🔗 discord.gg/m88xfymbk6

  49. huawei-noah/Pretrained-Language-Model ⭐ 2,961
    Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

  50. neuralmagic/deepsparse ⭐ 2,879
    Sparsity-aware deep learning inference runtime for CPUs
    🔗 neuralmagic.com/deepsparse

  51. jbesomi/texthero ⭐ 2,865
    Text preprocessing, representation and visualization from zero to hero.
    🔗 texthero.org

  52. ddangelov/Top2Vec ⭐ 2,844
    Top2Vec learns jointly embedded topic, document and word vectors.

  53. huggingface/neuralcoref ⭐ 2,806
    ✨Fast Coreference Resolution in spaCy with Neural Networks
    🔗 huggingface.co/coref

  54. salesforce/CodeT5 ⭐ 2,596
    Home of CodeT5: Open Code LLMs for Code Understanding and Generation
    🔗 arxiv.org/abs/2305.07922

  55. bigscience-workshop/promptsource ⭐ 2,510
    Toolkit for creating, sharing and using natural language prompts.

  56. huggingface/setfit ⭐ 1,992
    SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers.
    🔗 hf.co/docs/setfit

  57. jamesturk/jellyfish ⭐ 1,990
    🪼 a python library for doing approximate and phonetic matching of strings.
    🔗 jamesturk.github.io/jellyfish

  58. alibaba/EasyNLP ⭐ 1,953
    EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

  59. thudm/P-tuning-v2 ⭐ 1,889
    An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks

  60. deepset-ai/FARM ⭐ 1,724
    🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
    🔗 farm.deepset.ai

  61. marella/ctransformers ⭐ 1,703
    Python bindings for the Transformer models implemented in C/C++ using GGML library.

  62. featureform/featureform ⭐ 1,695
    The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
    🔗 www.featureform.com

  63. franck-dernoncourt/NeuroNER ⭐ 1,679
    Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.
    🔗 neuroner.com

  64. plasticityai/magnitude ⭐ 1,611
    A fast, efficient universal vector embedding utility package.

  65. arxiv-vanity/arxiv-vanity ⭐ 1,597
    Renders papers from arXiv as responsive web pages so you don't have to squint at a PDF.
    🔗 www.arxiv-vanity.com

  66. google-research/language ⭐ 1,563
    Shared repository for open-sourced projects from the Google AI Language team.
    🔗 ai.google/research/teams/language

  67. explosion/spacy-models ⭐ 1,515
    💫 Models for the spaCy Natural Language Processing (NLP) library
    🔗 spacy.io

  68. chrismattmann/tika-python ⭐ 1,420
    Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

  69. dmmiller612/bert-extractive-summarizer ⭐ 1,348
    Easy to use extractive text summarization with BERT

  70. gunthercox/chatterbot-corpus ⭐ 1,346
    A multilingual dialog corpus
    🔗 chatterbot-corpus.readthedocs.io

  71. jonasgeiping/cramming ⭐ 1,238
    Cramming the training of a (BERT-type) language model into limited compute.

  72. abertsch72/unlimiformer ⭐ 1,032
    Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"

  73. nomic-ai/nomic ⭐ 1,006
    Interact, analyze and structure massive text, image, embedding, audio and video datasets
    🔗 atlas.nomic.ai

  74. pemistahl/lingua-py ⭐ 915
    The most accurate natural language detection library for Python, suitable for short text and mixed-language text

  75. norskregnesentral/skweak ⭐ 910
    skweak: A software toolkit for weak supervision applied to NLP tasks

  76. intellabs/fastRAG ⭐ 909
    Efficient Retrieval Augmentation and Generation Framework

  77. openai/grade-school-math ⭐ 881
    GSM8K, a dataset of 8.5K high quality linguistically diverse grade school math word problems

  78. explosion/spacy-streamlit ⭐ 765
    👑 spaCy building blocks and visualizers for Streamlit apps
    🔗 share.streamlit.io/ines/spacy-streamlit-demo/master/app.py

  79. paddlepaddle/RocketQA ⭐ 744
    🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.

  80. explosion/spacy-stanza ⭐ 715
    💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy

  81. keras-team/keras-nlp ⭐ 701
    Modular Natural Language Processing workflows with Keras

  82. urchade/GLiNER ⭐ 624
    Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 24
    🔗 arxiv.org/abs/2311.08526

Packaging

Python packaging, dependency management and bundling.

  1. pyenv/pyenv ⭐ 36,797
    pyenv lets you easily switch between multiple versions of Python.

  2. python-poetry/poetry ⭐ 29,557
    Python packaging and dependency management made easy
    🔗 python-poetry.org

  3. pypa/pipenv ⭐ 24,611
    A virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, python and virtualenv.
    🔗 pipenv.pypa.io

  4. astral-sh/uv ⭐ 11,453
    An extremely fast Python package installer and resolver, written in Rust. Designed as a drop-in replacement for pip and pip-compile.
    🔗 astral.sh

  5. mitsuhiko/rye ⭐ 11,410
    a Hassle-Free Python Experience
    🔗 rye-up.com

  6. pyinstaller/pyinstaller ⭐ 11,315
    Freeze (package) Python programs into stand-alone executables
    🔗 www.pyinstaller.org

  7. pypa/pipx ⭐ 8,898
    Install and Run Python Applications in Isolated Environments
    🔗 pipx.pypa.io

  8. jazzband/pip-tools ⭐ 7,480
    A set of tools to keep your pinned Python dependencies fresh (pip-compile + pip-sync)
    🔗 pip-tools.rtfd.io

  9. pdm-project/pdm ⭐ 6,589
    A modern Python package and dependency manager supporting the latest PEP standards
    🔗 pdm-project.org

  10. mamba-org/mamba ⭐ 6,285
    The Fast Cross-Platform Package Manager: mamba is a reimplementation of the conda package manager in C++
    🔗 mamba.readthedocs.io

  11. conda/conda ⭐ 6,096
    A system-level, binary package and environment manager running on all major operating systems and platforms.
    🔗 docs.conda.io/projects/conda

  12. pypa/hatch ⭐ 5,356
    Modern, extensible Python project management
    🔗 hatch.pypa.io/latest

  13. conda-forge/miniforge ⭐ 5,342
    A conda-forge distribution.
    🔗 conda-forge.org/miniforge

  14. indygreg/PyOxidizer ⭐ 5,206
    A modern Python application packaging and distribution tool

  15. pypa/virtualenv ⭐ 4,714
    A tool to create isolated Python environments. Since Python 3.3, a subset of it has been integrated into the standard lib venv module.
    🔗 virtualenv.pypa.io

  16. spack/spack ⭐ 3,983
    A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
    🔗 spack.io

  17. pantsbuild/pex ⭐ 2,458
    A tool for generating .pex (Python EXecutable) files, lock files and venvs.
    🔗 docs.pex-tool.org

  18. beeware/briefcase ⭐ 2,331
    Tools to support converting a Python project into a standalone native application.
    🔗 briefcase.readthedocs.io

  19. pypa/flit ⭐ 2,098
    Simplified packaging of Python modules
    🔗 flit.pypa.io

  20. prefix-dev/pixi ⭐ 1,935
    pixi is a cross-platform, multi-language package manager and workflow tool built on the foundation of the conda ecosystem.
    🔗 pixi.sh

  21. linkedin/shiv ⭐ 1,691
    shiv is a command line utility for building fully self contained Python zipapps as outlined in PEP 441, but with all their dependencies included.

  22. marcelotduarte/cx_Freeze ⭐ 1,248
    cx_Freeze creates standalone executables from Python scripts, with the same performance, is cross-platform and should work on any platform that Python itself works on.
    🔗 marcelotduarte.github.io/cx_freeze

  23. ofek/pyapp ⭐ 1,054
    Runtime installer for Python applications
    🔗 ofek.dev/pyapp

  24. pypa/gh-action-pypi-publish ⭐ 839
    The blessed :octocat: GitHub Action, for publishing your 📦 distribution files to PyPI: https://github.com/marketplace/actions/pypi-publish
    🔗 packaging.python.org/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows

  25. py2exe/py2exe ⭐ 749
    Create standalone Windows programs from Python code
    🔗 www.py2exe.org

  26. prefix-dev/rip ⭐ 620
    RIP is a library that allows the resolving and installing of Python PyPI packages from Rust into a virtual environment. It's based on our experience with building Rattler and aims to provide the same experience but for PyPI instead of Conda.
    🔗 prefix.dev

  27. snok/install-poetry ⭐ 533
    Github action for installing and configuring Poetry

  28. python-poetry/install.python-poetry.org ⭐ 176
    The official Poetry installation script
    🔗 install.python-poetry.org

Pandas

Pandas and dataframe libraries: data analysis, statistical reporting, pandas GUIs, pandas performance optimisations.

  1. pandas-dev/pandas ⭐ 42,008
    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
    🔗 pandas.pydata.org

  2. pola-rs/polars ⭐ 26,337
    Dataframes powered by a multithreaded, vectorized query engine, written in Rust
    🔗 docs.pola.rs

  3. duckdb/duckdb ⭐ 16,858
    DuckDB is an in-process SQL OLAP Database Management System
    🔗 www.duckdb.org

  4. ydataai/ydata-profiling ⭐ 12,067
    1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
    🔗 docs.profiling.ydata.ai

  5. gventuri/pandas-ai ⭐ 11,015
    Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
    🔗 pandas-ai.com

  6. kanaries/pygwalker ⭐ 9,844
    PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis
    🔗 kanaries.net/pygwalker

  7. rapidsai/cudf ⭐ 7,297
    cuDF is a GPU DataFrame library for loading joining, aggregating, filtering, and otherwise manipulating data
    🔗 docs.rapids.ai/api/cudf/stable

  8. aws/aws-sdk-pandas ⭐ 3,804
    pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
    🔗 aws-sdk-pandas.readthedocs.io

  9. nalepae/pandarallel ⭐ 3,497
    A simple and efficient tool to parallelize Pandas operations on all available CPUs
    🔗 nalepae.github.io/pandarallel

  10. blaze/blaze ⭐ 3,180
    NumPy and Pandas interface to Big Data
    🔗 blaze.pydata.org

  11. adamerose/PandasGUI ⭐ 3,133
    A GUI for Pandas DataFrames

  12. unionai-oss/pandera ⭐ 3,011
    A light-weight, flexible, and expressive statistical data testing library
    🔗 www.union.ai/pandera

  13. pydata/pandas-datareader ⭐ 2,825
    Extract data from a wide range of Internet sources into a pandas DataFrame.
    🔗 pydata.github.io/pandas-datareader/stable/index.html

  14. scikit-learn-contrib/sklearn-pandas ⭐ 2,783
    Pandas integration with sklearn

  15. jmcarpenter2/swifter ⭐ 2,468
    A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner

  16. fugue-project/fugue ⭐ 1,881
    A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
    🔗 fugue-tutorials.readthedocs.io

  17. delta-io/delta-rs ⭐ 1,833
    A native Rust library for Delta Lake, with bindings into Python
    🔗 delta-io.github.io/delta-rs

  18. eventual-inc/Daft ⭐ 1,692
    Distributed DataFrame for Python designed for the cloud, powered by Rust
    🔗 getdaft.io

  19. pyjanitor-devs/pyjanitor ⭐ 1,286
    Clean APIs for data cleaning. Python implementation of R package Janitor
    🔗 pyjanitor-devs.github.io/pyjanitor

  20. machow/siuba ⭐ 1,101
    Python library for using dplyr like syntax with pandas and SQL
    🔗 siuba.org

  21. renumics/spotlight ⭐ 1,013
    Interactively explore unstructured datasets from your dataframe.
    🔗 renumics.com

  22. holoviz/hvplot ⭐ 942
    A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
    🔗 hvplot.holoviz.org

  23. tkrabel/bamboolib ⭐ 934
    bamboolib - a GUI for pandas DataFrames
    🔗 bamboolib.com

  24. mwouts/itables ⭐ 671
    This packages changes how Pandas and Polars DataFrames are rendered in Jupyter Notebooks. With itables you can display your tables as interactive DataTables that you can sort, paginate, scroll or filter.
    🔗 mwouts.github.io/itables

Performance

Performance, parallelisation and low level libraries.

  1. celery/celery ⭐ 23,551
    Distributed Task Queue (development branch)
    🔗 docs.celeryq.dev

  2. google/flatbuffers ⭐ 22,062
    FlatBuffers: Memory Efficient Serialization Library
    🔗 flatbuffers.dev

  3. pybind/pybind11 ⭐ 14,816
    Seamless operability between C++11 and Python
    🔗 pybind11.readthedocs.io

  4. exaloop/codon ⭐ 13,851
    A high-performance, zero-overhead, extensible Python compiler using LLVM
    🔗 docs.exaloop.io/codon

  5. dask/dask ⭐ 12,021
    Parallel computing with task scheduling
    🔗 dask.org

  6. modin-project/modin ⭐ 9,486
    Modin: Scale your Pandas workflows by changing a single line of code
    🔗 modin.readthedocs.io

  7. numba/numba ⭐ 9,466
    NumPy aware dynamic Python compiler using LLVM
    🔗 numba.pydata.org

  8. nebuly-ai/nebuly ⭐ 8,363
    The user analytics platform for LLMs
    🔗 www.nebuly.com

  9. vaexio/vaex ⭐ 8,170
    Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
    🔗 vaex.io

  10. mher/flower ⭐ 6,180
    Real-time monitor and web admin for Celery distributed task queue
    🔗 flower.readthedocs.io

  11. python-trio/trio ⭐ 5,898
    Trio – a friendly Python library for async concurrency and I/O
    🔗 trio.readthedocs.io

  12. ultrajson/ultrajson ⭐ 4,250
    Ultra fast JSON decoder and encoder written in C with Python bindings
    🔗 pypi.org/project/ujson

  13. facebookincubator/cinder ⭐ 3,381
    Cinder is Meta's internal performance-oriented production version of CPython.
    🔗 trycinder.com

  14. tlkh/asitop ⭐ 2,918
    Perf monitoring CLI tool for Apple Silicon
    🔗 tlkh.github.io/asitop

  15. ipython/ipyparallel ⭐ 2,551
    IPython Parallel: Interactive Parallel Computing in Python
    🔗 ipyparallel.readthedocs.io

  16. h5py/h5py ⭐ 2,003
    HDF5 for Python -- The h5py package is a Pythonic interface to the HDF5 binary data format.
    🔗 www.h5py.org

  17. intel/intel-extension-for-transformers ⭐ 1,951
    ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

  18. airtai/faststream ⭐ 1,811
    FastStream is a powerful and easy-to-use Python framework for building asynchronous services interacting with event streams such as Apache Kafka, RabbitMQ, NATS and Redis.
    🔗 faststream.airt.ai/latest

  19. faster-cpython/ideas ⭐ 1,651
    Discussion and work tracker for Faster CPython project.

  20. agronholm/anyio ⭐ 1,618
    High level asynchronous concurrency and networking framework that works on top of either trio or asyncio

  21. dask/distributed ⭐ 1,543
    A distributed task scheduler for Dask
    🔗 distributed.dask.org

  22. tiangolo/asyncer ⭐ 1,443
    Asyncer, async and await, focused on developer experience.
    🔗 asyncer.tiangolo.com

  23. intel/intel-extension-for-pytorch ⭐ 1,351
    A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

  24. nschloe/perfplot ⭐ 1,301
    📈 Performance analysis for Python snippets

  25. intel/scikit-learn-intelex ⭐ 1,161
    Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
    🔗 intel.github.io/scikit-learn-intelex

  26. markshannon/faster-cpython ⭐ 937
    How to make CPython faster.

  27. zerointensity/pointers.py ⭐ 884
    Bringing the hell of pointers to Python.
    🔗 pointers.zintensity.dev

  28. brandtbucher/specialist ⭐ 610
    Visualize CPython's specializing, adaptive interpreter. 🔥

Profiling

Memory and CPU/GPU profiling tools and libraries.

  1. bloomberg/memray ⭐ 12,568
    Memray is a memory profiler for Python
    🔗 bloomberg.github.io/memray

  2. benfred/py-spy ⭐ 11,869
    Sampling profiler for Python programs

  3. plasma-umass/scalene ⭐ 11,181
    Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

  4. joerick/pyinstrument ⭐ 6,135
    🚴 Call stack profiler for Python. Shows you why your code is slow!
    🔗 pyinstrument.readthedocs.io

  5. gaogaotiantian/viztracer ⭐ 4,378
    VizTracer is a low-overhead logging/debugging/profiling tool that can trace and visualize your python code execution.
    🔗 viztracer.readthedocs.io

  6. pythonprofilers/memory_profiler ⭐ 4,219
    Monitor Memory usage of Python code
    🔗 pypi.python.org/pypi/memory_profiler

  7. reloadware/reloadium ⭐ 2,698
    Hot Reloading and Profiling for Python
    🔗 reloadium.io

  8. pyutils/line_profiler ⭐ 2,483
    Line-by-line profiling for Python

  9. jiffyclub/snakeviz ⭐ 2,235
    An in-browser Python profile viewer
    🔗 jiffyclub.github.io/snakeviz

  10. p403n1x87/austin ⭐ 1,362
    Python frame stack sampler for CPython
    🔗 pypi.org/project/austin-dist

  11. pythonspeed/filprofiler ⭐ 812
    A Python memory profiler for data processing and scientific computing applications
    🔗 pythonspeed.com/products/filmemoryprofiler

Security

Security related libraries: vulnerability discovery, SQL injection, environment auditing.

  1. swisskyrepo/PayloadsAllTheThings ⭐ 56,934
    A list of useful payloads and bypass for Web Application Security and Pentest/CTF
    🔗 swisskyrepo.github.io/payloadsallthethings

  2. certbot/certbot ⭐ 30,864
    Certbot is EFF's tool to obtain certs from Let's Encrypt and (optionally) auto-enable HTTPS on your server. It can also act as a client for any other CA that uses the ACME protocol.

  3. sqlmapproject/sqlmap ⭐ 30,626
    Automatic SQL injection and database takeover tool
    🔗 sqlmap.org

  4. aquasecurity/trivy ⭐ 21,420
    Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
    🔗 aquasecurity.github.io/trivy

  5. bridgecrewio/checkov ⭐ 6,551
    Checkov is a static code analysis tool for infrastructure as code (IaC) and also a software composition analysis (SCA) tool for images and open source packages.
    🔗 www.checkov.io

  6. nccgroup/ScoutSuite ⭐ 6,188
    Multi-Cloud Security Auditing Tool

  7. pycqa/bandit ⭐ 6,008
    Bandit is a tool designed to find common security issues in Python code.
    🔗 bandit.readthedocs.io

  8. stamparm/maltrail ⭐ 5,762
    Malicious traffic detection system

  9. rhinosecuritylabs/pacu ⭐ 4,035
    The AWS exploitation framework, designed for testing the security of Amazon Web Services environments.
    🔗 rhinosecuritylabs.com/aws/pacu-open-source-aws-exploitation-framework

  10. dashingsoft/pyarmor ⭐ 2,917
    A tool used to obfuscate python scripts, bind obfuscated scripts to fixed machine or expire obfuscated scripts.
    🔗 pyarmor.dashingsoft.com

  11. luijait/DarkGPT ⭐ 1,725
    DarkGPT is an OSINT assistant based on GPT-4-200K (recommended use) designed to perform queries on leaked databases, thus providing an artificial intelligence assistant that can be useful in your traditional OSINT processes.

  12. pyupio/safety ⭐ 1,631
    Safety checks Python dependencies for known security vulnerabilities and suggests the proper remediations for vulnerabilities detected.
    🔗 safetycli.com/product/safety-cli

  13. trailofbits/pip-audit ⭐ 919
    Audits Python environments, requirements files and dependency trees for known security vulnerabilities, and can automatically fix them
    🔗 pypi.org/project/pip-audit

  14. fadi002/de4py ⭐ 766
    toolkit for python reverse engineering
    🔗 de4py.000.pe

Simulation

Simulation libraries: robotics, economic, agent-based, traffic, physics, astronomy, chemistry, quantum simulation. Also see the Maths and Science category for crossover.

  1. atsushisakai/PythonRobotics ⭐ 21,769
    Python sample codes for robotics algorithms.
    🔗 atsushisakai.github.io/pythonrobotics

  2. bulletphysics/bullet3 ⭐ 11,935
    Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
    🔗 bulletphysics.org

  3. isl-org/Open3D ⭐ 10,515
    Open3D: A Modern Library for 3D Data Processing
    🔗 www.open3d.org

  4. qiskit/qiskit ⭐ 4,633
    Qiskit is an open-source SDK for working with quantum computers at the level of extended quantum circuits, operators, and primitives.
    🔗 www.ibm.com/quantum/qiskit

  5. astropy/astropy ⭐ 4,220
    Astronomy and astrophysics core library
    🔗 www.astropy.org

  6. quantumlib/Cirq ⭐ 4,143
    A python framework for creating, editing, and invoking Noisy Intermediate Scale Quantum (NISQ) circuits.

  7. openai/mujoco-py ⭐ 2,739
    MuJoCo is a physics engine for detailed, efficient rigid body simulations with contacts. mujoco-py allows using MuJoCo from Python 3.

  8. rdkit/rdkit ⭐ 2,426
    The official sources for the RDKit library

  9. taichi-dev/difftaichi ⭐ 2,398
    10 differentiable physical simulators built with Taichi differentiable programming (DiffTaichi, ICLR 2020)

  10. projectmesa/mesa ⭐ 2,223
    Mesa is an open-source Python library for agent-based modeling, ideal for simulating complex systems and exploring emergent behaviors.
    🔗 mesa.readthedocs.io

  11. google/brax ⭐ 2,071
    Massively parallel rigidbody physics simulation on accelerator hardware.

  12. quantecon/QuantEcon.py ⭐ 1,861
    A community based Python library for quantitative economics
    🔗 quantecon.org/quantecon-py

  13. facebookresearch/habitat-lab ⭐ 1,720
    A modular high-level library to train embodied AI agents across a variety of tasks and environments.
    🔗 aihabitat.org

  14. microsoft/PromptCraft-Robotics ⭐ 1,715
    Community for applying LLMs to robotics and a robot simulator with ChatGPT integration
    🔗 aka.ms/chatgpt-robotics

  15. nvidia/warp ⭐ 1,692
    A Python framework for high performance GPU simulation and graphics
    🔗 nvidia.github.io/warp

  16. nvidia-omniverse/IsaacGymEnvs ⭐ 1,630
    Example RL environments for the NVIDIA Isaac Gym high performance environments

  17. deepmodeling/deepmd-kit ⭐ 1,366
    A deep learning package for many-body potential energy representation and molecular dynamics
    🔗 docs.deepmodeling.com/projects/deepmd

  18. sail-sg/envpool ⭐ 1,019
    C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
    🔗 envpool.readthedocs.io

  19. a-r-j/graphein ⭐ 980
    Protein Graph Library
    🔗 graphein.ai

  20. hardmaru/estool ⭐ 919
    Evolution Strategies Tool

  21. viblo/pymunk ⭐ 880
    Pymunk is a easy-to-use pythonic 2d physics library that can be used whenever you need 2d rigid body physics from Python
    🔗 www.pymunk.org

  22. bowang-lab/scGPT ⭐ 837
    scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI
    🔗 scgpt.readthedocs.io/en/latest

  23. facebookresearch/fairo ⭐ 829
    A modular embodied agent architecture and platform for building embodied agents

  24. nvidia-omniverse/orbit ⭐ 811
    Unified framework for robot learning built on NVIDIA Isaac Sim
    🔗 isaac-orbit.github.io/orbit

  25. google-deepmind/materials_discovery ⭐ 796
    Graph Networks for Materials Science (GNoME) is a project centered around scaling machine learning methods to tackle materials science.

  26. google/evojax ⭐ 787
    EvoJAX is a scalable, general purpose, hardware-accelerated neuroevolution toolkit built on the JAX library

  27. nvidia-omniverse/OmniIsaacGymEnvs ⭐ 690
    Reinforcement Learning Environments for Omniverse Isaac Gym

Study

Miscellaneous study resources: algorithms, general resources, system design, code repos for textbooks, best practices, tutorials.

  1. thealgorithms/Python ⭐ 179,824
    All Algorithms implemented in Python
    🔗 the-algorithms.com

  2. microsoft/generative-ai-for-beginners ⭐ 43,168
    18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
    🔗 microsoft.github.io/generative-ai-for-beginners

  3. jakevdp/PythonDataScienceHandbook ⭐ 41,542
    Python Data Science Handbook: full text in Jupyter Notebooks
    🔗 jakevdp.github.io/pythondatasciencehandbook

  4. mlabonne/llm-course ⭐ 29,186
    Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
    🔗 mlabonne.github.io/blog

  5. realpython/python-guide ⭐ 27,711
    Python best practices guidebook, written for humans.
    🔗 docs.python-guide.org

  6. christoschristofidis/awesome-deep-learning ⭐ 22,874
    A curated list of awesome Deep Learning tutorials, projects and communities.

  7. d2l-ai/d2l-en ⭐ 21,744
    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
    🔗 d2l.ai

  8. wesm/pydata-book ⭐ 21,342
    Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media

  9. microsoft/recommenders ⭐ 18,002
    Best Practices on Recommendation Systems
    🔗 recommenders-team.github.io/recommenders/intro.html

  10. fchollet/deep-learning-with-python-notebooks ⭐ 17,786
    Jupyter notebooks for the code samples of the book "Deep Learning with Python"

  11. hannibal046/Awesome-LLM ⭐ 14,462
    Awesome-LLM: a curated list of Large Language Model

  12. graykode/nlp-tutorial ⭐ 13,712
    Natural Language Processing Tutorial for Deep Learning Researchers
    🔗 www.reddit.com/r/machinelearning/comments/amfinl/project_nlptutoral_repository_who_is_studying

  13. shangtongzhang/reinforcement-learning-an-introduction ⭐ 13,195
    Python Implementation of Reinforcement Learning: An Introduction

  14. karpathy/nn-zero-to-hero ⭐ 10,423
    Neural Networks: Zero to Hero

  15. eugeneyan/open-llms ⭐ 10,202
    📋 A list of open LLMs available for commercial use.

  16. openai/spinningup ⭐ 9,660
    An educational resource to help anyone learn deep reinforcement learning.
    🔗 spinningup.openai.com

  17. mooler0410/LLMsPracticalGuide ⭐ 8,561
    A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
    🔗 arxiv.org/abs/2304.13712v2

  18. karpathy/micrograd ⭐ 8,330
    A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

  19. mrdbourke/pytorch-deep-learning ⭐ 8,066
    Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.
    🔗 learnpytorch.io

  20. nielsrogge/Transformers-Tutorials ⭐ 7,616
    This repository contains demos I made with the Transformers library by HuggingFace.

  21. zhanymkanov/fastapi-best-practices ⭐ 7,038
    FastAPI Best Practices and Conventions we used at our startup

  22. firmai/industry-machine-learning ⭐ 7,015
    A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
    🔗 www.linkedin.com/company/firmai

  23. gkamradt/langchain-tutorials ⭐ 6,270
    Overview and tutorial of the LangChain Library

  24. udacity/deep-learning-v2-pytorch ⭐ 5,179
    Projects and exercises for the latest Deep Learning ND program https://www.udacity.com/course/deep-learning-nanodegree--nd101

  25. neetcode-gh/leetcode ⭐ 5,092
    Leetcode solutions for NeetCode.io

  26. srush/GPU-Puzzles ⭐ 5,052
    Teaching beginner GPU programming in a completely interactive fashion

  27. mrdbourke/tensorflow-deep-learning ⭐ 4,872
    All course materials for the Zero to Mastery Deep Learning with TensorFlow course.
    🔗 dbourke.link/ztmtfcourse

  28. udlbook/udlbook ⭐ 4,794
    Understanding Deep Learning - Simon J.D. Prince

  29. timofurrer/awesome-asyncio ⭐ 4,405
    A curated list of awesome Python asyncio frameworks, libraries, software and resources

  30. zotroneneis/machine_learning_basics ⭐ 4,207
    Plain python implementations of basic machine learning algorithms

  31. roboflow/notebooks ⭐ 4,175
    Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.
    🔗 roboflow.com/models

  32. huggingface/deep-rl-class ⭐ 3,618
    This repo contains the syllabus of the Hugging Face Deep Reinforcement Learning Course.

  33. alirezadir/Machine-Learning-Interviews ⭐ 3,341
    This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

  34. cosmicpython/book ⭐ 3,264
    A Book about Pythonic Application Architecture Patterns for Managing Complexity. Cosmos is the Opposite of Chaos you see. O'R. wouldn't actually let us call it "Cosmic Python" tho.
    🔗 www.cosmicpython.com

  35. huggingface/diffusion-models-class ⭐ 3,228
    Materials for the Hugging Face Diffusion Models Course

  36. promptslab/Awesome-Prompt-Engineering ⭐ 3,221
    This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
    🔗 discord.gg/m88xfymbk6

  37. fluentpython/example-code-2e ⭐ 2,933
    Example code for Fluent Python, 2nd edition (O'Reilly 2022)
    🔗 amzn.to/3j48u2j

  38. rasbt/machine-learning-book ⭐ 2,874
    Code Repository for Machine Learning with PyTorch and Scikit-Learn
    🔗 sebastianraschka.com/books/#machine-learning-with-pytorch-and-scikit-learn

  39. amanchadha/coursera-deep-learning-specialization ⭐ 2,703
    Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv...

  40. mrdbourke/zero-to-mastery-ml ⭐ 2,592
    All course materials for the Zero to Mastery Machine Learning and Data Science course.
    🔗 dbourke.link/ztmmlcourse

  41. krzjoa/awesome-python-data-science ⭐ 2,335
    Probably the best curated list of data science software in Python.
    🔗 krzjoa.github.io/awesome-python-data-science

  42. cgpotts/cs224u ⭐ 2,060
    Code for CS224u: Natural Language Understanding

  43. cerlymarco/MEDIUM_NoteBook ⭐ 2,024
    Repository containing notebooks of my posts on Medium

  44. trananhkma/fucking-awesome-python ⭐ 1,968
    awesome-python with :octocat: ⭐ and 🍴

  45. gerdm/prml ⭐ 1,891
    Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop

  46. atcold/NYU-DLSP21 ⭐ 1,493
    NYU Deep Learning Spring 2021
    🔗 atcold.github.io/nyu-dlsp21

  47. chandlerbang/awesome-self-supervised-gnn ⭐ 1,475
    Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).

  48. huggingface/cookbook ⭐ 1,345
    Community-driven practical examples of building AI applications and solving various tasks with AI using open-source tools and models.
    🔗 huggingface.co/learn/cookbook

  49. patrickloeber/MLfromscratch ⭐ 1,166
    Machine Learning algorithm implementations from scratch.

  50. davidadsp/Generative_Deep_Learning_2nd_Edition ⭐ 868
    The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
    🔗 www.oreilly.com/library/view/generative-deep-learning/9781098134174

  51. jackhidary/quantumcomputingbook ⭐ 763
    Companion site for the textbook Quantum Computing: An Applied Approach

  52. dylanhogg/awesome-python ⭐ 251
    🐍 Hand-picked awesome Python libraries and frameworks, organised by category
    🔗 www.awesomepython.org

Template

Template tools and libraries: cookiecutter repos, generators, quick-starts.

  1. tiangolo/full-stack-fastapi-template ⭐ 23,117
    Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more.

  2. cookiecutter/cookiecutter ⭐ 21,630
    A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.
    🔗 pypi.org/project/cookiecutter

  3. drivendata/cookiecutter-data-science ⭐ 7,614
    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
    🔗 drivendata.github.io/cookiecutter-data-science

  4. buuntu/fastapi-react ⭐ 2,073
    🚀 Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker

  5. pyscaffold/pyscaffold ⭐ 2,013
    🛠 Python project template generator with batteries included
    🔗 pyscaffold.org

  6. cjolowicz/cookiecutter-hypermodern-python ⭐ 1,721
    Cookiecutter template for a Python package based on the Hypermodern Python article series.
    🔗 cookiecutter-hypermodern-python.readthedocs.io

  7. tezromach/python-package-template ⭐ 1,072
    🚀 Your next Python package needs a bleeding-edge project structure.

  8. martinheinz/python-project-blueprint ⭐ 942
    Blueprint/Boilerplate For Python Projects

Terminal

Terminal and console tools and libraries: CLI tools, terminal based formatters, progress bars.

  1. willmcgugan/rich ⭐ 47,187
    Rich is a Python library for rich text and beautiful formatting in the terminal.
    🔗 rich.readthedocs.io/en/latest

  2. tqdm/tqdm ⭐ 27,484
    ⚡ A Fast, Extensible Progress Bar for Python and CLI
    🔗 tqdm.github.io

  3. google/python-fire ⭐ 26,340
    Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.

  4. willmcgugan/textual ⭐ 23,554
    The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
    🔗 textual.textualize.io

  5. pallets/click ⭐ 15,051
    Python composable command line interface toolkit
    🔗 click.palletsprojects.com

  6. tiangolo/typer ⭐ 14,400
    Typer, build great CLIs. Easy to code. Based on Python type hints.
    🔗 typer.tiangolo.com

  7. saulpw/visidata ⭐ 7,429
    A terminal spreadsheet multitool for discovering and arranging data
    🔗 visidata.org

  8. manrajgrover/halo ⭐ 2,853
    💫 Beautiful spinners for terminal, IPython and Jupyter

  9. urwid/urwid ⭐ 2,729
    Console user interface library for Python (official repo)
    🔗 urwid.org

  10. tconbeer/harlequin ⭐ 2,498
    The SQL IDE for Your Terminal.
    🔗 harlequin.sh

  11. textualize/trogon ⭐ 2,350
    Easily turn your Click CLI into a powerful terminal application

  12. tmbo/questionary ⭐ 1,420
    Python library to build pretty command line user prompts ✨Easy to use multi-select lists, confirmations, free text prompts ...

  13. jazzband/prettytable ⭐ 1,249
    Display tabular data in a visually appealing ASCII table format
    🔗 pypi.org/project/prettytable

  14. 1j01/textual-paint ⭐ 914
    🎨 MS Paint in your terminal.
    🔗 pypi.org/project/textual-paint

Testing

Testing libraries: unit testing, load testing, acceptance testing, code coverage, browser automation, plugins.

  1. locustio/locust ⭐ 23,703
    Write scalable load tests in plain Python 🚗💨

  2. pytest-dev/pytest ⭐ 11,388
    The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
    🔗 pytest.org

  3. microsoft/playwright-python ⭐ 10,740
    Python version of the Playwright testing and automation library.
    🔗 playwright.dev/python

  4. robotframework/robotframework ⭐ 9,108
    Generic automation framework for acceptance testing and RPA
    🔗 robotframework.org

  5. getmoto/moto ⭐ 7,394
    A library that allows you to easily mock out tests based on AWS infrastructure.
    🔗 docs.getmoto.org/en/latest

  6. hypothesisworks/hypothesis ⭐ 7,290
    Hypothesis is a powerful, flexible, and easy to use library for property-based testing.
    🔗 hypothesis.works

  7. newsapps/beeswithmachineguns ⭐ 6,393
    A utility for arming (creating) many bees (micro EC2 instances) to attack (load test) targets (web applications).
    🔗 apps.chicagotribune.com

  8. seleniumbase/SeleniumBase ⭐ 4,272
    📊 Python's all-in-one framework for web crawling, scraping, testing, and reporting. Supports pytest. UC Mode provides stealth. Includes many tools.
    🔗 seleniumbase.io

  9. getsentry/responses ⭐ 4,045
    A utility for mocking out the Python Requests library.

  10. spulec/freezegun ⭐ 3,973
    Let your Python tests travel through time

  11. tox-dev/tox ⭐ 3,528
    Command line driven CI frontend and development task automation tool.
    🔗 tox.wiki

  12. behave/behave ⭐ 3,066
    BDD, Python style.
    🔗 behave.readthedocs.io/en/latest

  13. nedbat/coveragepy ⭐ 2,838
    The code coverage tool for Python
    🔗 coverage.readthedocs.io

  14. cobrateam/splinter ⭐ 2,686
    splinter - python test framework for web applications
    🔗 splinter.readthedocs.org/en/stable/index.html

  15. kevin1024/vcrpy ⭐ 2,614
    Automatically mock your HTTP interactions to simplify and speed up testing

  16. pytest-dev/pytest-testinfra ⭐ 2,323
    With Testinfra you can write unit tests in Python to test actual state of your servers configured by management tools like Salt, Ansible, Puppet, Chef and so on.
    🔗 testinfra.readthedocs.io

  17. confident-ai/deepeval ⭐ 1,804
    The LLM Evaluation Framework
    🔗 docs.confident-ai.com

  18. pytest-dev/pytest-mock ⭐ 1,762
    Thin-wrapper around the mock package for easier use with pytest
    🔗 pytest-mock.readthedocs.io/en/latest

  19. pytest-dev/pytest-cov ⭐ 1,664
    Coverage plugin for pytest.

  20. pytest-dev/pytest-xdist ⭐ 1,360
    pytest plugin for distributed testing and loop-on-failures testing modes.
    🔗 pytest-xdist.readthedocs.io

  21. pytest-dev/pytest-asyncio ⭐ 1,327
    Asyncio support for pytest
    🔗 pytest-asyncio.readthedocs.io

  22. taverntesting/tavern ⭐ 992
    A command-line tool and Python library and Pytest plugin for automated testing of RESTful APIs, with a simple, concise and flexible YAML-based syntax
    🔗 taverntesting.github.io

Machine Learning - Time Series

Machine learning and classical timeseries libraries: forecasting, seasonality, anomaly detection, econometrics.

  1. facebook/prophet ⭐ 17,773
    Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
    🔗 facebook.github.io/prophet

  2. blue-yonder/tsfresh ⭐ 8,090
    Automatic extraction of relevant features from time series:
    🔗 tsfresh.readthedocs.io

  3. sktime/sktime ⭐ 7,411
    A unified framework for machine learning with time series
    🔗 www.sktime.net

  4. unit8co/darts ⭐ 7,294
    A python library for user-friendly forecasting and anomaly detection on time series.
    🔗 unit8co.github.io/darts

  5. facebookresearch/Kats ⭐ 4,763
    Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.

  6. awslabs/gluonts ⭐ 4,306
    Probabilistic time series modeling in Python
    🔗 ts.gluon.ai

  7. nixtla/statsforecast ⭐ 3,569
    Lightning ⚡️ fast forecasting with statistical and econometric models.
    🔗 nixtlaverse.nixtla.io/statsforecast

  8. salesforce/Merlion ⭐ 3,267
    Merlion: A Machine Learning Framework for Time Series Intelligence

  9. tdameritrade/stumpy ⭐ 2,995
    STUMPY is a powerful and scalable Python library for modern time series analysis
    🔗 stumpy.readthedocs.io/en/latest

  10. rjt1990/pyflux ⭐ 2,089
    Open source time series library for Python

  11. aistream-peelout/flow-forecast ⭐ 1,903
    Deep learning PyTorch library for time series forecasting, classification, and anomaly detection (originally for flood forecasting).
    🔗 flow-forecast.atlassian.net/wiki/spaces/ff/overview

  12. uber/orbit ⭐ 1,804
    A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.
    🔗 orbit-ml.readthedocs.io/en/stable

  13. amazon-science/chronos-forecasting ⭐ 1,665
    Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting

  14. alkaline-ml/pmdarima ⭐ 1,519
    A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
    🔗 www.alkaline-ml.com/pmdarima

  15. winedarksea/AutoTS ⭐ 1,012
    Automated Time Series Forecasting

  16. time-series-foundation-models/lag-llama ⭐ 969
    Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

  17. autoviml/Auto_TS ⭐ 674
    Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Created by Ram Seshadri. Collaborators welcome.

  18. google/temporian ⭐ 625
    Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖
    🔗 temporian.readthedocs.io

Typing

Typing libraries: static and run-time type checking, annotations.

  1. python/mypy ⭐ 17,576
    Optional static typing for Python
    🔗 www.mypy-lang.org

  2. microsoft/pyright ⭐ 12,098
    Static Type Checker for Python

  3. facebook/pyre-check ⭐ 6,695
    Performant type-checking for python.
    🔗 pyre-check.org

  4. python-attrs/attrs ⭐ 5,081
    Python Classes Without Boilerplate
    🔗 www.attrs.org

  5. google/pytype ⭐ 4,604
    A static type analyzer for Python code
    🔗 google.github.io/pytype

  6. instagram/MonkeyType ⭐ 4,540
    A Python library that generates static type annotations by collecting runtime types

  7. python/typeshed ⭐ 4,079
    Collection of library stubs for Python, with static types

  8. mtshiba/pylyzer ⭐ 1,993
    A fast static code analyzer & language server for Python
    🔗 mtshiba.github.io/pylyzer

  9. microsoft/pylance-release ⭐ 1,653
    Fast, feature-rich language support for Python. Documentation and issues for Pylance.

  10. agronholm/typeguard ⭐ 1,446
    Run-time type checker for Python

  11. patrick-kidger/torchtyping ⭐ 1,337
    Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.

  12. robertcraigie/pyright-python ⭐ 140
    Python command line wrapper for pyright, a static type checker
    🔗 pypi.org/project/pyright

Utility

General utility libraries: miscellaneous tools, linters, code formatters, version management, package tools, documentation tools.

  1. yt-dlp/yt-dlp ⭐ 71,108
    A feature-rich command-line audio/video downloader
    🔗 discord.gg/h5mncfw63r

  2. home-assistant/core ⭐ 68,743
    🏡 Open source home automation that puts local control and privacy first.
    🔗 www.home-assistant.io

  3. python/cpython ⭐ 59,709
    The Python programming language
    🔗 www.python.org

  4. localstack/localstack ⭐ 52,211
    💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline
    🔗 localstack.cloud

  5. faif/python-patterns ⭐ 39,445
    A collection of design patterns/idioms in Python

  6. mingrammer/diagrams ⭐ 34,949
    🎨 Diagram as Code for prototyping cloud system architectures
    🔗 diagrams.mingrammer.com

  7. ggerganov/whisper.cpp ⭐ 31,393
    Port of OpenAI's Whisper model in C/C++

  8. keon/algorithms ⭐ 23,582
    Minimal examples of data structures and algorithms in Python

  9. norvig/pytudes ⭐ 22,405
    Python programs, usually short, of considerable difficulty, to perfect particular skills.

  10. modularml/mojo ⭐ 21,335
    The Mojo Programming Language
    🔗 docs.modular.com/mojo

  11. openai/openai-python ⭐ 19,966
    The official Python library for the OpenAI API
    🔗 pypi.org/project/openai

  12. facebookresearch/audiocraft ⭐ 19,684
    Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

  13. pydantic/pydantic ⭐ 18,768
    Data validation using Python type hints
    🔗 docs.pydantic.dev

  14. micropython/micropython ⭐ 18,389
    MicroPython - a lean and efficient Python implementation for microcontrollers and constrained systems
    🔗 micropython.org

  15. squidfunk/mkdocs-material ⭐ 18,337
    Documentation that simply works
    🔗 squidfunk.github.io/mkdocs-material

  16. mkdocs/mkdocs ⭐ 18,313
    Project documentation with Markdown.
    🔗 www.mkdocs.org

  17. delgan/loguru ⭐ 18,135
    Python logging made (stupidly) simple

  18. rustpython/RustPython ⭐ 17,641
    A Python Interpreter written in Rust
    🔗 rustpython.github.io

  19. kivy/kivy ⭐ 16,969
    Open source UI framework written in Python, running on Windows, Linux, macOS, Android and iOS
    🔗 kivy.org

  20. ipython/ipython ⭐ 16,139
    Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
    🔗 ipython.readthedocs.org

  21. alievk/avatarify-python ⭐ 16,101
    Avatars for Zoom, Skype and other video-conferencing apps.

  22. blakeblackshear/frigate ⭐ 14,841
    NVR with realtime local object detection for IP cameras
    🔗 frigate.video

  23. zulko/moviepy ⭐ 11,809
    Video editing with Python
    🔗 zulko.github.io/moviepy

  24. python-pillow/Pillow ⭐ 11,720
    The Python Imaging Library adds image processing capabilities to Python (Pillow is the friendly PIL fork)
    🔗 python-pillow.org

  25. dbader/schedule ⭐ 11,497
    Python job scheduling for humans.
    🔗 schedule.readthedocs.io

  26. pyodide/pyodide ⭐ 11,418
    Pyodide is a Python distribution for the browser and Node.js based on WebAssembly
    🔗 pyodide.org/en/stable

  27. openai/triton ⭐ 11,062
    Development repository for the Triton language and compiler
    🔗 triton-lang.org

  28. pyo3/pyo3 ⭐ 11,056
    Rust bindings for the Python interpreter
    🔗 pyo3.rs

  29. nuitka/Nuitka ⭐ 10,884
    Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10, and 3.11. You feed it your Python app, it does a lot of clever things, and spits out an executable or extension module.
    🔗 nuitka.net

  30. ninja-build/ninja ⭐ 10,534
    Ninja is a small build system with a focus on speed.
    🔗 ninja-build.org

  31. caronc/apprise ⭐ 10,517
    Apprise - Push Notifications that work with just about every platform!
    🔗 hub.docker.com/r/caronc/apprise

  32. pytube/pytube ⭐ 10,305
    A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
    🔗 pytube.io

  33. secdev/scapy ⭐ 10,074
    Scapy: the Python-based interactive packet manipulation program & library.
    🔗 scapy.net

  34. magicstack/uvloop ⭐ 10,024
    Ultra fast asyncio event loop.

  35. pallets/jinja ⭐ 9,954
    A very fast and expressive template engine.
    🔗 jinja.palletsprojects.com

  36. paul-gauthier/aider ⭐ 9,761
    Aider is a command line tool that lets you pair program with LLMs, to edit code stored in your local git repository
    🔗 aider.chat

  37. asweigart/pyautogui ⭐ 9,625
    A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.

  38. aws/serverless-application-model ⭐ 9,240
    The AWS Serverless Application Model (AWS SAM) transform is a AWS CloudFormation macro that transforms SAM templates into CloudFormation templates.
    🔗 aws.amazon.com/serverless/sam

  39. cython/cython ⭐ 8,938
    The most widely used Python to C compiler
    🔗 cython.org

  40. paramiko/paramiko ⭐ 8,835
    The leading native Python SSHv2 protocol library.
    🔗 paramiko.org

  41. boto/boto3 ⭐ 8,704
    AWS SDK for Python
    🔗 aws.amazon.com/sdk-for-python

  42. arrow-py/arrow ⭐ 8,559
    🏹 Better dates & times for Python
    🔗 arrow.readthedocs.io

  43. facebookresearch/hydra ⭐ 8,229
    Hydra is a framework for elegantly configuring complex applications
    🔗 hydra.cc

  44. xonsh/xonsh ⭐ 8,021
    🐚 Python-powered, cross-platform, Unix-gazing shell.
    🔗 xon.sh

  45. eternnoir/pyTelegramBotAPI ⭐ 7,710
    Python Telegram bot api.

  46. kellyjonbrazil/jc ⭐ 7,576
    CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.

  47. py-pdf/pypdf ⭐ 7,428
    A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
    🔗 pypdf.readthedocs.io/en/latest

  48. googleapis/google-api-python-client ⭐ 7,397
    🐍 The official Python client library for Google's discovery based APIs.
    🔗 googleapis.github.io/google-api-python-client/docs

  49. theskumar/python-dotenv ⭐ 7,129
    Reads key-value pairs from a .env file and can set them as environment variables. It helps in developing applications following the 12-factor principles.
    🔗 saurabh-kumar.com/python-dotenv

  50. google/latexify_py ⭐ 7,055
    A library to generate LaTeX expression from Python code.

  51. googlecloudplatform/python-docs-samples ⭐ 6,990
    Code samples used on cloud.google.com

  52. marshmallow-code/marshmallow ⭐ 6,904
    A lightweight library for converting complex objects to and from simple Python datatypes.
    🔗 marshmallow.readthedocs.io

  53. hugapi/hug ⭐ 6,821
    Embrace the APIs of the future. Hug aims to make developing APIs as simple as possible, but no simpler.

  54. jasonppy/VoiceCraft ⭐ 6,729
    Zero-Shot Speech Editing and Text-to-Speech in the Wild

  55. pygithub/PyGithub ⭐ 6,692
    Typed interactions with the GitHub API v3
    🔗 pygithub.readthedocs.io

  56. openai/point-e ⭐ 6,307
    Point cloud diffusion for 3D model synthesis

  57. pyca/cryptography ⭐ 6,300
    cryptography is a package designed to expose cryptographic primitives and recipes to Python developers.
    🔗 cryptography.io

  58. gorakhargosh/watchdog ⭐ 6,276
    Python library and shell utilities to monitor filesystem events.
    🔗 packages.python.org/watchdog

  59. sdispater/pendulum ⭐ 6,067
    Python datetimes made easy
    🔗 pendulum.eustace.io

  60. sphinx-doc/sphinx ⭐ 6,046
    The Sphinx documentation generator
    🔗 www.sphinx-doc.org

  61. jd/tenacity ⭐ 6,009
    Retrying library for Python
    🔗 tenacity.readthedocs.io

  62. icloud-photos-downloader/icloud_photos_downloader ⭐ 5,950
    A command-line tool to download photos from iCloud

  63. scikit-image/scikit-image ⭐ 5,879
    Image processing in Python
    🔗 scikit-image.org

  64. bndr/pipreqs ⭐ 5,844
    pipreqs - Generate pip requirements.txt file based on imports of any project. Looking for maintainers to move this project forward.

  65. wireservice/csvkit ⭐ 5,821
    A suite of utilities for converting to and working with CSV, the king of tabular file formats.
    🔗 csvkit.readthedocs.io

  66. agronholm/apscheduler ⭐ 5,729
    Task scheduling library for Python

  67. ijl/orjson ⭐ 5,586
    Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

  68. pdfminer/pdfminer.six ⭐ 5,465
    Community maintained fork of pdfminer - we fathom PDF
    🔗 pdfminersix.readthedocs.io

  69. timdettmers/bitsandbytes ⭐ 5,452
    Accessible large language models via k-bit quantization for PyTorch.
    🔗 huggingface.co/docs/bitsandbytes/main/en/index

  70. pytransitions/transitions ⭐ 5,376
    A lightweight, object-oriented finite state machine implementation in Python with many extensions

  71. buildbot/buildbot ⭐ 5,166
    Python-based continuous integration testing framework; your pull requests are more than welcome!
    🔗 www.buildbot.net

  72. rsalmei/alive-progress ⭐ 5,128
    A new kind of Progress Bar, with real-time throughput, ETA, and very cool animations!

  73. prompt-toolkit/ptpython ⭐ 5,047
    A better Python REPL

  74. pycqa/pycodestyle ⭐ 4,984
    Simple Python style checker in one Python file
    🔗 pycodestyle.pycqa.org

  75. spotify/pedalboard ⭐ 4,854
    🎛 🔊 A Python library for audio.
    🔗 spotify.github.io/pedalboard

  76. jorgebastida/awslogs ⭐ 4,755
    AWS CloudWatch logs for Humans™

  77. pywinauto/pywinauto ⭐ 4,638
    Windows GUI Automation with Python (based on text properties)
    🔗 pywinauto.github.io

  78. tebelorg/RPA-Python ⭐ 4,555
    Python package for doing RPA

  79. hhatto/autopep8 ⭐ 4,523
    A tool that automatically formats Python code to conform to the PEP 8 style guide.
    🔗 pypi.org/project/autopep8

  80. pytoolz/toolz ⭐ 4,521
    A functional standard library for Python.
    🔗 toolz.readthedocs.org

  81. pyinvoke/invoke ⭐ 4,254
    Pythonic task management & command execution.
    🔗 pyinvoke.org

  82. bogdanp/dramatiq ⭐ 4,079
    A fast and reliable background task processing library for Python 3.
    🔗 dramatiq.io

  83. evhub/coconut ⭐ 3,952
    Coconut (coconut-lang.org) is a variant of Python that adds on top of Python syntax new features for simple, elegant, Pythonic functional programming.
    🔗 coconut-lang.org

  84. adafruit/circuitpython ⭐ 3,907
    CircuitPython - a Python implementation for teaching coding with microcontrollers
    🔗 circuitpython.org

  85. miguelgrinberg/python-socketio ⭐ 3,781
    Python Socket.IO server and client

  86. rspeer/python-ftfy ⭐ 3,715
    Fixes mojibake and other glitches in Unicode text, after the fact.
    🔗 ftfy.readthedocs.org

  87. ashleve/lightning-hydra-template ⭐ 3,683
    PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡

  88. joblib/joblib ⭐ 3,669
    Computing with Python functions.
    🔗 joblib.readthedocs.org

  89. ets-labs/python-dependency-injector ⭐ 3,603
    Dependency injection framework for Python
    🔗 python-dependency-injector.ets-labs.org

  90. python-markdown/markdown ⭐ 3,590
    A Python implementation of John Gruber’s Markdown with Extension support.
    🔗 python-markdown.github.io

  91. zeromq/pyzmq ⭐ 3,550
    PyZMQ: Python bindings for zeromq
    🔗 zguide.zeromq.org/py:all

  92. pypi/warehouse ⭐ 3,470
    The Python Package Index
    🔗 pypi.org

  93. tartley/colorama ⭐ 3,434
    Simple cross-platform colored terminal text in Python

  94. more-itertools/more-itertools ⭐ 3,429
    More routines for operating on iterables, beyond itertools
    🔗 more-itertools.rtfd.io

  95. pydata/xarray ⭐ 3,415
    N-D labeled arrays and datasets in Python
    🔗 xarray.dev

  96. osohq/oso ⭐ 3,409
    Oso is a batteries-included framework for building authorization in your application.
    🔗 docs.osohq.com

  97. jorisschellekens/borb ⭐ 3,291
    borb is a library for reading, creating and manipulating PDF files in python.
    🔗 borbpdf.com

  98. pyo3/maturin ⭐ 3,273
    Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
    🔗 maturin.rs

  99. suor/funcy ⭐ 3,273
    A fancy and practical functional tools

  100. pyinfra-dev/pyinfra ⭐ 3,234
    pyinfra automates infrastructure using Python. It’s fast and scales from one server to thousands. Great for ad-hoc command execution, service deployment, configuration management and more.
    🔗 pyinfra.com

  101. pyserial/pyserial ⭐ 3,106
    Python serial port access library

  102. spotify/basic-pitch ⭐ 2,937
    A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
    🔗 basicpitch.io

  103. tox-dev/pipdeptree ⭐ 2,689
    A command line utility to display dependency tree of the installed Python packages
    🔗 pypi.python.org/pypi/pipdeptree

  104. legrandin/pycryptodome ⭐ 2,672
    A self-contained cryptographic library for Python
    🔗 www.pycryptodome.org

  105. camelot-dev/camelot ⭐ 2,661
    A Python library to extract tabular data from PDFs
    🔗 camelot-py.readthedocs.io

  106. liiight/notifiers ⭐ 2,601
    The easy way to send notifications
    🔗 notifiers.readthedocs.io

  107. lxml/lxml ⭐ 2,574
    The lxml XML toolkit for Python
    🔗 lxml.de

  108. whylabs/whylogs ⭐ 2,553
    An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
    🔗 whylogs.readthedocs.io

  109. pexpect/pexpect ⭐ 2,535
    A Python module for controlling interactive programs in a pseudo-terminal
    🔗 pexpect.readthedocs.io

  110. pyston/pyston ⭐ 2,488
    A faster and highly-compatible implementation of the Python programming language.
    🔗 www.pyston.org

  111. litl/backoff ⭐ 2,481
    Python library providing function decorators for configurable backoff and retry

  112. scrapinghub/dateparser ⭐ 2,462
    python parser for human readable dates

  113. dosisod/refurb ⭐ 2,448
    A tool for refurbishing and modernizing Python codebases

  114. yaml/pyyaml ⭐ 2,431
    Canonical source repository for PyYAML

  115. cdgriffith/Box ⭐ 2,357
    Python dictionaries with advanced dot notation access
    🔗 github.com/cdgriffith/box/wiki

  116. pypa/setuptools ⭐ 2,324
    Official project repository for the Setuptools build system
    🔗 pypi.org/project/setuptools

  117. nschloe/tikzplotlib ⭐ 2,312
    📊 Save matplotlib figures as TikZ/PGFplots for smooth integration into LaTeX.

  118. hgrecco/pint ⭐ 2,268
    Operate and manipulate physical quantities in Python
    🔗 pint.readthedocs.org

  119. dateutil/dateutil ⭐ 2,253
    Useful extensions to the standard Python datetime features

  120. grantjenks/python-diskcache ⭐ 2,169
    Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.
    🔗 www.grantjenks.com/docs/diskcache

  121. pndurette/gTTS ⭐ 2,148
    Python library and CLI tool to interface with Google Translate's text-to-speech API
    🔗 gtts.readthedocs.org

  122. ianmiell/shutit ⭐ 2,147
    Automation framework for programmers
    🔗 ianmiell.github.io/shutit

  123. kiminewt/pyshark ⭐ 2,129
    Python wrapper for tshark, allowing python packet parsing using wireshark dissectors

  124. pyparsing/pyparsing ⭐ 2,100
    Python library for creating PEG parsers

  125. libaudioflux/audioFlux ⭐ 2,053
    A library for audio and music analysis, feature extraction.
    🔗 audioflux.top

  126. google/gin-config ⭐ 1,993
    Gin provides a lightweight configuration framework for Python

  127. grahamdumpleton/wrapt ⭐ 1,980
    A Python module for decorators, wrappers and monkey patching.

  128. astanin/python-tabulate ⭐ 1,976
    Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.
    🔗 pypi.org/project/tabulate

  129. pyfilesystem/pyfilesystem2 ⭐ 1,950
    Python's Filesystem abstraction layer
    🔗 www.pyfilesystem.org

  130. nateshmbhat/pyttsx3 ⭐ 1,917
    Offline Text To Speech synthesis for python

  131. landscapeio/prospector ⭐ 1,907
    Inspects Python source files and provides information about type and location of classes, methods etc

  132. jcrist/msgspec ⭐ 1,877
    A fast serialization and validation library, with builtin support for JSON, MessagePack, YAML, and TOML
    🔗 jcristharif.com/msgspec

  133. julienpalard/Pipe ⭐ 1,856
    A Python library to use infix notation in Python

  134. python-rope/rope ⭐ 1,839
    a python refactoring library

  135. numba/llvmlite ⭐ 1,838
    A lightweight LLVM python binding for writing JIT compilers
    🔗 llvmlite.pydata.org

  136. chaostoolkit/chaostoolkit ⭐ 1,833
    Chaos Engineering Toolkit & Orchestration for Developers
    🔗 chaostoolkit.org

  137. carpedm20/emoji ⭐ 1,822
    emoji terminal output for Python

  138. mitmproxy/pdoc ⭐ 1,816
    API Documentation for Python Projects
    🔗 pdoc.dev

  139. omry/omegaconf ⭐ 1,809
    Flexible Python configuration system. The last one you will ever need.

  140. joowani/binarytree ⭐ 1,804
    Python Library for Studying Binary Trees
    🔗 binarytree.readthedocs.io

  141. pydoit/doit ⭐ 1,783
    task management & automation tool
    🔗 pydoit.org

  142. pygments/pygments ⭐ 1,706
    Pygments is a generic syntax highlighter written in Python
    🔗 pygments.org

  143. rhettbull/osxphotos ⭐ 1,700
    Python app to work with pictures and associated metadata from Apple Photos on macOS. Also includes a package to provide programmatic access to the Photos library, pictures, and metadata.

  144. kalliope-project/kalliope ⭐ 1,697
    Kalliope is a framework that will help you to create your own personal assistant.
    🔗 kalliope-project.github.io

  145. konradhalas/dacite ⭐ 1,663
    Simple creation of data classes from dictionaries.

  146. home-assistant/supervisor ⭐ 1,655
    🏡 Home Assistant Supervisor
    🔗 home-assistant.io/hassio

  147. samuelcolvin/watchfiles ⭐ 1,607
    Simple, modern and fast file watching and code reload in python.
    🔗 watchfiles.helpmanual.io

  148. open-telemetry/opentelemetry-python ⭐ 1,604
    OpenTelemetry Python API and SDK
    🔗 opentelemetry.io

  149. rubik/radon ⭐ 1,598
    Various code metrics for Python code
    🔗 radon.readthedocs.org

  150. mkdocstrings/mkdocstrings ⭐ 1,578
    📘 Automatic documentation from sources, for MkDocs.
    🔗 mkdocstrings.github.io

  151. p0dalirius/Coercer ⭐ 1,565
    A python script to automatically coerce a Windows server to authenticate on an arbitrary machine through 12 methods.
    🔗 podalirius.net

  152. hbldh/bleak ⭐ 1,534
    A cross platform Bluetooth Low Energy Client for Python using asyncio

  153. facebookincubator/Bowler ⭐ 1,514
    Safe code refactoring for modern Python.
    🔗 pybowler.io

  154. nficano/python-lambda ⭐ 1,480
    A toolkit for developing and deploying serverless Python code in AWS Lambda.

  155. quodlibet/mutagen ⭐ 1,445
    Python module for handling audio metadata
    🔗 mutagen.readthedocs.io

  156. instagram/LibCST ⭐ 1,418
    A concrete syntax tree parser and serializer library for Python that preserves many aspects of Python's abstract syntax tree
    🔗 libcst.readthedocs.io

  157. fabiocaccamo/python-benedict ⭐ 1,411
    📘 dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.

  158. aws-samples/aws-glue-samples ⭐ 1,392
    AWS Glue code samples

  159. lcompilers/lpython ⭐ 1,343
    Python compiler
    🔗 lpython.org

  160. pycqa/pyflakes ⭐ 1,341
    A simple program which checks Python source files for errors
    🔗 pypi.org/project/pyflakes

  161. lidatong/dataclasses-json ⭐ 1,306
    Easily serialize Data Classes to and from JSON

  162. ossf/criticality_score ⭐ 1,283
    Gives criticality score for an open source project

  163. brandon-rhodes/python-patterns ⭐ 1,267
    Source code behind the python-patterns.guide site by Brandon Rhodes

  164. aio-libs/yarl ⭐ 1,234
    Yet another URL library
    🔗 yarl.aio-libs.org

  165. oracle/graalpython ⭐ 1,112
    A Python 3 implementation built on GraalVM

  166. pdoc3/pdoc ⭐ 1,091
    🐍 ➡️ 📜 Auto-generate API documentation for Python projects
    🔗 pdoc3.github.io/pdoc

  167. c4urself/bump2version ⭐ 1,039
    Version-bump your software with a single command
    🔗 pypi.python.org/pypi/bump2version

  168. metachris/logzero ⭐ 1,029
    Robust and effective logging for Python 2 and 3.
    🔗 logzero.readthedocs.io

  169. pyo3/rust-numpy ⭐ 1,019
    PyO3-based Rust bindings of the NumPy C-API

  170. pyfpdf/fpdf2 ⭐ 940
    Simple PDF generation for Python
    🔗 py-pdf.github.io/fpdf2

  171. anthropics/anthropic-sdk-python ⭐ 911
    SDK providing access to Anthropic's safety-first language model APIs

  172. fsspec/filesystem_spec ⭐ 901
    A specification that python filesystems should adhere to.

  173. fastai/fastcore ⭐ 901
    Python supercharged for the fastai library
    🔗 fastcore.fast.ai

  174. milvus-io/pymilvus ⭐ 872
    Python SDK for Milvus.

  175. alex-sherman/unsync ⭐ 869
    Unsynchronize asyncio

  176. lastmile-ai/aiconfig ⭐ 844
    AIConfig saves prompts, models and model parameters as source control friendly configs. This allows you to iterate on prompts and model parameters separately from your application code.
    🔗 aiconfig.lastmileai.dev

  177. samuelcolvin/dirty-equals ⭐ 769
    Doing dirty (but extremely useful) things with equals.
    🔗 dirty-equals.helpmanual.io

  178. pypy/pypy ⭐ 736
    PyPy is a very fast and compliant implementation of the Python language.
    🔗 pypy.org

  179. barracuda-fsh/pyobd ⭐ 736
    open source obd2 car diagnostics program

  180. pypa/build ⭐ 662
    A simple, correct Python build frontend
    🔗 build.pypa.io

  181. pydantic/logfire ⭐ 652
    Uncomplicated Observability for Python and beyond! 🪵🔥
    🔗 docs.pydantic.dev/logfire

  182. instagram/Fixit ⭐ 649
    Advanced Python linting framework with auto-fixes and hierarchical configuration that makes it easy to write custom in-repo lint rules.
    🔗 fixit.rtfd.io/en/latest

  183. gefyrahq/gefyra ⭐ 627
    Blazingly-fast 🚀, rock-solid, local application development ➡️ with Kubernetes.
    🔗 gefyra.dev

  184. open-telemetry/opentelemetry-python-contrib ⭐ 618
    OpenTelemetry instrumentation for Python modules
    🔗 opentelemetry.io

  185. qdrant/qdrant-client ⭐ 618
    Python client for Qdrant vector search engine
    🔗 qdrant.tech

  186. methexis-inc/terminal-copilot ⭐ 566
    A smart terminal assistant that helps you find the right command.

  187. fastai/ghapi ⭐ 513
    A delightful and complete interface to GitHub's amazing API
    🔗 ghapi.fast.ai

  188. steamship-core/steamship-langchain ⭐ 504
    steamship-langchain

  189. google/pyglove ⭐ 320
    Manipulating Python Programs

Vizualisation

Vizualisation tools and libraries. Application frameworks, 2D/3D plotting, dashboards, WebGL.

  1. apache/superset ⭐ 58,934
    Apache Superset is a Data Visualization and Data Exploration Platform
    🔗 superset.apache.org

  2. streamlit/streamlit ⭐ 31,818
    Streamlit — A faster way to build and share data apps.
    🔗 streamlit.io

  3. gradio-app/gradio ⭐ 29,043
    Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
    🔗 www.gradio.app

  4. plotly/dash ⭐ 20,528
    Data Apps & Dashboards for Python. No JavaScript Required.
    🔗 plotly.com/dash

  5. matplotlib/matplotlib ⭐ 19,309
    matplotlib: plotting with Python
    🔗 matplotlib.org/stable

  6. bokeh/bokeh ⭐ 18,848
    Interactive Data Visualization in the browser, from Python
    🔗 bokeh.org

  7. plotly/plotly.py ⭐ 15,310
    The interactive graphing library for Python ✨ This project now includes Plotly Express!
    🔗 plotly.com/python

  8. mwaskom/seaborn ⭐ 11,966
    Statistical data visualization in Python
    🔗 seaborn.pydata.org

  9. visgl/deck.gl ⭐ 11,715
    WebGL2 powered visualization framework
    🔗 deck.gl

  10. marceloprates/prettymaps ⭐ 10,841
    A small set of Python functions to draw pretty maps from OpenStreetMap data. Based on osmnx, matplotlib and shapely libraries.

  11. altair-viz/altair ⭐ 8,931
    Declarative statistical visualization library for Python
    🔗 altair-viz.github.io

  12. nvidia/TensorRT-LLM ⭐ 6,638
    TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT...
    🔗 nvidia.github.io/tensorrt-llm

  13. lux-org/lux ⭐ 4,919
    Automatically visualize your pandas dataframe via a single print! 📊 💡

  14. renpy/renpy ⭐ 4,567
    The Ren'Py Visual Novel Engine
    🔗 www.renpy.org

  15. man-group/dtale ⭐ 4,558
    Visualizer for pandas data structures
    🔗 alphatechadmin.pythonanywhere.com

  16. holoviz/panel ⭐ 4,246
    Panel: The powerful data exploration & web app framework for Python
    🔗 panel.holoviz.org

  17. has2k1/plotnine ⭐ 3,830
    A Grammar of Graphics for Python
    🔗 plotnine.org

  18. residentmario/missingno ⭐ 3,813
    missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities that allows you to get a quick visual summary of the completeness (or lack thereof) of your dataset.

  19. pyqtgraph/pyqtgraph ⭐ 3,683
    Fast data visualization and GUI tools for scientific / engineering applications
    🔗 www.pyqtgraph.org

  20. vispy/vispy ⭐ 3,224
    Main repository for Vispy
    🔗 vispy.org

  21. ml-tooling/opyrator ⭐ 3,018
    🪄 Turns your machine learning code into microservices with web API, interactive GUI, and more.
    🔗 opyrator-playground.mltooling.org

  22. netflix/flamescope ⭐ 2,989
    FlameScope is a visualization tool for exploring different time ranges as Flame Graphs.

  23. facebookresearch/hiplot ⭐ 2,698
    HiPlot makes understanding high dimensional data easy
    🔗 facebookresearch.github.io/hiplot

  24. holoviz/holoviews ⭐ 2,624
    With Holoviews, your data visualizes itself.
    🔗 holoviews.org

  25. kozea/pygal ⭐ 2,603
    pygal is a dynamic SVG charting library written in python.
    🔗 www.pygal.org

  26. pyvista/pyvista ⭐ 2,370
    3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)
    🔗 docs.pyvista.org

  27. mckinsey/vizro ⭐ 2,359
    Vizro is a toolkit for creating modular data visualization applications.
    🔗 vizro.readthedocs.io/en/stable

  28. marcomusy/vedo ⭐ 1,932
    A python module for scientific analysis of 3D data based on VTK and Numpy
    🔗 vedo.embl.es

  29. datapane/datapane ⭐ 1,349
    Build and share data reports in 100% Python
    🔗 datapane.com

  30. facultyai/dash-bootstrap-components ⭐ 1,057
    Bootstrap components for Plotly Dash
    🔗 dash-bootstrap-components.opensource.faculty.ai

  31. nomic-ai/deepscatter ⭐ 978
    Zoomable, animated scatterplots in the browser that scales over a billion points

  32. hazyresearch/meerkat ⭐ 812
    Creative interactive views of any dataset.

  33. holoviz/holoviz ⭐ 789
    High-level tools to simplify visualization in Python.
    🔗 holoviz.org

Web

Web related frameworks and libraries: webapp servers, WSGI, ASGI, asyncio, HTTP, REST, user management.

  1. django/django ⭐ 76,886
    The Web framework for perfectionists with deadlines.
    🔗 www.djangoproject.com

  2. tiangolo/fastapi ⭐ 71,145
    FastAPI framework, high performance, easy to learn, fast to code, ready for production
    🔗 fastapi.tiangolo.com

  3. pallets/flask ⭐ 66,425
    The Python micro framework for building web applications.
    🔗 flask.palletsprojects.com

  4. sherlock-project/sherlock ⭐ 51,412
    🔎 Hunt down social media accounts by username across social networks
    🔗 sherlock-project.github.io

  5. psf/requests ⭐ 51,378
    A simple, yet elegant, HTTP library.
    🔗 requests.readthedocs.io/en/latest

  6. tornadoweb/tornado ⭐ 21,524
    Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
    🔗 www.tornadoweb.org

  7. huge-success/sanic ⭐ 17,735
    Accelerate your web app development | Build fast. Run fast.
    🔗 sanic.dev

  8. pyscript/pyscript ⭐ 17,451
    A framework that allows users to create rich Python applications in the browser using HTML's interface and the power of Pyodide, WASM, and modern web technologies.
    🔗 pyscript.net

  9. wagtail/wagtail ⭐ 17,239
    A Django content management system focused on flexibility and user experience
    🔗 wagtail.org

  10. reflex-dev/reflex ⭐ 16,778
    🕸️ Web apps in pure Python 🐍
    🔗 reflex.dev

  11. aio-libs/aiohttp ⭐ 14,581
    Asynchronous HTTP client/server framework for asyncio and Python
    🔗 docs.aiohttp.org

  12. encode/httpx ⭐ 12,363
    A next generation HTTP client for Python. 🦋
    🔗 www.python-httpx.org

  13. getpelican/pelican ⭐ 12,268
    Static site generator that supports Markdown and reST syntax. Powered by Python.
    🔗 getpelican.com

  14. aws/chalice ⭐ 10,311
    Python Serverless Microframework for AWS

  15. encode/starlette ⭐ 9,541
    The little ASGI framework that shines. 🌟
    🔗 www.starlette.io

  16. benoitc/gunicorn ⭐ 9,519
    gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
    🔗 www.gunicorn.org

  17. falconry/falcon ⭐ 9,388
    The no-magic web data plane API and microservices framework for Python developers, with a focus on reliability, correctness, and performance at scale.
    🔗 falcon.readthedocs.io/en/stable

  18. flet-dev/flet ⭐ 9,218
    Flet enables developers to easily build realtime web, mobile and desktop apps in Python. No frontend experience required.
    🔗 flet.dev

  19. bottlepy/bottle ⭐ 8,300
    bottle.py is a fast and simple micro-framework for python web-applications.
    🔗 bottlepy.org

  20. graphql-python/graphene ⭐ 7,977
    GraphQL framework for Python
    🔗 graphene-python.org

  21. encode/uvicorn ⭐ 7,876
    An ASGI web server, for Python. 🦄
    🔗 www.uvicorn.org

  22. reactive-python/reactpy ⭐ 7,663
    ReactPy is a library for building user interfaces in Python without Javascript
    🔗 reactpy.dev

  23. zauberzeug/nicegui ⭐ 7,446
    Create web-based user interfaces with Python. The nice way.
    🔗 nicegui.io

  24. pyeve/eve ⭐ 6,661
    REST API framework designed for human beings
    🔗 python-eve.org

  25. pallets/werkzeug ⭐ 6,547
    The comprehensive WSGI web application library.
    🔗 werkzeug.palletsprojects.com

  26. vitalik/django-ninja ⭐ 6,251
    💨 Fast, Async-ready, Openapi, type hints based framework for building APIs
    🔗 django-ninja.dev

  27. webpy/webpy ⭐ 5,870
    web.py is a web framework for python that is as simple as it is powerful.
    🔗 webpy.org

  28. stephenmcd/mezzanine ⭐ 4,716
    CMS framework for Django
    🔗 mezzanine.jupo.org

  29. nameko/nameko ⭐ 4,653
    A microservices framework for Python that lets service developers concentrate on application logic and encourages testability.
    🔗 www.nameko.io

  30. starlite-api/litestar ⭐ 4,469
    Production-ready, Light, Flexible and Extensible ASGI API framework | Effortlessly Build Performant APIs
    🔗 litestar.dev

  31. pywebio/PyWebIO ⭐ 4,335
    Write interactive web app in script way.
    🔗 pywebio.readthedocs.io

  32. fastapi-users/fastapi-users ⭐ 4,077
    Ready-to-use and customizable users management for FastAPI
    🔗 fastapi-users.github.io/fastapi-users

  33. pylons/pyramid ⭐ 3,901
    Pyramid - A Python web framework
    🔗 trypyramid.com

  34. h2oai/wave ⭐ 3,863
    H2O Wave is a software stack for building beautiful, low-latency, realtime, browser-based applications and dashboards entirely in Python/R without using HTML, Javascript, or CSS.
    🔗 wave.h2o.ai

  35. strawberry-graphql/strawberry ⭐ 3,773
    A GraphQL library for Python that leverages type annotations 🍓
    🔗 strawberry.rocks

  36. websocket-client/websocket-client ⭐ 3,464
    WebSocket client for Python
    🔗 github.com/websocket-client/websocket-client

  37. unbit/uwsgi ⭐ 3,412
    uWSGI application server container
    🔗 projects.unbit.it/uwsgi

  38. pallets/quart ⭐ 2,636
    An async Python micro framework for building web applications.
    🔗 quart.palletsprojects.com

  39. fastapi-admin/fastapi-admin ⭐ 2,561
    A fast admin dashboard based on FastAPI and TortoiseORM with tabler ui, inspired by Django admin
    🔗 fastapi-admin-docs.long2ice.io

  40. flipkart-incubator/Astra ⭐ 2,427
    Automated Security Testing For REST API's

  41. masoniteframework/masonite ⭐ 2,150
    The Modern And Developer Centric Python Web Framework. Be sure to read the documentation and join the Discord channel for questions: https://discord.gg/TwKeFahmPZ
    🔗 docs.masoniteproject.com

  42. dot-agent/nextpy ⭐ 2,110
    🤖Self-Modifying Framework from the Future 🔮 World's First AMS
    🔗 dotagent.ai

  43. python-restx/flask-restx ⭐ 2,077
    Fork of Flask-RESTPlus: Fully featured framework for fast, easy and documented API development with Flask
    🔗 flask-restx.readthedocs.io/en/latest

  44. cherrypy/cherrypy ⭐ 1,786
    CherryPy is a pythonic, object-oriented HTTP framework. https://cherrypy.dev
    🔗 docs.cherrypy.dev

  45. dmontagu/fastapi-utils ⭐ 1,748
    Reusable utilities for FastAPI: a number of utilities to help reduce boilerplate and reuse common functionality across projects

  46. neoteroi/BlackSheep ⭐ 1,731
    Fast ASGI web framework for Python
    🔗 www.neoteroi.dev/blacksheep

  47. s3rius/FastAPI-template ⭐ 1,670
    Feature rich robust FastAPI template.

  48. jordaneremieff/mangum ⭐ 1,602
    AWS Lambda support for ASGI applications
    🔗 mangum.io

  49. wtforms/wtforms ⭐ 1,463
    A flexible forms validation and rendering library for Python.
    🔗 wtforms.readthedocs.io

  50. awtkns/fastapi-crudrouter ⭐ 1,310
    A dynamic FastAPI router that automatically creates CRUD routes for your models
    🔗 fastapi-crudrouter.awtkns.com

  51. magicstack/httptools ⭐ 1,166
    Fast HTTP parser

  52. long2ice/fastapi-cache ⭐ 1,137
    fastapi-cache is a tool to cache fastapi response and function result, with backends support redis and memcached.
    🔗 github.com/long2ice/fastapi-cache

  53. whitphx/stlite ⭐ 993
    A port of Streamlit to WebAssembly, powered by Pyodide.
    🔗 edit.share.stlite.net

  54. rstudio/py-shiny ⭐ 977
    Shiny for Python
    🔗 shiny.posit.co/py

  55. koxudaxi/fastapi-code-generator ⭐ 907
    This code generator creates FastAPI app from an openapi file.

  56. aeternalis-ingenium/FastAPI-Backend-Template ⭐ 558
    A backend project template with FastAPI, PostgreSQL with asynchronous SQLAlchemy 2.0, Alembic for asynchronous database migration, and Docker.


Interactive version: www.awesomepython.org, Hugging Face Dataset: awesome-python

Please raise a new issue to suggest a Python repo that you would like to see added.

1,461 hand-picked awesome Python libraries and frameworks, updated 05 May 2024

Hits