Release v1.0.0: The first stable release · bigscience-workshop/petals

General

This release contains the core functionality of the Petals platform described in our paper.

What's Changed

Rudimentary decentralization by @justheuristic in #9
Update model by @dbaranchuk in #17
Chained rpc_forward & rpc_backward by @dbaranchuk in #18
Implement block selection on servers by @borzunov in #20
LM head module by @dbaranchuk in #19
Measure and cache network & compute throughput by @borzunov in #21
Shallow prompt tuning with run example on SST-2 by @dbaranchuk in #22
minimalistic automated tests by @justheuristic in #23
Clean up readme by @justheuristic in #24
[Test CI] add instructions to test the full model by @justheuristic in #25
Fix default branch in CI by @justheuristic in #26
Fix CI runs in master by @justheuristic in #27
CI: use GIT_REF_NAME instead of GIT_HEAD_REF by @justheuristic in #28
Add GenerationMixin class by @artek0chumak in #29
Decouple make_sequence and move to RemoteSequenceManager by @justheuristic in #30
fix is_subsequence by @dbaranchuk in #32
Miscellaneous fixes to automatic tests by @justheuristic in #35
Efficient forward & backward by @dbaranchuk in #36
Pack of Inference Changes by @artek0chumak in #37
Support various backend dtypes & async serialization by @dbaranchuk in #38
Use "PETALS" as the readme title by @borzunov in #40
integrate mixed-8bit model by @dbaranchuk in #39
Rename 350m -> 560m by @dbaranchuk in #43
make pytest outputs more verbose by @justheuristic in #44
Distributed prompt tuning by @dbaranchuk in #42
Reduce vocabulary size in test model, fix bug in routing when overlapped by @justheuristic in #45
Convert actual model weights by @dbaranchuk in #46
[quickfix 1/n] remove expensive assertions in inference code by @justheuristic in #48
[Fix] make distributed seq cls to not create the full bloom model by @dbaranchuk in #49
Fix recovering for sequential_backward by @dbaranchuk in #50
Inference: require max sequence length instead of assuming 2048 by @justheuristic in #52
Add shallow prefix-tuned inference by @artek0chumak in #55
remove transformer block, implement as sequence size 1 by @GreenFatGuy in #54
Update readme for the 1st public release by @borzunov in #57
Use latest version of Petals scheme, shrink Petals logo by @borzunov in #59
Update bullet points with feedback from Tim and other people by @borzunov in #61
Update readme with arxiv link and more discussions by @borzunov in #62
Warn that current instructions involve 6B model but we will replace them soon by @borzunov in #63
Add deep prompt inference by @artek0chumak in #66
Fix calling rpc_info multiple times by @justheuristic in #60
Make attention cache wait until memory is freed by @justheuristic in #53
Build cpuonly from bitsandbytes main by @justheuristic in #70
Priority tasks by @GreenFatGuy in #47
Update dependency versions by @justheuristic in #71
fix protobuf version by @justheuristic in #74
Add prompt tuning example on Personachat dataset by @artek0chumak in #69
Quality of life changes: update readme, simplify run_server interface by @justheuristic in #75
Use bitsandbytes==0.34.0, update readme by @justheuristic in #76
Make small readability & style changes to the instructions by @borzunov in #77
Rebalance swarm when necessary by @borzunov in #34
Update hivemind to 1.1.2, mark model argument as required by @borzunov in #81
Fix "Too many open files" during rebalancing by @borzunov in #83
Add colab-related changes by @artek0chumak in #80
Enable rebalancing by default by @borzunov in #84
Implement exponential backoff for forward & backward by @borzunov in #85
Add sst-2 ipynb example by @artek0chumak in #86
Fix floating point issues in block_selection.py by @borzunov in #89
Implement timeouts in forward/backward by @borzunov in #90
Force reinstall of hivemind in example notebooks by @artek0chumak in #88
Make inference, forward, and backward fully fault-tolerant by @borzunov in #91
Use public swarm by default by @borzunov in #92
Make ServerState announcements work better by @borzunov in #93
Require hivemind with fixed compression and protobuf working on Colab by @borzunov in #94
Try to fix protobuf versions once again by @borzunov in #95
Add Beam Search decoding algorithm by @artek0chumak in #87
Improve server's logging by @borzunov in #96
Add various server timeouts, lower --max_batch_size and --inference_max_length defaults by @borzunov in #97
Fix dtype- and device-related client issues by @borzunov in #98
Make Petals a pip-installable package (attempt 2) by @borzunov in #102
Fix dtypes in backend schemas by @borzunov in #99
Fix ptune with low_cpu_mem_usage=True (as in Colab) by @borzunov in #103
Add Dockerfile by @mryab in #82
Remove unused imports, add missing arguments to docstrings by @mryab in #108
Expose request_timeout to DistributedBloomConfig by @artek0chumak in #105
Optimize RemoteSequenceManager by @justheuristic in #106
Hotfix span selection by @justheuristic in #110
Patch Linear8bit to enable CxB backward by @justheuristic in #111
Fix Linear8bitlt state config, update tests by @justheuristic in #112
Measure throughput for different configs, devices, and dtypes separately by @borzunov in #114
Support --load_in_8bit on pre-Turing GPUs by @justheuristic in #113
Fix tile size on ampere by @justheuristic in #116
Make server use smart defaults by @borzunov in #115
Suppress quantization warning and fix dtype defaults in compute benchmark by @borzunov in #117
Choose --num_blocks for bigscience/bloom-petals automatically by @borzunov in #119
Require hivemind==1.1.4 with p2pd v0.3.13 by @borzunov in #121
Rework readme, move code example to the top, link draft of Colab by @borzunov in #118
Remove "-r" when installing Petals in examples by @mryab in #122
Update notebooks to use full BLOOM-176B by @artek0chumak in #104
Call block.load_state_dict only once by @mryab in #124
Add checks for forward() inputs on the client side by @justheuristic in #123
Fix typos with codespell by @mryab in #126
Set dht.num_workers = n_layer, update_period = 150, expiration = 300 by @borzunov in #125
Avoid synchronous updates, ban peers based on request outcome by @justheuristic in #127
Revert to hivemind==1.1.3 for stability by @borzunov in #129
Clear trigger before engaging in update by @justheuristic in #130
Fix inference and rpc_info() fault tolerance by @borzunov in #131
Set default --step_timeout to 5 min by @borzunov in #133
Don't ban servers in case of client-caused handler errors by @borzunov in #134
Allow .generate() to reuse existing inference session by @borzunov in #132
Fix waiting until free memory is available by @borzunov in #136
Fix "could not unlink the shared memory file" during rebalancing by @borzunov in #135
Add Docker commands, use permanent Discord links by @borzunov in #137
Update texts in "Terms of use" and "Privacy and security" sections by @borzunov in #138
Show route on client by @borzunov in #139
Update Anaconda instructions by @borzunov in #140
Use common folder for all caches, make it a volume in Dockerfile by @borzunov in #141
Suppress asyncio error logs by default by @borzunov in #142
Add link to privacy & security Wiki by @borzunov in #144
Improve block size calculations by @borzunov in #149
Fix OOMs during server rebalancing by @borzunov in #150
Bump transformers to 4.25.1 by @justheuristic in #151
Clean up disk space by @borzunov in #152
Fix arguments in remove_old_models.py by @mryab in #153
Add missing methods for SamplingAlgorithm, fix docstrings by @mryab in #107
Reset MemoryCache during rebalancings by @borzunov in #154
Check reachability automatically and give advice how to fix it by @borzunov in #155
Fix logging: do not duplicate lines, enable colors in Colab by @borzunov in #156
Update advanced notebooks by @artek0chumak in #148
Downgrade CUDA in Docker image to 11.0.3 by @mryab in #145
Switch to speedtest-cli by @justheuristic in #157
Fix issues related to petals as a module by @borzunov in #159
Alloc inference cache as one contiguous buffer by @borzunov in #160
Fix misstypos in the example notebooks. by @artek0chumak in #161
Hot fix: Increase hivemind.P2P's startup_timeout for Colab, remove absent initial peer by @borzunov in #162
Shield alloc & free from cancellation by @borzunov in #163
Update wording in readme by @borzunov in #165
Correct grammar in readme by @vadi2 in #166
Add link to chat.petals.ml by @borzunov in #168
Fix code example in readme by @borzunov in #169
Fix instruction for developers by @justheuristic in #170

New Contributors

@dbaranchuk made their first contribution in #17
@borzunov made their first contribution in #20
@artek0chumak made their first contribution in #29
@GreenFatGuy made their first contribution in #54
@mryab made their first contribution in #82
@vadi2 made their first contribution in #166

Full Changelog: https://github.com/bigscience-workshop/petals/commits/v1.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.0: The first stable release

General

What's Changed

New Contributors

Contributors