Skip to content

Latest commit

 

History

History
19 lines (15 loc) · 623 Bytes

deploy.md

File metadata and controls

19 lines (15 loc) · 623 Bytes

Deployment on LLMs

Here're some resources about Deployment on LLMs

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

paper link: here

citation:

@misc{miao2023efficient,
      title={Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems}, 
      author={Xupeng Miao and Gabriele Oliaro and Zhihao Zhang and Xinhao Cheng and Hongyi Jin and Tianqi Chen and Zhihao Jia},
      year={2023},
      eprint={2312.15234},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}