-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I see you guide to run on local, can you give me an example running on API? #1
Comments
Hi, I suggest you take a look at the Alternatively, you can refer to my solution below. I have deployed my own vLLM service on the server. I tried it, and it works well. This setup allows me to experiment with different models, and it's faster. Notice:
↑ The second point is that I previously missed the Fixed version: class IO_System:
"""Input/Output system"""
def __init__(self, args, tokenizer, model) -> None:
# ... former code here
# added
self.api_url = args.api_url
self.model_name = args.api_model_name
def generate(self, model_input, max_tokens: int, num_return: int, stop_tokens):
import requests
io_output_list = []
if self.api == "vllm_api":
if isinstance(model_input, str):
model_input = [model_input]
if isinstance(model_input, list):
for _input in model_input:
params = {
"model": self.model_name,
"stream": False,
"temperature": self.temperature,
"top_k": self.top_k,
"top_p": self.top_p,
"max_tokens": max_tokens,
"stop": stop_tokens,
"n": num_return,
"messages": [{"role": "user", "content": _input}],
}
try:
vllm_response = requests.post(self.api_url, json=params).json()
output = vllm_response["choices"][0]["message"]["content"]
token_count = vllm_response["usage"]["completion_tokens"]
except Exception as e:
raise RuntimeError(f"API Requests Error: {e}")
io_output_list.append(output)
self.call_counter += 1
self.token_counter += token_count
return io_output_list
# ... former code here |
Hello, The algorithm you provided is quite novel, but I want to run it through the API, can you give me an example, I tried to modify the code but most of the times running on the API gives me Connect error
The text was updated successfully, but these errors were encountered: