[Suggestion] Adding deepseek and openrouter support #212

chendakeng · 2024-12-10T11:01:58Z

This is a fascinating package! It's so TIDY, I like it a lot.

I am trying to use both deepseek (https://api-docs.deepseek.com/) and openrouter (https://openrouter.ai/docs/quick-start) with the chat_openai and extract_data functions.

Although they all use openai compatible API, but it seems like the response format is different from what the elmer package would expect. I try to revise the package directly to handle the response properly, but my attempts failed (all errors remained mine, I am a poor coder...)

I hope the developer can consider adding these 2 platforms. Deepseek is a very powerful open source model with a good price/performance ratio, while openrouter allows us to use a single apikey to access a variety of models, which is much more efficient in navigating between different models.

Cheers!
Devin

hadley · 2025-01-10T14:51:33Z

@chendakeng could you please show me what you've tried and the errors you received? That would make it easier for me to look into.

I've made deep seek a separate issue at #242

chendakeng · 2025-01-12T06:54:23Z

Thanks for replying to this issue about open router.

I tried the code below.

Since the open router api uses openai compatible format, I used "chat_openai" function, but it did not work.

The chat_openai example:

> chat <- chat_openai(
+   base_url = "https://openrouter.ai/api/v1",
+   api_key = "api-xxx",
+   model = "anthropic/claude-3.5-haiku:beta",
+   system_prompt = "You are a helpful assistant.",
+ )
> 
> 
> chat$chat("What is the capital of France?")
Error: lexical error: invalid char in json text.
                                       NA
                     (right here) ------^

Then I try to use the httr2 to directly link to open router API.

The httr2 chat example:

> library(httr2)
> 
> make_request <- function(api_key) {
+   request("https://openrouter.ai/api/v1/chat/completions") |>
+     req_headers(
+       "Content-Type" = "application/json",
+       "Authorization" = paste("Bearer", api_key),
+       "HTTP-Referer" = "http://localhost:8000",  # OpenRouter requires this header
+       "X-Title" = "R Console"  # OpenRouter requires an application name
+     ) |>
+     req_body_json(list(
+       model = "anthropic/claude-3.5-haiku:beta",
+       messages = list(
+         list(
+           role = "system",
+           content = "You are a helpful assistant."
+         ),
+         list(
+           role = "user",
+           content = "What is the capital of France?"
+         )
+       )
+     )) |>
+     req_perform() |>
+     resp_body_json()
+ }
> 
> # Try the request
> response <- make_request("api-xxx")
> str(response)
List of 7
 $ id      : chr "gen-1736664320-cOY4lYfgFhKnPmMINH5u"
 $ provider: chr "Anthropic"
 $ model   : chr "anthropic/claude-3.5-haiku"
 $ object  : chr "chat.completion"
 $ created : int 1736664320
 $ choices :List of 1
  ..$ :List of 4
  .. ..$ logprobs     : NULL
  .. ..$ finish_reason: chr "end_turn"
  .. ..$ index        : int 0
  .. ..$ message      :List of 3
  .. .. ..$ role   : chr "assistant"
  .. .. ..$ content: chr "The capital of France is Paris."
  .. .. ..$ refusal: chr ""
 $ usage   :List of 3
  ..$ prompt_tokens    : int 20
  ..$ completion_tokens: int 15
  ..$ total_tokens     : int 35

I also test the sentiment example using httr2 and open router API in case you need to see the json response format.

> make_request <- function(api_key) {
+   request("https://openrouter.ai/api/v1/chat/completions") |>
+     req_headers(
+       "Content-Type" = "application/json",
+       "Authorization" = paste("Bearer", api_key),
+       "HTTP-Referer" = "http://localhost:8000",
+       "X-Title" = "R Console"
+     ) |>
+     req_body_json(list(
+       model = "anthropic/claude-3.5-haiku:beta",
+       messages = list(
+         list(
+           role = "system",
+           content = "You are a helpful assistant. Please provide responses in JSON format."
+         ),
+         list(
+           role = "user", 
+           content = "Extract sentiment scores for this text with scores summing to 1.0: The product was okay, but the customer service was terrible."
+         )
+       ),
+       response_format = list(
+         type = "json_object"
+       )
+     )) |>
+     req_perform() |>
+     resp_body_json()
+ }
> 
> response <- make_request("api-xxx")
> str(response)
List of 7
 $ id      : chr "gen-1736664726-IuXxQ932kHONZNv6iFYj"
 $ provider: chr "Anthropic"
 $ model   : chr "anthropic/claude-3.5-haiku"
 $ object  : chr "chat.completion"
 $ created : int 1736664726
 $ choices :List of 1
  ..$ :List of 4
  .. ..$ logprobs     : NULL
  .. ..$ finish_reason: chr "end_turn"
  .. ..$ index        : int 0
  .. ..$ message      :List of 3
  .. .. ..$ role   : chr "assistant"
  .. .. ..$ content: chr "{\n    \"sentiment\": {\n        \"positive\": 0.3,\n        \"neutral\": 0.2,\n        \"negative\": 0.5\n    "| __truncated__
  .. .. ..$ refusal: chr ""
 $ usage   :List of 3
  ..$ prompt_tokens    : int 48
  ..$ completion_tokens: int 94
  ..$ total_tokens     : int 142

hadley · 2025-01-12T16:26:44Z

I suspect the difference is in how streaming is handled. Does chat_openai() work for you with echo = FALSE?

chendakeng · 2025-01-13T04:46:55Z

Thanks for providing possible solutions!

Yes, if I specify 'echo = FALSE' in 'chat_openai()`, it works! And both the chat and extract_data work perfectly for the openai series models in openrouter! That's nice!

But when I specified other models available on openrouter (the beauty of platforms like openrouter is that you can use 1 API to test multiple models with the same task to compare their performance), it did not work well. The results are not consistent, some work for chat but not for extract_data (like cluade and deepseek).

I am giving you some more cases:

The successful one, model = "openai/gpt-4o-mini:

> # chat
> chat <- chat_openai(
+   base_url = "https://openrouter.ai/api/v1",
+   api_key = "api-xxx",
+   model = "openai/gpt-4o-mini",
+   system_prompt = "You are a helpful assistant.",
+   echo = FALSE # the difference is in how streaming is handled. Use chat_openai() with echo = FALSE
+ )
> 
> 
> chat$chat("What is the capital of France?")
[1] "The capital of France is Paris."
> 
> 
> # extract_data
> text <- "
+   John works at Google in New York. He met with Sarah, the CEO of
+   Acme Inc., last week in San Francisco.
+ "
> 
> type_named_entity <- type_object(
+   name = type_string("The extracted entity name."),
+   type = type_enum("The entity type", c("person", "location", "organization")),
+   context = type_string("The context in which the entity appears in the text.")
+ )
> 
> type_named_entities <- type_object(
+   entities = type_array(items = type_named_entity)
+ )
> 
> str(chat$extract_data(text, type = type_named_entities))
List of 1
 $ entities:'data.frame':	6 obs. of  3 variables:
  ..$ name   : chr [1:6] "John" "Google" "Sarah" "Acme Inc." ...
  ..$ type   : chr [1:6] "person" "organization" "person" "organization" ...
  ..$ context: chr [1:6] "John works at Google" "John works at Google" "He met with Sarah, the CEO of Acme Inc." "He met with Sarah, the CEO of Acme Inc." ...

A half-successful one model = "anthropic/claude-3.5-haiku":

> # chat
> chat <- chat_openai(
+   base_url = "https://openrouter.ai/api/v1",
+   api_key = "api-xxx",
+   model = "anthropic/claude-3.5-haiku",
+   system_prompt = "You are a helpful assistant.",
+   echo = FALSE # the difference is in how streaming is handled. Use chat_openai() with echo = FALSE
+ )
> 
> 
> chat$chat("What is the capital of France?")
[1] "The capital of France is Paris."
>
>
> # ner
> text <- "
+   John works at Google in New York. He met with Sarah, the CEO of
+   Acme Inc., last week in San Francisco.
+ "
> 
> type_named_entity <- type_object(
+   name = type_string("The extracted entity name."),
+   type = type_enum("The entity type", c("person", "location", "organization")),
+   context = type_string("The context in which the entity appears in the text.")
+ )
> 
> type_named_entities <- type_object(
+   entities = type_array(items = type_named_entity)
+ )
> 
> str(chat$extract_data(text, type = type_named_entities))
 chr "Here's a summary of the key details in the statement:\n\n1. John's workplace: Google\n2. John's location: New Y"| __truncated__
>
>
> # sentiment analysis
> text <- "
+   The product was okay, but the customer service was terrible. I probably
+   won't buy from them again.
+ "
> 
> type_sentiment <- type_object(
+   "Extract the sentiment scores of a given text. Sentiment scores should sum to 1.",
+   positive_score = type_number("Positive sentiment score, ranging from 0.0 to 1.0."),
+   negative_score = type_number("Negative sentiment score, ranging from 0.0 to 1.0."),
+   neutral_score = type_number("Neutral sentiment score, ranging from 0.0 to 1.0.")
+ )
> 
> 
> str(chat$extract_data(text, type = type_sentiment))
 chr "This sounds like a customer review expressing dissatisfaction with a company. The customer indicates that while"| __truncated__

The other half-successful one, model = "deepseek/deepseek-chat":

> # chat
> chat <- chat_openai(
+   base_url = "https://openrouter.ai/api/v1",
+   api_key = "api-xxx",
+   model = "deepseek/deepseek-chat",
+   system_prompt = "You are a helpful assistant.",
+   echo = FALSE # the difference is in how streaming is handled. Use chat_openai() with echo = FALSE
+ )
> 
> 
> chat$chat("What is the capital of France?")
[1] "The capital of France is **Paris**. Known for its rich history, iconic landmarks, and cultural influence, Paris is one of the most prominent cities in the world."
>
>
> # ner
> text <- "
+   John works at Google in New York. He met with Sarah, the CEO of
+   Acme Inc., last week in San Francisco.
+ "
> 
> type_named_entity <- type_object(
+   name = type_string("The extracted entity name."),
+   type = type_enum("The entity type", c("person", "location", "organization")),
+   context = type_string("The context in which the entity appears in the text.")
+ )
> 
> type_named_entities <- type_object(
+   entities = type_array(items = type_named_entity)
+ )
> 
> str(chat$extract_data(text, type = type_named_entities))
 chr "It sounds like John, who works at Google in New York, had a meeting with Sarah, the CEO of Acme Inc., last week"| __truncated__
> 
> # sentiment analysis
> text <- "
+   The product was okay, but the customer service was terrible. I probably
+   won't buy from them again.
+ "
> 
> type_sentiment <- type_object(
+   "Extract the sentiment scores of a given text. Sentiment scores should sum to 1.",
+   positive_score = type_number("Positive sentiment score, ranging from 0.0 to 1.0."),
+   negative_score = type_number("Negative sentiment score, ranging from 0.0 to 1.0."),
+   neutral_score = type_number("Neutral sentiment score, ranging from 0.0 to 1.0.")
+ )
> 
> 
> str(chat$extract_data(text, type = type_sentiment))
 chr "<structured data/>"

Thanks again for your time on this issue.

hadley added the reprex needs a minimal reproducible example label Jan 10, 2025

hadley added feature a feature request or enhancement and removed reprex needs a minimal reproducible example labels Jan 13, 2025

hadley added this to the 0.1.1 milestone Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Suggestion] Adding deepseek and openrouter support #212

[Suggestion] Adding deepseek and openrouter support #212

chendakeng commented Dec 10, 2024

hadley commented Jan 10, 2025

chendakeng commented Jan 12, 2025

hadley commented Jan 12, 2025

chendakeng commented Jan 13, 2025 •

edited

Loading

[Suggestion] Adding deepseek and openrouter support #212

[Suggestion] Adding deepseek and openrouter support #212

Comments

chendakeng commented Dec 10, 2024

hadley commented Jan 10, 2025

chendakeng commented Jan 12, 2025

hadley commented Jan 12, 2025

chendakeng commented Jan 13, 2025 • edited Loading

chendakeng commented Jan 13, 2025 •

edited

Loading