Skip to content

[inference provider] Add wavespeed.ai as an inference provider #1424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

arabot777
Copy link

@arabot777 arabot777 commented May 5, 2025

What’s in this PR
WaveSpeedAI is a high-performance AI image and video generation service platform, offering industry-leading generation speeds. Now, want to be listed as an Inference Provider on the Hugging Face Hub

The JS Client Integration was completed based on the inference-providers help documentation and passed the test. I am submitting the pr now and look forward to further communication with you

Test

pnpm --filter @huggingface/inference test "test/InferenceClient.spec.ts" -t "^Wavespeed AI"

> @huggingface/[email protected] test /Users/shanliu/work/huggingface.js/packages/inference
> vitest run --config vitest.config.mts "test/InferenceClient.spec.ts"


 RUN  v0.34.6 /Users/shanliu/work/huggingface.js/packages/inference

 ✓ test/InferenceClient.spec.ts (104) 198160ms
   ✓ InferenceClient (104) 198160ms
     ✓ backward compatibility (1)
       ✓ works with old HfInference name
     ↓ HF Inference (49) [skipped]
       ↓ throws error if model does not exist [skipped]
       ↓ fillMask [skipped]
       ↓ works without model [skipped]
       ↓ summarization [skipped]
       ↓ questionAnswering [skipped]
       ↓ tableQuestionAnswering [skipped]
       ↓ documentQuestionAnswering [skipped]
       ↓ documentQuestionAnswering with non-array output [skipped]
       ↓ visualQuestionAnswering [skipped]
       ↓ textClassification [skipped]
       ↓ textGeneration - gpt2 [skipped]
       ↓ textGeneration - openai-community/gpt2 [skipped]
       ↓ textGenerationStream - meta-llama/Llama-3.2-3B [skipped]
       ↓ textGenerationStream - catch error [skipped]
       ↓ textGenerationStream - Abort [skipped]
       ↓ tokenClassification [skipped]
       ↓ translation [skipped]
       ↓ zeroShotClassification [skipped]
       ↓ sentenceSimilarity [skipped]
       ↓ FeatureExtraction [skipped]
       ↓ FeatureExtraction - auto-compatibility sentence similarity [skipped]
       ↓ FeatureExtraction - facebook/bart-base [skipped]
       ↓ FeatureExtraction - facebook/bart-base, list input [skipped]
       ↓ automaticSpeechRecognition [skipped]
       ↓ audioClassification [skipped]
       ↓ audioToAudio [skipped]
       ↓ textToSpeech [skipped]
       ↓ imageClassification [skipped]
       ↓ zeroShotImageClassification [skipped]
       ↓ objectDetection [skipped]
       ↓ imageSegmentation [skipped]
       ↓ imageToImage [skipped]
       ↓ imageToImage blob data [skipped]
       ↓ textToImage [skipped]
       ↓ textToImage with parameters [skipped]
       ↓ imageToText [skipped]
       ↓ request - openai-community/gpt2 [skipped]
       ↓ tabularRegression [skipped]
       ↓ tabularClassification [skipped]
       ↓ endpoint - makes request to specified endpoint [skipped]
       ↓ endpoint - makes request to specified endpoint - alternative syntax [skipped]
       ↓ chatCompletion modelId - OpenAI Specs [skipped]
       ↓ chatCompletionStream modelId - OpenAI Specs [skipped]
       ↓ chatCompletionStream modelId Fail - OpenAI Specs [skipped]
       ↓ chatCompletion - OpenAI Specs [skipped]
       ↓ chatCompletionStream - OpenAI Specs [skipped]
       ↓ custom mistral - OpenAI Specs [skipped]
       ↓ custom openai - OpenAI Specs [skipped]
       ↓ OpenAI client side routing - model should have provider as prefix [skipped]
     ↓ Fal AI (4) [skipped]
       ↓ textToImage - black-forest-labs/FLUX.1-schnell [skipped]
       ↓ textToImage - SD LoRAs [skipped]
       ↓ textToImage - Flux LoRAs [skipped]
       ↓ automaticSpeechRecognition - openai/whisper-large-v3 [skipped]
     ↓ Featherless (3) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textGeneration [skipped]
     ↓ Replicate (10) [skipped]
       ↓ textToImage canonical - black-forest-labs/FLUX.1-schnell [skipped]
       ↓ textToImage canonical - black-forest-labs/FLUX.1-dev [skipped]
       ↓ textToImage canonical - stabilityai/stable-diffusion-3.5-large-turbo [skipped]
       ↓ textToImage versioned - ByteDance/SDXL-Lightning [skipped]
       ↓ textToImage versioned - ByteDance/Hyper-SD [skipped]
       ↓ textToImage versioned - playgroundai/playground-v2.5-1024px-aesthetic [skipped]
       ↓ textToImage versioned - stabilityai/stable-diffusion-xl-base-1.0 [skipped]
       ↓ textToSpeech versioned [skipped]
       ↓ textToSpeech OuteTTS -  usually Cold [skipped]
       ↓ textToSpeech Kokoro [skipped]
     ↓ SambaNova (3) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ featureExtraction [skipped]
     ↓ Together (4) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textToImage [skipped]
       ↓ textGeneration [skipped]
     ↓ Nebius (3) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textToImage [skipped]
     ↓ 3rd party providers (1) [skipped]
       ↓ chatCompletion - fails with unsupported model [skipped]
     ↓ Fireworks (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ Hyperbolic (4) [skipped]
       ↓ chatCompletion - hyperbolic [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textToImage [skipped]
       ↓ textGeneration [skipped]
     ↓ Novita (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ Black Forest Labs (2) [skipped]
       ↓ textToImage [skipped]
       ↓ textToImage URL [skipped]
     ↓ Cohere (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ Cerebras (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ Nscale (3) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textToImage [skipped]
     ↓ Groq (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ OVHcloud (4) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textGeneration [skipped]
       ↓ textGeneration stream [skipped]
     ✓ Wavespeed AI (5) 89033ms
       ✓ textToImage - wavespeed-ai/flux-schnell 89032ms
       ✓ textToImage - wavespeed-ai/flux-dev-lora 12369ms
       ✓ textToImage - wavespeed-ai/flux-dev-lora-ultra-fast 17936ms
       ✓ textToVideo - wavespeed-ai/wan-2.1/t2v-480p 79507ms
       ✓ imageToImage - wavespeed-ai/hidream-e1-full 74481ms

 Test Files  1 passed (1)
      Tests  5 passed | 103 skipped (108)
   Start at  14:33:17
   Duration  89.62s (transform 315ms, setup 14ms, collect 368ms, tests 89.03s, environment 0ms, prepare 74ms)

Copy link
Contributor

@SBrandeis SBrandeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thank you for your contribution
The code is of great quality overall - I left a few comments regarding our code style.
Please make sure the client can be used to query your API for all supported tasks, and that the payload are matching your API.
Thanks again!

Comment on lines 101 to 102
- [HF Inference API (serverless)](https://huggingface.co/models?inference=warm&sort=trending)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [HF Inference API (serverless)](https://huggingface.co/models?inference=warm&sort=trending)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has deleted

hfModelId: "wavespeed-ai/wan-2.1/i2v-480p",
providerId: "wavespeed-ai/wan-2.1/i2v-480p",
status: "live",
task: "image-to-video",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this task is not supported in the client code - let's remove it for now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has deleted

Comment on lines 1 to 13
import { InferenceOutputError } from "../lib/InferenceOutputError";
import { ImageToImageArgs } from "../tasks";
import type { BodyParams, HeaderParams, RequestArgs, UrlParams } from "../types";
import { delay } from "../utils/delay";
import { omit } from "../utils/omit";
import { base64FromBytes } from "../utils/base64FromBytes";
import {
TaskProviderHelper,
TextToImageTaskHelper,
TextToVideoTaskHelper,
ImageToImageTaskHelper,
} from "./providerHelper";

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use import type when the import is only used as a type

Suggested change
import { InferenceOutputError } from "../lib/InferenceOutputError";
import { ImageToImageArgs } from "../tasks";
import type { BodyParams, HeaderParams, RequestArgs, UrlParams } from "../types";
import { delay } from "../utils/delay";
import { omit } from "../utils/omit";
import { base64FromBytes } from "../utils/base64FromBytes";
import {
TaskProviderHelper,
TextToImageTaskHelper,
TextToVideoTaskHelper,
ImageToImageTaskHelper,
} from "./providerHelper";
import { InferenceOutputError } from "../lib/InferenceOutputError";
import type { ImageToImageArgs } from "../tasks";
import type { BodyParams, HeaderParams, RequestArgs, UrlParams } from "../types";
import { delay } from "../utils/delay";
import { omit } from "../utils/omit";
import { base64FromBytes } from "../utils/base64FromBytes";
import type {
TaskProviderHelper,
TextToImageTaskHelper,
TextToVideoTaskHelper,
ImageToImageTaskHelper,
} from "./providerHelper";

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modify as suggested

};
}

type WaveSpeedAIResponse<T = WaveSpeedAITaskResponse> = WaveSpeedAICommonResponse<T>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this type alias is needed, can we remove it?

Suggested change
type WaveSpeedAIResponse<T = WaveSpeedAITaskResponse> = WaveSpeedAICommonResponse<T>;

WaveSpeedAICommonResponse can be renamed to WaveSpeedAIResponse

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type is needed and will be used in two places. It's uncertain whether it will be used again in the future.
It follows the DRY (Don't Repeat Yourself) principle
It provides better type safety (through default generic parameters)
It makes the code more readable and maintainable

Comment on lines 124 to 133
case "completed": {
// Get the video data from the first output URL
if (!taskResult.outputs?.[0]) {
throw new InferenceOutputError("No video URL in completed response");
}
const videoResponse = await fetch(taskResult.outputs[0]);
if (!videoResponse.ok) {
throw new InferenceOutputError("Failed to fetch video data");
}
return await videoResponse.blob();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, the payload can be something else than a video (eg an image)
Let's update the error message to reflect that

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes,
I revised it.

Comment on lines 170 to 192
if (!args.parameters) {
return {
...args,
model: args.model,
data: args.inputs,
};
} else {
return {
...args,
inputs: base64FromBytes(
new Uint8Array(args.inputs instanceof ArrayBuffer ? args.inputs : await (args.inputs as Blob).arrayBuffer())
),
};
}
}

override preparePayload(params: BodyParams): Record<string, unknown> {
return {
...omit(params.args, ["inputs", "parameters"]),
...(params.args.parameters as Record<string, unknown>),
image: params.args.inputs,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only one of the two ( preparePayload or preparePayloadAsync) should be responsible for building the payload, meaning, I'd rather move the rename of inputs to image in preparePayloadAsync an have preparePayload as dumb as possible

cc @hanouticelina - would love your opinion on that specific point

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only kept preparePayloadAsync func

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only one of the two ( preparePayload or preparePayloadAsync) should be responsible for building the payload, meaning, I'd rather move the rename of inputs to image in preparePayloadAsync an have preparePayload as dumb as possible

yes agree!

Comment on lines 179 to 181
inputs: base64FromBytes(
new Uint8Array(args.inputs instanceof ArrayBuffer ? args.inputs : await (args.inputs as Blob).arrayBuffer())
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the wavespeed API support base64-encoded images as inputs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@hanouticelina hanouticelina added the inference-providers integration of a new or existing Inference Provider label May 20, 2025
@arabot777 arabot777 requested a review from SBrandeis May 20, 2025 13:44
Copy link
Contributor

@hanouticelina hanouticelina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @arabot777 for the PR! I left some minor comments. I tested the 3 tasks supported by Wavespeed.ai and it works as expected with the changes I suggested.

constructor() {
super(WAVESPEEDAI_API_BASE_URL);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should override preparePayload to avoid sending unexpected fields to the API, since we rely on preparePayloadAsync to format the payload for this task

Suggested change
override preparePayload(params: BodyParams): Record<string, unknown> {
return params.args;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, for the preparePayload, I directly reused the preparePayload of the base class WavespeedAITask because there is no ambiguity in all types of interface parameter fields provided by the wavespeed api.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this consideration point complies with the code specification? Or do I need to make some better modifications in accordance with the norms? thinks

async preparePayloadAsync(args: ImageToImageArgs): Promise<RequestArgs> {
return {
...args,
inputs: args.parameters?.prompt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on https://wavespeed.ai/models/wavespeed-ai/hidream-e1-full, it seems that the prompt should be send in "prompt" not "inputs". let me know if i'm missing something here.

Suggested change
inputs: args.parameters?.prompt,
prompt: args.parameters?.prompt,

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because here I reused the preparePayload function of the general class WavespeedAITask. The prompt here is obtained from params.args.inputs.
For general processing, an "inputs" field was added to place the prompt when handling this logic

Copy link
Contributor

@SBrandeis SBrandeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second round of code review, thank you! We're getting there

Note: make sure you run pnpm format and pnpm lint to conform our code style.

Comment on lines 16 to 24
/**
* Common response structure for all WaveSpeed AI API responses
*/
interface WaveSpeedAICommonResponse<T> {
code: number;
message: string;
data: T;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This abstraction is not necessary IMO, let's remove it (see my other comment)

Suggested change
/**
* Common response structure for all WaveSpeed AI API responses
*/
interface WaveSpeedAICommonResponse<T> {
code: number;
message: string;
data: T;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been modified as suggested

Comment on lines 55 to 56
type WaveSpeedAIResponse<T = WaveSpeedAITaskResponse> = WaveSpeedAICommonResponse<T>;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the previous comment - let's remove one level of abstraction

Suggested change
type WaveSpeedAIResponse<T = WaveSpeedAITaskResponse> = WaveSpeedAICommonResponse<T>;
interface WaveSpeedAIResponse {
code: number;
message: string;
data: WaveSpeedAITaskResponse
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been modified as suggested

Comment on lines 67 to 73
preparePayload(params: BodyParams): Record<string, unknown> {
const payload: Record<string, unknown> = {
...omit(params.args, ["inputs", "parameters"]),
...(params.args.parameters as Record<string, unknown>),
prompt: params.args.inputs,
};
// Add LoRA support if adapter is specified in the mapping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to cast into Result<string, unknown> if the params have the proper type
ImageToImageArgs, TextToImageArgs, and TextToVideoArgs need to be improrted from "../tasks"

Suggested change
preparePayload(params: BodyParams): Record<string, unknown> {
const payload: Record<string, unknown> = {
...omit(params.args, ["inputs", "parameters"]),
...(params.args.parameters as Record<string, unknown>),
prompt: params.args.inputs,
};
// Add LoRA support if adapter is specified in the mapping
preparePayload(params: BodyParams<ImageToImageArgs | TextToImageArgs | TextToVideoArgs>): Record<string, unknown> {
const payload: Record<string, unknown> = {
...omit(params.args, ["inputs", "parameters"]),
...params.args.parameters,
prompt: params.args.inputs,
};
// Add LoRA support if adapter is specified in the mapping

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been modified as suggested

if (params.mapping?.adapter === "lora" && params.mapping.adapterWeightsPath) {
payload.loras = [
{
path: params.mapping.adapterWeightsPath,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, adapterWeightsPath is the path to the LoRA weights inside the associated HF repo
eg, for nerijs/pixel-art-xl, it will be

"pixel-art-xl.safetensors"

Let's make sure that is indeed what your API is expecting when running LoRAs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I see that fal is the endpoint that has been concatenated with hf.
Can I directly set the adapterWeightsPath to a lora http address? Or any other address.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the test cases, I conducted the test in this way. The adapterWeightsPath was directly passed over as an input parameter of lora.

"wavespeed-ai/flux-dev-lora": {
	hfModelId: "wavespeed-ai/flux-dev-lora",
	providerId: "wavespeed-ai/flux-dev-lora",
	status: "live",
	task: "text-to-image",
	adapter: "lora",
	adapterWeightsPath:
		"https://d32s1zkpjdc4b1.cloudfront.net/predictions/599f3739f5354afc8a76a12042736bfd/1.safetensors",
},
"wavespeed-ai/flux-dev-lora-ultra-fast": {
	hfModelId: "wavespeed-ai/flux-dev-lora-ultra-fast",
	providerId: "wavespeed-ai/flux-dev-lora-ultra-fast",
	status: "live",
	task: "text-to-image",
	adapter: "lora",
	adapterWeightsPath: "linoyts/yarn_art_Flux_LoRA",
},

in wavespeedai task is :
image
image

However, I'm not sure whether the input parameters submitted by hf to lora must be the abbreviation of the file path of the hf model and then concatenated with the hf address in the code. If it is this kind of specification, I can complete it in the format of fal

Comment on lines +85 to +92
override prepareHeaders(params: HeaderParams, isBinary: boolean): Record<string, string> {
this.accessToken = params.accessToken;
const headers: Record<string, string> = { Authorization: `Bearer ${params.accessToken}` };
if (!isBinary) {
headers["Content-Type"] = "application/json";
}
return headers;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same behavior as the blanket implementation here:
https://github.com/arabot777/huggingface.js/blob/f706e02d6128f559bd5551072344ff6e31b9c4be/packages/inference/src/providers/providerHelper.ts#L114-L124

No need for an override IMO

Suggested change
override prepareHeaders(params: HeaderParams, isBinary: boolean): Record<string, string> {
this.accessToken = params.accessToken;
const headers: Record<string, string> = { Authorization: `Bearer ${params.accessToken}` };
if (!isBinary) {
headers["Content-Type"] = "application/json";
}
return headers;
}

Copy link
Author

@arabot777 arabot777 May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this part of the logic at the beginning. However, the getresponse method of imageToimage.ts did not pass in header information.
image

I have to rewrite prepareHeaders here and by assignment
this.accessToken = params.accessToken; To ensure that the complete ak information of the header can be passed on when calling getresponse
image

@SBrandeis SBrandeis self-assigned this May 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inference-providers integration of a new or existing Inference Provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants