-
Notifications
You must be signed in to change notification settings - Fork 39
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
12 changed files
with
189 additions
and
1 deletion.
There are no files selected for viewing
105 changes: 105 additions & 0 deletions
105
i18n/en-us/docusaurus-plugin-content-blog/2024-05-20-OCR-Video-Streams.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
--- | ||
slug: ocr-video-streams | ||
title: Oryx - Leveraging OpenAI for OCR and Object Recognition in Video Streams | ||
authors: [] | ||
tags: [ocr, ai, gpt, srs, oryx] | ||
custom_edit_url: null | ||
--- | ||
|
||
# Leveraging OpenAI for OCR and Object Recognition in Video Streams using Oryx | ||
|
||
## Introduction | ||
|
||
In today's digital world, videos are everywhere. From social media clips to live broadcasts, we consume a | ||
vast amount of video content daily. But have you ever wondered how we can make sense of all the information | ||
in these videos? This is where AI comes in. With the help of artificial intelligence, we can now recognize | ||
text, identify objects, and even describe scenes in video streams. | ||
|
||
<!--truncate--> | ||
|
||
One powerful tool that makes this process easy is Oryx. In this blog, we'll explore how Oryx can help you | ||
perform OCR (Optical Character Recognition) on video streams, allowing you to extract valuable information | ||
in real-time. | ||
|
||
## Step 1: Create Oryx by One Click | ||
|
||
Creating an Oryx is simple and can be done with just one click if you use Digital Ocean droplet. | ||
Please see [How to Setup a Video Streaming Service by 1-Click](./2022-04-09-Oryx-Tutorial.md) for detail. | ||
|
||
You can also use Docker to create an Oryx with a single command line: | ||
|
||
```bash | ||
docker run --restart always -d -it --name oryx -v $HOME/data:/data \ | ||
-p 80:2022 -p 443:2443 -p 1935:1935 -p 8000:8000/udp -p 10080:10080/udp \ | ||
ossrs/oryx:5 | ||
``` | ||
|
||
After creating the Oryx, you can access it through `http://your-server-ip/mgmt` via a browser. | ||
|
||
## Step 2: Publish a Live Stream to Oryx | ||
|
||
You can use OBS or FFmpeg to publish a live stream to Oryx. You can also set up HTTPS and publish via WebRTC. | ||
|
||
![](/img/blog-2024-05-20-01.png) | ||
|
||
Once the stream is published, you can preview it using an H5 player or VLC. | ||
Please see [How to Setup a Video Streaming Service by 1-Click](./2022-04-09-Oryx-Tutorial.md) for detail. | ||
|
||
## Step 3: Setup OpenAI Secret Key for OCR | ||
|
||
To use OCR, you must obtain a secret key from OpenAI. Please open the [API keys](https://platform.openai.com/api-keys) | ||
page in your browser and click the `Create new secret key` button. Once the key is created, copy it and set it in Oryx. | ||
Then, click the `Test OpenAI Service` button, as shown in the picture below. | ||
|
||
![](/img/blog-2024-05-20-02.png) | ||
|
||
If the test is successful, you can click the `Start OCR` button to start the OCR process. | ||
|
||
## Step 4: Setup AI Instructions for OCR | ||
|
||
Once you've configured your GPT AI assistant, you can update the bellow prompt at the setting webpage | ||
`Service Settings > AI Instructions > Instructions`. | ||
|
||
![](/img/blog-2024-05-20-03.png) | ||
|
||
To recognize text in video streams, you can use the following instructions: | ||
|
||
```text | ||
Recognize the text in the image. Output the identified text directly. | ||
``` | ||
|
||
## Step 5: View OCR Results by Callback | ||
|
||
Once the OCR process is complete, you can view the results by setting up a callback URL in Oryx. | ||
|
||
![](/img/blog-2024-05-20-04.png) | ||
|
||
You can also view the last OCR result in the dashboard. | ||
|
||
![](/img/blog-2024-05-20-05.png) | ||
|
||
## Cloud Service | ||
|
||
At SRS, our goal is to establish a non-profit, open-source community dedicated to creating an all-in-one, | ||
out-of-the-box, open-source video solution for live streaming and WebRTC online services. | ||
|
||
Additionally, we offer a [Cloud](../cloud) service for those who prefer to use cloud service instead of building from | ||
scratch. Our cloud service features global network acceleration, enhanced congestion control algorithms, | ||
client SDKs for all platforms, and some free quota. | ||
|
||
To learn more about our cloud service, click [here](../cloud). | ||
|
||
## Conclusion | ||
|
||
In conclusion, using AI to recognize text and objects in video streams is a game-changer. It helps us quickly | ||
and accurately extract valuable information from videos. Tools like Oryx make this process simple and efficient, | ||
allowing you to publish live streams and get real-time OCR results with ease. Whether you're looking to identify | ||
people, read text, or describe scenes, AI-powered OCR can transform how you interact with video content. By | ||
leveraging these technologies, you can unlock new possibilities and insights from the videos you encounter | ||
every day. | ||
|
||
## Contact | ||
|
||
Welcome for more discussion at [discord](https://discord.gg/bQUPDRqy79). | ||
|
||
![](https://ossrs.io/gif/v1/sls.gif?site=ossrs.io&path=/lts/blog-en/24-05-20-OCR-Video-Streams) |
2 changes: 1 addition & 1 deletion
2
i18n/zh-cn/docusaurus-plugin-content-blog/2024-02-21-Dubbing-Translating.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
83 changes: 83 additions & 0 deletions
83
i18n/zh-cn/docusaurus-plugin-content-blog/2024-05-20-OCR-Video-Streams.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
--- | ||
slug: ocr-video-streams | ||
title: Oryx - 基于AI的视频流的OCR和对象识别 | ||
authors: [] | ||
tags: [ocr, ai, gpt, srs, oryx] | ||
custom_edit_url: null | ||
--- | ||
|
||
# Leveraging OpenAI for OCR and Object Recognition in Video Streams using Oryx | ||
|
||
## Introduction | ||
|
||
在当今的数字世界中,视频无处不在。从社交媒体片段到直播,我们每天都在大量消费视频内容。但你是否想过我们如何理解这些视频中的所有信息? | ||
这就是人工智能的作用。有了人工智能的帮助,我们现在可以识别文字、识别物体,甚至描述视频流中的场景。 | ||
|
||
<!--truncate--> | ||
|
||
一个强大的工具使这个过程变得简单,那就是Oryx。在这篇博客中,我们将探讨Oryx如何帮助你在视频流上执行OCR(光学字符识别), | ||
让你能够实时提取有价值的信息。 | ||
|
||
## Step 1: Create Oryx by One Click | ||
|
||
创建 Oryx 很简单,只需点击一下,如果您使用 Digital Ocean droplet,就可以完成。 | ||
请参阅[如何通过 1-Click 设置视频流服务](./2022-04-09-Oryx-Tutorial.md)了解详细信息。 | ||
|
||
您还可以使用 Docker 通过单个命令行创建 Oryx: | ||
|
||
```bash | ||
docker run --restart always -d -it --name oryx -v $HOME/data:/data \ | ||
-p 80:2022 -p 443:2443 -p 1935:1935 -p 8000:8000/udp -p 10080:10080/udp \ | ||
registry.cn-hangzhou.aliyuncs.com/ossrs/oryx:5 | ||
``` | ||
|
||
创建 Oryx 后,您可以通过 `http://your-server-ip/mgmt` 访问它。 | ||
|
||
## Step 2: Publish a Live Stream to Oryx | ||
|
||
您可以使用 OBS 或 FFmpeg 将直播流发布到 Oryx。您还可以设置 HTTPS 并通过 WebRTC 发布。 | ||
|
||
![](/img/blog-2024-05-20-11.png) | ||
|
||
发布流后,您可以使用 H5 播放器或 VLC 预览它。 | ||
请参阅[如何通过 1-Click 设置视频流服务](./2022-04-09-Oryx-Tutorial.md)了解详细信息。 | ||
|
||
## Step 3: Setup OpenAI Secret Key for OCR | ||
|
||
要使用 Whisper ASR,您必须从 OpenAI 获取一个密钥。请在您的浏览器中打开 [API 密钥](https://platform.openai.com/api-keys) | ||
页面,然后点击 `创建新的密钥` 按钮。密钥创建后,复制它并在 Oryx 中设置。然后,如下图所示,点击 `测试OpenAI服务可用性` | ||
按钮。 | ||
|
||
![](/img/blog-2024-05-20-12.png) | ||
|
||
如果测试成功,你可以点击 `开始OCR` 按钮来启动OCR过程。 | ||
|
||
## Step 4: Setup AI Instructions for OCR | ||
|
||
配置好你的GPT AI助手后,你可以在设置网页上更新以下提示`服务设置 > AI模型配置 > 提示词`。 | ||
|
||
![](/img/blog-2024-05-20-13.png) | ||
|
||
要在视频流中识别文本,你可以使用以下指令: | ||
|
||
```text | ||
Recognize the text in the image. Output the identified text directly. | ||
``` | ||
|
||
## Step 5: View OCR Results by Callback | ||
|
||
一旦OCR过程完成,你可以通过在Oryx中设置回调URL来查看结果。 | ||
|
||
![](/img/blog-2024-05-20-14.png) | ||
|
||
你也可以在仪表板中查看最新的OCR结果。 | ||
|
||
![](/img/blog-2024-05-20-15.png) | ||
|
||
## Conclusion | ||
|
||
总之,使用AI识别视频流中的文本和物体是一个改变游戏规则的技术。它帮助我们快速准确地从视频中提取有价值的信息。 | ||
像Oryx这样的工具使这个过程变得简单高效,让你能够轻松发布直播并获得实时OCR结果。无论你是想识别人、读取文本还是描述场景, | ||
AI驱动的OCR都可以改变你与视频内容的互动方式。通过利用这些技术,你可以从每天接触到的视频中解锁新的可能性和见解。 | ||
|
||
![](https://ossrs.net/gif/v1/sls.gif?site=ossrs.net&path=/lts/blog-zh/24-05-20-OCR-Video-Streams) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.