Skip to content

Commit

Permalink
Add blog for OCR.
Browse files Browse the repository at this point in the history
  • Loading branch information
winlinvip committed May 20, 2024
1 parent a493212 commit 51c5cc8
Show file tree
Hide file tree
Showing 12 changed files with 189 additions and 1 deletion.
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
---
slug: ocr-video-streams
title: Oryx - Leveraging OpenAI for OCR and Object Recognition in Video Streams
authors: []
tags: [ocr, ai, gpt, srs, oryx]
custom_edit_url: null
---

# Leveraging OpenAI for OCR and Object Recognition in Video Streams using Oryx

## Introduction

In today's digital world, videos are everywhere. From social media clips to live broadcasts, we consume a
vast amount of video content daily. But have you ever wondered how we can make sense of all the information
in these videos? This is where AI comes in. With the help of artificial intelligence, we can now recognize
text, identify objects, and even describe scenes in video streams.

<!--truncate-->

One powerful tool that makes this process easy is Oryx. In this blog, we'll explore how Oryx can help you
perform OCR (Optical Character Recognition) on video streams, allowing you to extract valuable information
in real-time.

## Step 1: Create Oryx by One Click

Creating an Oryx is simple and can be done with just one click if you use Digital Ocean droplet.
Please see [How to Setup a Video Streaming Service by 1-Click](./2022-04-09-Oryx-Tutorial.md) for detail.

You can also use Docker to create an Oryx with a single command line:

```bash
docker run --restart always -d -it --name oryx -v $HOME/data:/data \
-p 80:2022 -p 443:2443 -p 1935:1935 -p 8000:8000/udp -p 10080:10080/udp \
ossrs/oryx:5
```

After creating the Oryx, you can access it through `http://your-server-ip/mgmt` via a browser.

## Step 2: Publish a Live Stream to Oryx

You can use OBS or FFmpeg to publish a live stream to Oryx. You can also set up HTTPS and publish via WebRTC.

![](/img/blog-2024-05-20-01.png)

Once the stream is published, you can preview it using an H5 player or VLC.
Please see [How to Setup a Video Streaming Service by 1-Click](./2022-04-09-Oryx-Tutorial.md) for detail.

## Step 3: Setup OpenAI Secret Key for OCR

To use OCR, you must obtain a secret key from OpenAI. Please open the [API keys](https://platform.openai.com/api-keys)
page in your browser and click the `Create new secret key` button. Once the key is created, copy it and set it in Oryx.
Then, click the `Test OpenAI Service` button, as shown in the picture below.

![](/img/blog-2024-05-20-02.png)

If the test is successful, you can click the `Start OCR` button to start the OCR process.

## Step 4: Setup AI Instructions for OCR

Once you've configured your GPT AI assistant, you can update the bellow prompt at the setting webpage
`Service Settings > AI Instructions > Instructions`.

![](/img/blog-2024-05-20-03.png)

To recognize text in video streams, you can use the following instructions:

```text
Recognize the text in the image. Output the identified text directly.
```

## Step 5: View OCR Results by Callback

Once the OCR process is complete, you can view the results by setting up a callback URL in Oryx.

![](/img/blog-2024-05-20-04.png)

You can also view the last OCR result in the dashboard.

![](/img/blog-2024-05-20-05.png)

## Cloud Service

At SRS, our goal is to establish a non-profit, open-source community dedicated to creating an all-in-one,
out-of-the-box, open-source video solution for live streaming and WebRTC online services.

Additionally, we offer a [Cloud](../cloud) service for those who prefer to use cloud service instead of building from
scratch. Our cloud service features global network acceleration, enhanced congestion control algorithms,
client SDKs for all platforms, and some free quota.

To learn more about our cloud service, click [here](../cloud).

## Conclusion

In conclusion, using AI to recognize text and objects in video streams is a game-changer. It helps us quickly
and accurately extract valuable information from videos. Tools like Oryx make this process simple and efficient,
allowing you to publish live streams and get real-time OCR results with ease. Whether you're looking to identify
people, read text, or describe scenes, AI-powered OCR can transform how you interact with video content. By
leveraging these technologies, you can unlock new possibilities and insights from the videos you encounter
every day.

## Contact

Welcome for more discussion at [discord](https://discord.gg/bQUPDRqy79).

![](https://ossrs.io/gif/v1/sls.gif?site=ossrs.io&path=/lts/blog-en/24-05-20-OCR-Video-Streams)
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
slug: dubbing-translating
title: Oryx - 视频多语言翻译和配音
title: Oryx - 基于AI的视频多语言翻译和配音
authors: []
tags: [dubbing, translating, ai, gpt, voice, srs, oryx, multilingual]
custom_edit_url: null
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
---
slug: ocr-video-streams
title: Oryx - 基于AI的视频流的OCR和对象识别
authors: []
tags: [ocr, ai, gpt, srs, oryx]
custom_edit_url: null
---

# Leveraging OpenAI for OCR and Object Recognition in Video Streams using Oryx

## Introduction

在当今的数字世界中,视频无处不在。从社交媒体片段到直播,我们每天都在大量消费视频内容。但你是否想过我们如何理解这些视频中的所有信息?
这就是人工智能的作用。有了人工智能的帮助,我们现在可以识别文字、识别物体,甚至描述视频流中的场景。

<!--truncate-->

一个强大的工具使这个过程变得简单,那就是Oryx。在这篇博客中,我们将探讨Oryx如何帮助你在视频流上执行OCR(光学字符识别),
让你能够实时提取有价值的信息。

## Step 1: Create Oryx by One Click

创建 Oryx 很简单,只需点击一下,如果您使用 Digital Ocean droplet,就可以完成。
请参阅[如何通过 1-Click 设置视频流服务](./2022-04-09-Oryx-Tutorial.md)了解详细信息。

您还可以使用 Docker 通过单个命令行创建 Oryx:

```bash
docker run --restart always -d -it --name oryx -v $HOME/data:/data \
-p 80:2022 -p 443:2443 -p 1935:1935 -p 8000:8000/udp -p 10080:10080/udp \
registry.cn-hangzhou.aliyuncs.com/ossrs/oryx:5
```

创建 Oryx 后,您可以通过 `http://your-server-ip/mgmt` 访问它。

## Step 2: Publish a Live Stream to Oryx

您可以使用 OBS 或 FFmpeg 将直播流发布到 Oryx。您还可以设置 HTTPS 并通过 WebRTC 发布。

![](/img/blog-2024-05-20-11.png)

发布流后,您可以使用 H5 播放器或 VLC 预览它。
请参阅[如何通过 1-Click 设置视频流服务](./2022-04-09-Oryx-Tutorial.md)了解详细信息。

## Step 3: Setup OpenAI Secret Key for OCR

要使用 Whisper ASR,您必须从 OpenAI 获取一个密钥。请在您的浏览器中打开 [API 密钥](https://platform.openai.com/api-keys)
页面,然后点击 `创建新的密钥` 按钮。密钥创建后,复制它并在 Oryx 中设置。然后,如下图所示,点击 `测试OpenAI服务可用性`
按钮。

![](/img/blog-2024-05-20-12.png)

如果测试成功,你可以点击 `开始OCR` 按钮来启动OCR过程。

## Step 4: Setup AI Instructions for OCR

配置好你的GPT AI助手后,你可以在设置网页上更新以下提示`服务设置 > AI模型配置 > 提示词`

![](/img/blog-2024-05-20-13.png)

要在视频流中识别文本,你可以使用以下指令:

```text
Recognize the text in the image. Output the identified text directly.
```

## Step 5: View OCR Results by Callback

一旦OCR过程完成,你可以通过在Oryx中设置回调URL来查看结果。

![](/img/blog-2024-05-20-14.png)

你也可以在仪表板中查看最新的OCR结果。

![](/img/blog-2024-05-20-15.png)

## Conclusion

总之,使用AI识别视频流中的文本和物体是一个改变游戏规则的技术。它帮助我们快速准确地从视频中提取有价值的信息。
像Oryx这样的工具使这个过程变得简单高效,让你能够轻松发布直播并获得实时OCR结果。无论你是想识别人、读取文本还是描述场景,
AI驱动的OCR都可以改变你与视频内容的互动方式。通过利用这些技术,你可以从每天接触到的视频中解锁新的可能性和见解。

![](https://ossrs.net/gif/v1/sls.gif?site=ossrs.net&path=/lts/blog-zh/24-05-20-OCR-Video-Streams)
Binary file added static/img/blog-2024-05-20-01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/blog-2024-05-20-02.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/blog-2024-05-20-03.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/blog-2024-05-20-04.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/blog-2024-05-20-05.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/blog-2024-05-20-12.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/blog-2024-05-20-13.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/blog-2024-05-20-14.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added static/img/blog-2024-05-20-15.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 51c5cc8

Please sign in to comment.