Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing story narrating experience by introducing audio in story books #1666

Open
Monu2114 opened this issue Oct 7, 2024 · 12 comments
Open

Comments

@Monu2114
Copy link

Monu2114 commented Oct 7, 2024

Add Read-Aloud Functionality for Storybooks

Description: It would be highly beneficial to introduce a "read-aloud" feature that allows children to listen to storybooks being narrated in a natural, human-like voice. This would enhance the user experience, especially for younger audiences who may prefer or require auditory learning. The goal is to provide a more immersive, humanized storytelling experience.

Proposed Implementation:

Text-to-Speech Integration:

We can leverage a Text-to-Speech (TTS) API to convert the story text into speech. These services offer natural-sounding voices that can mimic human narration.

Voice Customization:
Users can choose between different voice types (e.g., gender, accent) to cater to different preferences and languages.
User Interface:

Add a "Read Aloud" button on the storybook interface, which, when clicked, triggers the TTS engine to start narrating the story.
Include basic playback controls (play, pause, stop) for better user control.

Performance Considerations:
Caching the audio output for frequently accessed books to minimize API calls and improve performance.
Ensuring the feature runs smoothly across all supported platforms.
Benefits:

Enhances accessibility for children with visual impairments or reading difficulties.
Provides a comforting, human-like storytelling experience.
Supports auditory learners and adds a layer of engagement to the story-reading process.

@llaske
Copy link
Owner

llaske commented Oct 8, 2024

It could be a good idea. It's what Speak activity do so it could be interesting to have this feature on e-books.
BTW the implementation should respect two major rules of Sugarizer:

  • It should work offline
  • It should use only Free/Libre Open Source library/service.

@Monu2114
Copy link
Author

Monu2114 commented Oct 8, 2024 via email

@llaske
Copy link
Owner

llaske commented Oct 9, 2024

@Monu2114 there is no need to ask permission to work on an issue, everyone can work on anything. See here.
There is no plan for GSoC 2025 today and there is plenty of time before the next GSoC.

@Devmoni
Copy link

Devmoni commented Jan 31, 2025

@llaske
This issue has not been verified yet, even though the response seems positive.
Thanks.

@llaske
Copy link
Owner

llaske commented Feb 8, 2025

@Devmoni it will not be a GSoC 2025 project but it's a good idea if it could be done with constraints I've mentioned above.

@Devmoni
Copy link

Devmoni commented Feb 8, 2025

@Devmoni it will not be a GSoC 2025 project but it's a good idea if it could be done with constraints I've mentioned above.

Thanks for the feedback!
I’ll look into this and see how it can be implemented while considering the mentioned constraints.

@AliHassan245
Copy link

@llaske, Sir, I have been working on this issue for a while and have made significant progress.

I have addressed the requirement for offline functionality by ensuring that the solution relies only on Free/Libre Open-Source libraries and services.

For the Text-to-Speech (TTS) API integration, I have implemented a system that converts the entire story text from the iframe container into speech. Additionally, I have added a Read Aloud button to the toolbar such that clicking the button starts the speech narration, while double-clicking it stops the narration.

I would like to confirm if this approach is suitable for our use case. Also, could you please suggest an appropriate SVG icon for the Read Aloud button in the toolbar?

@Monu2114
Copy link
Author

Monu2114 commented Mar 11, 2025 via email

@AdityaKrSingh26
Copy link

AdityaKrSingh26 commented Mar 12, 2025

@llaske, Sir, I have been working on this issue for a while and have made significant progress.

I have addressed the requirement for offline functionality by ensuring that the solution relies only on Free/Libre Open-Source libraries and services.

For the Text-to-Speech (TTS) API integration, I have implemented a system that converts the entire story text from the iframe container into speech. Additionally, I have added a Read Aloud button to the toolbar such that clicking the button starts the speech narration, while double-clicking it stops the narration.

I would like to confirm if this approach is suitable for our use case. Also, could you please suggest an appropriate SVG icon for the Read Aloud button in the toolbar?

@AliHassan245 One you need to keep in mind while solving this issue is that Text-to-Speech need a server side therefore it will not be able to work offline.

@AliHassan245
Copy link

@AdityaKrSingh26, I've integrated the browser’s native SpeechSynthesis API into the codebase for smooth offline functionality using a completely free/open-source approach.

@AdityaKrSingh26
Copy link

@AdityaKrSingh26, I've integrated the browser’s native SpeechSynthesis API into the codebase for smooth offline functionality using a completely free/open-source approach.

One thing to note is that Sugarizer already includes the Speak activity. However, we need something more realistic here, and the browser's SpeechSynthesis API does not provide a sufficiently natural voice for narrating children's stories.

@llaske
Copy link
Owner

llaske commented Mar 12, 2025

@llaske, Sir, I have been working on this issue for a while and have made significant progress.

I have addressed the requirement for offline functionality by ensuring that the solution relies only on Free/Libre Open-Source libraries and services.

For the Text-to-Speech (TTS) API integration, I have implemented a system that converts the entire story text from the iframe container into speech. Additionally, I have added a Read Aloud button to the toolbar such that clicking the button starts the speech narration, while double-clicking it stops the narration.

I would like to confirm if this approach is suitable for our use case. Also, could you please suggest an appropriate SVG icon for the Read Aloud button in the toolbar?

Cool. I can't evaluate without PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants