Some of the prod code from kumori.ai that will help others, perhaps...
-
Image cleanup: This script provides image enhancement and super-resolution processing on a directory of images, offering features like sharpness adjustments, contrast and color enhancements, and facial detection based improvements. It includes a progress tracking feature that provides time estimates for completion.
-
Enhanced Kohya Image Captioning with OpenAI's Vision Model for Loras/A1111: A Python script designed to generate human-centric, lifelike captions, leveraging the depth of OpenAI's vision model for nuanced and respectful portrayals.
-
MP4 Maker: Image & Video Generator: Utilizes generative models like OpenAI's DALL·E and GPT for creating captivating videos from textual descriptions, enhancing content creation with high-quality images, tailored captions, and matching RFM audio tracks for AI-generated videos.
- Face Recognition: A collection of facial recognition and preprocessing scripts for accurately detecting a face in an image, preprocessing that image, adding or removing items (hat.png in this example) from detected humans, and then saving the processed face for further analysis.
-
Gmail Utilities: A Python script that facilitates the sending of emails through Gmail. It includes attachment support, access to Google Secret Manager for retrieving Gmail credentials, and a test case to demonstrate its usage.
-
LinkedIn Post App: A Flask-based web application that allows users to authenticate with LinkedIn, and post textual and image content to their LinkedIn timeline. This app uses OAuth protocol for LinkedIn integration, stores client id and secret via Google Secret Manager, and facilitates image uploads.
-
LinkedIn Utilities: A Helper script for the LinkedIn Post app, containing all the necessary utility functions for generating LinkedIn auth URLs, processing callbacks, fetching access tokens and user info, and handling LinkedIn posts including image uploads. Client id and secret are securely stored and retrieved from Google Secret store.
-
YouTube Video Frame Processing: Scripts for downloading a YouTube video, parsing it into frames, applying generative AI transformations, and then optimizing these frames. A detailed sequence performs downloading, frame extraction, and AI-based image styling to enhance and create new variations of the original video content.
-
YouTube Video Transcription and Summarization: Automated tools for transcribing YouTube videos and summarizing the content. Leveraging YouTube's transcript API and OpenAI's GPT models, these scripts provide a way to extract written content from videos, then summarize and highlight key points, facilitating content analysis and repurposing.
-
Ollama Model Chat Interface: Includes both a Python script and a Flask web app to chat with various AI models like Llama2, Gemma 2B, Gemma 7B, and Mistral. The interface allows for real-time AI conversations through seamless terminal prompts or a browser-based chat environment. It offers a versatile platform for integrations and development of AI-driven chat solutions.