Skip to content

Releases: presidio-oss/factif-ai

v1.3.0

18 Mar 09:56
e03294b
Compare
Choose a tag to compare

What's Changed

Features

  • Integrated OmniParser v2: Delivers enriched, annotated screenshots to the LLM, enabling more informed decision-making for task execution
  • Automated Playwright Installation: Playwright binaries now automatically install after npm install, streamlining the setup process.
  • Factifai Logo Added: Improved visual identity with the addition of the Factifai logo.
  • OpenAI Support for Explore Mode: Explore mode now supports OpenAI models, expanding LLM options and capabilities.
  • Chat History and Persistence: Added chat history tracking with file storage persistence, allowing users to revisit previous conversations.

Bug Fixes

  • LLM Context Isolation: Resolved context contamination between different operating modes, ensuring accurate and isolated responses.
  • Chat Context Management: Implemented context management to prevent exceeding LLM token limits on complex websites.
  • Explore Mode Graph Fix: Corrected a bug causing incorrect graph rendering in explore mode.
  • Seamless VNC Mode Switching: Resolved issues with VNC mode switching, ensuring a smoother user experience.

Enhancements

  • UX Improvements: General UX enhancements implemented to improve usability and overall user experience.

v1.2.0

11 Mar 08:02
ed32973
Compare
Choose a tag to compare

What's Changed

  • Explore Chat: Specialized chat interface for Click-through exploration of interconnected web content of a website. #11
  • Graph View: Visual representation of web pages and their relationships for easier navigation and understanding of site structure #11
  • Page Node System: Interactive page nodes that display content and allow navigation between related pages #11
  • Recent Chats: Easy access to previous explore mode conversations #11

New Contributors

v1.1.0

03 Mar 08:13
4fb1b47
Compare
Choose a tag to compare

Release Notes

Enhancements

  • Added browser-centric approach on the puppeteer mode. (#9)
  • General improvements & bugfixes (#9)

v1.0.0

03 Mar 08:15
7359d94
Compare
Choose a tag to compare

Release Notes

Built-in support for leading vision-language models:

  • Claude: Anthropic's advanced vision and reasoning model
  • OpenAI: GPT-4o with visual understanding capabilities
  • Gemini: Google's multimodal AI for computer interaction
  • OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent

AI-Powered Computer Control

  • Intelligent element detection and navigation
  • Automated verification and validation
  • Comprehensive test documentation with automated screenshot capture for each step
  • Integrated test case export with visual step-by-step documentation