The Research Assistant is a powerful research assistant powered by the Gemini Pro generative model. It streamlines the process of gathering information from web pages, breaking them down into smaller, digestible chunks, and embedding them into vectors for quick retrieval. When users pose questions, the Research Assistant leverages its capabilities to retrieve relevant context, generate prompts, and provide concise answers.
-
Clone the repository:
git clone https://github.com/Devparihar5/InfoGenie.git
-
Navigate to the project directory:
cd InfoGenie
-
Create a virtual environment (optional but recommended):
python -m venv venv
-
Activate the virtual environment:
- On Windows:
.\venv\Scripts\activate
- On macOS and Linux:
source venv/bin/activate
- On Windows:
-
Install dependencies from requirements.txt:
pip install -r requirements.txt
To use the Research Assistant, follow these steps:
-
Run the main script:
streamlit run main.py
-
Input your query or question when prompted.
-
The Research Assistant will process your query, retrieve relevant information from web pages, generate a prompt, and provide you with a concise answer.
-
Continue to ask questions or end the session when you're done.
Gemini Pro serves as the core of the Research Assistant, enabling it to understand and generate human-like text responses based on the provided context and user queries.
The URL Loader component fetches web pages specified by users, allowing the Research Assistant to extract information from diverse online sources.
The Recursive Text Splitter breaks down the fetched web pages into smaller, more manageable chunks, facilitating efficient processing and analysis.
GoogleGenerativeAIEmbeddings transforms the processed text chunks into numerical vectors, enabling the Research Assistant to perform similarity comparisons and retrieve relevant information quickly.
FAISS (Facebook AI Similarity Search) is utilized for indexing the embedded vectors, enabling fast and efficient retrieval of relevant context and information.
The Langchain pipeline orchestrates the entire workflow of the Research Assistant, from loading and processing data to generating prompts and providing answers to user queries.
-
Check User Query: The Research Assistant examines the user's query to understand the information being sought.
-
Load FAISS Index: The indexed vectors containing information from fetched web pages are loaded into memory for quick access.
-
Define Prompt Template: A template for generating prompts based on the user query and context is defined.
-
Retrieve Context: Relevant context and information related to the user query are retrieved from the indexed vectors.
-
Generate Prompt: Using the retrieved context, the Research Assistant generates a concise prompt tailored to address the user's query.
-
Provide Answer: Finally, the Research Assistant provides the user with a well-informed answer based on the generated prompt.