A tool that bridges the gap between LLMs and the internet.
WebWeaver determines if a query requires internet access and, if so, performs searches using SearxNG or DuckDuckGo, scrapes, parses, and stores the data in a vector database for more informed LLM responses.
With the rise of local LLMs, their capabilities are often limited to pre-existing knowledge without real-time internet access. WebWeaver solves this problem by enabling LLMs to fetch up-to-date information, enhancing their accuracy and utility in providing answers to user queries.
- Query Analysis: The tool first evaluates whether internet access is required for the given query.
- Web Search: If needed, it conducts a search using two methods:
- SearxNG: A self-hosted, privacy-focused search engine.
- DuckDuckGo API: A search engine with strong privacy policies.
- Web Scraping: Scrapes and parses the search results for relevant data.
- Data Storage: Stores scraped data in a vector database for future use.
- LLM Integration: The local LLM uses the stored data to generate more accurate and informed responses.
First, clone the WebWeaver repository and install the necessary dependencies:
git clone https://github.com/agkavin/WebWeaver.git
pip install -r requirements.txt
-
Clone the SearxNG Docker repository:
git clone https://github.com/searxng/searxng-docker.git cd searxng-docker/
-
Generate a secret key:
openssl rand -hex 32
-
Add the generated key to the
secret_key
placeholder in thesearxng/settings.yml
file. -
Run the following command to start the SearxNG server:
docker compose up -d