A newer version of the Gradio SDK is available:
6.2.0
title: web search MCP-server
sdk: gradio
colorFrom: green
colorTo: green
short_description: MCP server for general and custom search on web
sdk_version: 5.34.0
tags:
- mcp-server-track
app_file: app.py
pinned: true
Search Tool
Overview
Search Tool is a modular Python framework for performing advanced web searches, scraping content from search results, and analyzing the retrieved information using AI-powered models. The project is designed for extensibility, allowing easy integration of new search engines, scrapers, and analyzers.
Demo video
Link: https://drive.google.com/file/d/11bHRCr0tdAkCEtwKOiuzzfAp7RgZk-si/view?usp=sharing
Features
- Custom Site Search: Search within a specified list of websites.
- Custom Domain Search: Restrict searches to specific domains (e.g.,
.edu,.gov). - General Web Search: Perform open web searches.
- Content Scraping: Extracts main textual content from URLs using trafilatura.
- AI Analysis: Summarizes and analyzes scraped content using OpenAI models.
- Validation: Ensures URLs are valid before processing.
- Extensible Architecture: Easily add new searchers, scrapers, or analyzers.
Project Structure
search_tool/
βββ src/
β βββ analyzer/ # AI-powered analyzers (e.g., OpenAI)
β βββ core/
β β βββ factory/ # Factories for searcher, scraper,
β β βββ interface/ # Abstract interfaces for extensibility
β β βββ types.py # Enums and constants
β βββ mcp_servers/ # MCP server integration
β βββ models/ # Pydantic models for data validation
β βββ scraper/ # Web scrapers (e.g., Trafilatura)
β βββ searcher/ # Search engine integrations
β βββ tools/ # User-facing tool functions
β βββ utils/ # Utility functions (e.g., URL validation)
βββ test.py # Example/test script
βββ requirements.txt # Python dependencies
βββ pyproject.toml # Project metadata and dependencies
βββ .env # Environment variables (e.g., API keys)
βββ README.md # Project documentation
Installation
Clone the repository:
git clone https://github.com/ola172/web-search-mcp-server.git cd search_toolSet up a virtual environment (recommended):
python3 -m venv .venv source .venv/bin/activateInstall dependencies:
pip install -r requirements.txtConfigure environment variables:
- Copy
.env.exampleto.env - Add your secrets:
- Copy
Usage
Core Tools
Each tool validates input, performs the search, scrapes the results, and analyzes the content.
- General Web Search:
search_on_web - Custom Sites Search:
search_custom_sites - Custom Domains Search:
search_custom_domain
MCP Server Integration
The project includes an MCP server (web_search_server.py) for exposing search tools as mcp tools.
Extending the Framework
- Add a new searcher: Implement the
SearchInterfaceand register it inSearcherFactory. - Add a new scraper: Implement the
ScraperInterfaceand register it inScraperFactory. - Add a new analyzer: Implement the
AnalyzerInterfaceand register it inAnalyzerFactory.
Configuration
- API Keys: Store sensitive keys (e.g., OpenAI) in the
.envfile. - Search Engine IDs: For Google Custom Search, configure
API_KEYandSEARCH_ENGINE_IDin the relevant modules.
Dependencies
openaitrafilaturapydanticgooglesearch-pythonpython-dotenvgoogle-api-python-client
See requirements.txt for the full list.
License
This project is for educational and research purposes. Please ensure compliance with the terms of service of any third-party APIs used.
Acknowledgements
- OpenAI
- Trafilatura
- Google Custom Search
For questions or contributions, please open an issue or pull request.