MCP Integration
================

MassGen supports the Model Context Protocol (MCP) for standardized tool integration. MCP enables agents to use external tools through a unified interface.

What is MCP?
------------

From the official documentation:

   MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools.

MassGen integrates MCP through YAML configuration, allowing agents to access tools like:

* Web search (Brave, Google)
* Weather services
* File operations
* Browser automation (Playwright)
* Discord, Twitter, Notion APIs
* And many more MCP servers

Quick Start
-----------

**Single MCP tool (weather):**

.. code-block:: bash

   massgen \
     --config @examples/tools/mcp/gpt5_nano_mcp_example.yaml \
     "What's the weather forecast for New York this week?"

**Multiple MCP tools:**

.. code-block:: bash

   massgen \
     --config @examples/tools/mcp/multimcp_gemini.yaml \
     "Find the best restaurants in Paris and save the recommendations to a file"

Backend Support
---------------

MCP integration is available for most MassGen backends. For the complete backend capabilities matrix including MCP support status, see :doc:`../backends`.

**Backends with MCP Support:**

* ✅ Claude API - Full MCP integration
* ✅ Claude Code - Native MCP + file tools
* ✅ Gemini API - Full MCP integration with planning mode
* ✅ Grok API - Full MCP integration
* ✅ OpenAI API - Full MCP integration
* ✅ Z AI - MCP integration available
* ❌ Azure OpenAI - Not yet supported

See :doc:`../backends` for detailed backend capabilities and feature comparison.

Configuration
-------------

Basic MCP Setup
~~~~~~~~~~~~~~~

Add MCP servers to your agent's backend configuration:

.. code-block:: yaml

   agents:
     - id: "agent_with_mcp"
       backend:
         type: "openai"              # Your backend choice
         model: "gpt-5-mini"         # Your model choice

         # Add MCP servers here
         mcp_servers:
           - name: "weather"         # Server name (you choose this)
             type: "stdio"           # Communication type
             command: "npx"          # Command to run
             args: ["-y", "@modelcontextprotocol/server-weather"]

That's it! The agent can now check weather.

MCP Transport Types
~~~~~~~~~~~~~~~~~~~

**stdio (Standard Input/Output)**

Most MCP servers use stdio transport:

.. code-block:: yaml

   mcp_servers:
     - name: "weather"
       type: "stdio"                # stdio transport
       command: "npx"               # Command to launch server
       args: ["-y", "@modelcontextprotocol/server-weather"]

**streamable-http (HTTP/SSE)**

Some MCP servers use HTTP with Server-Sent Events:

.. code-block:: yaml

   mcp_servers:
     - name: "custom_api"
       type: "streamable-http"      # HTTP transport
       url: "http://localhost:8080/mcp/sse"

Configuration Parameters
~~~~~~~~~~~~~~~~~~~~~~~~

.. list-table::
   :header-rows: 1
   :widths: 25 15 60

   * - Parameter
     - Required
     - Description
   * - ``name``
     - Yes
     - Unique name for the MCP server
   * - ``type``
     - Yes
     - Transport: ``"stdio"`` or ``"streamable-http"``
   * - ``command``
     - stdio only
     - Command to run the MCP server
   * - ``args``
     - stdio only
     - Arguments for the command
   * - ``url``
     - http only
     - Server endpoint URL
   * - ``env``
     - No
     - Environment variables to pass

Variable Substitution
~~~~~~~~~~~~~~~~~~~~~

MassGen supports variable substitution in MCP configurations:

**Built-in Variables:**

* ``${cwd}`` - Replaced with the agent's working directory (from ``backend.cwd``)
* Works anywhere in the backend config (``args``, ``env``, etc.)

**Environment Variables:**

* Use ``${VARIABLE_NAME}`` syntax (must be UPPERCASE)
* Resolved from your ``.env`` file or system environment
* Work in both ``args`` and ``env`` parameters

.. code-block:: yaml

   mcp_servers:
     - name: "playwright"
       type: "stdio"
       command: "npx"
       args:
         - "@playwright/mcp@latest"
         - "--output-dir=${cwd}"                # Built-in: agent's working directory
         - "--user-data-dir=${cwd}/profile"
       env:
         API_KEY: "${API_KEY}"                  # Environment variable from .env file

**Important:**

* ``${cwd}`` is lowercase and refers to the agent's working directory
* Environment variables must be UPPERCASE (e.g., ``${API_KEY}``, ``${BRAVE_API_KEY}``)
* Both systems work together but are resolved separately

Recommended MCP Servers (Registry)
-----------------------------------

MassGen includes a curated registry of recommended MCP servers that are automatically available when auto-discovery is enabled. These servers have been tested and provide essential capabilities for agent workflows.

**Registry Servers:**

* **Context7** - Up-to-date code documentation for libraries and frameworks
* **Brave Search** - Web search via Brave API (requires API key)

See :doc:`../../reference/mcp_server_registry` for complete documentation of all registry servers, including configuration examples, API key setup, and usage patterns.

Auto-Discovery
~~~~~~~~~~~~~~

Enable automatic inclusion of registry MCP servers:

.. code-block:: yaml

   agents:
     - id: "research_agent"
       backend:
         type: "gemini"
         model: "gemini-2.5-flash"
         auto_discover_custom_tools: true  # Automatically adds registry servers!

**Behavior:**

* **Context7**: Always included (no API key required)
* **Brave Search**: Only included if ``BRAVE_API_KEY`` is set in ``.env``

**Log Output Example:**

.. code-block:: text

   [gemini] Auto-discovery enabled: Added MCP servers from registry: context7
   [gemini] Registry servers not added (missing API keys): brave_search (needs BRAVE_API_KEY)

**Benefits:**

* No manual configuration needed for recommended servers
* Servers are only included if API keys are available
* Avoids duplicates if you manually configure a registry server
* Easy to get started with powerful tools

**Example Configurations:**

* ``massgen/configs/tools/mcp/auto_discovery_with_registry.yaml`` - Auto-discovery example
* ``massgen/configs/tools/mcp/context7_documentation_example.yaml`` - Context7 usage
* ``massgen/configs/tools/mcp/brave_search_example.yaml`` - Brave Search usage

Manual Configuration
~~~~~~~~~~~~~~~~~~~~

You can still manually configure any registry server without auto-discovery:

.. code-block:: yaml

   agents:
     - id: "my_agent"
       backend:
         type: "claude"
         model: "claude-sonnet-4"
         mcp_servers:
           - name: "context7"
             type: "stdio"
             command: "npx"
             args: ["-y", "@upstash/context7-mcp"]

This gives you full control over which servers to include and their configuration.

Common MCP Servers
------------------

Weather
~~~~~~~

.. code-block:: yaml

   mcp_servers:
     - name: "weather"
       type: "stdio"
       command: "npx"
       args: ["-y", "@modelcontextprotocol/server-weather"]

Web Search (Brave)
~~~~~~~~~~~~~~~~~~

Requires ``BRAVE_API_KEY`` in your ``.env`` file:

.. code-block:: yaml

   mcp_servers:
     - name: "search"
       type: "stdio"
       command: "npx"
       args: ["-y", "@modelcontextprotocol/server-brave-search"]
       env:
         BRAVE_API_KEY: "${BRAVE_API_KEY}"

Playwright (Browser Automation)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Enables browser automation with screenshot and PDF capabilities:

.. code-block:: yaml

   mcp_servers:
     playwright:
       type: "stdio"
       command: "npx"
       args:
         - "@playwright/mcp@latest"
         - "--browser=chrome"              # Browser choice (chrome, firefox, webkit)
         - "--caps=vision,pdf"             # Enable vision and PDF capabilities
         - "--output-dir=${cwd}"           # Save screenshots/PDFs to workspace
         - "--user-data-dir=${cwd}/playwright-profile"  # Persistent browser profile

**Advanced Options:**

* ``--browser`` - Browser to use: ``chrome``, ``firefox``, or ``webkit``
* ``--caps`` - Capabilities: ``vision`` (screenshots), ``pdf`` (PDF generation)
* ``--output-dir`` - Directory for saving screenshots and PDFs
* ``--user-data-dir`` - Persistent browser profile directory
* ``--save-trace`` - Save Playwright traces for debugging (uncomment to enable)

Discord
~~~~~~~

Requires Discord bot token. See `Discord MCP Setup Guide <https://github.com/Leezekun/MassGen/blob/main/massgen/configs/docs/DISCORD_MCP_SETUP.md>`_:

.. code-block:: yaml

   mcp_servers:
     - name: "discord"
       type: "stdio"
       command: "npx"
       args: ["-y", "@modelcontextprotocol/server-discord"]
       env:
         DISCORD_BOT_TOKEN: "${DISCORD_BOT_TOKEN}"

Twitter
~~~~~~~

Requires Twitter API credentials. See `Twitter MCP Setup Guide <https://github.com/Leezekun/MassGen/blob/main/massgen/configs/docs/TWITTER_MCP_ENESCINAR_SETUP.md>`_:

.. code-block:: yaml

   mcp_servers:
     - name: "twitter"
       type: "stdio"
       command: "npx"
       args: ["-y", "mcp-server-twitter-unofficial"]
       env:
         TWITTER_USERNAME: "${TWITTER_USERNAME}"
         TWITTER_PASSWORD: "${TWITTER_PASSWORD}"

Multiple MCP Servers
--------------------

Agents can use multiple MCP servers simultaneously:

.. code-block:: yaml

   agents:
     - id: "multi_tool_agent"
       backend:
         type: "gemini"
         model: "gemini-2.5-flash"
         mcp_servers:
           # Web search
           - name: "search"
             type: "stdio"
             command: "npx"
             args: ["-y", "@modelcontextprotocol/server-brave-search"]
             env:
               BRAVE_API_KEY: "${BRAVE_API_KEY}"

           # Weather data
           - name: "weather"
             type: "stdio"
             command: "npx"
             args: ["-y", "@modelcontextprotocol/server-weather"]

The agent can use all tools together. For example: "Search for weather apps and check the weather in Paris"

.. note::
   **File operations** are handled automatically via the ``cwd`` parameter in your backend configuration. You don't need to add a filesystem MCP server manually.

Tool Filtering
--------------

Control which MCP tools are available to agents.

Backend-Level Filtering
~~~~~~~~~~~~~~~~~~~~~~~

Exclude specific tools at the backend level:

.. code-block:: yaml

   backend:
     type: "openai"
     model: "gpt-4o-mini"
     exclude_tools:
       - mcp__discord__discord_send_webhook_message  # Exclude dangerous tools
     mcp_servers:
       - name: "discord"
         type: "stdio"
         command: "npx"
         args: ["-y", "@modelcontextprotocol/server-discord"]

MCP-Server-Specific Filtering
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Override with allowed tools per MCP server:

.. code-block:: yaml

   backend:
     type: "openai"
     model: "gpt-4o-mini"
     mcp_servers:
       - name: "discord"
         type: "stdio"
         command: "npx"
         args: ["-y", "@modelcontextprotocol/server-discord"]
         allowed_tools:  # Whitelist specific tools
           - mcp__discord__discord_read_messages
           - mcp__discord__discord_send_message

Merged Exclusions
~~~~~~~~~~~~~~~~~

``exclude_tools`` from both backend and MCP server configs are combined:

.. code-block:: yaml

   backend:
     exclude_tools:
       - mcp__discord__send_webhook  # Backend-level exclusion
     mcp_servers:
       - name: "discord"
         exclude_tools:
           - mcp__discord__delete_channel  # MCP-level exclusion
         # Both tools are excluded

MCP Planning Mode
-----------------

**NEW in v0.1.2**: Intelligent LLM-based tool filtering automatically detects and blocks irreversible operations during coordination.

Planning mode prevents irreversible actions during multi-agent coordination by intelligently analyzing your question and blocking tools with side effects.

How It Works
~~~~~~~~~~~~

**Without planning mode:**

1. All agents execute MCP tools during coordination
2. Risk of duplicate or premature actions
3. Example: Multiple agents posting to Discord

**With planning mode (v0.1.2):**

1. **LLM Analysis**: Question is analyzed to detect irreversible operations
2. **Automatic Blocking**: Tools with side effects are automatically blocked during coordination
3. **Coordination**: Agents plan and discuss with read-only tools available
4. **Execution**: Winning agent executes the plan with full tool access

**Example Analysis Output:**

.. code-block:: text

   ╭─ Coordination Mode ────────────────────────────────────────╮
   │ 🧠 Planning Mode: ENABLED                                  │
   │                                                            │
   │ Agents will plan and coordinate without executing         │
   │ irreversible actions. The winning agent will implement    │
   │ the plan during final presentation.                       │
   │                                                            │
   │ 🚫 Blocked Tools:                                          │
   │   1. mcp__discord__discord_send_message                    │
   │                                                            │
   │ 📊 Analysis:                                               │
   │   Post a summary of recent AI discussions to Discord      │
   ╰────────────────────────────────────────────────────────────╯

The LLM identifies which tools have irreversible side effects (like sending messages) and blocks them during coordination, while keeping read-only tools (like reading messages) available.

Configuration
~~~~~~~~~~~~~

Enable planning mode in orchestrator config - the LLM analysis happens automatically:

.. code-block:: yaml

   orchestrator:
     coordination:
       enable_planning_mode: true
       planning_mode_instruction: |
         PLANNING MODE ACTIVE: You are currently in the coordination phase.
         During this phase:
         1. Describe your intended actions and reasoning
         2. Analyze other agents' proposals
         3. Use only 'vote' or 'new_answer' tools for coordination
         4. Read-only tools are available, but write operations are blocked
         5. Save execution for final presentation phase

When ``enable_planning_mode: true`` is set:

1. **Automatic Analysis**: An LLM analyzes your question before coordination starts
2. **Smart Blocking**: Only tools with irreversible side effects are blocked
3. **Read-Only Access**: Agents can still use read tools (e.g., ``discord_get_messages``)
4. **Visual Feedback**: A UI box shows what's blocked and why

No manual tool filtering needed - the system intelligently determines what to block based on your specific question.

Example Configuration
~~~~~~~~~~~~~~~~~~~~~

.. code-block:: yaml

   agents:
     - id: "gemini_discord_agent"
       backend:
         type: "gemini"
         model: "gemini-2.5-flash"
         mcp_servers:
           - name: "discord"
             type: "stdio"
             command: "npx"
             args: ["-y", "mcp-discord"]
             env:
               DISCORD_TOKEN: "${DISCORD_TOKEN}"
             security:
               level: "high"

     - id: "openai_discord_agent"
       backend:
         type: "openai"
         model: "gpt-4o-mini"
         mcp_servers:
           - name: "discord"
             type: "stdio"
             command: "npx"
             args: ["-y", "mcp-discord"]
             env:
               DISCORD_TOKEN: "${DISCORD_TOKEN}"

   orchestrator:
     snapshot_storage: "snapshots"
     agent_temporary_workspace: "temp_workspaces"
     coordination:
       enable_planning_mode: true
       planning_mode_instruction: |
         PLANNING MODE ACTIVE: Coordination phase - plan only.
         Read-only operations are allowed (reading messages, files).
         DO NOT execute write operations - those are blocked.

Usage
~~~~~

.. code-block:: bash

   # Five agents with planning mode (no execution during coordination)
   massgen \
     --config @examples/tools/planning/five_agents_filesystem_mcp_planning_mode.yaml \
     "Create a comprehensive project structure with documentation"

**What happens:**

1. **Coordination phase** → Agents discuss and plan file structure
2. **Voting** → Agents vote for best plan
3. **Final presentation** → Winning agent **executes** the plan

Multi-Backend Support
~~~~~~~~~~~~~~~~~~~~~

Planning mode works across:

* Response API (Claude)
* Chat Completions (OpenAI, Grok, etc.)
* Gemini with session-based tool execution

Complete Example
----------------

Full configuration with multiple MCP servers and planning mode:

.. code-block:: yaml

   agents:
     - id: "research_agent"
       backend:
         type: "gemini"
         model: "gemini-2.5-flash"
         mcp_servers:
           # Web search
           - name: "search"
             type: "stdio"
             command: "npx"
             args: ["-y", "@modelcontextprotocol/server-brave-search"]
             env:
               BRAVE_API_KEY: "${BRAVE_API_KEY}"
             allowed_tools:
               - mcp__search__brave_web_search

           # Weather
           - name: "weather"
             type: "stdio"
             command: "npx"
             args: ["-y", "@modelcontextprotocol/server-weather"]

     - id: "analyst_agent"
       backend:
         type: "openai"
         model: "gpt-5-nano"
         # File operations handled via cwd parameter

   orchestrator:
     coordination:
       enable_planning_mode: true
       planning_mode_instruction: |
         PLANNING MODE: Describe your intended tool usage.
         Do not execute tools during coordination.

   ui:
     display_type: "rich_terminal"
     logging_enabled: true

Security Considerations
-----------------------

1. **Tool Filtering** - Use ``allowed_tools`` and ``exclude_tools`` to limit capabilities
2. **Planning Mode** - Enable for tasks with irreversible actions
3. **Environment Variables** - Store API keys in ``.env``, never in config files
4. **Path Restrictions** - Limit filesystem server to specific directories
5. **Review Permissions** - Check what each MCP server can do before enabling

Troubleshooting
---------------

**MCP server not found:**

Ensure the MCP server package is installed:

.. code-block:: bash

   npx -y @modelcontextprotocol/server-weather

**Tools not appearing:**

* Check backend MCP support (see table above)
* Verify ``mcp_servers`` configuration
* Check for tool filtering (``allowed_tools``, ``exclude_tools``)

**Environment variables not working:**

.. code-block:: bash

   # Set in .env file
   BRAVE_API_KEY=your_key_here

   # Reference in config
   env:
     BRAVE_API_KEY: "${BRAVE_API_KEY}"

**Planning mode not working:**

* Check ``enable_planning_mode: true`` in orchestrator config
* Look for the UI box showing analysis results at the start of coordination
* If the box says "Planning Mode: DISABLED", the LLM didn't detect irreversible operations
* Review logs to see what tools the LLM identified as blocked

**Planning mode blocking too many/few tools:**

* The LLM automatically analyzes your question to determine what to block
* If too restrictive: Rephrase your question to emphasize read-only operations
* If not restrictive enough: Make your question more explicit about write operations
* The analysis UI box shows exactly what was blocked and why

**Want to see the analysis:**

The UI box appears automatically before coordination starts when planning mode is enabled.

Next Steps
----------

* :doc:`../files/file_operations` - Filesystem MCP integration
* :doc:`../files/project_integration` - Using MCP with context paths
* :doc:`../sessions/multi_turn_mode` - MCP in interactive sessions
* :doc:`../../quickstart/running-massgen` - More examples
* `MCP Server Registry <https://github.com/modelcontextprotocol/servers>`_ - Browse available MCP servers