MassGen Logo
Multi-Agent Threads
Scaling AI Through Multi-Agent Collaboration
๐ŸŒ massgen.ai | GitHub
๐Ÿšซ

The Single-Agent Limitation

  • Siloed Thinking: Single models miss diverse perspectives
  • Limited Context: No peer review or validation
  • Sequential Processing: Linear, not parallel exploration
  • Fixed Approach: Limited mid-task adaptation to new information
Single Agent Limitation
From Isolation to Collaboration
๐Ÿค

The Promise of Multi-Agent Collaboration

  • Study Group Dynamics: Like humans collaborating on complex problems
  • Cross-Model Synergy: Leverage unique strengths of Claude, Gemini, GPT, Grok
  • Parallel Processing: Multiple perspectives tackle same task simultaneously
  • Real-time Intelligence Sharing: Agents learn and adapt from each other
Cognition and Reasoning Process
The Promise of Collaborative Reasoning
๐Ÿ“ˆ

Proven Performance Gains - Grok Heavy Evidence

Grok-4 Standard
1
Single Agent Processing
38.6%
Last Human Exam Score
$30/month
Grok-4 Heavy
A1
A2
A3
Multi-Agent Collaboration
44.4%
Last Human Exam Score
$300/month
+15% Performance Boost
Multi-agent "study group" approach outperforms single agent

"The exploration of the art & science of multi-agent collaboration has just started."
๐Ÿš€
MassGen Orchestrator
Task Distribution & Coordination
โ†“
๐Ÿ—๏ธ
Agent 1
Anthropic/Claude
๐Ÿ‘จโ€๐Ÿ’ป
Agent 2
Claude Code
๐ŸŒŸ
Agent 3
Google/Gemini
๐Ÿค–
Agent 4
OpenAI/GPT
โšก
Agent 5
xAI/Grok
โ†• Real-time Collaboration โ†•
โ†“
๐Ÿ”„
Shared Collaboration Hub
Real-time Notification & Consensus

Key Features & Capabilities

  • ๐Ÿค Cross-Model Synergy: Diverse AI models working together
  • โšก Parallel Processing: Simultaneous problem-solving
  • ๐Ÿ”„ Iterative Refinement: Continuous improvement cycles
  • ๐Ÿ‘ฅ Real-time Sharing: Live collaboration between agents
  • ๐ŸŽฏ Consensus Building: Natural convergence
Iterative Refinement Process
Iterative Refinement: The Reality of Reasoning
โš™๏ธ

Tech Deep Dive: Async Streaming & Dynamic Scheduling

  • ๐Ÿ”„ Async Streaming: Real-time from 5+ agents
  • โšก Dynamic Scheduling: Smart start/stop coordination
  • ๐Ÿ” Graceful Restarts: Seamless task transitions
Orchestrator Agent 1 Agent 2 Agent 3 Agent 4 content tool_calls reasoning ๐Ÿ” Restart Trigger When Agent 2 provides new_answer restarting restarting restarting
Key Innovation: Dynamic coordination without deadlocks
๐Ÿ”ง

Tech Deep Dive: Backend Abstraction Challenges

  • ๐ŸŽญ Unified Interface: One protocol for 8+ backends
  • ๐Ÿ› ๏ธ Tool Integration: Search, code, MCP support
  • โš™๏ธ Format Normalization: Common response protocol
  • ๐Ÿ”€ Provider Workarounds: Handle unique limitations
Backend Challenges:
Claude Code CLI
Context sharing across agents
Gemini API
Can't mix builtin + custom tools
GPT-5
API change (reasoning, streaming etc.)
Most Backends
Unable to autonomously collaborate
๐ŸŽฏ Our Solution:
Binary Decision Framework (Explained in the next slide)
Result: Each backend ~200-300 lines with unique workarounds
๐ŸŽฏ

Tech Deep Dive: Binary Decision Framework Solution

  • โš–๏ธ Binary Choice: Each agent must choose: vote OR new_answer
  • ๐Ÿ’ฅ Vote Invalidation: Any new_answer invalidates ALL existing votes
  • ๐Ÿ”„ Reset & Restart: All agents restart with updated answer context
  • ๐ŸŽญ Anonymous Voting: Agents see "agent1", "agent2" etc.
Round 1: Agents 1,3,4 vote for Agent 4 Vote: agent4 No vote yet Vote: agent4 Answer+Vote โšก Agent 2 provides new_answer new_answer Round 2: All agents restart with 2 answers restart_pending Has new answer restart_pending Has old answer Each agent decides: vote OR new_answer ๐Ÿ”‘ Any new_answer resets ALL votes Votes from Round 1: โŒ INVALID New decisions needed based on 2 available answers
Key Innovation: Dynamic equilibrium through vote invalidation
๐Ÿš€

MassGen Evolution: v0.0.1 โ†’ v0.0.9

๐Ÿ—๏ธ

Foundation Era

v0.0.1 - v0.0.3
Core framework, basic streaming,
Claude, Gemini, GPT/o, Grok
LATEST
๐ŸŽฏ

Expansion Era

v0.0.4 - v0.0.9
Claude Code, GPT-5, MCP
GPT-OSS, Local models, 8+ providers
8 Major Releases
โ€ข
150+ Commits
โ€ข
14 Days Foundationโ†’Expansion
๐ŸŽฌ

Live Demonstrations

๐Ÿ† IMO 2025 Winner Research: Multi-agent fact-checking โ†’ unanimous consensus on Google DeepMind victory
Result: Accurate identification despite conflicting information
๐Ÿ“ Creative Writing: Robot-music story refined through collaborative voting โ†’ final unanimous approval
Result: Higher quality narrative through collaborative editing
๐ŸŒ Stockholm Travel Guide: Agents combined local knowledge + real-time data โ†’ comprehensive October 2025 plan
Result: Detailed recommendations no single agent could provide
๐Ÿ’ฐ Technical Analysis: Complex Grok-4 HLE pricing calculation through iterative refinement
Result: Accurate cost estimates through collaborative validation
๐Ÿ“š case.massgen.ai - Complete Case Studies
โšก

Get Started in 60 Seconds

# 1. Clone and setup
git clone https://github.com/Leezekun/MassGen
cd MassGen && pip install uv && uv venv

# 2. Configure API keys
cp .env.example .env # Add your API keys

# 3. Run single agent (quick test)
uv run python -m massgen.cli --model gemini-2.5-flash "When is your knowledge up to"

# 4. Run multi-agent collaboration
uv run python -m massgen.cli --config three_agents_default.yaml "Summarize latest news of github.com/Leezekun/MassGen"

โœ… Supported Models & Providers

๐Ÿข Major Providers:
Anthropic Claude & Claude Code โ€ข Google Gemini โ€ข OpenAI GPT โ€ข xAI Grok โ€ข ZAI GLM
๐Ÿ  Local & Extended:
Cerebras โ€ข Fireworks โ€ข Groq โ€ข LM Studio โ€ข OpenRouter โ€ข Together...

๐Ÿ› ๏ธ Advanced Tools

Web Search, Code Execution, File Operations, MCP
๐Ÿ”ฎ

Vision: The Path to Exponential Intelligence

  • Hurdles: Shared memory, context, interoperability
  • Roadmap: More models/agents, web UI
  • Vision: Recursive agents bootstrapping intelligence
Agents Grok Gemini Claude GPT AG2 Systems Grok Heavy DeepThink Claude Code ChatGPT AG2 Orchestrator MassGen 1ร— 10ร— 100ร— Challenges Consensus Shared Context
The Path to Exponential Intelligence
๐Ÿš€

Join the Multi-Agent Revolution

Build Scalable, Collaborative AI Systems
๐Ÿš€ v0.0.3โ†’v0.0.9 Evolution | Claude Code + MCP Integration + GPT-5 + Local Models
Thank you DataHack Summit 2025!
Questions & Discussion
1 / 14
Title - DataHack Summit 2025