The Challenge: How to share context without interference?
๐ฌ
Context Sharing in Action
๐ฌ Agent 1 Finished First
โข Creates analysis.py and results.csv
โข Saves to permanent workspace
โข ๐ธ Snapshot captured
๐ Agent 2 Restarted
โข Sees agent1/analysis.py in temp workspace
โข Reads & tests the analysis code
โข Modifications in temp dir don't affect Agent 1
โข Creates improved_analysis.py in own permanent workspace
โข ๐ธ Snapshot captured
...
๐ฏ Final Presentation
โข Winning agent has full context
โข Can reference both agents' work
โข Snapshots ensure correct version access
โ Agent 2 can READ & execute Agent 1's work
โ Temp modifications don't corrupt original
โ Each agent maintains workspace integrity
โ Final answer has complete context
๐
Benchmarking: Preliminary Results
Scientific evaluation across graduate-level reasoning, instruction-following, and narrative tasks
๐งช GPQA-Diamond
Graduate Physics/Chemistry
MassGen
87.4% ๐
Gemini
85.9%
Grok-4
85.4%
GPT-5
84.8%
Claude
68.2%
๐ IFEval
Instruction Following
MassGen
88.0% ๐
GPT-5
87.4%
Grok-4
84.7%
Gemini
66.0%
Claude
63.6%
๐ MuSR
Narrative Reasoning
Gemini
69.6% ๐
GPT-5
69.2%
MassGen
68.3%
Grok-4
67.6%
Claude
62.8%
๐ Overall Champion
MassGen: 81.2%
Wins 2/3 benchmarks โข Statistically significant
โ Key Results:
โข Highest on 2/3 benchmarks
โข Best overall average
โข Consistent performance
๐ Statistical:
โข vs Claude: p = 1.4e-07 โญโญโญ
โข vs Gemini: p = 1.1e-28 โญโญโญ
โข Not due to chance
Graduate-level physics question from GPQA-Diamond benchmark
๐ The Problem
A quasar shows a peak at 790 nm wavelength. Given Lambda-CDM cosmological parameters
(Hโ = 70 km/s/Mpc, ฮฉโ = 0.3, ฮฉฮ = 0.7), what is the comoving distance?
Options: A) 8 Gpc B) 7 Gpc C) 6 Gpc D) 9 Gpc
๐ฏ Final Result
โ
Correct Answer: A (8 Gpc)
Orchestration succeeded where individual agents initially failed
๐ค Initial Answers
Agent 1: "I calculate ~6 Gpc โ Answer C" Agent 2: "I get ~8.95 Gpc โ Answer D" Agent 3: "~6.1 Gpc โ Answer C"
๐ Self-Correction Process
Agent 1 observes: "There is significant discrepancy in calculations:
Agent1 gets ~6.1 Gpc, Agent2 gets ~8.95 Gpc. Let me re-examine..."
โจ Breakthrough Moment
Agent 1 revises: "Standard cosmological calculators yield 8000-8500 Mpc
for z=5.5. This equals 8.0-8.5 Gpc, closest to option A."