Saturday, March 21

Tag: AI

Built an autonomous system where 5 AI models argue about geopolitical crisis outcomes: Here's what I learned about model behavior
News Feed, Reddit

Built an autonomous system where 5 AI models argue about geopolitical crisis outcomes: Here’s what I learned about model behavior

I built a pipeline where 5 AI models (Claude, GPT-4o, Gemini, Grok, DeepSeek) independently assess the probability of 30+ crisis scenarios twice daily. None of them see the others' outputs. An orchestrator synthesizes their reasoning into final projections. Some observations after 15 days of continuous operation: The models frequently disagree, sometimes by 25+ points. Grok tends to run hot on scenarios with OSINT signals. The orchestrator has to resolve these tensions every cycle. The models anchored to their own previous outputs when shown current probabilities, so I made them blind. Named rules in prompts became shortcuts the models cited instead of actually reasoning. Google Search grounding prevented source hallucination but not content hallucination, the model fabricated a $138...
The AI Report