name: socialgrid-embodied-multi-agent description: 'Research paper: SocialGrid - A Benchmark for Planning and Social Reasoning in Embodied Multi-Agent Systems. Among Us-inspired environment for social reasoning evaluation.' metadata: openclaw: emoji: "🎮" tags: ["research", "arxiv", "multi-agent", "social-reasoning", "embodied", "benchmark", "planning"]
SocialGrid: A Benchmark for Planning and Social Reasoning in Embodied Multi-Agent Systems
arXiv ID: 2604.16168
Published: 2026-04-18
Authors: Hikaru Shindo, Hanzhao Lin, Lukas Helff
Categories: cs.AI, cs.LG, cs.MA
Utility Score: 0.99
Abstract
As Large Language Models (LLMs) transition from text processors to autonomous agents, evaluating their social reasoning in embodied multi-agent settings becomes critical. We introduce SocialGrid, an embodied multi-agent environment inspired by Among Us that requires agents to perform complex social reasoning.
Key Contributions
- Social Reasoning Benchmark: First benchmark for social reasoning in embodied multi-agent systems
- Among Us Environment: Game-inspired setting requiring deception detection and trust evaluation
- Embodied Multi-Agent: Combines physical embodiment with social interaction
Relevance to AI Agent Systems
- Multi-Agent Systems: Social dynamics in multi-agent environments
- LLM Agents: Evaluating LLMs as social agents
- Reasoning: Social and strategic reasoning capabilities
- Planning: Long-horizon planning in social contexts
Technical Keywords
multi-agent, llm agent, reasoning, planning, benchmark, evaluation, autonomous
URL
http://arxiv.org/abs/2604.16168
Tracked: 2026-04-20
Source: arXiv Paper Tracker