socialgrid-embodied-multi-agent

star 1

Research paper: SocialGrid - A Benchmark for Planning and Social Reasoning in Embodied Multi-Agent Systems. Among Us-inspired environment for social reasoning evaluation.

hiyenwong By hiyenwong schedule Updated 6/3/2026

name: socialgrid-embodied-multi-agent description: 'Research paper: SocialGrid - A Benchmark for Planning and Social Reasoning in Embodied Multi-Agent Systems. Among Us-inspired environment for social reasoning evaluation.' metadata: openclaw: emoji: "🎮" tags: ["research", "arxiv", "multi-agent", "social-reasoning", "embodied", "benchmark", "planning"]


SocialGrid: A Benchmark for Planning and Social Reasoning in Embodied Multi-Agent Systems

arXiv ID: 2604.16168
Published: 2026-04-18
Authors: Hikaru Shindo, Hanzhao Lin, Lukas Helff
Categories: cs.AI, cs.LG, cs.MA
Utility Score: 0.99

Abstract

As Large Language Models (LLMs) transition from text processors to autonomous agents, evaluating their social reasoning in embodied multi-agent settings becomes critical. We introduce SocialGrid, an embodied multi-agent environment inspired by Among Us that requires agents to perform complex social reasoning.

Key Contributions

  1. Social Reasoning Benchmark: First benchmark for social reasoning in embodied multi-agent systems
  2. Among Us Environment: Game-inspired setting requiring deception detection and trust evaluation
  3. Embodied Multi-Agent: Combines physical embodiment with social interaction

Relevance to AI Agent Systems

  • Multi-Agent Systems: Social dynamics in multi-agent environments
  • LLM Agents: Evaluating LLMs as social agents
  • Reasoning: Social and strategic reasoning capabilities
  • Planning: Long-horizon planning in social contexts

Technical Keywords

multi-agent, llm agent, reasoning, planning, benchmark, evaluation, autonomous

URL

http://arxiv.org/abs/2604.16168


Tracked: 2026-04-20
Source: arXiv Paper Tracker

Install via CLI
npx skills add https://github.com/hiyenwong/ai_collection --skill socialgrid-embodied-multi-agent
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator