Social interaction significantly impacts well-being, mental health, and cognition. Yet an estimated 1 in 6 people worldwide lack the social interactions they need, creating crises of social isolation and loneliness. As large language models (LLMs) integrate deeper into human lives, their role is shifting from passive tools to active socio-collaborative companions in affective and collaborative settings.
We propose MASCOT, a generalizable multi-agent framework for developing multi-perspective socio-collaborative companions. Unlike previous multi-agent systems optimized solely for task efficiency, MASCOT targets user-agent interaction quality, explicitly balancing individual agent persona consistency with global discourse dynamics. We introduce an efficient bi-level optimization strategy: (1) a Reinforcement-Learning-from-AI-Feedback (RLAIF) pipeline that fine-tunes individual agents for strict Persona Fidelity, and (2) a meta-agent policy guided by group-level rewards to ensure Interaction Synergy.
Extensive experiments demonstrate that MASCOT achieves significant improvements: +14.1 in Persona Consistency and +10.6 in Social Contribution compared to baseline approaches.