Corporate reputation teams rely on media monitoring and qualitative research, both limited in speed and coverage when digital narratives form rapidly. This paper proposes SAPIENT (Sentinel-Augmented Population Intelligence for Emerging Narrative Tracking), a multi-agent system that links a sentinel layer over public text
[...] Read more.
Corporate reputation teams rely on media monitoring and qualitative research, both limited in speed and coverage when digital narratives form rapidly. This paper proposes SAPIENT (Sentinel-Augmented Population Intelligence for Emerging Narrative Tracking), a multi-agent system that links a sentinel layer over public text streams with a simulation layer that runs moderated, repeatable in silico focus-group sessions. The sentinel layer ingests social media, news, and forum text to produce a compact signal state (topics, sentiment, anomaly scores, risk labels), which conditions the simulation layer through an orchestrator. Persona agents and a moderator follow an Agentic Focus Group (AFG) protocol with repeated runs, variance reporting, and human review gates. We describe four sustainability communication scenarios: greenwashing backlash prediction, greenhushing risk assessment, campaign pre-testing, and crisis communication simulation. Nine experiments span 280 AFG runs across 20 conditions, three LLM backends (Claude Sonnet 4, GPT-4o, and Gemini 2.5 Flash), and a preregistered pilot human validation study with 54 participants. Signal conditioning improved simulation specificity (
). Cross-lingual sessions revealed a sentiment asymmetry between English and Turkish (
) with preserved persona rank ordering (
,
). Cross-model comparison showed consistent persona differentiation across all three backends (Pearson
,
for all pairs). Sentiment was robust to prompt paraphrasing (
, n.s.), though credibility was sensitive to prompt wording (
). All significant results from Experiments 1–8 survived Benjamini–Hochberg correction. A preregistered pilot with 54 human participants on Prolific replicated the predicted credibility ranking across framing variants (
) but not the sentiment ranking, identifying a specific calibration target for future work.
Full article