Last updated
Last updated
try it out in the playground on
why?
agents will never feel truly alive if they are confined to text; real-time voice (and soon, video) with emotion, personality, and presence is where it's at. for this, you need low latency (500-800ms voice-to-voice) and interrupt handling with word-level accuracy, or it doesn't feel very real-time, or human-like.we found out the hard way that running the infra for this sucks. it's expensive too. so, with soulgraph presence, developers can build natural, responsive voice interactions with just a few api calls.
how?
soulgraph presence provides the distributed infra and orchestrates the WebRTC transport, speech-to-text, text-to-speech and agent interface. voice and video interactions are moderated by the personality defined in soulscript, creating a natural feedback loop--when users communicate through voice, they share richer emotional context, leading to more accurate memories in soulgraph memory, which in turn enables richer personality evolution in the agent.
we leverage a bunch of great, open-source libraries to make this happen. soulgraph's contribution is the orchestration layer and a purpose-built api designed for agent frameworks to make it as easy as possible to get up and running.
human-like, real-time voice, video and avatars