Description
TEN is an open-source framework for building real-time multimodal conversational AI agents that can see, hear, and speak with users. It features a modular architecture that seamlessly integrates large language models with speech recognition, text-to-speech, vision processing, and real-time communications capabilities. Developers can create agents with natural voice interactions, visual understanding, and even animated avatars while easily swapping AI components through plug-and-play extensions without code changes. TEN distinguishes itself with its visual graph-based configuration system, support for cutting-edge real-time AI services like Gemini 2.0 Live and OpenAI Realtime, and compatibility with platforms like Dify and Coze. Organizations seeking low-latency conversational agents with multimodal capabilities will appreciate TEN's comprehensive AI stack that combines the flexibility of open-source development with production-grade performance for applications requiring natural human-AI interaction.
GitHub Repository
Note: This is a GitHub repository, meaning that it is code that someone created and made available for others to use. It typically requires some technical knowledge to set up and run.
Explore Similar AI Tools
AI news twice a week
Join 230,000+ readers getting the most important AI news and coolest tools every Wednesday and Friday.




