Go offline with the Player FM app!
Goodbye Chatbots, Hello Voice AI Agents
Manage episode 472001995 series 3541344
In this exciting episode of Generation AI, hosts Ardis Kadiu and Dr. JC Bonia examine the rapid evolution of voice AI agents and their impact on higher education. They spotlight Sesame, a new open-source voice model that crosses the "uncanny valley" with its remarkably human-like speech capabilities. The hosts demonstrate these advanced voice technologies live and discuss how they're already transforming student support services, enrollment management, and call centers. As voice becomes the primary interface for AI interaction, colleges must adapt to meet student expectations for 24/7 instant support while maintaining an authentic, empathetic experience.
The Voice AI Revolution Arrives (00:00:00)
- Introduction to the rapidly evolving voice AI landscape in spring 2025
- Shift from text-based prompting to voice as the primary AI interface
- Voice modes now standard in major AI platforms (ChatGPT, Gemini, Perplexity)
- Growing prevalence of AI voice agents in customer service contexts
Introducing Sesame: Breaking the Internet (00:02:00)
- Overview of Sesame, an ultra-realistic, open-source AI voice model
- How Sesame differentiates from previous voice technologies
- The significance of its open-source MIT license for wider adoption
- Crossing the "uncanny valley" with human-indistinguishable speech
Live Demonstration with Maya (00:09:00)
- Interactive conversation with the Sesame voice model "Maya"
- Demonstrations of emotional range, empathy, and conversational realism
- Role-play examples showing versatility for higher education scenarios
- Showcasing natural pauses, breathing patterns, and emotional intelligence
Technical Approaches to Voice AI (00:14:00)
- Comparison of different voice technology architectures
- Traditional pipeline: Speech-to-Text → LLM processing → Text-to-Speech
- Native voice models: direct voice-to-voice processing
- How end-to-end approaches improve latency, accuracy, and emotional context
- Parallels to human language processing and bilingual thinking
Voice AI Providers and Implementation (00:18:00)
- Overview of key players in the voice AI market (11Labs, OpenAI, Meta)
- Recent announcements including Llama 4's voice capabilities
- Building voice agent systems that minimize latency
- Multi-channel implementation strategies beyond traditional phone calls
Higher Education Use Cases (00:25:00)
- Enrollment support with multilingual family communication
- 24/7 student services that match student productivity patterns
- Integration across multiple communication channels
- Voice as a natural query interface on campus
- Enhancing accessibility while maintaining personalization
Transforming Call Centers (00:30:00)
- How voice AI is replacing traditional IVR phone trees
- Benefits of AI-powered routing without human limitations
- Cost-saving potential with AI handling 30-50% of calls
- Eliminating wait times during peak periods like financial aid season
- Parallel processing capabilities versus sequential human staffing
Balancing Technology and Responsibility (00:38:00)
- Ethical considerations for voice AI implementation
- Importance of accurate information delivery
- How human staff can focus on more complex, high-value interactions
- Real-world implementations at Johnson Community College and other institutions
- Looking ahead to continued voice technology improvements
- - - -
Connect With Our Co-Hosts:
Ardis Kadiu
https://www.linkedin.com/in/ardis/
https://twitter.com/ardis
Dr. JC Bonilla
https://www.linkedin.com/in/jcbonilla/
https://twitter.com/jbonillx
About The Enrollify Podcast Network:
Generation AI is a part of the Enrollify Podcast Network. If you like this podcast, chances are you’ll like other Enrollify shows too!
Enrollify is made possible by Element451 — the next-generation AI student engagement platform helping institutions create meaningful and personalized interactions with students. Learn more at element451.com.
Attend the 2025 Engage Summit!
The Engage Summit is the premier conference for forward-thinking leaders and practitioners dedicated to exploring the transformative power of AI in education. Explore the strategies and tools to step into the next generation of student engagement, supercharged by AI. You'll leave ready to deliver the most personalized digital engagement experience every step of the way.
Register now to secure your spot in Charlotte, NC, on June 24-25, 2025! Early bird registration ends February 1st -- https://engage.element451.com/register
75 episodes
Manage episode 472001995 series 3541344
In this exciting episode of Generation AI, hosts Ardis Kadiu and Dr. JC Bonia examine the rapid evolution of voice AI agents and their impact on higher education. They spotlight Sesame, a new open-source voice model that crosses the "uncanny valley" with its remarkably human-like speech capabilities. The hosts demonstrate these advanced voice technologies live and discuss how they're already transforming student support services, enrollment management, and call centers. As voice becomes the primary interface for AI interaction, colleges must adapt to meet student expectations for 24/7 instant support while maintaining an authentic, empathetic experience.
The Voice AI Revolution Arrives (00:00:00)
- Introduction to the rapidly evolving voice AI landscape in spring 2025
- Shift from text-based prompting to voice as the primary AI interface
- Voice modes now standard in major AI platforms (ChatGPT, Gemini, Perplexity)
- Growing prevalence of AI voice agents in customer service contexts
Introducing Sesame: Breaking the Internet (00:02:00)
- Overview of Sesame, an ultra-realistic, open-source AI voice model
- How Sesame differentiates from previous voice technologies
- The significance of its open-source MIT license for wider adoption
- Crossing the "uncanny valley" with human-indistinguishable speech
Live Demonstration with Maya (00:09:00)
- Interactive conversation with the Sesame voice model "Maya"
- Demonstrations of emotional range, empathy, and conversational realism
- Role-play examples showing versatility for higher education scenarios
- Showcasing natural pauses, breathing patterns, and emotional intelligence
Technical Approaches to Voice AI (00:14:00)
- Comparison of different voice technology architectures
- Traditional pipeline: Speech-to-Text → LLM processing → Text-to-Speech
- Native voice models: direct voice-to-voice processing
- How end-to-end approaches improve latency, accuracy, and emotional context
- Parallels to human language processing and bilingual thinking
Voice AI Providers and Implementation (00:18:00)
- Overview of key players in the voice AI market (11Labs, OpenAI, Meta)
- Recent announcements including Llama 4's voice capabilities
- Building voice agent systems that minimize latency
- Multi-channel implementation strategies beyond traditional phone calls
Higher Education Use Cases (00:25:00)
- Enrollment support with multilingual family communication
- 24/7 student services that match student productivity patterns
- Integration across multiple communication channels
- Voice as a natural query interface on campus
- Enhancing accessibility while maintaining personalization
Transforming Call Centers (00:30:00)
- How voice AI is replacing traditional IVR phone trees
- Benefits of AI-powered routing without human limitations
- Cost-saving potential with AI handling 30-50% of calls
- Eliminating wait times during peak periods like financial aid season
- Parallel processing capabilities versus sequential human staffing
Balancing Technology and Responsibility (00:38:00)
- Ethical considerations for voice AI implementation
- Importance of accurate information delivery
- How human staff can focus on more complex, high-value interactions
- Real-world implementations at Johnson Community College and other institutions
- Looking ahead to continued voice technology improvements
- - - -
Connect With Our Co-Hosts:
Ardis Kadiu
https://www.linkedin.com/in/ardis/
https://twitter.com/ardis
Dr. JC Bonilla
https://www.linkedin.com/in/jcbonilla/
https://twitter.com/jbonillx
About The Enrollify Podcast Network:
Generation AI is a part of the Enrollify Podcast Network. If you like this podcast, chances are you’ll like other Enrollify shows too!
Enrollify is made possible by Element451 — the next-generation AI student engagement platform helping institutions create meaningful and personalized interactions with students. Learn more at element451.com.
Attend the 2025 Engage Summit!
The Engage Summit is the premier conference for forward-thinking leaders and practitioners dedicated to exploring the transformative power of AI in education. Explore the strategies and tools to step into the next generation of student engagement, supercharged by AI. You'll leave ready to deliver the most personalized digital engagement experience every step of the way.
Register now to secure your spot in Charlotte, NC, on June 24-25, 2025! Early bird registration ends February 1st -- https://engage.element451.com/register
75 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.