Free
Introduction
In this detailed overview, we explore the latest feature update introduced for chatbots—Auto Transcribe Audio Files.
This enhancement allows seamless conversion of audio messages into text, facilitating more efficient communication and interaction within chatbot workflows.
The update is designed to improve user experience by automating the transcription process, enabling AI agents to process audio inputs more effectively, and supporting richer conversational capabilities.
Key Features and How It Works
1. Activation and Setup
Accessing Settings:
To enable auto transcription, navigate to the Chatbot Settings.Go to Live Chat section.
Locate the Audio Transcription toggle.
Model Selection:
Currently, the system supports the OpenAI Whisper Model (version 1).Additional models may be integrated over time, expanding transcription options.
Workspace Connection:
Ensure your OpenAI account is connected within your workspace.
Without this connection, the transcription feature will not function.
Saving Configuration:
After enabling, save your settings to activate the feature.
2. Operational Workflow
Incoming Audio Files:
When an audio message is received, the system automatically processes it.The audio file is saved alongside its URL in the message input.
Transcription Process:
The audio is transcribed into text using the selected model.
The transcribed text is then added to the message as a text reply.
Integration with AI Agents:
If an AI agent is involved, the transcribed text is automatically passed to the agent for processing.
The agent then generates a reply based on the transcribed content.
3. Practical Demonstration
Using the Flow Builder:
Create or select an AI agent within the flow builder.
Use the Preview mode to test the feature.
Sending an Audio File:
Upload an audio message (e.g., "How are you doing?").
Wait for processing, which takes a few seconds.
Receiving a Response:
The AI agent replies based on the transcribed message, e.g., "I'm doing well, thank you. How can I assist you today?"
Verification:
Check the bot user profile after interaction.
The last message will display:
Type: Audio
URL: Link to the audio file
Text: The transcribed message ("Hey, how are you doing today?")
4. Additional Use Cases
Manual and Automated Responses:
Use transcribed text to craft replies, even if the original message was audio.
Flow Builder Flexibility:
Access transcribed audio data from anywhere within the flow builder.
Send different types of responses based on the transcribed content.
Text-to-Speech and Voice Conversations:
Convert AI-generated text back into speech for voice interactions.
Enable voice-based conversations for a more natural user experience.
Outro: Final Thoughts and Recommendations
The Auto Transcribe Audio Files feature significantly enhances chatbot capabilities by bridging audio and text communication. It simplifies workflows, improves accessibility, and opens new avenues for engaging user interactions. To maximize its benefits:
Enable the feature in your live chat settings.
Test with AI agents to see real-time transcription and response generation.
Leverage the flexibility to incorporate transcribed data into various response types, including text and speech.
If you encounter any issues or have questions, submit a support ticket for assistance. This update is a step toward more intelligent, voice-enabled chatbot experiences, making conversations more natural and efficient.
Summary Table
Aspect | Details |
---|---|
Feature Name | Auto Transcribe Audio Files |
Supported Model | OpenAI Whisper v1 (future models may be added) |
Activation Steps | Settings → Live Chat → Enable Audio Transcription → Save |
Prerequisites | Connected OpenAI account in workspace |
Workflow | Incoming audio → Transcribed text added to message → Passed to AI agent for response |
Use Cases | - Automated transcription for AI processing<br>- Manual review<br>- Voice conversations |
Additional Capabilities | Text-to-speech conversion for voice replies |
Support | Submit tickets for issues or questions |
Closing Remarks
This update marks a significant advancement in chatbot technology, emphasizing multimodal communication—integrating voice and text seamlessly.
By enabling auto transcription, developers and users can create more dynamic, accessible, and engaging chatbot experiences. Embrace this feature to elevate your conversational workflows and explore new possibilities in voice-enabled AI interactions.