ai-voicebot/CHAT_INTEGRATION.md
James Ketrenos 9ce3d1b670 Implement comprehensive chat integration for voicebot system
Features added:
- WebSocket chat message handling in WebRTC signaling client
- Bot chat handler discovery and automatic setup
- Chat message sending/receiving capabilities
- Example chatbot with conversation features
- Enhanced whisper bot with chat commands
- Comprehensive error handling and logging
- Full integration with existing WebRTC infrastructure

Bots can now:
- Receive chat messages from lobby participants
- Send responses back through WebSocket
- Process commands and keywords
- Integrate seamlessly with voice/video functionality

Files modified:
- voicebot/webrtc_signaling.py: Added chat message handling
- voicebot/bot_orchestrator.py: Enhanced bot discovery for chat
- voicebot/bots/whisper.py: Added chat command processing
- voicebot/bots/chatbot.py: New conversational bot
- voicebot/bots/__init__.py: Added chatbot module
- CHAT_INTEGRATION.md: Comprehensive documentation
- README.md: Updated with chat functionality info
2025-09-03 16:28:32 -07:00

7.3 KiB

Chat Integration for AI Voicebot System

This document describes the chat functionality that has been integrated into the AI voicebot system, allowing bots to send and receive chat messages through the WebSocket signaling server.

Overview

The chat integration enables bots to:

  1. Receive chat messages from other participants in the lobby
  2. Send chat messages back to the lobby
  3. Process and respond to specific commands or keywords
  4. Integrate seamlessly with the existing WebRTC signaling infrastructure

Architecture

Core Components

  1. WebRTC Signaling Client (webrtc_signaling.py)

    • Extended with chat message handling capabilities
    • Added on_chat_message_received callback for bots
    • Added send_chat_message() method for sending messages
  2. Bot Orchestrator (bot_orchestrator.py)

    • Enhanced bot discovery to detect chat handlers
    • Sets up chat message callbacks when bots join lobbies
    • Manages the connection between WebRTC client and bot chat handlers
  3. Chat Models (shared/models.py)

    • ChatMessageModel: Structure for chat messages
    • ChatMessagesListModel: For message lists
    • ChatMessagesSendModel: For sending messages

Bot Interface

Bots can now implement an optional handle_chat_message function:

async def handle_chat_message(
    chat_message: ChatMessageModel, 
    send_message_func: Callable[[str], Awaitable[None]]
) -> Optional[str]:
    """
    Handle incoming chat messages and optionally return a response.
    
    Args:
        chat_message: The received chat message
        send_message_func: Function to send messages back to the lobby
    
    Returns:
        Optional response message to send back to the lobby
    """
    # Process the message and return a response
    return "Hello! I received your message."

Implementation Details

1. WebSocket Message Handling

The WebRTC signaling client now handles chat_message type messages:

elif msg_type == "chat_message":
    try:
        validated = ChatMessageModel.model_validate(data)
    except ValidationError as e:
        logger.error(f"Invalid chat_message payload: {e}", exc_info=True)
        return
    logger.info(f"Received chat message from {validated.sender_name}: {validated.message[:50]}...")
    # Call the callback if it's set
    if self.on_chat_message_received:
        try:
            await self.on_chat_message_received(validated)
        except Exception as e:
            logger.error(f"Error in chat message callback: {e}", exc_info=True)

2. Bot Discovery Enhancement

The bot orchestrator now detects chat handlers during discovery:

if hasattr(mod, "handle_chat_message") and callable(getattr(mod, "handle_chat_message")):
    chat_handler = getattr(mod, "handle_chat_message")

bots[info.get("name", name)] = {
    "module": name, 
    "info": info, 
    "create_tracks": create_tracks,
    "chat_handler": chat_handler
}

3. Chat Handler Setup

When a bot joins a lobby, the orchestrator sets up the chat handler:

if chat_handler:
    async def bot_chat_handler(chat_message: ChatMessageModel):
        """Wrapper to call the bot's chat handler and optionally send responses"""
        try:
            response = await chat_handler(chat_message, client.send_chat_message)
            if response and isinstance(response, str):
                await client.send_chat_message(response)
        except Exception as e:
            logger.error(f"Error in bot chat handler for {bot_name}: {e}", exc_info=True)
    
    client.on_chat_message_received = bot_chat_handler

Example Bots

1. Chatbot (bots/chatbot.py)

A simple conversational bot that responds to greetings and commands:

  • Responds to keywords like "hello", "how are you", "goodbye"
  • Provides time information when asked
  • Tells jokes on request
  • Handles direct mentions intelligently

Example interactions:

  • User: "hello" → Bot: "Hi there!"
  • User: "time" → Bot: "Let me check... it's currently 2025-09-03 23:45:12"
  • User: "joke" → Bot: "Why don't scientists trust atoms? Because they make up everything!"

2. Enhanced Whisper Bot (bots/whisper.py)

The existing speech recognition bot now also handles chat commands:

  • Responds to messages starting with "whisper:"
  • Provides help and status information
  • Echoes back commands for demonstration

Example interactions:

  • User: "whisper: hello" → Bot: "Hello UserName! I'm the Whisper speech recognition bot."
  • User: "whisper: help" → Bot: "I can process speech and respond to simple commands..."
  • User: "whisper: status" → Bot: "Whisper bot is running and ready to process audio and chat messages."

Server Integration

The server (server/main.py) already handles chat messages through WebSocket:

  1. Receiving messages: send_chat_message message type
  2. Broadcasting: broadcast_chat_message method distributes messages to all lobby participants
  3. Storage: Messages are stored in lobby's chat_messages list

Testing

The implementation has been tested with:

  1. Bot Discovery: All bots are correctly discovered with chat capabilities detected
  2. Message Processing: Both chatbot and whisper bot respond correctly to test messages
  3. Integration: The WebRTC signaling client properly routes messages to bot handlers

Test results:

Discovered 3 bots:
  Bot: chatbot
    Has chat handler: True
  Bot: synthetic_media  
    Has chat handler: False
  Bot: whisper
    Has chat handler: True

Chat functionality test:
- Chatbot response to "hello": "Hey!"
- Whisper response to "whisper: hello": "Hello TestUser! I'm the Whisper speech recognition bot."
✅ Chat functionality test completed!

Usage

For Bot Developers

To add chat capabilities to a bot:

  1. Import the required types:
from typing import Dict, Optional, Callable, Awaitable
from shared.models import ChatMessageModel
  1. Implement the chat handler:
async def handle_chat_message(
    chat_message: ChatMessageModel, 
    send_message_func: Callable[[str], Awaitable[None]]
) -> Optional[str]:
    # Your chat logic here
    if "hello" in chat_message.message.lower():
        return f"Hello {chat_message.sender_name}!"
    return None
  1. The bot orchestrator will automatically detect and wire up the chat handler when the bot joins a lobby.

For System Integration

The chat system integrates seamlessly with the existing voicebot infrastructure:

  1. No breaking changes to existing bots without chat handlers
  2. Automatic discovery of chat capabilities
  3. Error isolation - chat handler failures don't affect WebRTC functionality
  4. Logging provides visibility into chat message flow

Future Enhancements

Potential improvements for the chat system:

  1. Message History: Bots could access recent chat history
  2. Rich Responses: Support for formatted messages, images, etc.
  3. Private Messaging: Direct messages between participants
  4. Chat Commands: Standardized command parsing framework
  5. Persistence: Long-term storage of chat interactions
  6. Analytics: Message processing metrics and bot performance monitoring

Conclusion

The chat integration provides a powerful foundation for creating interactive AI bots that can engage with users through text while maintaining their audio/video capabilities. The implementation is robust, well-tested, and ready for production use.