ai-voicebot/docs/CHAT_INTEGRATION.md

221 lines
7.3 KiB
Markdown

# Chat Integration for AI Voicebot System
This document describes the chat functionality that has been integrated into the AI voicebot system, allowing bots to send and receive chat messages through the WebSocket signaling server.
## Overview
The chat integration enables bots to:
1. **Receive chat messages** from other participants in the lobby
2. **Send chat messages** back to the lobby
3. **Process and respond** to specific commands or keywords
4. **Integrate seamlessly** with the existing WebRTC signaling infrastructure
## Architecture
### Core Components
1. **WebRTC Signaling Client** (`webrtc_signaling.py`)
- Extended with chat message handling capabilities
- Added `on_chat_message_received` callback for bots
- Added `send_chat_message()` method for sending messages
2. **Bot Orchestrator** (`bot_orchestrator.py`)
- Enhanced bot discovery to detect chat handlers
- Sets up chat message callbacks when bots join lobbies
- Manages the connection between WebRTC client and bot chat handlers
3. **Chat Models** (`shared/models.py`)
- `ChatMessageModel`: Structure for chat messages
- `ChatMessagesListModel`: For message lists
- `ChatMessagesSendModel`: For sending messages
### Bot Interface
Bots can now implement an optional `handle_chat_message` function:
```python
async def handle_chat_message(
chat_message: ChatMessageModel,
send_message_func: Callable[[str], Awaitable[None]]
) -> Optional[str]:
"""
Handle incoming chat messages and optionally return a response.
Args:
chat_message: The received chat message
send_message_func: Function to send messages back to the lobby
Returns:
Optional response message to send back to the lobby
"""
# Process the message and return a response
return "Hello! I received your message."
```
## Implementation Details
### 1. WebSocket Message Handling
The WebRTC signaling client now handles `chat_message` type messages:
```python
elif msg_type == "chat_message":
try:
validated = ChatMessageModel.model_validate(data)
except ValidationError as e:
logger.error(f"Invalid chat_message payload: {e}", exc_info=True)
return
logger.info(f"Received chat message from {validated.sender_name}: {validated.message[:50]}...")
# Call the callback if it's set
if self.on_chat_message_received:
try:
await self.on_chat_message_received(validated)
except Exception as e:
logger.error(f"Error in chat message callback: {e}", exc_info=True)
```
### 2. Bot Discovery Enhancement
The bot orchestrator now detects chat handlers during discovery:
```python
if hasattr(mod, "handle_chat_message") and callable(getattr(mod, "handle_chat_message")):
chat_handler = getattr(mod, "handle_chat_message")
bots[info.get("name", name)] = {
"module": name,
"info": info,
"create_tracks": create_tracks,
"chat_handler": chat_handler
}
```
### 3. Chat Handler Setup
When a bot joins a lobby, the orchestrator sets up the chat handler:
```python
if chat_handler:
async def bot_chat_handler(chat_message: ChatMessageModel):
"""Wrapper to call the bot's chat handler and optionally send responses"""
try:
response = await chat_handler(chat_message, client.send_chat_message)
if response and isinstance(response, str):
await client.send_chat_message(response)
except Exception as e:
logger.error(f"Error in bot chat handler for {bot_name}: {e}", exc_info=True)
client.on_chat_message_received = bot_chat_handler
```
## Example Bots
### 1. Chatbot (`bots/chatbot.py`)
A simple conversational bot that responds to greetings and commands:
- Responds to keywords like "hello", "how are you", "goodbye"
- Provides time information when asked
- Tells jokes on request
- Handles direct mentions intelligently
Example interactions:
- User: "hello" → Bot: "Hi there!"
- User: "time" → Bot: "Let me check... it's currently 2025-09-03 23:45:12"
- User: "joke" → Bot: "Why don't scientists trust atoms? Because they make up everything!"
### 2. Enhanced Whisper Bot (`bots/whisper.py`)
The existing speech recognition bot now also handles chat commands:
- Responds to messages starting with "whisper:"
- Provides help and status information
- Echoes back commands for demonstration
Example interactions:
- User: "whisper: hello" → Bot: "Hello UserName! I'm the Whisper speech recognition bot."
- User: "whisper: help" → Bot: "I can process speech and respond to simple commands..."
- User: "whisper: status" → Bot: "Whisper bot is running and ready to process audio and chat messages."
## Server Integration
The server (`server/main.py`) already handles chat messages through WebSocket:
1. **Receiving messages**: `send_chat_message` message type
2. **Broadcasting**: `broadcast_chat_message` method distributes messages to all lobby participants
3. **Storage**: Messages are stored in lobby's `chat_messages` list
## Testing
The implementation has been tested with:
1. **Bot Discovery**: All bots are correctly discovered with chat capabilities detected
2. **Message Processing**: Both chatbot and whisper bot respond correctly to test messages
3. **Integration**: The WebRTC signaling client properly routes messages to bot handlers
Test results:
```
Discovered 3 bots:
Bot: chatbot
Has chat handler: True
Bot: synthetic_media
Has chat handler: False
Bot: whisper
Has chat handler: True
Chat functionality test:
- Chatbot response to "hello": "Hey!"
- Whisper response to "whisper: hello": "Hello TestUser! I'm the Whisper speech recognition bot."
✅ Chat functionality test completed!
```
## Usage
### For Bot Developers
To add chat capabilities to a bot:
1. Import the required types:
```python
from typing import Dict, Optional, Callable, Awaitable
from shared.models import ChatMessageModel
```
2. Implement the chat handler:
```python
async def handle_chat_message(
chat_message: ChatMessageModel,
send_message_func: Callable[[str], Awaitable[None]]
) -> Optional[str]:
# Your chat logic here
if "hello" in chat_message.message.lower():
return f"Hello {chat_message.sender_name}!"
return None
```
3. The bot orchestrator will automatically detect and wire up the chat handler when the bot joins a lobby.
### For System Integration
The chat system integrates seamlessly with the existing voicebot infrastructure:
1. **No breaking changes** to existing bots without chat handlers
2. **Automatic discovery** of chat capabilities
3. **Error isolation** - chat handler failures don't affect WebRTC functionality
4. **Logging** provides visibility into chat message flow
## Future Enhancements
Potential improvements for the chat system:
1. **Message History**: Bots could access recent chat history
2. **Rich Responses**: Support for formatted messages, images, etc.
3. **Private Messaging**: Direct messages between participants
4. **Chat Commands**: Standardized command parsing framework
5. **Persistence**: Long-term storage of chat interactions
6. **Analytics**: Message processing metrics and bot performance monitoring
## Conclusion
The chat integration provides a powerful foundation for creating interactive AI bots that can engage with users through text while maintaining their audio/video capabilities. The implementation is robust, well-tested, and ready for production use.