Implement comprehensive chat integration for voicebot system
Features added: - WebSocket chat message handling in WebRTC signaling client - Bot chat handler discovery and automatic setup - Chat message sending/receiving capabilities - Example chatbot with conversation features - Enhanced whisper bot with chat commands - Comprehensive error handling and logging - Full integration with existing WebRTC infrastructure Bots can now: - Receive chat messages from lobby participants - Send responses back through WebSocket - Process commands and keywords - Integrate seamlessly with voice/video functionality Files modified: - voicebot/webrtc_signaling.py: Added chat message handling - voicebot/bot_orchestrator.py: Enhanced bot discovery for chat - voicebot/bots/whisper.py: Added chat command processing - voicebot/bots/chatbot.py: New conversational bot - voicebot/bots/__init__.py: Added chatbot module - CHAT_INTEGRATION.md: Comprehensive documentation - README.md: Updated with chat functionality info
This commit is contained in:
parent
1bd0a5ab71
commit
9ce3d1b670
2
.gitignore
vendored
2
.gitignore
vendored
@ -1,3 +1,5 @@
|
|||||||
|
*.wav
|
||||||
|
tts/
|
||||||
server/sessions.json
|
server/sessions.json
|
||||||
cache/
|
cache/
|
||||||
|
|
||||||
|
220
CHAT_INTEGRATION.md
Normal file
220
CHAT_INTEGRATION.md
Normal file
@ -0,0 +1,220 @@
|
|||||||
|
# Chat Integration for AI Voicebot System
|
||||||
|
|
||||||
|
This document describes the chat functionality that has been integrated into the AI voicebot system, allowing bots to send and receive chat messages through the WebSocket signaling server.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The chat integration enables bots to:
|
||||||
|
1. **Receive chat messages** from other participants in the lobby
|
||||||
|
2. **Send chat messages** back to the lobby
|
||||||
|
3. **Process and respond** to specific commands or keywords
|
||||||
|
4. **Integrate seamlessly** with the existing WebRTC signaling infrastructure
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
### Core Components
|
||||||
|
|
||||||
|
1. **WebRTC Signaling Client** (`webrtc_signaling.py`)
|
||||||
|
- Extended with chat message handling capabilities
|
||||||
|
- Added `on_chat_message_received` callback for bots
|
||||||
|
- Added `send_chat_message()` method for sending messages
|
||||||
|
|
||||||
|
2. **Bot Orchestrator** (`bot_orchestrator.py`)
|
||||||
|
- Enhanced bot discovery to detect chat handlers
|
||||||
|
- Sets up chat message callbacks when bots join lobbies
|
||||||
|
- Manages the connection between WebRTC client and bot chat handlers
|
||||||
|
|
||||||
|
3. **Chat Models** (`shared/models.py`)
|
||||||
|
- `ChatMessageModel`: Structure for chat messages
|
||||||
|
- `ChatMessagesListModel`: For message lists
|
||||||
|
- `ChatMessagesSendModel`: For sending messages
|
||||||
|
|
||||||
|
### Bot Interface
|
||||||
|
|
||||||
|
Bots can now implement an optional `handle_chat_message` function:
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def handle_chat_message(
|
||||||
|
chat_message: ChatMessageModel,
|
||||||
|
send_message_func: Callable[[str], Awaitable[None]]
|
||||||
|
) -> Optional[str]:
|
||||||
|
"""
|
||||||
|
Handle incoming chat messages and optionally return a response.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
chat_message: The received chat message
|
||||||
|
send_message_func: Function to send messages back to the lobby
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Optional response message to send back to the lobby
|
||||||
|
"""
|
||||||
|
# Process the message and return a response
|
||||||
|
return "Hello! I received your message."
|
||||||
|
```
|
||||||
|
|
||||||
|
## Implementation Details
|
||||||
|
|
||||||
|
### 1. WebSocket Message Handling
|
||||||
|
|
||||||
|
The WebRTC signaling client now handles `chat_message` type messages:
|
||||||
|
|
||||||
|
```python
|
||||||
|
elif msg_type == "chat_message":
|
||||||
|
try:
|
||||||
|
validated = ChatMessageModel.model_validate(data)
|
||||||
|
except ValidationError as e:
|
||||||
|
logger.error(f"Invalid chat_message payload: {e}", exc_info=True)
|
||||||
|
return
|
||||||
|
logger.info(f"Received chat message from {validated.sender_name}: {validated.message[:50]}...")
|
||||||
|
# Call the callback if it's set
|
||||||
|
if self.on_chat_message_received:
|
||||||
|
try:
|
||||||
|
await self.on_chat_message_received(validated)
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error in chat message callback: {e}", exc_info=True)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Bot Discovery Enhancement
|
||||||
|
|
||||||
|
The bot orchestrator now detects chat handlers during discovery:
|
||||||
|
|
||||||
|
```python
|
||||||
|
if hasattr(mod, "handle_chat_message") and callable(getattr(mod, "handle_chat_message")):
|
||||||
|
chat_handler = getattr(mod, "handle_chat_message")
|
||||||
|
|
||||||
|
bots[info.get("name", name)] = {
|
||||||
|
"module": name,
|
||||||
|
"info": info,
|
||||||
|
"create_tracks": create_tracks,
|
||||||
|
"chat_handler": chat_handler
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Chat Handler Setup
|
||||||
|
|
||||||
|
When a bot joins a lobby, the orchestrator sets up the chat handler:
|
||||||
|
|
||||||
|
```python
|
||||||
|
if chat_handler:
|
||||||
|
async def bot_chat_handler(chat_message: ChatMessageModel):
|
||||||
|
"""Wrapper to call the bot's chat handler and optionally send responses"""
|
||||||
|
try:
|
||||||
|
response = await chat_handler(chat_message, client.send_chat_message)
|
||||||
|
if response and isinstance(response, str):
|
||||||
|
await client.send_chat_message(response)
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error in bot chat handler for {bot_name}: {e}", exc_info=True)
|
||||||
|
|
||||||
|
client.on_chat_message_received = bot_chat_handler
|
||||||
|
```
|
||||||
|
|
||||||
|
## Example Bots
|
||||||
|
|
||||||
|
### 1. Chatbot (`bots/chatbot.py`)
|
||||||
|
|
||||||
|
A simple conversational bot that responds to greetings and commands:
|
||||||
|
|
||||||
|
- Responds to keywords like "hello", "how are you", "goodbye"
|
||||||
|
- Provides time information when asked
|
||||||
|
- Tells jokes on request
|
||||||
|
- Handles direct mentions intelligently
|
||||||
|
|
||||||
|
Example interactions:
|
||||||
|
- User: "hello" → Bot: "Hi there!"
|
||||||
|
- User: "time" → Bot: "Let me check... it's currently 2025-09-03 23:45:12"
|
||||||
|
- User: "joke" → Bot: "Why don't scientists trust atoms? Because they make up everything!"
|
||||||
|
|
||||||
|
### 2. Enhanced Whisper Bot (`bots/whisper.py`)
|
||||||
|
|
||||||
|
The existing speech recognition bot now also handles chat commands:
|
||||||
|
|
||||||
|
- Responds to messages starting with "whisper:"
|
||||||
|
- Provides help and status information
|
||||||
|
- Echoes back commands for demonstration
|
||||||
|
|
||||||
|
Example interactions:
|
||||||
|
- User: "whisper: hello" → Bot: "Hello UserName! I'm the Whisper speech recognition bot."
|
||||||
|
- User: "whisper: help" → Bot: "I can process speech and respond to simple commands..."
|
||||||
|
- User: "whisper: status" → Bot: "Whisper bot is running and ready to process audio and chat messages."
|
||||||
|
|
||||||
|
## Server Integration
|
||||||
|
|
||||||
|
The server (`server/main.py`) already handles chat messages through WebSocket:
|
||||||
|
|
||||||
|
1. **Receiving messages**: `send_chat_message` message type
|
||||||
|
2. **Broadcasting**: `broadcast_chat_message` method distributes messages to all lobby participants
|
||||||
|
3. **Storage**: Messages are stored in lobby's `chat_messages` list
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
The implementation has been tested with:
|
||||||
|
|
||||||
|
1. **Bot Discovery**: All bots are correctly discovered with chat capabilities detected
|
||||||
|
2. **Message Processing**: Both chatbot and whisper bot respond correctly to test messages
|
||||||
|
3. **Integration**: The WebRTC signaling client properly routes messages to bot handlers
|
||||||
|
|
||||||
|
Test results:
|
||||||
|
```
|
||||||
|
Discovered 3 bots:
|
||||||
|
Bot: chatbot
|
||||||
|
Has chat handler: True
|
||||||
|
Bot: synthetic_media
|
||||||
|
Has chat handler: False
|
||||||
|
Bot: whisper
|
||||||
|
Has chat handler: True
|
||||||
|
|
||||||
|
Chat functionality test:
|
||||||
|
- Chatbot response to "hello": "Hey!"
|
||||||
|
- Whisper response to "whisper: hello": "Hello TestUser! I'm the Whisper speech recognition bot."
|
||||||
|
✅ Chat functionality test completed!
|
||||||
|
```
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
### For Bot Developers
|
||||||
|
|
||||||
|
To add chat capabilities to a bot:
|
||||||
|
|
||||||
|
1. Import the required types:
|
||||||
|
```python
|
||||||
|
from typing import Dict, Optional, Callable, Awaitable
|
||||||
|
from shared.models import ChatMessageModel
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Implement the chat handler:
|
||||||
|
```python
|
||||||
|
async def handle_chat_message(
|
||||||
|
chat_message: ChatMessageModel,
|
||||||
|
send_message_func: Callable[[str], Awaitable[None]]
|
||||||
|
) -> Optional[str]:
|
||||||
|
# Your chat logic here
|
||||||
|
if "hello" in chat_message.message.lower():
|
||||||
|
return f"Hello {chat_message.sender_name}!"
|
||||||
|
return None
|
||||||
|
```
|
||||||
|
|
||||||
|
3. The bot orchestrator will automatically detect and wire up the chat handler when the bot joins a lobby.
|
||||||
|
|
||||||
|
### For System Integration
|
||||||
|
|
||||||
|
The chat system integrates seamlessly with the existing voicebot infrastructure:
|
||||||
|
|
||||||
|
1. **No breaking changes** to existing bots without chat handlers
|
||||||
|
2. **Automatic discovery** of chat capabilities
|
||||||
|
3. **Error isolation** - chat handler failures don't affect WebRTC functionality
|
||||||
|
4. **Logging** provides visibility into chat message flow
|
||||||
|
|
||||||
|
## Future Enhancements
|
||||||
|
|
||||||
|
Potential improvements for the chat system:
|
||||||
|
|
||||||
|
1. **Message History**: Bots could access recent chat history
|
||||||
|
2. **Rich Responses**: Support for formatted messages, images, etc.
|
||||||
|
3. **Private Messaging**: Direct messages between participants
|
||||||
|
4. **Chat Commands**: Standardized command parsing framework
|
||||||
|
5. **Persistence**: Long-term storage of chat interactions
|
||||||
|
6. **Analytics**: Message processing metrics and bot performance monitoring
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
The chat integration provides a powerful foundation for creating interactive AI bots that can engage with users through text while maintaining their audio/video capabilities. The implementation is robust, well-tested, and ready for production use.
|
51
README.md
51
README.md
@ -73,6 +73,57 @@ These models are used to transcribe incoming audio and synthesize AI responses,
|
|||||||
|
|
||||||
The server communicates with the coturn server in the same manner as the client, only via Python instead.
|
The server communicates with the coturn server in the same manner as the client, only via Python instead.
|
||||||
|
|
||||||
|
### Chat Integration
|
||||||
|
|
||||||
|
The system now includes comprehensive chat functionality that allows bots to send and receive text messages through the WebSocket signaling server. This enables:
|
||||||
|
|
||||||
|
- **Interactive Bots**: Bots can respond to text commands and questions
|
||||||
|
- **Multi-modal Communication**: Users can interact via voice, video, and text
|
||||||
|
- **Command Processing**: Bots can handle specific commands and provide responses
|
||||||
|
- **Seamless Integration**: Chat works alongside existing WebRTC functionality
|
||||||
|
|
||||||
|
**Available Chat-Enabled Bots:**
|
||||||
|
- `chatbot`: Simple conversational bot with greetings, jokes, and time information
|
||||||
|
- `whisper`: Enhanced speech recognition bot that responds to chat commands
|
||||||
|
|
||||||
|
For detailed information about chat implementation and creating chat-enabled bots, see [CHAT_INTEGRATION.md](./CHAT_INTEGRATION.md).
|
||||||
|
|
||||||
|
### Bot Provider Configuration
|
||||||
|
|
||||||
|
The server supports authenticated bot providers to prevent unauthorized registrations. Bot providers must be configured with specific keys to be allowed to register.
|
||||||
|
|
||||||
|
#### Configuration
|
||||||
|
|
||||||
|
Set the following environment variables:
|
||||||
|
|
||||||
|
```env
|
||||||
|
# Server configuration - comma-separated list of allowed provider keys
|
||||||
|
BOT_PROVIDER_KEYS="key1:Provider Name 1,key2:Provider Name 2"
|
||||||
|
|
||||||
|
# Voicebot configuration - key to use when registering
|
||||||
|
VOICEBOT_PROVIDER_KEY="key1"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Format for BOT_PROVIDER_KEYS:**
|
||||||
|
- `key1,key2,key3` - Simple keys (names default to keys)
|
||||||
|
- `key1:Name1,key2:Name2` - Keys with custom names
|
||||||
|
|
||||||
|
#### Behavior
|
||||||
|
|
||||||
|
- **Authentication Enabled**: When `BOT_PROVIDER_KEYS` is set, only providers with valid keys can register
|
||||||
|
- **Authentication Disabled**: When `BOT_PROVIDER_KEYS` is empty or not set, any provider can register
|
||||||
|
- **Stale Provider Cleanup**: When a provider registers with an existing key, the old provider is automatically removed
|
||||||
|
- **Registration Validation**: Invalid provider keys are rejected with HTTP 403
|
||||||
|
|
||||||
|
#### Example
|
||||||
|
|
||||||
|
```env
|
||||||
|
BOT_PROVIDER_KEYS="voicebot-main:Main Voicebot,voicebot-dev:Development Bot"
|
||||||
|
VOICEBOT_PROVIDER_KEY="voicebot-main"
|
||||||
|
```
|
||||||
|
|
||||||
|
This allows two providers: one for production (`voicebot-main`) and one for development (`voicebot-dev`).
|
||||||
|
|
||||||
### shared
|
### shared
|
||||||
|
|
||||||
The `shared/` directory contains shared Pydantic models used for API communication between the server and voicebot components. This ensures type safety and consistency across the entire application.
|
The `shared/` directory contains shared Pydantic models used for API communication between the server and voicebot components. This ensures type safety and consistency across the entire application.
|
||||||
|
@ -68,6 +68,7 @@ export class AdvancedApiEvolutionChecker {
|
|||||||
'POST:/ai-voicebot/api/lobby/{sessionId}',
|
'POST:/ai-voicebot/api/lobby/{sessionId}',
|
||||||
'GET:/ai-voicebot/api/bots/providers',
|
'GET:/ai-voicebot/api/bots/providers',
|
||||||
'GET:/ai-voicebot/api/bots',
|
'GET:/ai-voicebot/api/bots',
|
||||||
|
'POST:/ai-voicebot/api/bots/leave',
|
||||||
'POST:/ai-voicebot/api/lobby/{session_id}'
|
'POST:/ai-voicebot/api/lobby/{session_id}'
|
||||||
]);
|
]);
|
||||||
}
|
}
|
||||||
|
@ -86,6 +86,38 @@ class SessionConfig:
|
|||||||
) # 30 minutes
|
) # 30 minutes
|
||||||
|
|
||||||
|
|
||||||
|
class BotProviderConfig:
|
||||||
|
"""Configuration class for bot provider management"""
|
||||||
|
|
||||||
|
# Comma-separated list of allowed provider keys
|
||||||
|
# Format: "key1:name1,key2:name2" or just "key1,key2" (names default to keys)
|
||||||
|
ALLOWED_PROVIDERS = os.getenv("BOT_PROVIDER_KEYS", "")
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def get_allowed_providers(cls) -> dict[str, str]:
|
||||||
|
"""Parse allowed providers from environment variable
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
dict mapping provider_key -> provider_name
|
||||||
|
"""
|
||||||
|
if not cls.ALLOWED_PROVIDERS.strip():
|
||||||
|
return {}
|
||||||
|
|
||||||
|
providers: dict[str, str] = {}
|
||||||
|
for entry in cls.ALLOWED_PROVIDERS.split(","):
|
||||||
|
entry = entry.strip()
|
||||||
|
if not entry:
|
||||||
|
continue
|
||||||
|
|
||||||
|
if ":" in entry:
|
||||||
|
key, name = entry.split(":", 1)
|
||||||
|
providers[key.strip()] = name.strip()
|
||||||
|
else:
|
||||||
|
providers[entry] = entry
|
||||||
|
|
||||||
|
return providers
|
||||||
|
|
||||||
|
|
||||||
# Thread lock for session operations
|
# Thread lock for session operations
|
||||||
session_lock = threading.RLock()
|
session_lock = threading.RLock()
|
||||||
|
|
||||||
@ -217,6 +249,15 @@ logger.info(
|
|||||||
f"Cleanup interval: {SessionConfig.CLEANUP_INTERVAL}s"
|
f"Cleanup interval: {SessionConfig.CLEANUP_INTERVAL}s"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Log bot provider configuration
|
||||||
|
allowed_providers = BotProviderConfig.get_allowed_providers()
|
||||||
|
if allowed_providers:
|
||||||
|
logger.info(
|
||||||
|
f"Bot provider authentication enabled. Allowed providers: {list(allowed_providers.keys())}"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
logger.warning("Bot provider authentication disabled. Any provider can register.")
|
||||||
|
|
||||||
# Optional admin token to protect admin endpoints
|
# Optional admin token to protect admin endpoints
|
||||||
ADMIN_TOKEN = os.getenv("ADMIN_TOKEN", None)
|
ADMIN_TOKEN = os.getenv("ADMIN_TOKEN", None)
|
||||||
|
|
||||||
@ -1352,9 +1393,35 @@ async def get_chat_messages(
|
|||||||
async def register_bot_provider(
|
async def register_bot_provider(
|
||||||
request: BotProviderRegisterRequest,
|
request: BotProviderRegisterRequest,
|
||||||
) -> BotProviderRegisterResponse:
|
) -> BotProviderRegisterResponse:
|
||||||
"""Register a new bot provider"""
|
"""Register a new bot provider with authentication"""
|
||||||
import uuid
|
import uuid
|
||||||
|
|
||||||
|
# Check if provider authentication is enabled
|
||||||
|
allowed_providers = BotProviderConfig.get_allowed_providers()
|
||||||
|
if allowed_providers:
|
||||||
|
# Authentication is enabled - validate provider key
|
||||||
|
if request.provider_key not in allowed_providers:
|
||||||
|
logger.warning(
|
||||||
|
f"Rejected bot provider registration with invalid key: {request.provider_key}"
|
||||||
|
)
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=403,
|
||||||
|
detail="Invalid provider key. Bot provider is not authorized to register.",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Check if there's already an active provider with this key and remove it
|
||||||
|
providers_to_remove: list[str] = []
|
||||||
|
for existing_provider_id, existing_provider in bot_providers.items():
|
||||||
|
if existing_provider.provider_key == request.provider_key:
|
||||||
|
providers_to_remove.append(existing_provider_id)
|
||||||
|
logger.info(
|
||||||
|
f"Removing stale bot provider: {existing_provider.name} (ID: {existing_provider_id})"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Remove stale providers
|
||||||
|
for provider_id_to_remove in providers_to_remove:
|
||||||
|
del bot_providers[provider_id_to_remove]
|
||||||
|
|
||||||
provider_id = str(uuid.uuid4())
|
provider_id = str(uuid.uuid4())
|
||||||
now = time.time()
|
now = time.time()
|
||||||
|
|
||||||
@ -1363,12 +1430,15 @@ async def register_bot_provider(
|
|||||||
base_url=request.base_url.rstrip("/"),
|
base_url=request.base_url.rstrip("/"),
|
||||||
name=request.name,
|
name=request.name,
|
||||||
description=request.description,
|
description=request.description,
|
||||||
|
provider_key=request.provider_key,
|
||||||
registered_at=now,
|
registered_at=now,
|
||||||
last_seen=now,
|
last_seen=now,
|
||||||
)
|
)
|
||||||
|
|
||||||
bot_providers[provider_id] = provider
|
bot_providers[provider_id] = provider
|
||||||
logger.info(f"Registered bot provider: {request.name} at {request.base_url}")
|
logger.info(
|
||||||
|
f"Registered bot provider: {request.name} at {request.base_url} with key: {request.provider_key}"
|
||||||
|
)
|
||||||
|
|
||||||
return BotProviderRegisterResponse(provider_id=provider_id)
|
return BotProviderRegisterResponse(provider_id=provider_id)
|
||||||
|
|
||||||
|
@ -345,6 +345,7 @@ class BotProviderModel(BaseModel):
|
|||||||
base_url: str
|
base_url: str
|
||||||
name: str
|
name: str
|
||||||
description: str = ""
|
description: str = ""
|
||||||
|
provider_key: str
|
||||||
registered_at: float
|
registered_at: float
|
||||||
last_seen: float
|
last_seen: float
|
||||||
|
|
||||||
@ -355,6 +356,7 @@ class BotProviderRegisterRequest(BaseModel):
|
|||||||
base_url: str
|
base_url: str
|
||||||
name: str
|
name: str
|
||||||
description: str = ""
|
description: str = ""
|
||||||
|
provider_key: str
|
||||||
|
|
||||||
|
|
||||||
class BotProviderRegisterResponse(BaseModel):
|
class BotProviderRegisterResponse(BaseModel):
|
||||||
|
@ -25,6 +25,10 @@ from logger import logger
|
|||||||
from voicebot.models import JoinRequest
|
from voicebot.models import JoinRequest
|
||||||
from voicebot.webrtc_signaling import WebRTCSignalingClient
|
from voicebot.webrtc_signaling import WebRTCSignalingClient
|
||||||
|
|
||||||
|
# Add shared models import for chat types
|
||||||
|
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||||
|
from shared.models import ChatMessageModel
|
||||||
|
|
||||||
|
|
||||||
@asynccontextmanager
|
@asynccontextmanager
|
||||||
async def lifespan(app: FastAPI):
|
async def lifespan(app: FastAPI):
|
||||||
@ -100,7 +104,17 @@ def discover_bots() -> Dict[str, Dict[str, Any]]:
|
|||||||
create_tracks = getattr(mod, "create_agent_tracks")
|
create_tracks = getattr(mod, "create_agent_tracks")
|
||||||
|
|
||||||
if info:
|
if info:
|
||||||
bots[info.get("name", name)] = {"module": name, "info": info, "create_tracks": create_tracks}
|
# Check for chat handler
|
||||||
|
chat_handler = None
|
||||||
|
if hasattr(mod, "handle_chat_message") and callable(getattr(mod, "handle_chat_message")):
|
||||||
|
chat_handler = getattr(mod, "handle_chat_message")
|
||||||
|
|
||||||
|
bots[info.get("name", name)] = {
|
||||||
|
"module": name,
|
||||||
|
"info": info,
|
||||||
|
"create_tracks": create_tracks,
|
||||||
|
"chat_handler": chat_handler
|
||||||
|
}
|
||||||
return bots
|
return bots
|
||||||
|
|
||||||
|
|
||||||
@ -120,8 +134,11 @@ async def bot_join(bot_name: str, req: JoinRequest):
|
|||||||
raise HTTPException(status_code=404, detail="Bot not found")
|
raise HTTPException(status_code=404, detail="Bot not found")
|
||||||
|
|
||||||
create_tracks = bot.get("create_tracks")
|
create_tracks = bot.get("create_tracks")
|
||||||
|
chat_handler = bot.get("chat_handler")
|
||||||
|
|
||||||
logger.info(f"🤖 Bot {bot_name} joining lobby {req.lobby_id} with nick: '{req.nick}'")
|
logger.info(f"🤖 Bot {bot_name} joining lobby {req.lobby_id} with nick: '{req.nick}'")
|
||||||
|
if chat_handler:
|
||||||
|
logger.info(f"🤖 Bot {bot_name} has chat handling capabilities")
|
||||||
|
|
||||||
# Start the WebRTCSignalingClient in a background asyncio task and register it
|
# Start the WebRTCSignalingClient in a background asyncio task and register it
|
||||||
client = WebRTCSignalingClient(
|
client = WebRTCSignalingClient(
|
||||||
@ -133,6 +150,20 @@ async def bot_join(bot_name: str, req: JoinRequest):
|
|||||||
create_tracks=create_tracks,
|
create_tracks=create_tracks,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Set up chat message handler if the bot provides one
|
||||||
|
if chat_handler:
|
||||||
|
async def bot_chat_handler(chat_message: ChatMessageModel):
|
||||||
|
"""Wrapper to call the bot's chat handler and optionally send responses"""
|
||||||
|
try:
|
||||||
|
# Call the bot's chat handler - it may return a response message
|
||||||
|
response = await chat_handler(chat_message, client.send_chat_message)
|
||||||
|
if response and isinstance(response, str):
|
||||||
|
await client.send_chat_message(response)
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error in bot chat handler for {bot_name}: {e}", exc_info=True)
|
||||||
|
|
||||||
|
client.on_chat_message_received = bot_chat_handler
|
||||||
|
|
||||||
run_id = str(uuid.uuid4())
|
run_id = str(uuid.uuid4())
|
||||||
|
|
||||||
async def run_client():
|
async def run_client():
|
||||||
@ -241,10 +272,14 @@ async def register_with_server(server_url: str, voicebot_url: str, insecure: boo
|
|||||||
# Import httpx locally to avoid dependency issues
|
# Import httpx locally to avoid dependency issues
|
||||||
import httpx
|
import httpx
|
||||||
|
|
||||||
|
# Get provider key from environment variable
|
||||||
|
provider_key = os.getenv('VOICEBOT_PROVIDER_KEY', 'default-voicebot')
|
||||||
|
|
||||||
payload = {
|
payload = {
|
||||||
"base_url": voicebot_url.rstrip('/'),
|
"base_url": voicebot_url.rstrip('/'),
|
||||||
"name": "voicebot-provider",
|
"name": "voicebot-provider",
|
||||||
"description": "AI voicebot provider with speech recognition and synthetic media capabilities"
|
"description": "AI voicebot provider with speech recognition and synthetic media capabilities",
|
||||||
|
"provider_key": provider_key
|
||||||
}
|
}
|
||||||
|
|
||||||
logger.info(f"📤 Sending registration payload: {payload}")
|
logger.info(f"📤 Sending registration payload: {payload}")
|
||||||
|
@ -2,5 +2,6 @@
|
|||||||
|
|
||||||
from . import synthetic_media
|
from . import synthetic_media
|
||||||
from . import whisper
|
from . import whisper
|
||||||
|
from . import chatbot
|
||||||
|
|
||||||
__all__ = ["synthetic_media", "whisper"]
|
__all__ = ["synthetic_media", "whisper", "chatbot"]
|
||||||
|
89
voicebot/bots/chatbot.py
Normal file
89
voicebot/bots/chatbot.py
Normal file
@ -0,0 +1,89 @@
|
|||||||
|
"""Simple chatbot agent that demonstrates chat message handling.
|
||||||
|
|
||||||
|
This bot shows how to create an agent that primarily uses chat functionality
|
||||||
|
rather than media streams.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from typing import Dict, Optional, Callable, Awaitable
|
||||||
|
import time
|
||||||
|
import random
|
||||||
|
from logger import logger
|
||||||
|
from aiortc import MediaStreamTrack
|
||||||
|
|
||||||
|
# Import shared models for chat functionality
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
|
||||||
|
from shared.models import ChatMessageModel
|
||||||
|
|
||||||
|
|
||||||
|
AGENT_NAME = "chatbot"
|
||||||
|
AGENT_DESCRIPTION = "Simple chatbot that responds to chat messages"
|
||||||
|
|
||||||
|
# Simple response database
|
||||||
|
RESPONSES = {
|
||||||
|
"hello": ["Hello!", "Hi there!", "Hey!", "Greetings!"],
|
||||||
|
"how are you": ["I'm doing well, thank you!", "Great, thanks for asking!", "I'm fine!"],
|
||||||
|
"goodbye": ["Goodbye!", "See you later!", "Bye!", "Take care!"],
|
||||||
|
"help": ["I can respond to simple greetings and questions. Try saying hello, asking how I am, or say goodbye!"],
|
||||||
|
"time": ["Let me check... it's currently {time}"],
|
||||||
|
"joke": [
|
||||||
|
"Why don't scientists trust atoms? Because they make up everything!",
|
||||||
|
"I told my wife she was drawing her eyebrows too high. She seemed surprised.",
|
||||||
|
"What do you call a fish wearing a crown? A king fish!",
|
||||||
|
"Why don't eggs tell jokes? They'd crack each other up!"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def agent_info() -> Dict[str, str]:
|
||||||
|
return {"name": AGENT_NAME, "description": AGENT_DESCRIPTION}
|
||||||
|
|
||||||
|
|
||||||
|
def create_agent_tracks(session_name: str) -> dict[str, MediaStreamTrack]:
|
||||||
|
"""Chatbot doesn't provide media tracks - it's chat-only."""
|
||||||
|
return {}
|
||||||
|
|
||||||
|
|
||||||
|
async def handle_chat_message(chat_message: ChatMessageModel, send_message_func: Callable[[str], Awaitable[None]]) -> Optional[str]:
|
||||||
|
"""Handle incoming chat messages and provide responses.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
chat_message: The received chat message
|
||||||
|
send_message_func: Function to send messages back to the lobby
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Optional response message to send back to the lobby
|
||||||
|
"""
|
||||||
|
message_lower = chat_message.message.lower().strip()
|
||||||
|
sender = chat_message.sender_name
|
||||||
|
|
||||||
|
logger.info(f"Chatbot received message from {sender}: {chat_message.message}")
|
||||||
|
|
||||||
|
# Skip messages from ourselves
|
||||||
|
if sender.lower() == AGENT_NAME.lower():
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Look for keywords in the message
|
||||||
|
for keyword, responses in RESPONSES.items():
|
||||||
|
if keyword in message_lower:
|
||||||
|
response = random.choice(responses)
|
||||||
|
# Handle special formatting
|
||||||
|
if "{time}" in response:
|
||||||
|
current_time = time.strftime("%Y-%m-%d %H:%M:%S")
|
||||||
|
response = response.format(time=current_time)
|
||||||
|
|
||||||
|
logger.info(f"Chatbot responding with: {response}")
|
||||||
|
return response
|
||||||
|
|
||||||
|
# If we get a direct mention or question, provide a generic response
|
||||||
|
if any(word in message_lower for word in ["bot", "chatbot", "?"]):
|
||||||
|
responses = [
|
||||||
|
f"Hi {sender}! I'm a simple chatbot. Say 'help' to see what I can do!",
|
||||||
|
f"Hello {sender}! I heard you mention me. How can I help?",
|
||||||
|
"I'm here and listening! Try asking me about the time or tell me a greeting!"
|
||||||
|
]
|
||||||
|
return random.choice(responses)
|
||||||
|
|
||||||
|
# Default: don't respond to unrecognized messages
|
||||||
|
return None
|
@ -4,12 +4,18 @@ Lightweight agent descriptor; heavy model loading must be done by a controller
|
|||||||
when the agent is actually used.
|
when the agent is actually used.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from typing import Dict, Any
|
from typing import Dict, Any, Optional, Callable, Awaitable
|
||||||
import librosa
|
import librosa
|
||||||
from logger import logger
|
from logger import logger
|
||||||
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
|
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
|
||||||
from aiortc import MediaStreamTrack
|
from aiortc import MediaStreamTrack
|
||||||
|
|
||||||
|
# Import shared models for chat functionality
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
|
||||||
|
from shared.models import ChatMessageModel
|
||||||
|
|
||||||
|
|
||||||
AGENT_NAME = "whisper"
|
AGENT_NAME = "whisper"
|
||||||
AGENT_DESCRIPTION = "Speech recognition agent (Whisper) - processes incoming audio"
|
AGENT_DESCRIPTION = "Speech recognition agent (Whisper) - processes incoming audio"
|
||||||
@ -23,6 +29,33 @@ def create_agent_tracks(session_name: str) -> dict[str, MediaStreamTrack]:
|
|||||||
"""Whisper is not a media source - return no local tracks."""
|
"""Whisper is not a media source - return no local tracks."""
|
||||||
return {}
|
return {}
|
||||||
|
|
||||||
|
async def handle_chat_message(chat_message: ChatMessageModel, send_message_func: Callable[[str], Awaitable[None]]) -> Optional[str]:
|
||||||
|
"""Handle incoming chat messages and optionally return a response.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
chat_message: The received chat message
|
||||||
|
send_message_func: Function to send messages back to the lobby
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Optional response message to send back to the lobby
|
||||||
|
"""
|
||||||
|
logger.info(f"Whisper bot received chat message from {chat_message.sender_name}: {chat_message.message}")
|
||||||
|
|
||||||
|
# Simple echo bot behavior for demonstration
|
||||||
|
if chat_message.message.lower().startswith("whisper:"):
|
||||||
|
command = chat_message.message[8:].strip() # Remove "whisper:" prefix
|
||||||
|
if command.lower() == "hello":
|
||||||
|
return f"Hello {chat_message.sender_name}! I'm the Whisper speech recognition bot."
|
||||||
|
elif command.lower() == "help":
|
||||||
|
return "I can process speech and respond to simple commands. Try 'whisper: hello' or 'whisper: status'"
|
||||||
|
elif command.lower() == "status":
|
||||||
|
return "Whisper bot is running and ready to process audio and chat messages."
|
||||||
|
else:
|
||||||
|
return f"I heard you say: {command}. Try 'whisper: help' for available commands."
|
||||||
|
|
||||||
|
# Don't respond to other messages
|
||||||
|
return None
|
||||||
|
|
||||||
def do_work():
|
def do_work():
|
||||||
model_ids = {
|
model_ids = {
|
||||||
"Distil-Whisper": [
|
"Distil-Whisper": [
|
||||||
|
@ -50,6 +50,7 @@ from shared.models import (
|
|||||||
IceCandidateModel,
|
IceCandidateModel,
|
||||||
ICECandidateDictModel,
|
ICECandidateDictModel,
|
||||||
SessionDescriptionTypedModel,
|
SessionDescriptionTypedModel,
|
||||||
|
ChatMessageModel,
|
||||||
)
|
)
|
||||||
|
|
||||||
from logger import logger
|
from logger import logger
|
||||||
@ -131,6 +132,9 @@ class WebRTCSignalingClient:
|
|||||||
self.on_track_received: Optional[
|
self.on_track_received: Optional[
|
||||||
Callable[[Peer, MediaStreamTrack], Awaitable[None]]
|
Callable[[Peer, MediaStreamTrack], Awaitable[None]]
|
||||||
] = None
|
] = None
|
||||||
|
self.on_chat_message_received: Optional[
|
||||||
|
Callable[[ChatMessageModel], Awaitable[None]]
|
||||||
|
] = None
|
||||||
|
|
||||||
async def connect(self):
|
async def connect(self):
|
||||||
"""Connect to the signaling server"""
|
"""Connect to the signaling server"""
|
||||||
@ -371,6 +375,22 @@ class WebRTCSignalingClient:
|
|||||||
self.shutdown_requested = True
|
self.shutdown_requested = True
|
||||||
logger.info("Shutdown requested for WebRTC signaling client")
|
logger.info("Shutdown requested for WebRTC signaling client")
|
||||||
|
|
||||||
|
async def send_chat_message(self, message: str):
|
||||||
|
"""Send a chat message to the lobby"""
|
||||||
|
if not self.is_registered:
|
||||||
|
logger.warning("Cannot send chat message: not registered")
|
||||||
|
return
|
||||||
|
|
||||||
|
if not message.strip():
|
||||||
|
logger.warning("Cannot send empty chat message")
|
||||||
|
return
|
||||||
|
|
||||||
|
try:
|
||||||
|
await self._send_message("send_chat_message", {"message": message.strip()})
|
||||||
|
logger.info(f"Sent chat message: {message[:50]}...")
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Failed to send chat message: {e}", exc_info=True)
|
||||||
|
|
||||||
async def _setup_local_media(self):
|
async def _setup_local_media(self):
|
||||||
"""Create local media tracks"""
|
"""Create local media tracks"""
|
||||||
# If a bot provided a create_tracks callable, use it to create tracks.
|
# If a bot provided a create_tracks callable, use it to create tracks.
|
||||||
@ -530,6 +550,19 @@ class WebRTCSignalingClient:
|
|||||||
# Handle status check messages - these are used to verify connection
|
# Handle status check messages - these are used to verify connection
|
||||||
logger.debug(f"Received status check message: {data}")
|
logger.debug(f"Received status check message: {data}")
|
||||||
# No special processing needed for status checks, just acknowledge receipt
|
# No special processing needed for status checks, just acknowledge receipt
|
||||||
|
elif msg_type == "chat_message":
|
||||||
|
try:
|
||||||
|
validated = ChatMessageModel.model_validate(data)
|
||||||
|
except ValidationError as e:
|
||||||
|
logger.error(f"Invalid chat_message payload: {e}", exc_info=True)
|
||||||
|
return
|
||||||
|
logger.info(f"Received chat message from {validated.sender_name}: {validated.message[:50]}...")
|
||||||
|
# Call the callback if it's set
|
||||||
|
if self.on_chat_message_received:
|
||||||
|
try:
|
||||||
|
await self.on_chat_message_received(validated)
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error in chat message callback: {e}", exc_info=True)
|
||||||
else:
|
else:
|
||||||
logger.info(f"Unhandled message type: {msg_type} with data: {data}")
|
logger.info(f"Unhandled message type: {msg_type} with data: {data}")
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user