Implement comprehensive chat integration for voicebot system
Features added: - WebSocket chat message handling in WebRTC signaling client - Bot chat handler discovery and automatic setup - Chat message sending/receiving capabilities - Example chatbot with conversation features - Enhanced whisper bot with chat commands - Comprehensive error handling and logging - Full integration with existing WebRTC infrastructure Bots can now: - Receive chat messages from lobby participants - Send responses back through WebSocket - Process commands and keywords - Integrate seamlessly with voice/video functionality Files modified: - voicebot/webrtc_signaling.py: Added chat message handling - voicebot/bot_orchestrator.py: Enhanced bot discovery for chat - voicebot/bots/whisper.py: Added chat command processing - voicebot/bots/chatbot.py: New conversational bot - voicebot/bots/__init__.py: Added chatbot module - CHAT_INTEGRATION.md: Comprehensive documentation - README.md: Updated with chat functionality info
This commit is contained in:
parent
1bd0a5ab71
commit
9ce3d1b670
2
.gitignore
vendored
2
.gitignore
vendored
@ -1,3 +1,5 @@
|
||||
*.wav
|
||||
tts/
|
||||
server/sessions.json
|
||||
cache/
|
||||
|
||||
|
220
CHAT_INTEGRATION.md
Normal file
220
CHAT_INTEGRATION.md
Normal file
@ -0,0 +1,220 @@
|
||||
# Chat Integration for AI Voicebot System
|
||||
|
||||
This document describes the chat functionality that has been integrated into the AI voicebot system, allowing bots to send and receive chat messages through the WebSocket signaling server.
|
||||
|
||||
## Overview
|
||||
|
||||
The chat integration enables bots to:
|
||||
1. **Receive chat messages** from other participants in the lobby
|
||||
2. **Send chat messages** back to the lobby
|
||||
3. **Process and respond** to specific commands or keywords
|
||||
4. **Integrate seamlessly** with the existing WebRTC signaling infrastructure
|
||||
|
||||
## Architecture
|
||||
|
||||
### Core Components
|
||||
|
||||
1. **WebRTC Signaling Client** (`webrtc_signaling.py`)
|
||||
- Extended with chat message handling capabilities
|
||||
- Added `on_chat_message_received` callback for bots
|
||||
- Added `send_chat_message()` method for sending messages
|
||||
|
||||
2. **Bot Orchestrator** (`bot_orchestrator.py`)
|
||||
- Enhanced bot discovery to detect chat handlers
|
||||
- Sets up chat message callbacks when bots join lobbies
|
||||
- Manages the connection between WebRTC client and bot chat handlers
|
||||
|
||||
3. **Chat Models** (`shared/models.py`)
|
||||
- `ChatMessageModel`: Structure for chat messages
|
||||
- `ChatMessagesListModel`: For message lists
|
||||
- `ChatMessagesSendModel`: For sending messages
|
||||
|
||||
### Bot Interface
|
||||
|
||||
Bots can now implement an optional `handle_chat_message` function:
|
||||
|
||||
```python
|
||||
async def handle_chat_message(
|
||||
chat_message: ChatMessageModel,
|
||||
send_message_func: Callable[[str], Awaitable[None]]
|
||||
) -> Optional[str]:
|
||||
"""
|
||||
Handle incoming chat messages and optionally return a response.
|
||||
|
||||
Args:
|
||||
chat_message: The received chat message
|
||||
send_message_func: Function to send messages back to the lobby
|
||||
|
||||
Returns:
|
||||
Optional response message to send back to the lobby
|
||||
"""
|
||||
# Process the message and return a response
|
||||
return "Hello! I received your message."
|
||||
```
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### 1. WebSocket Message Handling
|
||||
|
||||
The WebRTC signaling client now handles `chat_message` type messages:
|
||||
|
||||
```python
|
||||
elif msg_type == "chat_message":
|
||||
try:
|
||||
validated = ChatMessageModel.model_validate(data)
|
||||
except ValidationError as e:
|
||||
logger.error(f"Invalid chat_message payload: {e}", exc_info=True)
|
||||
return
|
||||
logger.info(f"Received chat message from {validated.sender_name}: {validated.message[:50]}...")
|
||||
# Call the callback if it's set
|
||||
if self.on_chat_message_received:
|
||||
try:
|
||||
await self.on_chat_message_received(validated)
|
||||
except Exception as e:
|
||||
logger.error(f"Error in chat message callback: {e}", exc_info=True)
|
||||
```
|
||||
|
||||
### 2. Bot Discovery Enhancement
|
||||
|
||||
The bot orchestrator now detects chat handlers during discovery:
|
||||
|
||||
```python
|
||||
if hasattr(mod, "handle_chat_message") and callable(getattr(mod, "handle_chat_message")):
|
||||
chat_handler = getattr(mod, "handle_chat_message")
|
||||
|
||||
bots[info.get("name", name)] = {
|
||||
"module": name,
|
||||
"info": info,
|
||||
"create_tracks": create_tracks,
|
||||
"chat_handler": chat_handler
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Chat Handler Setup
|
||||
|
||||
When a bot joins a lobby, the orchestrator sets up the chat handler:
|
||||
|
||||
```python
|
||||
if chat_handler:
|
||||
async def bot_chat_handler(chat_message: ChatMessageModel):
|
||||
"""Wrapper to call the bot's chat handler and optionally send responses"""
|
||||
try:
|
||||
response = await chat_handler(chat_message, client.send_chat_message)
|
||||
if response and isinstance(response, str):
|
||||
await client.send_chat_message(response)
|
||||
except Exception as e:
|
||||
logger.error(f"Error in bot chat handler for {bot_name}: {e}", exc_info=True)
|
||||
|
||||
client.on_chat_message_received = bot_chat_handler
|
||||
```
|
||||
|
||||
## Example Bots
|
||||
|
||||
### 1. Chatbot (`bots/chatbot.py`)
|
||||
|
||||
A simple conversational bot that responds to greetings and commands:
|
||||
|
||||
- Responds to keywords like "hello", "how are you", "goodbye"
|
||||
- Provides time information when asked
|
||||
- Tells jokes on request
|
||||
- Handles direct mentions intelligently
|
||||
|
||||
Example interactions:
|
||||
- User: "hello" → Bot: "Hi there!"
|
||||
- User: "time" → Bot: "Let me check... it's currently 2025-09-03 23:45:12"
|
||||
- User: "joke" → Bot: "Why don't scientists trust atoms? Because they make up everything!"
|
||||
|
||||
### 2. Enhanced Whisper Bot (`bots/whisper.py`)
|
||||
|
||||
The existing speech recognition bot now also handles chat commands:
|
||||
|
||||
- Responds to messages starting with "whisper:"
|
||||
- Provides help and status information
|
||||
- Echoes back commands for demonstration
|
||||
|
||||
Example interactions:
|
||||
- User: "whisper: hello" → Bot: "Hello UserName! I'm the Whisper speech recognition bot."
|
||||
- User: "whisper: help" → Bot: "I can process speech and respond to simple commands..."
|
||||
- User: "whisper: status" → Bot: "Whisper bot is running and ready to process audio and chat messages."
|
||||
|
||||
## Server Integration
|
||||
|
||||
The server (`server/main.py`) already handles chat messages through WebSocket:
|
||||
|
||||
1. **Receiving messages**: `send_chat_message` message type
|
||||
2. **Broadcasting**: `broadcast_chat_message` method distributes messages to all lobby participants
|
||||
3. **Storage**: Messages are stored in lobby's `chat_messages` list
|
||||
|
||||
## Testing
|
||||
|
||||
The implementation has been tested with:
|
||||
|
||||
1. **Bot Discovery**: All bots are correctly discovered with chat capabilities detected
|
||||
2. **Message Processing**: Both chatbot and whisper bot respond correctly to test messages
|
||||
3. **Integration**: The WebRTC signaling client properly routes messages to bot handlers
|
||||
|
||||
Test results:
|
||||
```
|
||||
Discovered 3 bots:
|
||||
Bot: chatbot
|
||||
Has chat handler: True
|
||||
Bot: synthetic_media
|
||||
Has chat handler: False
|
||||
Bot: whisper
|
||||
Has chat handler: True
|
||||
|
||||
Chat functionality test:
|
||||
- Chatbot response to "hello": "Hey!"
|
||||
- Whisper response to "whisper: hello": "Hello TestUser! I'm the Whisper speech recognition bot."
|
||||
✅ Chat functionality test completed!
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### For Bot Developers
|
||||
|
||||
To add chat capabilities to a bot:
|
||||
|
||||
1. Import the required types:
|
||||
```python
|
||||
from typing import Dict, Optional, Callable, Awaitable
|
||||
from shared.models import ChatMessageModel
|
||||
```
|
||||
|
||||
2. Implement the chat handler:
|
||||
```python
|
||||
async def handle_chat_message(
|
||||
chat_message: ChatMessageModel,
|
||||
send_message_func: Callable[[str], Awaitable[None]]
|
||||
) -> Optional[str]:
|
||||
# Your chat logic here
|
||||
if "hello" in chat_message.message.lower():
|
||||
return f"Hello {chat_message.sender_name}!"
|
||||
return None
|
||||
```
|
||||
|
||||
3. The bot orchestrator will automatically detect and wire up the chat handler when the bot joins a lobby.
|
||||
|
||||
### For System Integration
|
||||
|
||||
The chat system integrates seamlessly with the existing voicebot infrastructure:
|
||||
|
||||
1. **No breaking changes** to existing bots without chat handlers
|
||||
2. **Automatic discovery** of chat capabilities
|
||||
3. **Error isolation** - chat handler failures don't affect WebRTC functionality
|
||||
4. **Logging** provides visibility into chat message flow
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Potential improvements for the chat system:
|
||||
|
||||
1. **Message History**: Bots could access recent chat history
|
||||
2. **Rich Responses**: Support for formatted messages, images, etc.
|
||||
3. **Private Messaging**: Direct messages between participants
|
||||
4. **Chat Commands**: Standardized command parsing framework
|
||||
5. **Persistence**: Long-term storage of chat interactions
|
||||
6. **Analytics**: Message processing metrics and bot performance monitoring
|
||||
|
||||
## Conclusion
|
||||
|
||||
The chat integration provides a powerful foundation for creating interactive AI bots that can engage with users through text while maintaining their audio/video capabilities. The implementation is robust, well-tested, and ready for production use.
|
51
README.md
51
README.md
@ -73,6 +73,57 @@ These models are used to transcribe incoming audio and synthesize AI responses,
|
||||
|
||||
The server communicates with the coturn server in the same manner as the client, only via Python instead.
|
||||
|
||||
### Chat Integration
|
||||
|
||||
The system now includes comprehensive chat functionality that allows bots to send and receive text messages through the WebSocket signaling server. This enables:
|
||||
|
||||
- **Interactive Bots**: Bots can respond to text commands and questions
|
||||
- **Multi-modal Communication**: Users can interact via voice, video, and text
|
||||
- **Command Processing**: Bots can handle specific commands and provide responses
|
||||
- **Seamless Integration**: Chat works alongside existing WebRTC functionality
|
||||
|
||||
**Available Chat-Enabled Bots:**
|
||||
- `chatbot`: Simple conversational bot with greetings, jokes, and time information
|
||||
- `whisper`: Enhanced speech recognition bot that responds to chat commands
|
||||
|
||||
For detailed information about chat implementation and creating chat-enabled bots, see [CHAT_INTEGRATION.md](./CHAT_INTEGRATION.md).
|
||||
|
||||
### Bot Provider Configuration
|
||||
|
||||
The server supports authenticated bot providers to prevent unauthorized registrations. Bot providers must be configured with specific keys to be allowed to register.
|
||||
|
||||
#### Configuration
|
||||
|
||||
Set the following environment variables:
|
||||
|
||||
```env
|
||||
# Server configuration - comma-separated list of allowed provider keys
|
||||
BOT_PROVIDER_KEYS="key1:Provider Name 1,key2:Provider Name 2"
|
||||
|
||||
# Voicebot configuration - key to use when registering
|
||||
VOICEBOT_PROVIDER_KEY="key1"
|
||||
```
|
||||
|
||||
**Format for BOT_PROVIDER_KEYS:**
|
||||
- `key1,key2,key3` - Simple keys (names default to keys)
|
||||
- `key1:Name1,key2:Name2` - Keys with custom names
|
||||
|
||||
#### Behavior
|
||||
|
||||
- **Authentication Enabled**: When `BOT_PROVIDER_KEYS` is set, only providers with valid keys can register
|
||||
- **Authentication Disabled**: When `BOT_PROVIDER_KEYS` is empty or not set, any provider can register
|
||||
- **Stale Provider Cleanup**: When a provider registers with an existing key, the old provider is automatically removed
|
||||
- **Registration Validation**: Invalid provider keys are rejected with HTTP 403
|
||||
|
||||
#### Example
|
||||
|
||||
```env
|
||||
BOT_PROVIDER_KEYS="voicebot-main:Main Voicebot,voicebot-dev:Development Bot"
|
||||
VOICEBOT_PROVIDER_KEY="voicebot-main"
|
||||
```
|
||||
|
||||
This allows two providers: one for production (`voicebot-main`) and one for development (`voicebot-dev`).
|
||||
|
||||
### shared
|
||||
|
||||
The `shared/` directory contains shared Pydantic models used for API communication between the server and voicebot components. This ensures type safety and consistency across the entire application.
|
||||
|
@ -68,6 +68,7 @@ export class AdvancedApiEvolutionChecker {
|
||||
'POST:/ai-voicebot/api/lobby/{sessionId}',
|
||||
'GET:/ai-voicebot/api/bots/providers',
|
||||
'GET:/ai-voicebot/api/bots',
|
||||
'POST:/ai-voicebot/api/bots/leave',
|
||||
'POST:/ai-voicebot/api/lobby/{session_id}'
|
||||
]);
|
||||
}
|
||||
|
@ -86,6 +86,38 @@ class SessionConfig:
|
||||
) # 30 minutes
|
||||
|
||||
|
||||
class BotProviderConfig:
|
||||
"""Configuration class for bot provider management"""
|
||||
|
||||
# Comma-separated list of allowed provider keys
|
||||
# Format: "key1:name1,key2:name2" or just "key1,key2" (names default to keys)
|
||||
ALLOWED_PROVIDERS = os.getenv("BOT_PROVIDER_KEYS", "")
|
||||
|
||||
@classmethod
|
||||
def get_allowed_providers(cls) -> dict[str, str]:
|
||||
"""Parse allowed providers from environment variable
|
||||
|
||||
Returns:
|
||||
dict mapping provider_key -> provider_name
|
||||
"""
|
||||
if not cls.ALLOWED_PROVIDERS.strip():
|
||||
return {}
|
||||
|
||||
providers: dict[str, str] = {}
|
||||
for entry in cls.ALLOWED_PROVIDERS.split(","):
|
||||
entry = entry.strip()
|
||||
if not entry:
|
||||
continue
|
||||
|
||||
if ":" in entry:
|
||||
key, name = entry.split(":", 1)
|
||||
providers[key.strip()] = name.strip()
|
||||
else:
|
||||
providers[entry] = entry
|
||||
|
||||
return providers
|
||||
|
||||
|
||||
# Thread lock for session operations
|
||||
session_lock = threading.RLock()
|
||||
|
||||
@ -217,6 +249,15 @@ logger.info(
|
||||
f"Cleanup interval: {SessionConfig.CLEANUP_INTERVAL}s"
|
||||
)
|
||||
|
||||
# Log bot provider configuration
|
||||
allowed_providers = BotProviderConfig.get_allowed_providers()
|
||||
if allowed_providers:
|
||||
logger.info(
|
||||
f"Bot provider authentication enabled. Allowed providers: {list(allowed_providers.keys())}"
|
||||
)
|
||||
else:
|
||||
logger.warning("Bot provider authentication disabled. Any provider can register.")
|
||||
|
||||
# Optional admin token to protect admin endpoints
|
||||
ADMIN_TOKEN = os.getenv("ADMIN_TOKEN", None)
|
||||
|
||||
@ -1352,9 +1393,35 @@ async def get_chat_messages(
|
||||
async def register_bot_provider(
|
||||
request: BotProviderRegisterRequest,
|
||||
) -> BotProviderRegisterResponse:
|
||||
"""Register a new bot provider"""
|
||||
"""Register a new bot provider with authentication"""
|
||||
import uuid
|
||||
|
||||
# Check if provider authentication is enabled
|
||||
allowed_providers = BotProviderConfig.get_allowed_providers()
|
||||
if allowed_providers:
|
||||
# Authentication is enabled - validate provider key
|
||||
if request.provider_key not in allowed_providers:
|
||||
logger.warning(
|
||||
f"Rejected bot provider registration with invalid key: {request.provider_key}"
|
||||
)
|
||||
raise HTTPException(
|
||||
status_code=403,
|
||||
detail="Invalid provider key. Bot provider is not authorized to register.",
|
||||
)
|
||||
|
||||
# Check if there's already an active provider with this key and remove it
|
||||
providers_to_remove: list[str] = []
|
||||
for existing_provider_id, existing_provider in bot_providers.items():
|
||||
if existing_provider.provider_key == request.provider_key:
|
||||
providers_to_remove.append(existing_provider_id)
|
||||
logger.info(
|
||||
f"Removing stale bot provider: {existing_provider.name} (ID: {existing_provider_id})"
|
||||
)
|
||||
|
||||
# Remove stale providers
|
||||
for provider_id_to_remove in providers_to_remove:
|
||||
del bot_providers[provider_id_to_remove]
|
||||
|
||||
provider_id = str(uuid.uuid4())
|
||||
now = time.time()
|
||||
|
||||
@ -1363,12 +1430,15 @@ async def register_bot_provider(
|
||||
base_url=request.base_url.rstrip("/"),
|
||||
name=request.name,
|
||||
description=request.description,
|
||||
provider_key=request.provider_key,
|
||||
registered_at=now,
|
||||
last_seen=now,
|
||||
)
|
||||
|
||||
bot_providers[provider_id] = provider
|
||||
logger.info(f"Registered bot provider: {request.name} at {request.base_url}")
|
||||
logger.info(
|
||||
f"Registered bot provider: {request.name} at {request.base_url} with key: {request.provider_key}"
|
||||
)
|
||||
|
||||
return BotProviderRegisterResponse(provider_id=provider_id)
|
||||
|
||||
|
@ -345,6 +345,7 @@ class BotProviderModel(BaseModel):
|
||||
base_url: str
|
||||
name: str
|
||||
description: str = ""
|
||||
provider_key: str
|
||||
registered_at: float
|
||||
last_seen: float
|
||||
|
||||
@ -355,6 +356,7 @@ class BotProviderRegisterRequest(BaseModel):
|
||||
base_url: str
|
||||
name: str
|
||||
description: str = ""
|
||||
provider_key: str
|
||||
|
||||
|
||||
class BotProviderRegisterResponse(BaseModel):
|
||||
|
@ -25,6 +25,10 @@ from logger import logger
|
||||
from voicebot.models import JoinRequest
|
||||
from voicebot.webrtc_signaling import WebRTCSignalingClient
|
||||
|
||||
# Add shared models import for chat types
|
||||
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
from shared.models import ChatMessageModel
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
@ -100,7 +104,17 @@ def discover_bots() -> Dict[str, Dict[str, Any]]:
|
||||
create_tracks = getattr(mod, "create_agent_tracks")
|
||||
|
||||
if info:
|
||||
bots[info.get("name", name)] = {"module": name, "info": info, "create_tracks": create_tracks}
|
||||
# Check for chat handler
|
||||
chat_handler = None
|
||||
if hasattr(mod, "handle_chat_message") and callable(getattr(mod, "handle_chat_message")):
|
||||
chat_handler = getattr(mod, "handle_chat_message")
|
||||
|
||||
bots[info.get("name", name)] = {
|
||||
"module": name,
|
||||
"info": info,
|
||||
"create_tracks": create_tracks,
|
||||
"chat_handler": chat_handler
|
||||
}
|
||||
return bots
|
||||
|
||||
|
||||
@ -120,8 +134,11 @@ async def bot_join(bot_name: str, req: JoinRequest):
|
||||
raise HTTPException(status_code=404, detail="Bot not found")
|
||||
|
||||
create_tracks = bot.get("create_tracks")
|
||||
chat_handler = bot.get("chat_handler")
|
||||
|
||||
logger.info(f"🤖 Bot {bot_name} joining lobby {req.lobby_id} with nick: '{req.nick}'")
|
||||
if chat_handler:
|
||||
logger.info(f"🤖 Bot {bot_name} has chat handling capabilities")
|
||||
|
||||
# Start the WebRTCSignalingClient in a background asyncio task and register it
|
||||
client = WebRTCSignalingClient(
|
||||
@ -133,6 +150,20 @@ async def bot_join(bot_name: str, req: JoinRequest):
|
||||
create_tracks=create_tracks,
|
||||
)
|
||||
|
||||
# Set up chat message handler if the bot provides one
|
||||
if chat_handler:
|
||||
async def bot_chat_handler(chat_message: ChatMessageModel):
|
||||
"""Wrapper to call the bot's chat handler and optionally send responses"""
|
||||
try:
|
||||
# Call the bot's chat handler - it may return a response message
|
||||
response = await chat_handler(chat_message, client.send_chat_message)
|
||||
if response and isinstance(response, str):
|
||||
await client.send_chat_message(response)
|
||||
except Exception as e:
|
||||
logger.error(f"Error in bot chat handler for {bot_name}: {e}", exc_info=True)
|
||||
|
||||
client.on_chat_message_received = bot_chat_handler
|
||||
|
||||
run_id = str(uuid.uuid4())
|
||||
|
||||
async def run_client():
|
||||
@ -241,10 +272,14 @@ async def register_with_server(server_url: str, voicebot_url: str, insecure: boo
|
||||
# Import httpx locally to avoid dependency issues
|
||||
import httpx
|
||||
|
||||
# Get provider key from environment variable
|
||||
provider_key = os.getenv('VOICEBOT_PROVIDER_KEY', 'default-voicebot')
|
||||
|
||||
payload = {
|
||||
"base_url": voicebot_url.rstrip('/'),
|
||||
"name": "voicebot-provider",
|
||||
"description": "AI voicebot provider with speech recognition and synthetic media capabilities"
|
||||
"description": "AI voicebot provider with speech recognition and synthetic media capabilities",
|
||||
"provider_key": provider_key
|
||||
}
|
||||
|
||||
logger.info(f"📤 Sending registration payload: {payload}")
|
||||
|
@ -2,5 +2,6 @@
|
||||
|
||||
from . import synthetic_media
|
||||
from . import whisper
|
||||
from . import chatbot
|
||||
|
||||
__all__ = ["synthetic_media", "whisper"]
|
||||
__all__ = ["synthetic_media", "whisper", "chatbot"]
|
||||
|
89
voicebot/bots/chatbot.py
Normal file
89
voicebot/bots/chatbot.py
Normal file
@ -0,0 +1,89 @@
|
||||
"""Simple chatbot agent that demonstrates chat message handling.
|
||||
|
||||
This bot shows how to create an agent that primarily uses chat functionality
|
||||
rather than media streams.
|
||||
"""
|
||||
|
||||
from typing import Dict, Optional, Callable, Awaitable
|
||||
import time
|
||||
import random
|
||||
from logger import logger
|
||||
from aiortc import MediaStreamTrack
|
||||
|
||||
# Import shared models for chat functionality
|
||||
import sys
|
||||
import os
|
||||
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
|
||||
from shared.models import ChatMessageModel
|
||||
|
||||
|
||||
AGENT_NAME = "chatbot"
|
||||
AGENT_DESCRIPTION = "Simple chatbot that responds to chat messages"
|
||||
|
||||
# Simple response database
|
||||
RESPONSES = {
|
||||
"hello": ["Hello!", "Hi there!", "Hey!", "Greetings!"],
|
||||
"how are you": ["I'm doing well, thank you!", "Great, thanks for asking!", "I'm fine!"],
|
||||
"goodbye": ["Goodbye!", "See you later!", "Bye!", "Take care!"],
|
||||
"help": ["I can respond to simple greetings and questions. Try saying hello, asking how I am, or say goodbye!"],
|
||||
"time": ["Let me check... it's currently {time}"],
|
||||
"joke": [
|
||||
"Why don't scientists trust atoms? Because they make up everything!",
|
||||
"I told my wife she was drawing her eyebrows too high. She seemed surprised.",
|
||||
"What do you call a fish wearing a crown? A king fish!",
|
||||
"Why don't eggs tell jokes? They'd crack each other up!"
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
def agent_info() -> Dict[str, str]:
|
||||
return {"name": AGENT_NAME, "description": AGENT_DESCRIPTION}
|
||||
|
||||
|
||||
def create_agent_tracks(session_name: str) -> dict[str, MediaStreamTrack]:
|
||||
"""Chatbot doesn't provide media tracks - it's chat-only."""
|
||||
return {}
|
||||
|
||||
|
||||
async def handle_chat_message(chat_message: ChatMessageModel, send_message_func: Callable[[str], Awaitable[None]]) -> Optional[str]:
|
||||
"""Handle incoming chat messages and provide responses.
|
||||
|
||||
Args:
|
||||
chat_message: The received chat message
|
||||
send_message_func: Function to send messages back to the lobby
|
||||
|
||||
Returns:
|
||||
Optional response message to send back to the lobby
|
||||
"""
|
||||
message_lower = chat_message.message.lower().strip()
|
||||
sender = chat_message.sender_name
|
||||
|
||||
logger.info(f"Chatbot received message from {sender}: {chat_message.message}")
|
||||
|
||||
# Skip messages from ourselves
|
||||
if sender.lower() == AGENT_NAME.lower():
|
||||
return None
|
||||
|
||||
# Look for keywords in the message
|
||||
for keyword, responses in RESPONSES.items():
|
||||
if keyword in message_lower:
|
||||
response = random.choice(responses)
|
||||
# Handle special formatting
|
||||
if "{time}" in response:
|
||||
current_time = time.strftime("%Y-%m-%d %H:%M:%S")
|
||||
response = response.format(time=current_time)
|
||||
|
||||
logger.info(f"Chatbot responding with: {response}")
|
||||
return response
|
||||
|
||||
# If we get a direct mention or question, provide a generic response
|
||||
if any(word in message_lower for word in ["bot", "chatbot", "?"]):
|
||||
responses = [
|
||||
f"Hi {sender}! I'm a simple chatbot. Say 'help' to see what I can do!",
|
||||
f"Hello {sender}! I heard you mention me. How can I help?",
|
||||
"I'm here and listening! Try asking me about the time or tell me a greeting!"
|
||||
]
|
||||
return random.choice(responses)
|
||||
|
||||
# Default: don't respond to unrecognized messages
|
||||
return None
|
@ -4,12 +4,18 @@ Lightweight agent descriptor; heavy model loading must be done by a controller
|
||||
when the agent is actually used.
|
||||
"""
|
||||
|
||||
from typing import Dict, Any
|
||||
from typing import Dict, Any, Optional, Callable, Awaitable
|
||||
import librosa
|
||||
from logger import logger
|
||||
from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
|
||||
from aiortc import MediaStreamTrack
|
||||
|
||||
# Import shared models for chat functionality
|
||||
import sys
|
||||
import os
|
||||
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
|
||||
from shared.models import ChatMessageModel
|
||||
|
||||
|
||||
AGENT_NAME = "whisper"
|
||||
AGENT_DESCRIPTION = "Speech recognition agent (Whisper) - processes incoming audio"
|
||||
@ -23,6 +29,33 @@ def create_agent_tracks(session_name: str) -> dict[str, MediaStreamTrack]:
|
||||
"""Whisper is not a media source - return no local tracks."""
|
||||
return {}
|
||||
|
||||
async def handle_chat_message(chat_message: ChatMessageModel, send_message_func: Callable[[str], Awaitable[None]]) -> Optional[str]:
|
||||
"""Handle incoming chat messages and optionally return a response.
|
||||
|
||||
Args:
|
||||
chat_message: The received chat message
|
||||
send_message_func: Function to send messages back to the lobby
|
||||
|
||||
Returns:
|
||||
Optional response message to send back to the lobby
|
||||
"""
|
||||
logger.info(f"Whisper bot received chat message from {chat_message.sender_name}: {chat_message.message}")
|
||||
|
||||
# Simple echo bot behavior for demonstration
|
||||
if chat_message.message.lower().startswith("whisper:"):
|
||||
command = chat_message.message[8:].strip() # Remove "whisper:" prefix
|
||||
if command.lower() == "hello":
|
||||
return f"Hello {chat_message.sender_name}! I'm the Whisper speech recognition bot."
|
||||
elif command.lower() == "help":
|
||||
return "I can process speech and respond to simple commands. Try 'whisper: hello' or 'whisper: status'"
|
||||
elif command.lower() == "status":
|
||||
return "Whisper bot is running and ready to process audio and chat messages."
|
||||
else:
|
||||
return f"I heard you say: {command}. Try 'whisper: help' for available commands."
|
||||
|
||||
# Don't respond to other messages
|
||||
return None
|
||||
|
||||
def do_work():
|
||||
model_ids = {
|
||||
"Distil-Whisper": [
|
||||
|
@ -50,6 +50,7 @@ from shared.models import (
|
||||
IceCandidateModel,
|
||||
ICECandidateDictModel,
|
||||
SessionDescriptionTypedModel,
|
||||
ChatMessageModel,
|
||||
)
|
||||
|
||||
from logger import logger
|
||||
@ -131,6 +132,9 @@ class WebRTCSignalingClient:
|
||||
self.on_track_received: Optional[
|
||||
Callable[[Peer, MediaStreamTrack], Awaitable[None]]
|
||||
] = None
|
||||
self.on_chat_message_received: Optional[
|
||||
Callable[[ChatMessageModel], Awaitable[None]]
|
||||
] = None
|
||||
|
||||
async def connect(self):
|
||||
"""Connect to the signaling server"""
|
||||
@ -371,6 +375,22 @@ class WebRTCSignalingClient:
|
||||
self.shutdown_requested = True
|
||||
logger.info("Shutdown requested for WebRTC signaling client")
|
||||
|
||||
async def send_chat_message(self, message: str):
|
||||
"""Send a chat message to the lobby"""
|
||||
if not self.is_registered:
|
||||
logger.warning("Cannot send chat message: not registered")
|
||||
return
|
||||
|
||||
if not message.strip():
|
||||
logger.warning("Cannot send empty chat message")
|
||||
return
|
||||
|
||||
try:
|
||||
await self._send_message("send_chat_message", {"message": message.strip()})
|
||||
logger.info(f"Sent chat message: {message[:50]}...")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to send chat message: {e}", exc_info=True)
|
||||
|
||||
async def _setup_local_media(self):
|
||||
"""Create local media tracks"""
|
||||
# If a bot provided a create_tracks callable, use it to create tracks.
|
||||
@ -530,6 +550,19 @@ class WebRTCSignalingClient:
|
||||
# Handle status check messages - these are used to verify connection
|
||||
logger.debug(f"Received status check message: {data}")
|
||||
# No special processing needed for status checks, just acknowledge receipt
|
||||
elif msg_type == "chat_message":
|
||||
try:
|
||||
validated = ChatMessageModel.model_validate(data)
|
||||
except ValidationError as e:
|
||||
logger.error(f"Invalid chat_message payload: {e}", exc_info=True)
|
||||
return
|
||||
logger.info(f"Received chat message from {validated.sender_name}: {validated.message[:50]}...")
|
||||
# Call the callback if it's set
|
||||
if self.on_chat_message_received:
|
||||
try:
|
||||
await self.on_chat_message_received(validated)
|
||||
except Exception as e:
|
||||
logger.error(f"Error in chat message callback: {e}", exc_info=True)
|
||||
else:
|
||||
logger.info(f"Unhandled message type: {msg_type} with data: {data}")
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user