Workiing ish

2025-09-09 16:11:24 -07:00 · 2025-09-09 16:11:24 -07:00 · 122ed532d6
commit 122ed532d6
parent 39739e5d34
15 changed files with 435 additions and 1946 deletions
--- a/API_EVOLUTION.md
+++ b/API_EVOLUTION.md
@ -1,175 +0,0 @@
-# API Evolution Detection System
-
-This system automatically detects when your OpenAPI schema has new endpoints or changed parameters that need to be implemented in the `ApiClient` class.
-
-## How It Works
-
-### Automatic Detection
- **Development Mode**: Automatically runs when `api-client.ts` is imported during development
- **Runtime Checking**: Compares available endpoints in the OpenAPI schema with implemented methods
- **Console Warnings**: Displays detailed warnings about unimplemented endpoints
-
-### Schema Comparison
- **Hash-based Detection**: Detects when the OpenAPI schema file changes
- **Endpoint Analysis**: Identifies new, changed, or unimplemented endpoints
- **Parameter Validation**: Suggests checking for parameter changes
-
-## Usage
-
-### Automatic Checking
-The system runs automatically in development mode when you import from `api-client.ts`:
-
-```typescript
-import { apiClient } from './api-client';
-// Check runs automatically after 1 second delay
-```
-
-### Command Line Checking
-You can run API evolution checks from the command line:
-
-```bash
-# Full type generation with evolution check
-./generate-ts-types.sh
-
-# Quick evolution check only (without regenerating types)
-./check-api-evolution.sh
-
-# Or from within the client container
-npm run check-api-evolution
-```
-
-### Manual Checking
-You can manually trigger checks during development:
-
-```typescript
-import { devUtils } from './api-client';
-
-// Check for API evolution
-const evolution = await devUtils.checkApiEvolution();
-
-// Force recheck (bypasses once-per-session limit)
-devUtils.recheckEndpoints();
-```
-
-### Console Output
-When unimplemented endpoints are found, you'll see:
-
-**Browser Console (development mode):**
-```
-🚨 API Evolution Detection
-🆕 New API endpoints detected:
-  • GET /ai-voicebot/api/new-feature (get_new_feature_endpoint)
-⚠️  Unimplemented API endpoints:
-  • POST /ai-voicebot/api/admin/bulk-action
-💡 Implementation suggestions:
-Add these methods to ApiClient:
-  async adminBulkAction(): Promise<any> {
-    return this.request<any>('/ai-voicebot/api/admin/bulk-action', { method: 'POST' });
-  }
-```
-
-**Command Line:**
-```
-🔍 API Evolution Check
-==================================================
-📊 Summary:
-   Total endpoints: 8
-   Implemented: 7
-   Unimplemented: 1
-
-⚠️  Unimplemented API endpoints:
-   • POST /ai-voicebot/api/admin/bulk-action
-     Admin bulk action endpoint
-
-💡 Implementation suggestions:
-Add these methods to the ApiClient class:
-
-  async adminBulkAction(data?: any): Promise<any> {
-    return this.request<any>('/ai-voicebot/api/admin/bulk-action', { method: 'POST', body: data });
-  }
-```
-
-## Configuration
-
-### Implemented Endpoints Registry
-The system maintains a registry of implemented endpoints in `ApiClient`. When you add new methods, update the registry:
-
-```typescript
-// In api-evolution-checker.ts
-private getImplementedEndpoints(): Set<string> {
-  return new Set([
-    'GET:/ai-voicebot/api/admin/names',
-    'POST:/ai-voicebot/api/admin/set_password',
-    // Add new endpoints here:
-    'POST:/ai-voicebot/api/admin/bulk-action',
-  ]);
-}
-```
-
-### Schema Location
-The system attempts to load the OpenAPI schema from:
- `/openapi-schema.json` (served by your development server)
- Falls back to hardcoded endpoint list if schema file is unavailable
-
-## Development Workflow
-
-### When Adding New API Endpoints
-
-1. **Add endpoint to FastAPI server** (server/main.py)
-2. **Regenerate types**: Run `./generate-ts-types.sh`
-3. **Check console** for warnings about unimplemented endpoints
-4. **Implement methods** in `ApiClient` class
-5. **Update endpoint registry** in the evolution checker
-6. **Add convenience methods** to API namespaces if needed
-
-### Example Implementation
-
-When you see a warning like:
-```
-⚠️  Unimplemented: POST /ai-voicebot/api/admin/bulk-action
-```
-
-1. Add the method to `ApiClient`:
-```typescript
-async adminBulkAction(data: BulkActionRequest): Promise<BulkActionResponse> {
-  return this.request<BulkActionResponse>('/ai-voicebot/api/admin/bulk-action', { 
-    method: 'POST', 
-    body: data 
-  });
-}
-```
-
-2. Add to convenience API:
-```typescript
-export const adminApi = {
-  listNames: () => apiClient.adminListNames(),
-  setPassword: (data: AdminSetPassword) => apiClient.adminSetPassword(data),
-  clearPassword: (data: AdminClearPassword) => apiClient.adminClearPassword(data),
-  bulkAction: (data: BulkActionRequest) => apiClient.adminBulkAction(data), // New
-};
-```
-
-3. Update the registry:
-```typescript
-private getImplementedEndpoints(): Set<string> {
-  return new Set([
-    // ... existing endpoints ...
-    'POST:/ai-voicebot/api/admin/bulk-action', // Add this
-  ]);
-}
-```
-
-## Benefits
-
- **Prevents Missing Implementations**: Never forget to implement new API endpoints
- **Development Efficiency**: Automatic detection saves time during API evolution
- **Type Safety**: Works with generated TypeScript types for full type safety
- **Code Generation**: Provides implementation stubs to get started quickly
- **Schema Validation**: Detects when OpenAPI schema changes
-
-## Production Considerations
-
- **Development Only**: Evolution checking only runs in development mode
- **Performance**: Minimal runtime overhead (single check per session)
- **Error Handling**: Gracefully falls back if schema loading fails
- **Console Logging**: All output goes to console.warn/info for easy filtering
--- a/ARCHITECTURE_RECOMMENDATIONS.md
+++ b/ARCHITECTURE_RECOMMENDATIONS.md
@ -1,298 +0,0 @@
-# Architecture Recommendations: Sessions, Lobbies, and WebSockets
-
-## Executive Summary
-
-The current architecture has grown organically into a monolithic structure that mixes concerns and creates maintenance challenges. This document outlines specific recommendations to improve maintainability, reduce complexity, and enhance the development experience.
-
-## Current Issues
-
-### 1. Server (`server/main.py`)
- **Monolithic structure**: 2300+ lines in a single file
- **Mixed concerns**: Session, lobby, WebSocket, bot, and admin logic intertwined
- **Complex state management**: Multiple global dictionaries requiring manual synchronization
- **WebSocket message handling**: Deep nested switch statements are hard to follow
- **Threading complexity**: Multiple locks and shared state increase deadlock risk
-
-### 2. Client (`client/src/`)
- **Fragmented connection logic**: WebSocket handling scattered across components
- **Error handling complexity**: Different scenarios handled inconsistently
- **State synchronization**: Multiple sources of truth for session/lobby state
-
-### 3. Voicebot (`voicebot/`)
- **Duplicate patterns**: Similar WebSocket logic but different implementation
- **Bot lifecycle complexity**: Complex orchestration with unclear state flow
-
-## Proposed Architecture
-
-### Server Refactoring
-
-#### 1. Extract Core Modules
-
-```
-server/
-├── main.py                 # FastAPI app setup and routing only
-├── core/
-│   ├── __init__.py
-│   ├── session_manager.py  # Session lifecycle and persistence
-│   ├── lobby_manager.py    # Lobby management and chat
-│   ├── bot_manager.py      # Bot provider and orchestration
-│   └── auth_manager.py     # Name/password authentication
-├── websocket/
-│   ├── __init__.py
-│   ├── connection.py       # WebSocket connection handling
-│   ├── message_handlers.py # Message type routing and handling
-│   └── signaling.py        # WebRTC signaling logic
-├── api/
-│   ├── __init__.py
-│   ├── admin.py           # Admin endpoints
-│   ├── sessions.py        # Session HTTP API
-│   ├── lobbies.py         # Lobby HTTP API
-│   └── bots.py            # Bot HTTP API
-└── models/
-    ├── __init__.py
-    ├── session.py         # Session and Lobby classes
-    └── events.py          # Event system for decoupled communication
-```
-
-#### 2. Event-Driven Architecture
-
-Replace direct method calls with an event system:
-
-```python
-from typing import Protocol
-from abc import ABC, abstractmethod
-
-class Event(ABC):
-    """Base event class"""
-    pass
-
-class SessionJoinedLobby(Event):
-    def __init__(self, session_id: str, lobby_id: str):
-        self.session_id = session_id
-        self.lobby_id = lobby_id
-
-class EventHandler(Protocol):
-    async def handle(self, event: Event) -> None: ...
-
-class EventBus:
-    def __init__(self):
-        self._handlers: dict[type[Event], list[EventHandler]] = {}
-    
-    def subscribe(self, event_type: type[Event], handler: EventHandler):
-        if event_type not in self._handlers:
-            self._handlers[event_type] = []
-        self._handlers[event_type].append(handler)
-    
-    async def publish(self, event: Event):
-        event_type = type(event)
-        if event_type in self._handlers:
-            for handler in self._handlers[event_type]:
-                await handler.handle(event)
-```
-
-#### 3. WebSocket Message Router
-
-Replace the massive switch statement with a clean router:
-
-```python
-from typing import Callable, Dict, Any
-from abc import ABC, abstractmethod
-
-class MessageHandler(ABC):
-    @abstractmethod
-    async def handle(self, session: Session, data: Dict[str, Any], websocket: WebSocket) -> None:
-        pass
-
-class SetNameHandler(MessageHandler):
-    async def handle(self, session: Session, data: Dict[str, Any], websocket: WebSocket) -> None:
-        # Handle set_name logic here
-        pass
-
-class WebSocketRouter:
-    def __init__(self):
-        self._handlers: Dict[str, MessageHandler] = {}
-    
-    def register(self, message_type: str, handler: MessageHandler):
-        self._handlers[message_type] = handler
-    
-    async def route(self, message_type: str, session: Session, data: Dict[str, Any], websocket: WebSocket):
-        if message_type in self._handlers:
-            await self._handlers[message_type].handle(session, data, websocket)
-        else:
-            await websocket.send_json({"type": "error", "data": {"error": f"Unknown message type: {message_type}"}})
-```
-
-### Client Refactoring
-
-#### 1. Centralized Connection Management
-
-Create a single WebSocket connection manager:
-
-```typescript
-// src/connection/WebSocketManager.ts
-export class WebSocketManager {
-  private ws: WebSocket | null = null;
-  private reconnectAttempts = 0;
-  private messageHandlers = new Map<string, (data: any) => void>();
-  
-  constructor(private url: string) {}
-  
-  async connect(): Promise<void> {
-    // Connection logic with automatic reconnection
-  }
-  
-  subscribe(messageType: string, handler: (data: any) => void): void {
-    this.messageHandlers.set(messageType, handler);
-  }
-  
-  send(type: string, data: any): void {
-    if (this.ws?.readyState === WebSocket.OPEN) {
-      this.ws.send(JSON.stringify({ type, data }));
-    }
-  }
-  
-  private handleMessage(event: MessageEvent): void {
-    const message = JSON.parse(event.data);
-    const handler = this.messageHandlers.get(message.type);
-    if (handler) {
-      handler(message.data);
-    }
-  }
-}
-```
-
-#### 2. Unified State Management
-
-Use a state management pattern (Context + Reducer or Zustand):
-
-```typescript
-// src/store/AppStore.ts
-interface AppState {
-  session: Session | null;
-  lobby: Lobby | null;
-  participants: Participant[];
-  connectionStatus: 'disconnected' | 'connecting' | 'connected';
-  error: string | null;
-}
-
-type AppAction = 
-  | { type: 'SET_SESSION'; payload: Session }
-  | { type: 'SET_LOBBY'; payload: Lobby }
-  | { type: 'UPDATE_PARTICIPANTS'; payload: Participant[] }
-  | { type: 'SET_CONNECTION_STATUS'; payload: AppState['connectionStatus'] }
-  | { type: 'SET_ERROR'; payload: string | null };
-
-const appReducer = (state: AppState, action: AppAction): AppState => {
-  switch (action.type) {
-    case 'SET_SESSION':
-      return { ...state, session: action.payload };
-    // ... other cases
-    default:
-      return state;
-  }
-};
-```
-
-### Voicebot Refactoring
-
-#### 1. Unified Connection Interface
-
-Create a common WebSocket interface used by both client and voicebot:
-
-```python
-# shared/websocket_client.py
-from abc import ABC, abstractmethod
-from typing import Dict, Any, Callable, Optional
-
-class WebSocketClient(ABC):
-    def __init__(self, url: str, session_id: str, lobby_id: str):
-        self.url = url
-        self.session_id = session_id
-        self.lobby_id = lobby_id
-        self.message_handlers: Dict[str, Callable[[Dict[str, Any]], None]] = {}
-    
-    @abstractmethod
-    async def connect(self) -> None:
-        pass
-    
-    @abstractmethod
-    async def send_message(self, message_type: str, data: Dict[str, Any]) -> None:
-        pass
-    
-    def register_handler(self, message_type: str, handler: Callable[[Dict[str, Any]], None]):
-        self.message_handlers[message_type] = handler
-    
-    async def handle_message(self, message_type: str, data: Dict[str, Any]):
-        handler = self.message_handlers.get(message_type)
-        if handler:
-            await handler(data)
-```
-
-## Implementation Plan
-
-### Phase 1: Server Foundation (Week 1-2)
-1. Extract `SessionManager` and `LobbyManager` classes
-2. Implement basic event system
-3. Create WebSocket message router
-4. Move admin endpoints to separate module
-
-### Phase 2: Server Completion (Week 3-4)
-1. Extract bot management functionality
-2. Implement remaining message handlers
-3. Add comprehensive testing
-4. Performance optimization
-
-### Phase 3: Client Refactoring (Week 5-6)
-1. Implement centralized WebSocket manager
-2. Create unified state management
-3. Refactor components to use new architecture
-4. Add error boundary and better error handling
-
-### Phase 4: Voicebot Integration (Week 7-8)
-1. Create shared WebSocket interface
-2. Refactor voicebot to use common patterns
-3. Improve bot lifecycle management
-4. Integration testing
-
-## Benefits of Proposed Architecture
-
-### Maintainability
- **Single Responsibility**: Each module has a clear, focused purpose
- **Testability**: Smaller, focused classes are easier to unit test
- **Debugging**: Clear separation makes it easier to trace issues
-
-### Scalability
- **Event-driven**: Loose coupling enables easier feature additions
- **Modular**: New functionality can be added without touching core logic
- **Performance**: Event system enables asynchronous processing
-
-### Developer Experience
- **Code Navigation**: Easier to find relevant code
- **Documentation**: Smaller modules are easier to document
- **Onboarding**: New developers can understand individual components
-
-### Reliability
- **Error Isolation**: Failures in one module don't cascade
- **State Management**: Centralized state reduces synchronization bugs
- **Connection Handling**: Robust reconnection and error recovery
-
-## Risk Mitigation
-
-### Breaking Changes
- Implement changes incrementally
- Maintain backward compatibility during transition
- Comprehensive testing at each phase
-
-### Performance Impact
- Benchmark before and after changes
- Event system should be lightweight
- Monitor memory usage and connection handling
-
-### Team Coordination
- Clear communication about architecture changes
- Code review process for architectural decisions
- Documentation updates with each phase
-
-## Conclusion
-
-This refactoring will transform the current monolithic architecture into a maintainable, scalable system. The modular approach will reduce complexity, improve testability, and make the codebase more approachable for new developers while maintaining all existing functionality.
--- a/AUTOMATED_API_CLIENT.md
+++ b/AUTOMATED_API_CLIENT.md
@ -1,238 +0,0 @@
-# Automated API Client Generation System
-
-This document explains the automated TypeScript API client generation and update system for the AI Voicebot project.
-
-## Overview
-
-The system automatically:
-1. **Generates OpenAPI schema** from FastAPI server
-2. **Creates TypeScript types** from the schema 
-3. **Updates API client** with missing endpoint implementations using dynamic paths
-4. **Updates evolution checker** with current endpoint lists
-5. **Validates TypeScript** compilation
-6. **Runs evolution checks** to ensure completeness
-
-All generated API calls use the `PUBLIC_URL` environment variable to dynamically construct paths, making the system deployable to any base path without hardcoded `/ai-voicebot` prefixes.
-
-## Files in the System
-
-### Generated Files (Auto-updated)
- `client/openapi-schema.json` - OpenAPI schema from server
- `client/src/api-types.ts` - TypeScript type definitions
- `client/src/api-client.ts` - API client (auto-sections updated)
- `client/src/api-evolution-checker.ts` - Evolution checker (lists updated)
-
-### Manual Files 
- `generate-ts-types.sh` - Main orchestration script
- `client/update-api-client.js` - API client updater utility
- `client/src/api-usage-examples.ts` - Usage examples and patterns
-
-## Configuration
-
-### Environment Variables
-
-The system uses environment variables for dynamic path configuration:
-
- **`PUBLIC_URL`** - Base path for the application (e.g., `/ai-voicebot`, `/my-app`, etc.)
-  - Used in: API paths, schema loading, asset paths
-  - Default: `""` (empty string for root deployment)
-  - Set in: Docker environment, build process, or runtime
-
-### Dynamic Path Handling
-
-All API endpoints use dynamic path construction:
-
-```typescript
-// Instead of hardcoded paths:
-// "/ai-voicebot/api/health" 
-
-// The system uses:
-this.getApiPath("/ai-voicebot/api/health") 
-// Which becomes: `${PUBLIC_URL}/api/health`
-```
-
-This allows deployment to different base paths without code changes.
-
-## Usage
-
-### Full Generation (Recommended)
-```bash
-./generate-ts-types.sh
-```
-This runs the complete pipeline and is the primary way to use the system.
-
-### Individual Steps
-```bash
-# Inside client container
-npm run generate-schema        # Generate OpenAPI schema
-npm run generate-types         # Generate TypeScript types  
-npm run update-api-client      # Update API client
-npm run check-api-evolution    # Check for missing endpoints
-```
-
-## How Auto-Updates Work
-
-### API Client Updates
-
-The `update-api-client.js` script:
-
-1. **Parses OpenAPI schema** to find all available endpoints
-2. **Scans existing API client** to detect implemented methods
-3. **Identifies missing endpoints** by comparing the two
-4. **Generates method implementations** for missing endpoints
-5. **Updates the client class** by inserting new methods in designated section
-6. **Updates endpoint lists** used by evolution checking
-
-#### Auto-Generated Section
-```typescript
-export class ApiClient {
-  // ... manual methods ...
-
-  /**
-   * Construct API path using PUBLIC_URL environment variable
-   * Replaces hardcoded /ai-voicebot prefix with dynamic base from environment
-   */
-  private getApiPath(schemaPath: string): string {
-    return schemaPath.replace('/ai-voicebot', base);
-  }
-
-  // Auto-generated endpoints will be added here by update-api-client.js
-  // DO NOT MANUALLY EDIT BELOW THIS LINE
-  
-  // New endpoints automatically appear here using this.getApiPath()
-}
-```
-
-#### Method Generation
- **Method names** derived from `operationId` or path/method combination
- **Parameters** inferred from path parameters and request body
- **Return types** use generic `Promise<any>` (can be enhanced)
- **Path handling** supports both static and parameterized paths using `PUBLIC_URL`
- **Dynamic paths** automatically replace hardcoded prefixes with environment-based values
-
-### Evolution Checker Updates
-
-The evolution checker tracks:
- **Known schema endpoints** - updated from current OpenAPI schema
- **Implemented endpoints** - updated from actual API client code
- **Missing endpoints** - calculated difference for warnings
-
-## Customization
-
-### Adding Manual Endpoints
-
-For endpoints not in OpenAPI schema (e.g., external services), add them manually before the auto-generated section:
-
-```typescript
-// Manual endpoints (these won't be auto-generated)
-async getCustomData(): Promise<CustomResponse> {
-  return this.request<CustomResponse>("/custom/endpoint", { method: "GET" });
-}
-
-// Auto-generated endpoints will be added here by update-api-client.js
-// DO NOT MANUALLY EDIT BELOW THIS LINE
-```
-
-### Improving Generated Methods
-
-To enhance auto-generated methods:
-
-1. **Better Type Inference**: Modify `generateMethodSignature()` in `update-api-client.js` to use specific types from schema
-2. **Parameter Validation**: Add validation logic in method generation
-3. **Error Handling**: Customize error handling patterns
-4. **Documentation**: Add JSDoc generation from OpenAPI descriptions
-
-### Schema Evolution Detection
-
-The system detects:
- **New endpoints** added to OpenAPI schema
- **Changed endpoints** (parameter or response changes)  
- **Deprecated endpoints** (with proper OpenAPI marking)
-
-## Development Workflow
-
-1. **Develop API endpoints** in FastAPI server with proper typing
-2. **Run generation script** to update client: `./generate-ts-types.sh` 
-3. **Use generated types** in React components
-4. **Manual customization** for complex endpoints if needed
-5. **Commit all changes** including generated and updated files
-
-## Best Practices
-
-### Server Development
- Use **Pydantic models** for all request/response types
- Add **proper OpenAPI metadata** (summary, description, tags)
- Use **consistent naming** for operation IDs
- **Version your API** to handle breaking changes
-
-### Client Development  
- **Import from api-client.ts** rather than making raw fetch calls
- **Use generated types** for type safety
- **Avoid editing auto-generated sections** - they will be overwritten
- **Add custom endpoints manually** when needed
-
-### Type Safety
-```typescript
-// Good: Using generated types and client
-import { apiClient, type LobbyModel, type LobbyCreateRequest } from './api-client';
-
-const createLobby = async (data: LobbyCreateRequest): Promise<LobbyModel> => {
-  const response = await apiClient.createLobby(sessionId, data);
-  return response.data; // Fully typed
-};
-
-// Avoid: Direct fetch calls
-const createLobbyRaw = async () => {
-  const response = await fetch('/api/lobby', { /* ... */ });
-  return response.json(); // No type safety
-};
-```
-
-## Troubleshooting
-
-### Common Issues
-
-**"Could not find insertion marker"**
- The API client file was manually edited and the auto-generation markers were removed
- Restore the markers or regenerate the client file from template
-
-**"Missing endpoints detected"**  
- New endpoints were added to the server but the generation script wasn't run
- Run `./generate-ts-types.sh` to update the client
-
-**"Type errors after generation"**
- Schema changes may have affected existing manual code
- Check the TypeScript compiler output and update affected code
-
-**"Duplicate method names"**
- Manual methods conflict with auto-generated ones
- Rename manual methods or adjust the operation ID generation logic
-
-### Debug Mode
-
-Add debug logging by modifying `update-api-client.js`:
-
-```javascript
-// Add after parsing
-console.log('Schema endpoints:', this.endpoints.map(e => `${e.method}:${e.path}`));
-console.log('Implemented endpoints:', Array.from(this.implementedEndpoints));
-```
-
-## Future Enhancements
-
- **Stronger type inference** from OpenAPI schema components
- **Request/response validation** using schema definitions
- **Mock data generation** for testing
- **API versioning support** with backward compatibility
- **Performance optimization** with request caching
- **OpenAPI spec validation** before generation
-
-## Integration with Build Process
-
-The system integrates with:
- **Docker Compose** for cross-container coordination
- **npm scripts** for frontend build pipeline  
- **TypeScript compilation** for type checking
- **CI/CD workflows** for automated updates
-
-This ensures that API changes are automatically reflected in the frontend without manual intervention, reducing development friction and preventing API/client drift.
--- a/CHAT_INTEGRATION.md
+++ b/CHAT_INTEGRATION.md
@ -1,220 +0,0 @@
-# Chat Integration for AI Voicebot System
-
-This document describes the chat functionality that has been integrated into the AI voicebot system, allowing bots to send and receive chat messages through the WebSocket signaling server.
-
-## Overview
-
-The chat integration enables bots to:
-1. **Receive chat messages** from other participants in the lobby
-2. **Send chat messages** back to the lobby
-3. **Process and respond** to specific commands or keywords
-4. **Integrate seamlessly** with the existing WebRTC signaling infrastructure
-
-## Architecture
-
-### Core Components
-
-1. **WebRTC Signaling Client** (`webrtc_signaling.py`)
-   - Extended with chat message handling capabilities
-   - Added `on_chat_message_received` callback for bots
-   - Added `send_chat_message()` method for sending messages
-
-2. **Bot Orchestrator** (`bot_orchestrator.py`)
-   - Enhanced bot discovery to detect chat handlers
-   - Sets up chat message callbacks when bots join lobbies
-   - Manages the connection between WebRTC client and bot chat handlers
-
-3. **Chat Models** (`shared/models.py`)
-   - `ChatMessageModel`: Structure for chat messages
-   - `ChatMessagesListModel`: For message lists
-   - `ChatMessagesSendModel`: For sending messages
-
-### Bot Interface
-
-Bots can now implement an optional `handle_chat_message` function:
-
-```python
-async def handle_chat_message(
-    chat_message: ChatMessageModel, 
-    send_message_func: Callable[[str], Awaitable[None]]
-) -> Optional[str]:
-    """
-    Handle incoming chat messages and optionally return a response.
-    
-    Args:
-        chat_message: The received chat message
-        send_message_func: Function to send messages back to the lobby
-    
-    Returns:
-        Optional response message to send back to the lobby
-    """
-    # Process the message and return a response
-    return "Hello! I received your message."
-```
-
-## Implementation Details
-
-### 1. WebSocket Message Handling
-
-The WebRTC signaling client now handles `chat_message` type messages:
-
-```python
-elif msg_type == "chat_message":
-    try:
-        validated = ChatMessageModel.model_validate(data)
-    except ValidationError as e:
-        logger.error(f"Invalid chat_message payload: {e}", exc_info=True)
-        return
-    logger.info(f"Received chat message from {validated.sender_name}: {validated.message[:50]}...")
-    # Call the callback if it's set
-    if self.on_chat_message_received:
-        try:
-            await self.on_chat_message_received(validated)
-        except Exception as e:
-            logger.error(f"Error in chat message callback: {e}", exc_info=True)
-```
-
-### 2. Bot Discovery Enhancement
-
-The bot orchestrator now detects chat handlers during discovery:
-
-```python
-if hasattr(mod, "handle_chat_message") and callable(getattr(mod, "handle_chat_message")):
-    chat_handler = getattr(mod, "handle_chat_message")
-
-bots[info.get("name", name)] = {
-    "module": name, 
-    "info": info, 
-    "create_tracks": create_tracks,
-    "chat_handler": chat_handler
-}
-```
-
-### 3. Chat Handler Setup
-
-When a bot joins a lobby, the orchestrator sets up the chat handler:
-
-```python
-if chat_handler:
-    async def bot_chat_handler(chat_message: ChatMessageModel):
-        """Wrapper to call the bot's chat handler and optionally send responses"""
-        try:
-            response = await chat_handler(chat_message, client.send_chat_message)
-            if response and isinstance(response, str):
-                await client.send_chat_message(response)
-        except Exception as e:
-            logger.error(f"Error in bot chat handler for {bot_name}: {e}", exc_info=True)
-    
-    client.on_chat_message_received = bot_chat_handler
-```
-
-## Example Bots
-
-### 1. Chatbot (`bots/chatbot.py`)
-
-A simple conversational bot that responds to greetings and commands:
-
- Responds to keywords like "hello", "how are you", "goodbye"
- Provides time information when asked
- Tells jokes on request
- Handles direct mentions intelligently
-
-Example interactions:
- User: "hello" → Bot: "Hi there!"
- User: "time" → Bot: "Let me check... it's currently 2025-09-03 23:45:12"
- User: "joke" → Bot: "Why don't scientists trust atoms? Because they make up everything!"
-
-### 2. Enhanced Whisper Bot (`bots/whisper.py`)
-
-The existing speech recognition bot now also handles chat commands:
-
- Responds to messages starting with "whisper:"
- Provides help and status information
- Echoes back commands for demonstration
-
-Example interactions:
- User: "whisper: hello" → Bot: "Hello UserName! I'm the Whisper speech recognition bot."
- User: "whisper: help" → Bot: "I can process speech and respond to simple commands..."
- User: "whisper: status" → Bot: "Whisper bot is running and ready to process audio and chat messages."
-
-## Server Integration
-
-The server (`server/main.py`) already handles chat messages through WebSocket:
-
-1. **Receiving messages**: `send_chat_message` message type
-2. **Broadcasting**: `broadcast_chat_message` method distributes messages to all lobby participants
-3. **Storage**: Messages are stored in lobby's `chat_messages` list
-
-## Testing
-
-The implementation has been tested with:
-
-1. **Bot Discovery**: All bots are correctly discovered with chat capabilities detected
-2. **Message Processing**: Both chatbot and whisper bot respond correctly to test messages
-3. **Integration**: The WebRTC signaling client properly routes messages to bot handlers
-
-Test results:
-```
-Discovered 3 bots:
-  Bot: chatbot
-    Has chat handler: True
-  Bot: synthetic_media  
-    Has chat handler: False
-  Bot: whisper
-    Has chat handler: True
-
-Chat functionality test:
- Chatbot response to "hello": "Hey!"
- Whisper response to "whisper: hello": "Hello TestUser! I'm the Whisper speech recognition bot."
-✅ Chat functionality test completed!
-```
-
-## Usage
-
-### For Bot Developers
-
-To add chat capabilities to a bot:
-
-1. Import the required types:
-```python
-from typing import Dict, Optional, Callable, Awaitable
-from shared.models import ChatMessageModel
-```
-
-2. Implement the chat handler:
-```python
-async def handle_chat_message(
-    chat_message: ChatMessageModel, 
-    send_message_func: Callable[[str], Awaitable[None]]
-) -> Optional[str]:
-    # Your chat logic here
-    if "hello" in chat_message.message.lower():
-        return f"Hello {chat_message.sender_name}!"
-    return None
-```
-
-3. The bot orchestrator will automatically detect and wire up the chat handler when the bot joins a lobby.
-
-### For System Integration
-
-The chat system integrates seamlessly with the existing voicebot infrastructure:
-
-1. **No breaking changes** to existing bots without chat handlers
-2. **Automatic discovery** of chat capabilities
-3. **Error isolation** - chat handler failures don't affect WebRTC functionality
-4. **Logging** provides visibility into chat message flow
-
-## Future Enhancements
-
-Potential improvements for the chat system:
-
-1. **Message History**: Bots could access recent chat history
-2. **Rich Responses**: Support for formatted messages, images, etc.
-3. **Private Messaging**: Direct messages between participants
-4. **Chat Commands**: Standardized command parsing framework
-5. **Persistence**: Long-term storage of chat interactions
-6. **Analytics**: Message processing metrics and bot performance monitoring
-
-## Conclusion
-
-The chat integration provides a powerful foundation for creating interactive AI bots that can engage with users through text while maintaining their audio/video capabilities. The implementation is robust, well-tested, and ready for production use.
--- a/Dockerfile.voicebot
+++ b/Dockerfile.voicebot
@ -1,5 +1,5 @@
 FROM ubuntu:oracular
-# Stick with Python3.12
+# Stick with Python3.12 (plucky has 3.13)

 # Install some utilities frequently used
 RUN apt-get update \
@ -28,6 +28,20 @@ RUN apt-get update \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/{apt,dpkg,cache,log}

+# Install Intel graphics runtimes
+RUN apt-get update \
+    && DEBIAN_FRONTEND=noninteractive apt-get install -y software-properties-common \
+    && add-apt-repository -y ppa:kobuk-team/intel-graphics \
+    && apt-get update \
+    && DEBIAN_FRONTEND=noninteractive apt-get install -y \
+    libze-intel-gpu1 \
+    libze1 \
+    intel-ocloc \
+    intel-opencl-icd \
+    xpu-smi \
+    clinfo \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/{apt,dpkg,cache,log}

 # Install uv using the official Astral script
 RUN curl -Ls https://astral.sh/uv/install.sh | bash
--- a/REFACTORING_STEP1_COMPLETE.md
+++ b/REFACTORING_STEP1_COMPLETE.md
@ -1,190 +0,0 @@
-"""
-Documentation for the Server Refactoring Step 1 Implementation
-
-This document outlines what was accomplished in Step 1 of the server refactoring
-and how to verify the implementation works.
-"""
-
-# STEP 1 IMPLEMENTATION SUMMARY
-
-## What Was Accomplished
-
-### 1. Created Modular Architecture
- **server/core/**: Core business logic modules
-  - `session_manager.py`: Session lifecycle and persistence
-  - `lobby_manager.py`: Lobby management and chat functionality  
-  - `auth_manager.py`: Authentication and name protection
-
- **server/models/**: Event system and data models
-  - `events.py`: Event-driven architecture foundation
-
- **server/websocket/**: WebSocket handling
-  - `message_handlers.py`: Clean message routing (replaces massive switch statement)
-  - `connection.py`: WebSocket connection management
-
- **server/api/**: HTTP API endpoints  
-  - `admin.py`: Admin endpoints (extracted from main.py)
-  - `sessions.py`: Session management endpoints
-  - `lobbies.py`: Lobby management endpoints
-
-### 2. Key Improvements
- **Separation of Concerns**: Each module has a single responsibility
- **Event-Driven Architecture**: Decoupled communication between components
- **Clean Message Routing**: Replaced 200+ line switch statement with handler pattern
- **Thread Safety**: Proper locking and state management
- **Type Safety**: Better type annotations and error handling
- **Testability**: Modules can be tested independently
-
-### 3. Backward Compatibility
- All existing endpoints work unchanged
- Same WebSocket message protocols
- Same session/lobby behavior
- Same authentication mechanisms
-
-## File Structure Created
-
-```
-server/
-├── main_refactored.py          # New main file using modular architecture
-├── core/
-│   ├── __init__.py
-│   ├── session_manager.py      # Session lifecycle management
-│   ├── lobby_manager.py        # Lobby and chat management  
-│   └── auth_manager.py         # Authentication and passwords
-├── websocket/
-│   ├── __init__.py
-│   ├── message_handlers.py     # WebSocket message routing
-│   └── connection.py           # Connection management
-├── api/
-│   ├── __init__.py
-│   ├── admin.py               # Admin HTTP endpoints
-│   ├── sessions.py            # Session HTTP endpoints  
-│   └── lobbies.py             # Lobby HTTP endpoints
-└── models/
-    ├── __init__.py
-    └── events.py              # Event system
-```
-
-## How to Test/Verify
-
-### 1. Syntax Verification
-The modules can be imported and instantiated:
-
-```python
-# In server/ directory:
-python3 -c "
-import sys; sys.path.append('.')
-from core.session_manager import SessionManager
-from core.lobby_manager import LobbyManager  
-from core.auth_manager import AuthManager
-print('✓ All modules import successfully')
-"
-```
-
-### 2. Basic Functionality Test
-```python
-# Test basic object creation (no FastAPI dependencies)
-python3 -c "
-import sys; sys.path.append('.')
-from core.auth_manager import AuthManager
-auth = AuthManager()
-auth.set_password('test', 'password')
-assert auth.verify_password('test', 'password')
-assert not auth.verify_password('test', 'wrong')
-print('✓ AuthManager works correctly')
-"
-```
-
-### 3. Server Startup Test
-To test the full refactored server:
-
-```bash
-# Start the refactored server
-cd server/
-python3 main_refactored.py
-```
-
-Expected output:
-```
-INFO - Starting AI Voice Bot server with modular architecture...
-INFO - Loaded 0 sessions from sessions.json
-INFO - AI Voice Bot server started successfully!
-INFO - Server URL: /
-INFO - Sessions loaded: 0
-INFO - Lobbies available: 0
-INFO - Protected names: 0
-```
-
-### 4. API Endpoints Test
-```bash
-# Test health endpoint
-curl http://localhost:8000/api/system/health
-
-# Expected response:
-{
-  "status": "ok",
-  "architecture": "modular",
-  "version": "2.0.0",
-  "managers": {
-    "session_manager": "active",
-    "lobby_manager": "active", 
-    "auth_manager": "active",
-    "websocket_manager": "active"
-  },
-  "statistics": {
-    "sessions": 0,
-    "lobbies": 0,
-    "protected_names": 0
-  }
-}
-```
-
-## Benefits Achieved
-
-### Maintainability
- **Reduced Complexity**: Original 2300-line main.py split into focused modules
- **Clear Dependencies**: Each module has explicit dependencies
- **Easier Debugging**: Issues can be isolated to specific modules
-
-### Testability  
- **Unit Testing**: Each module can be tested independently
- **Mocking**: Dependencies can be easily mocked for testing
- **Integration Testing**: Components can be tested together
-
-### Developer Experience
- **Code Navigation**: Easy to find relevant functionality
- **Onboarding**: New developers can understand individual components
- **Documentation**: Smaller modules are easier to document
-
-### Scalability
- **Event System**: Enables loose coupling and async processing
- **Modular Growth**: New features can be added without touching core logic
- **Performance**: Better separation allows for targeted optimizations
-
-## Next Steps (Future Phases)
-
-### Phase 2: Complete WebSocket Extraction
- Extract remaining WebSocket message types (WebRTC signaling)
- Add comprehensive error handling
- Implement message validation
-
-### Phase 3: Enhanced Event System
- Add event persistence for reliability
- Implement event replay capabilities
- Add monitoring and metrics
-
-### Phase 4: Advanced Features
- Plugin architecture for bots
- Rate limiting and security enhancements
- Advanced admin capabilities
-
-## Migration Path
-
-The refactored architecture can be adopted gradually:
-
-1. **Testing**: Use `main_refactored.py` in development
-2. **Validation**: Verify all functionality works correctly  
-3. **Deployment**: Replace `main.py` with `main_refactored.py`
-4. **Cleanup**: Remove old monolithic code after verification
-
-The modular design ensures that each component can evolve independently while maintaining system stability.
--- a/REFACTORING_STEP1_SUCCESS.md
+++ b/REFACTORING_STEP1_SUCCESS.md
@ -1,153 +0,0 @@
-🎉 SERVER REFACTORING STEP 1 - SUCCESSFULLY COMPLETED!
-
-## Summary of Implementation
-
-### ✅ What Was Accomplished
-
-**1. Modular Architecture Created**
-```
-server/
-├── core/                    # Business logic modules
-│   ├── session_manager.py   # Session lifecycle & persistence
-│   ├── lobby_manager.py     # Lobby management & chat
-│   └── auth_manager.py      # Authentication & passwords
-├── websocket/               # WebSocket handling
-│   ├── message_handlers.py  # Message routing (replaces switch statement)
-│   └── connection.py        # Connection management
-├── api/                     # HTTP endpoints
-│   ├── admin.py            # Admin endpoints
-│   ├── sessions.py         # Session endpoints
-│   └── lobbies.py          # Lobby endpoints
-├── models/                  # Events & data models
-│   └── events.py           # Event-driven architecture
-└── main_refactored.py       # New modular main file
-```
-
-**2. Key Improvements Achieved**
- ✅ **Separation of Concerns**: 2300-line monolith split into focused modules
- ✅ **Event-Driven Architecture**: Decoupled communication via event bus
- ✅ **Clean Message Routing**: Replaced massive switch statement with handler pattern
- ✅ **Thread Safety**: Proper locking and state management maintained
- ✅ **Dependency Injection**: Managers can be configured and swapped
- ✅ **Testability**: Each module can be tested independently
-
-**3. Backward Compatibility Maintained**
- ✅ **Same API endpoints**: All existing HTTP endpoints work unchanged
- ✅ **Same WebSocket protocol**: All message types work identically
- ✅ **Same authentication**: Password and name protection unchanged
- ✅ **Same session persistence**: Existing sessions.json format preserved
-
-### 🧪 Verification Results
-
-**Architecture Structure**: ✅ All directories and files created correctly
-**Module Imports**: ✅ All core modules import successfully in proper environment
-**Server Startup**: ✅ Refactored server starts and initializes all components
-**Session Loading**: ✅ Successfully loaded 4 existing sessions from disk
-**Background Tasks**: ✅ Cleanup and validation tasks start properly
-**Session Integrity**: ✅ Detected and logged duplicate session names
-**Graceful Shutdown**: ✅ All components shut down cleanly
-
-### 📊 Test Results
-
-```
-INFO - Starting AI Voice Bot server with modular architecture...
-INFO - Loaded 4 sessions from sessions.json
-INFO - Starting session background tasks...
-INFO - AI Voice Bot server started successfully!
-INFO - Server URL: /ai-voicebot/
-INFO - Sessions loaded: 4
-INFO - Lobbies available: 0
-INFO - Protected names: 0
-INFO - Session background tasks started
-```
-
-**Session Integrity Validation Working**:
-```
-WARNING - Session integrity issues found: 3 issues
-WARNING - Integrity issue: Duplicate name 'whisper-bot' found in 3 sessions
-```
-
-### 🔧 Technical Achievements
-
-**1. SessionManager**
- Extracted all session lifecycle management
- Background cleanup and validation tasks
- Thread-safe operations with proper locking
- Event publishing for session state changes
-
-**2. LobbyManager** 
- Extracted lobby creation and management
- Chat message handling and persistence
- Event-driven participant updates
- Automatic empty lobby cleanup
-
-**3. AuthManager**
- Extracted password hashing and verification
- Name protection and takeover logic
- Integrity validation for auth data
- Clean separation from session logic
-
-**4. WebSocket Message Router**
- Replaced 200+ line switch statement
- Handler pattern for clean message processing
- Easy to extend with new message types
- Proper error handling and validation
-
-**5. Event System**
- Decoupled component communication
- Async event processing
- Error isolation and logging
- Foundation for future enhancements
-
-### 🚀 Benefits Realized
-
-**Maintainability**
- Code is now organized into logical, focused modules
- Much easier to locate and modify specific functionality
- Reduced cognitive load when working on individual features
-
-**Testability**
- Each module can be unit tested independently
- Dependencies can be mocked easily
- Integration tests can focus on specific interactions
-
-**Scalability**
- Event system enables loose coupling
- New features can be added without touching core logic
- Components can be optimized independently
-
-**Developer Experience**
- New developers can understand individual components
- Clear separation of responsibilities
- Better error messages and logging
-
-### 🎯 Next Steps (Future Phases)
-
-**Phase 2: Complete WebSocket Extraction**
- Extract WebRTC signaling handlers
- Add comprehensive message validation
- Implement rate limiting
-
-**Phase 3: Enhanced Event System**
- Add event persistence
- Implement event replay capabilities
- Add metrics and monitoring
-
-**Phase 4: Advanced Features**
- Plugin architecture for bots
- Advanced admin capabilities
- Performance optimizations
-
-### 🏁 Conclusion
-
-**Step 1 of the server refactoring is COMPLETE and SUCCESSFUL!**
-
-The monolithic `main.py` has been successfully transformed into a clean, modular architecture that:
- Maintains 100% backward compatibility
- Significantly improves code organization
- Provides a solid foundation for future development
- Reduces maintenance burden and technical debt
-
-The refactored server is ready for production use and provides a much better foundation for continued development and feature additions.
-
-**Ready to proceed to Phase 2 or continue with other improvements! 🚀**
--- a/TYPESCRIPT_GENERATION.md
+++ b/TYPESCRIPT_GENERATION.md
@ -1,168 +0,0 @@
-# OpenAPI TypeScript Generation
-
-This project now supports automatic TypeScript type generation from the FastAPI server's Pydantic models using OpenAPI schema generation.
-
-## Overview
-
-The implementation follows the "OpenAPI Schema Generation (Recommended for FastAPI)" approach:
-
-1. **Server-side**: FastAPI automatically generates OpenAPI schema from Pydantic models
-2. **Generation**: Python script extracts the schema and saves it as JSON
-3. **TypeScript**: `openapi-typescript` converts the schema to TypeScript types
-4. **Client**: Typed API client provides type-safe server communication
-
-## Generated Files
-
- `client/openapi-schema.json` - OpenAPI schema extracted from FastAPI
- `client/src/api-types.ts` - TypeScript interfaces generated from OpenAPI schema
- `client/src/api-client.ts` - Typed API client with convenience methods
-
-## How It Works
-
-### 1. Schema Generation
-The `server/generate_schema_simple.py` script:
- Imports the FastAPI app from `main.py`
- Extracts the OpenAPI schema using `app.openapi()`
- Saves the schema as JSON in `client/openapi-schema.json`
-
-### 2. TypeScript Generation
-The `openapi-typescript` package:
- Reads the OpenAPI schema JSON
- Generates TypeScript interfaces in `client/src/api-types.ts`
- Creates type-safe definitions for all Pydantic models
-
-### 3. API Client
-The `client/src/api-client.ts` file provides:
- Type-safe API client class
- Convenience functions for each endpoint
- Proper error handling with custom `ApiError` class
- Re-exported types for easy importing
-
-## Usage in React Components
-
-```typescript
-import { apiClient, adminApi, healthApi, lobbiesApi, sessionsApi } from './api-client';
-import type { LobbyModel, SessionModel, AdminSetPassword } from './api-client';
-
-// Using the convenience APIs
-const healthStatus = await healthApi.check();
-const lobbies = await lobbiesApi.getAll();
-const session = await sessionsApi.getCurrent();
-
-// Using the main client
-const adminNames = await apiClient.adminListNames();
-
-// With type safety for request data
-const passwordData: AdminSetPassword = {
-  name: "admin",
-  password: "newpassword"
-};
-const result = await adminApi.setPassword(passwordData);
-
-// Type-safe lobby creation
-const lobbyRequest: LobbyCreateRequest = {
-  type: "lobby_create",
-  data: {
-    name: "My Lobby",
-    private: false
-  }
-};
-const newLobby = await sessionsApi.createLobby("session-id", lobbyRequest);
-```
-
-## Regenerating Types
-
-### Manual Generation
-```bash
-# Generate schema from server
-docker compose exec server uv run python3 generate_schema_simple.py
-
-# Generate TypeScript types
-docker compose exec client npx openapi-typescript openapi-schema.json -o src/api-types.ts
-
-# Type check
-docker compose exec client npm run type-check
-```
-
-### Automated Generation
-```bash
-# Run the comprehensive generation script
-./generate-ts-types.sh
-```
-
-### NPM Scripts (in frontend container)
-```bash
-# Generate just the schema
-npm run generate-schema
-
-# Generate just the TypeScript types (requires schema to exist)
-npm run generate-types
-
-# Generate both schema and types
-npm run generate-api-types
-```
-
-## Development Workflow
-
-1. **Modify Pydantic models** in `shared/models.py`
-2. **Regenerate types** using one of the methods above
-3. **Update React components** to use the new types
-4. **Type check** to ensure everything compiles
-
-## Benefits
-
- ✅ **Type Safety**: Full TypeScript type checking for API requests/responses
- ✅ **Auto-completion**: IDE support with auto-complete for API methods and data structures
- ✅ **Error Prevention**: Catch type mismatches at compile time
- ✅ **Documentation**: Self-documenting API with TypeScript interfaces
- ✅ **Sync Guarantee**: Types are always in sync with server models
- ✅ **Refactoring Safety**: IDE can safely refactor across frontend/backend
-
-## File Structure
-
-```
-server/
-├── main.py                    # FastAPI app with Pydantic models
-├── generate_schema_simple.py  # Schema extraction script
-└── generate_api_client.py     # Enhanced generator (backup)
-
-shared/
-└── models.py                  # Pydantic models (source of truth)
-
-client/
-├── openapi-schema.json        # Generated OpenAPI schema
-├── package.json              # Updated with openapi-typescript dependency
-└── src/
-    ├── api-types.ts          # Generated TypeScript interfaces
-    └── api-client.ts         # Typed API client
-```
-
-## Troubleshooting
-
-### Container Issues
-If the frontend container has dependency conflicts:
-```bash
-# Rebuild the frontend container
-docker compose build client
-docker compose up -d client
-```
-
-### TypeScript Errors
-Ensure the generated types are up to date:
-```bash
-./generate-ts-types.sh
-```
-
-### Module Not Found Errors
-Check that the volume mounts are working correctly and files are synced between host and container.
-
-## API Evolution Detection
-
-The system now includes automatic detection of API changes:
-
- **Automatic Checking**: In development mode, the system automatically warns about unimplemented endpoints
- **Console Warnings**: Clear warnings appear in the browser console when new API endpoints are available
- **Implementation Stubs**: Provides ready-to-use code stubs for new endpoints
- **Schema Monitoring**: Detects when the OpenAPI schema changes
-
-See `client/src/API_EVOLUTION.md` for detailed documentation on using this feature.
--- a/server/core/session_manager.py
+++ b/server/core/session_manager.py
@ -140,10 +140,6 @@ class Session:
        self.bot_instance_id: Optional[str] = None  # Bot instance ID for tracking
        self.session_lock = threading.RLock()  # Instance-level lock

-    def is_bot(self) -> bool:
-        """Check if this session represents a bot"""
-        return bool(self.bot_run_id or self.bot_provider_id or self.bot_instance_id)
-
    def getName(self) -> str:
        with self.session_lock:
            return f"{self.short}:{self.name if self.name else '[  ----  ]'}"
@ -405,10 +401,6 @@ class SessionManager:
            with self.lock:
                sessions_list: List[SessionSaved] = []
                for s in self._instances:
-                    # Skip bot sessions - they should not be persisted
-                    # Bot sessions are managed by the voicebot service lifecycle
-                    if s.bot_instance_id is not None or s.bot_run_id is not None or s.bot_provider_id is not None:
-                        continue
                    sessions_list.append(s.to_saved())

                # Note: We'll need to handle name_passwords separately or inject it
--- a/server/main.py
+++ b/server/main.py
@ -104,12 +104,12 @@ logger.info(f"Starting server with public URL: {public_url}")


 # Global managers - these replace the global variables from original main.py
-session_manager: SessionManager = None
-lobby_manager: LobbyManager = None
-auth_manager: AuthManager = None
-bot_manager: BotManager = None
-bot_config_manager: BotConfigManager = None
-websocket_manager: WebSocketConnectionManager = None
+session_manager: SessionManager | None = None
+lobby_manager: LobbyManager | None = None
+auth_manager: AuthManager | None = None
+bot_manager: BotManager | None = None
+bot_config_manager: BotConfigManager | None = None
+websocket_manager: WebSocketConnectionManager | None = None


@asynccontextmanager
--- a/voicebot/README.md
+++ b/voicebot/README.md
@ -1,302 +0,0 @@
-# AI Voicebot
-
-A WebRTC-enabled AI voicebot system with speech recognition and synthetic media capabilities. The voicebot can run in two modes: as a client connecting to lobbies or as a provider serving bots to other applications.
-
-## Features
-
- **Speech Recognition**: Uses Whisper models for real-time audio transcription
- **Synthetic Media**: Generates animated video and audio tracks
- **WebRTC Integration**: Real-time peer-to-peer communication
- **Bot Provider System**: Can register with a main server to provide bot services
- **Flexible Deployment**: Docker-based with development and production modes
-
-## Quick Start
-
-### Prerequisites
-
- Docker and Docker Compose
- Python 3.12+ (if running locally)
- Access to a compatible signaling server
-
-### Running with Docker
-
-#### 1. Bot Provider Mode (Recommended)
-
-Run the voicebot as a bot provider that registers with the main server:
-
-```bash
-# Development mode with auto-reload
-VOICEBOT_MODE=provider PRODUCTION=false docker-compose up voicebot
-
-# Production mode
-VOICEBOT_MODE=provider PRODUCTION=true docker-compose up voicebot
-```
-
-#### 2. Direct Client Mode
-
-Run the voicebot as a direct client connecting to a lobby:
-
-```bash
-# Development mode
-VOICEBOT_MODE=client PRODUCTION=false docker-compose up voicebot
-
-# Production mode  
-VOICEBOT_MODE=client PRODUCTION=true docker-compose up voicebot
-```
-
-### Running Locally
-
-#### 1. Setup Environment
-
-```bash
-cd voicebot/
-
-# Create virtual environment
-uv init --python /usr/bin/python3.12 --name "ai-voicebot-agent"
-uv add -r requirements.txt
-
-# Activate environment
-source .venv/bin/activate
-```
-
-#### 2. Bot Provider Mode
-
-```bash
-# Development with auto-reload
-python main.py --mode provider --server-url https://your-server.com/ai-voicebot --reload --insecure
-
-# Production
-python main.py --mode provider --server-url https://your-server.com/ai-voicebot
-```
-
-#### 3. Direct Client Mode
-
-```bash
-python main.py --mode client \
-    --server-url https://your-server.com/ai-voicebot \
-    --lobby "my-lobby" \
-    --session-name "My Bot" \
-    --insecure
-```
-
-## Configuration
-
-### Environment Variables
-
-| Variable | Description | Default | Example |
-|----------|-------------|---------|---------|
-| `VOICEBOT_MODE` | Operating mode: `client` or `provider` | `client` | `provider` |
-| `PRODUCTION` | Production mode flag | `false` | `true` |
-
-### Command Line Arguments
-
-#### Common Arguments
- `--mode`: Run as `client` or `provider`
- `--server-url`: Main server URL
- `--insecure`: Allow insecure SSL connections
- `--help`: Show all available options
-
-#### Provider Mode Arguments
- `--host`: Host to bind the provider server (default: `0.0.0.0`)
- `--port`: Port for the provider server (default: `8788`)
- `--reload`: Enable auto-reload for development
-
-#### Client Mode Arguments
- `--lobby`: Lobby name to join (default: `default`)
- `--session-name`: Display name for the bot (default: `Python Bot`)
- `--session-id`: Existing session ID to reuse
- `--password`: Password for protected names
- `--private`: Create/join private lobby
-
-## Available Bots
-
-The voicebot system includes the following bot types:
-
-### 1. Whisper Bot
- **Name**: `whisper`
- **Description**: Speech recognition agent using OpenAI Whisper models
- **Capabilities**: Real-time audio transcription, multiple language support
- **Models**: Supports various Whisper and Distil-Whisper models
-
-### 2. Synthetic Media Bot
- **Name**: `synthetic_media`
- **Description**: Generates animated video and audio tracks
- **Capabilities**: Animated video generation, synthetic audio, edge detection on incoming video
-
-## Architecture
-
-### Bot Provider System
-
-```
-┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
-│   Main Server   │    │   Bot Provider   │    │   Client App    │
-│                 │◄───┤   (Voicebot)     │    │                 │
-│ - Bot Registry  │    │ - Whisper Bot    │    │ - Bot Manager   │
-│ - Lobby Management   │ - Synthetic Bot  │    │ - UI Controls   │
-│ - API Endpoints │    │ - API Server     │    │ - Lobby View    │
-└─────────────────┘    └──────────────────┘    └─────────────────┘
-```
-
-### Flow
-1. Voicebot registers as bot provider with main server
-2. Main server discovers available bots from providers
-3. Client requests bot to join lobby via main server
-4. Main server forwards request to appropriate provider
-5. Provider creates bot instance that connects to the lobby
-
-## Development
-
-### Auto-Reload
-
-In development mode, the bot provider supports auto-reload using uvicorn:
-
-```bash
-# Watches /voicebot and /shared directories for changes
-python main.py --mode provider --reload
-```
-
-### Adding New Bots
-
-1. Create a new module in `voicebot/bots/`
-2. Implement required functions:
-   ```python
-   def agent_info() -> dict:
-       return {"name": "my_bot", "description": "My custom bot"}
-   
-   def create_agent_tracks(session_name: str) -> dict:
-       # Return MediaStreamTrack instances
-       return {"audio": my_audio_track, "video": my_video_track}
-   ```
-3. The bot will be automatically discovered and available
-
-### Testing
-
-```bash
-# Test bot discovery
-python test_bot_api.py
-
-# Test client connection
-python main.py --mode client --lobby test --session-name "Test Bot"
-```
-
-## Production Deployment
-
-### Docker Compose
-
-```yaml
-version: '3.8'
-services:
-  voicebot-provider:
-    build: .
-    environment:
-      - VOICEBOT_MODE=provider
-      - PRODUCTION=true
-    ports:
-      - "8788:8788"
-    volumes:
-      - ./cache:/voicebot/cache
-```
-
-### Kubernetes
-
-```yaml
-apiVersion: apps/v1
-kind: Deployment
-metadata:
-  name: voicebot-provider
-spec:
-  replicas: 1
-  selector:
-    matchLabels:
-      app: voicebot-provider
-  template:
-    metadata:
-      labels:
-        app: voicebot-provider
-    spec:
-      containers:
-      - name: voicebot
-        image: ai-voicebot:latest
-        env:
-        - name: VOICEBOT_MODE
-          value: "provider"
-        - name: PRODUCTION
-          value: "true"
-        ports:
-        - containerPort: 8788
-```
-
-## API Reference
-
-### Bot Provider Endpoints
-
-The voicebot provider exposes the following HTTP API:
-
- `GET /bots` - List available bots
- `POST /bots/{bot_name}/join` - Request bot to join lobby
- `GET /bots/runs` - List active bot instances
- `POST /bots/runs/{run_id}/stop` - Stop a bot instance
-
-### Example API Usage
-
-```bash
-# List available bots
-curl http://localhost:8788/bots
-
-# Request whisper bot to join lobby
-curl -X POST http://localhost:8788/bots/whisper/join \
-  -H "Content-Type: application/json" \
-  -d '{
-    "lobby_id": "lobby-123",
-    "session_id": "session-456", 
-    "nick": "Speech Bot",
-    "server_url": "https://server.com/ai-voicebot"
-  }'
-```
-
-## Troubleshooting
-
-### Common Issues
-
-**Bot provider not registering:**
- Check server URL is correct and accessible
- Verify network connectivity between provider and server
- Check logs for registration errors
-
-**Auto-reload not working:**
- Ensure `--reload` flag is used in development
- Check file permissions on watched directories
- Verify uvicorn version supports reload functionality
-
-**WebRTC connection issues:**
- Check STUN/TURN server configuration
- Verify network ports are not blocked
- Check browser console for ICE connection errors
-
-### Logs
-
-Logs are written to stdout and include:
- Bot registration status
- WebRTC connection events
- Media track creation/destruction
- API request/response details
-
-### Debug Mode
-
-Enable verbose logging:
-
-```bash
-python main.py --mode provider --server-url https://server.com --debug
-```
-
-## Contributing
-
-1. Fork the repository
-2. Create a feature branch
-3. Make your changes
-4. Add tests for new functionality
-5. Submit a pull request
-
-## License
-
-This project is licensed under the MIT License - see the LICENSE file for details.
--- a/voicebot/REFACTORING_SUMMARY.md
+++ b/voicebot/REFACTORING_SUMMARY.md
@ -1,82 +0,0 @@
-# Voicebot Module Refactoring
-
-The voicebot/main.py functionality has been broken down into individual Python files for better organization and maintainability:
-
-## New File Structure
-
-### Core Modules
-
-1. **`models.py`** - Data models and configuration
-   - `VoicebotArgs` - Pydantic model for CLI arguments and configuration
-   - `VoicebotMode` - Enum for client/provider modes
-   - `Peer` - WebRTC peer representation
-   - `JoinRequest` - Request model for joining lobbies
-   - `MessageData` - Type alias for message payloads
-
-2. **`webrtc_signaling.py`** - WebRTC signaling client functionality
-   - `WebRTCSignalingClient` - Main WebRTC signaling client class
-   - Handles peer connection management, ICE candidates, session descriptions
-   - Registration status tracking and reconnection logic
-   - Message processing and event handling
-
-3. **`session_manager.py`** - Session and lobby management
-   - `create_or_get_session()` - Session creation/retrieval
-   - `create_or_get_lobby()` - Lobby creation/retrieval
-   - HTTP API communication utilities
-
-4. **`bot_orchestrator.py`** - FastAPI bot orchestration service
-   - Bot discovery and management
-   - FastAPI endpoints for bot operations
-   - Provider registration with main server
-   - Bot instance lifecycle management
-
-5. **`client_main.py`** - Main client logic
-   - `main_with_args()` - Core client functionality
-   - `start_client_with_reload()` - Development mode with reload
-   - Event handlers for peer and track management
-
-6. **`client_app.py`** - Client FastAPI application
-   - `create_client_app()` - Creates FastAPI app for client mode
-   - Health check and status endpoints
-   - Process isolation and locking
-
-7. **`utils.py`** - Utility functions
-   - URL conversion utilities (`http_base_url`, `ws_url`)
-   - SSL context creation
-   - Network information logging
-
-8. **`main.py`** - Main orchestration and entry point
-   - Command-line argument parsing
-   - Mode selection (client vs provider)
-   - Entry points for both modes
-
-### Key Improvements
-
- **Separation of Concerns**: Each file handles specific functionality
- **Better Maintainability**: Smaller, focused modules are easier to understand and modify
- **Reduced Coupling**: Dependencies between components are more explicit
- **Type Safety**: Proper type hints and Pydantic models throughout
- **Error Handling**: Centralized error handling and logging
-
-### Usage
-
-The refactored code maintains the same CLI interface:
-
-```bash
-# Client mode
-python voicebot/main.py --mode client --server-url http://localhost:8000/ai-voicebot
-
-# Provider mode  
-python voicebot/main.py --mode provider --host 0.0.0.0 --port 8788
-```
-
-### Import Structure
-
-```python
-from voicebot import VoicebotArgs, VoicebotMode, WebRTCSignalingClient
-from voicebot.models import Peer, JoinRequest
-from voicebot.session_manager import create_or_get_session, create_or_get_lobby
-from voicebot.client_main import main_with_args
-```
-
-The original `main_old.py` contains the monolithic implementation for reference.
--- a/voicebot/bots/whisper.py
+++ b/voicebot/bots/whisper.py
@ -13,7 +13,7 @@ import os
 import gc
 import shutil
 from queue import Queue, Empty
-from typing import Dict, Optional, Callable, Awaitable, Any, cast, List, Union
+from typing import Dict, Optional, Callable, Awaitable, Any, List, Union
 from pathlib import Path
 import numpy.typing as npt
 from pydantic import BaseModel, Field, ConfigDict
@ -23,7 +23,10 @@ import librosa
 from shared.logger import logger
 from aiortc import MediaStreamTrack
 from aiortc.mediastreams import MediaStreamError
-from av import AudioFrame
+from av import AudioFrame, VideoFrame
+import cv2
+import fractions
+from time import perf_counter

 # Import shared models for chat functionality
 import sys
@ -35,24 +38,75 @@ from voicebot.models import Peer

 # OpenVINO optimized imports
 import openvino as ov
-from optimum.intel.openvino import OVModelForSpeechSeq2Seq
-from transformers import AutoProcessor
+from optimum.intel.openvino import OVModelForSpeechSeq2Seq # type: ignore
+from transformers import WhisperProcessor
+from openvino.runtime import Core  # Part of optimum.intel.openvino # type: ignore
 import torch

 # Import quantization dependencies with error handling
-try:
-    import nncf
-    from optimum.intel.openvino.quantization import InferRequestWrapper
-    QUANTIZATION_AVAILABLE = True
-except ImportError as e:
-    logger.warning(f"Quantization libraries not available: {e}")
-    QUANTIZATION_AVAILABLE = False
+import nncf # type: ignore
+from optimum.intel.openvino.quantization import InferRequestWrapper # type: ignore
+QUANTIZATION_AVAILABLE = True

 # Type definitions
 AudioArray = npt.NDArray[np.float32]
 ModelConfig = Dict[str, Union[str, int, bool]]
 CalibrationData = List[Dict[str, Any]]

+_device = "GPU.1"  # Default to Intel Arc B580 GPU
+
+def get_available_devices() -> list[dict[str, Any]]:
+    """List available OpenVINO devices with their properties."""
+    try:
+        core = Core()
+        devices = core.available_devices
+        device_info : list[dict[str, Any]] = []
+        for device in devices:
+            try:
+                # Get device properties
+                properties = core.get_property(device, "FULL_DEVICE_NAME")
+                # Attempt to get additional properties if available
+                try:
+                    device_type = core.get_property(device, "DEVICE_TYPE")
+                except Exception:
+                    device_type = "N/A"
+                try:
+                    capabilities : Any = core.get_property(device, "SUPPORTED_PROPERTIES")
+                except Exception:
+                    capabilities = "N/A"
+                device_info.append({
+                    "name": device,
+                    "full_name": properties,
+                    "type": device_type,
+                    "capabilities": capabilities
+                })
+            except Exception as e:
+                logger.error(f"Failed to retrieve properties for device {device}: {e}")
+                device_info.append({
+                    "name": device,
+                    "full_name": "Unknown",
+                    "type": "N/A",
+                    "capabilities": "N/A"
+                })
+        return device_info
+    except Exception as e:
+        logger.error(f"Failed to retrieve available devices: {e}")
+        return []
+
+def print_available_devices(device: str | None = None):
+    """Print available OpenVINO devices in a formatted manner."""
+    devices = get_available_devices()
+    if not devices:
+        logger.info("No OpenVINO devices detected.")
+        return
+    logger.info("Available OpenVINO Devices:")
+    for d in devices:
+        logger.info(f"- Device: {d.get('name')} {'*' if d.get('name') == device else ''}")
+        logger.info(f"  Full Name: {d.get('full_name')}")
+        logger.info(f"  Type: {d.get('type')}")
+
+
+print_available_devices(_device)

 class AudioQueueItem(BaseModel):
    """Audio data with timestamp for processing queue."""
@ -75,7 +129,7 @@ class OpenVINOConfig(BaseModel):
    """OpenVINO configuration for Intel Arc B580 optimization."""
    model_config = ConfigDict(arbitrary_types_allowed=True)
    
-    device: str = Field(default="GPU", description="Target device for inference")
+    device: str = Field(default=_device, description="Target device for inference")
    cache_dir: str = Field(default="./ov_cache", description="Cache directory for compiled models")
    enable_quantization: bool = Field(default=True, description="Enable INT8 quantization")
    throughput_streams: int = Field(default=2, description="Number of inference streams")
@ -83,14 +137,36 @@ class OpenVINOConfig(BaseModel):
    
    def to_ov_config(self) -> ModelConfig:
        """Convert to OpenVINO configuration dictionary."""
-        return {
-            "CACHE_DIR": self.cache_dir,
-            "GPU_DISABLE_WINOGRAD_CONVOLUTION": "YES",
-            "GPU_ENABLE_LOOP_UNROLLING": "YES", 
-            "GPU_THROUGHPUT_STREAMS": str(self.throughput_streams),
-            "GPU_MAX_NUM_THREADS": str(self.max_threads),
-            "GPU_ENABLE_OPENCL_THROTTLING": "NO"
-        }
+        cfg: ModelConfig = {"CACHE_DIR": self.cache_dir}
+
+        # Only include GPU-specific tuning options when the target device is GPU.
+        # Some OpenVINO plugins (notably the CPU plugin) will raise NotFound
+        # errors for GPU_* properties, so avoid passing them unless applicable.
+        device = (self.device or "").upper()
+        if device == "GPU":
+            cfg.update(
+                {
+                    # Throughput / stream tuning
+                    "GPU_THROUGHPUT_STREAMS": str(self.throughput_streams),
+                    # Threading controls may be driver/plugin-specific; keep minimal
+                    # NOTE: We intentionally do NOT set GPU_MAX_NUM_THREADS here
+                    # because some OpenVINO plugins / builds (and the CPU plugin
+                    # during a fallback) do not recognize the property and will
+                    # raise NotFound/UnsupportedProperty errors. If you need to
+                    # tune GPU threads for a specific driver, set that externally
+                    # or via vendor-specific tools.
+                }
+            )
+        else:
+            # Safe CPU-side defaults
+            cfg.update(
+                {
+                    "CPU_THROUGHPUT_NUM_THREADS": str(self.max_threads),
+                    "CPU_BIND_THREAD": "YES",
+                }
+            )
+
+        return cfg


 # Global configuration and constants
@ -139,13 +215,14 @@ def setup_intel_arc_environment() -> None:
 class OpenVINOWhisperModel:
    """OpenVINO optimized Whisper model for Intel Arc B580."""
    
-    def __init__(self, model_id: str, config: OpenVINOConfig):
+    def __init__(self, model_id: str, config: OpenVINOConfig, device: str):
        self.model_id = model_id
        self.config = config
+        self.device = device
        self.model_path = Path(model_id.replace('/', '_'))
        self.quantized_model_path = Path(f"{self.model_path}_quantized")
        
-        self.processor: Optional[AutoProcessor] = None
+        self.processor: Optional[WhisperProcessor] = None
        self.ov_model: Optional[OVModelForSpeechSeq2Seq] = None
        self.is_quantized = False
        
@ -157,23 +234,29 @@ class OpenVINOWhisperModel:
        
        try:
            # Initialize processor
-            self.processor = AutoProcessor.from_pretrained(self.model_id)
+            logger.info(f"Loading Whisper model '{self.model_id}' on device: {self.device}")
+            self.processor = WhisperProcessor.from_pretrained(self.model_id, use_fast=True) # type: ignore
            logger.info("Whisper processor loaded successfully")
            
-            # Try to load quantized model first if it exists
-            if QUANTIZATION_AVAILABLE and self.config.enable_quantization and self.quantized_model_path.exists():
-                if self._try_load_quantized_model():
-                    return
+            # Export the model to OpenVINO IR if not already converted
+            self.ov_model = OVModelForSpeechSeq2Seq.from_pretrained(self.model_id, export=True, device=self.device) # type: ignore
+
+            logger.info("Whisper model exported as OpenVINO IR")
+
+            # # Try to load quantized model first if it exists
+            # if self.config.enable_quantization and self.quantized_model_path.exists():
+            #     if self._try_load_quantized_model():
+            #         return
            
-            # Load or create FP16 model
-            if self.model_path.exists():
-                self._load_fp16_model()
-            else:
-                self._convert_model()
+            # # Load or create FP16 model
+            # if self.model_path.exists():
+            #     self._load_fp16_model()
+            # else:
+            #     self._convert_model()
            
-            # Try quantization after model is loaded and compiled
-            if QUANTIZATION_AVAILABLE and self.config.enable_quantization and not self.is_quantized:
-                self._try_quantize_existing_model()
+            # # Try quantization after model is loaded and compiled
+            # if self.config.enable_quantization and not self.is_quantized:
+            #     self._try_quantize_existing_model()
                
        except Exception as e:
            logger.error(f"Error initializing model: {e}")
@ -294,6 +377,9 @@ class OpenVINOWhisperModel:
    
    def _quantize_model_safe(self) -> None:
        """Safely quantize the model with extensive error handling."""
+        if not nncf:
+            logger.info("Quantization libraries not available, skipping quantization")
+            return
        if self.quantized_model_path.exists():
            logger.info("Quantized model already exists")
            return
@ -301,6 +387,9 @@ class OpenVINOWhisperModel:
        if self.ov_model is None:
            raise RuntimeError("No model to quantize")
        
+        if not self.ov_model.decoder_with_past:
+            raise RuntimeError("Model decoder_with_past not available")
+        
        logger.info("Creating INT8 quantized model for Intel Arc B580...")
        
        try:
@ -338,8 +427,8 @@ class OpenVINOWhisperModel:
            
            # Save quantized models
            self.quantized_model_path.mkdir(parents=True, exist_ok=True)
-            ov.save_model(quantized_encoder, self.quantized_model_path / "openvino_encoder_model.xml")
-            ov.save_model(quantized_decoder, self.quantized_model_path / "openvino_decoder_with_past_model.xml")
+            ov.save_model(quantized_encoder, self.quantized_model_path / "openvino_encoder_model.xml") # type: ignore
+            ov.save_model(quantized_decoder, self.quantized_model_path / "openvino_decoder_with_past_model.xml") # type: ignore
            
            # Copy remaining files
            self._copy_model_files()
@ -366,11 +455,11 @@ class OpenVINOWhisperModel:
        logger.info(f"Collecting calibration data ({dataset_size} samples)...")
        
        # Check model components
-        if not hasattr(self.ov_model, 'encoder') or self.ov_model.encoder is None:
+        if not self.ov_model.encoder:
            logger.warning("Encoder not available for calibration")
            return {}
            
-        if not hasattr(self.ov_model, 'decoder_with_past') or self.ov_model.decoder_with_past is None:
+        if not self.ov_model.decoder_with_past:
            logger.warning("Decoder with past not available for calibration")
            return {}
        
@ -402,14 +491,14 @@ class OpenVINOWhisperModel:
                    duration = 2.0 + np.random.random() * 3.0  # 2-5 seconds
                    synthetic_audio = np.random.randn(int(SAMPLE_RATE * duration)).astype(np.float32) * 0.1
                    
-                    input_features = self.processor(
+                    inputs : Any = self.processor(
                        synthetic_audio,
                        sampling_rate=SAMPLE_RATE,
                        return_tensors="pt"
-                    ).input_features
+                    )
                    
                    # Run inference to collect calibration data
-                    _ = self.ov_model.generate(input_features, max_new_tokens=10)
+                    generated_ids = self.ov_model.generate(inputs.input_features, max_new_tokens=10)
                    
                    if i % 5 == 0:
                        logger.debug(f"Generated calibration sample {i+1}/{dataset_size}")
@ -470,11 +559,36 @@ class OpenVINOWhisperModel:
            self._warmup_model()
            logger.info("Model compiled and warmed up successfully")
        except Exception as e:
-            logger.warning(f"Failed to compile for GPU, trying CPU: {e}")
-            # Fallback to CPU
+            logger.warning(f"Failed to compile for {self.config.device}, attempting safe CPU fallback: {e}")
+            # Fallback: reload/compile model with a CPU-only ov_config to avoid
+            # passing GPU-specific properties to the CPU plugin which can raise
+            # NotFound/UnsupportedProperty exceptions.
            try:
-                self.ov_model.to("CPU")
-                self.ov_model.compile()
+                cpu_cfg = OpenVINOConfig(**{**self.config.model_dump()}) if hasattr(self.config, 'model_dump') else self.config
+                # Ensure device is CPU and use conservative CPU threading options
+                cpu_cfg = OpenVINOConfig(device='CPU', cache_dir=self.config.cache_dir, enable_quantization=self.config.enable_quantization, throughput_streams=1, max_threads=self.config.max_threads)
+
+                logger.info("Reloading model with CPU-only OpenVINO config for safe compilation")
+                # Try to reload using the existing saved model path if possible
+                try:
+                    self.ov_model = OVModelForSpeechSeq2Seq.from_pretrained(
+                        self.model_path,
+                        ov_config=cpu_cfg.to_ov_config(),
+                        compile=False
+                    )
+                except Exception:
+                    # If loading the saved model failed, try loading without ov_config
+                    self.ov_model = OVModelForSpeechSeq2Seq.from_pretrained(self.model_path, compile=False)
+
+                # Compile on CPU
+                self.ov_model.to('CPU')
+                # Provide CPU-only ov_config if supported
+                try:
+                    self.ov_model.compile()
+                except Exception as compile_cpu_e:
+                    logger.warning(f"CPU compile with CPU ov_config failed, retrying default compile: {compile_cpu_e}")
+                    self.ov_model.compile()
+
                self._warmup_model()
                logger.info("Model compiled for CPU successfully")
            except Exception as cpu_e:
@ -503,18 +617,32 @@ class OpenVINOWhisperModel:
        except Exception as e:
            logger.warning(f"Model warmup failed: {e}")
    
-    def generate(self, input_features: torch.Tensor) -> torch.Tensor:
+    def generate(self, input_features: torch.Tensor, language: str = "en") -> torch.Tensor:
        """Generate transcription from input features."""
        if self.ov_model is None:
            raise RuntimeError("Model not initialized")
-            
-        return self.ov_model.generate(
-            input_features,
-            max_new_tokens=128,
-            num_beams=1,  # Greedy decoding for speed
-            do_sample=False
-        )
-    
+
+        generation_config : dict[str, Any]= {
+            "max_length": 448,
+            "num_beams": 4,  # Use beam search for better results
+            # "num_beams": 1,  # Greedy decoding for speed
+            "no_repeat_ngram_size": 3,  # Prevent repetitive phrases
+            "language": language,  # Explicitly set language to English
+            "task": "transcribe",  # Ensure transcription, not translation
+            "suppress_tokens": None,  # Disable default suppress_tokens to avoid conflicts
+            "begin_suppress_tokens": None,  # Disable default begin_suppress_tokens
+            "max_new_tokens": 128,
+            "do_sample": False
+        }
+        try:
+            return self.ov_model.generate( # type: ignore
+                input_features,
+                **generation_config
+            ) 
+        except Exception as e:
+            logger.error(f"Model generation failed: {e}")
+            raise RuntimeError(f"Failed to generate transcription: {e}")
+        
    def decode(self, token_ids: torch.Tensor, skip_special_tokens: bool = True) -> List[str]:
        """Decode token IDs to text."""
        if self.processor is None:
@ -528,30 +656,29 @@ _whisper_model: Optional[OpenVINOWhisperModel] = None
 _audio_processors: Dict[str, "OptimizedAudioProcessor"] = {}
 _send_chat_func: Optional[Callable[[str], Awaitable[None]]] = None

-
-def _ensure_model_loaded() -> OpenVINOWhisperModel:
+def _ensure_model_loaded(device: str = _device) -> OpenVINOWhisperModel:
    """Ensure the global model is loaded."""
    global _whisper_model
    if _whisper_model is None:
        setup_intel_arc_environment()
        logger.info(f"Loading OpenVINO Whisper model: {_model_id}")
-        _whisper_model = OpenVINOWhisperModel(_model_id, _ov_config)
+        _whisper_model = OpenVINOWhisperModel(model_id=_model_id, config=_ov_config, device=device)
        logger.info("OpenVINO Whisper model loaded successfully")
    return _whisper_model


 def extract_input_features(audio_array: AudioArray, sampling_rate: int) -> torch.Tensor:
    """Extract input features from audio array optimized for OpenVINO."""
-    model = _ensure_model_loaded()
-    if model.processor is None:
+    ov_model = _ensure_model_loaded()
+    if ov_model.processor is None:
        raise RuntimeError("Processor not initialized")
        
-    processor_output = model.processor(
+    inputs = ov_model.processor(
        audio_array,
        sampling_rate=sampling_rate,
        return_tensors="pt",
    )
-    return processor_output.input_features
+    return inputs.input_features


 class OptimizedAudioProcessor:
@ -686,8 +813,8 @@ class OptimizedAudioProcessor:
                threading_queue = getattr(self, '_threading_queue', None)
                if threading_queue:
                    threading_queue.put_nowait(queue_item)
-            except:
-                logger.warning(f"Threading queue issue for {self.peer_name}")
+            except Exception as e:
+                logger.warning(f"Threading queue issue for {self.peer_name}: {e}")
    
    def _queue_final_transcription(self) -> None:
        """Queue final transcription of current phrase."""
@ -759,8 +886,18 @@ class OptimizedAudioProcessor:
            except Exception as e:
                logger.error(f"Error in thread processing loop for {self.peer_name}: {e}")
    
-    async def _transcribe_and_send(self, audio_array: AudioArray, is_final: bool) -> None:
-        """Transcribe audio using OpenVINO optimized model."""
+    async def _transcribe_and_send(self, audio_array: AudioArray, is_final: bool, language: str="en") -> None:
+        """
+        Transcribe raw numpy audio data using OpenVINO Whisper.
+
+        Parameters:
+        - audio_array: normalized 1D numpy array containing mono PCM data at 16 kHz. 
+        - is_final: whether this is a final transcription (True) or interim (False)
+        - language: language code for transcription (default 'en' for English)
+        """
+        if audio_array.ndim != 1:
+            raise ValueError("Expected mono audio as a 1D numpy array.")
+        
        transcription_start = time.time()
        transcription_type = "final" if is_final else "streaming"
        
@ -782,15 +919,15 @@ class OptimizedAudioProcessor:
            
            # Extract features for OpenVINO
            input_features = extract_input_features(audio_array, self.sample_rate)
-            
+            # logger.info(f"Features extracted for OpenVINO: {input_features.shape}")
            # GPU inference with OpenVINO
-            model = _ensure_model_loaded()
-            predicted_ids = model.generate(input_features)
-            
-            # Decode results
-            transcription = model.decode(predicted_ids, skip_special_tokens=True)
-            text = transcription[0].strip() if transcription else ""
+            ov_model = _ensure_model_loaded()
+            generated_ids = ov_model.generate(input_features)
            
+            # Decode tokens into text
+            transcription = ov_model.processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
+            text = transcription.strip() if transcription else ""
+            logger.info(f"Transcription text: {text}")
            transcription_time = time.time() - transcription_start
            
            if text and len(text.split()) >= 2:
@ -847,6 +984,125 @@ class OptimizedAudioProcessor:
        
        logger.info(f"OptimizedAudioProcessor shutdown complete for {self.peer_name}")

+def normalize_audio(audio_data: npt.NDArray[np.float32]) -> npt.NDArray[np.float32]:
+    """Normalize audio to have maximum amplitude of 1.0."""
+    max_amplitude = np.max(np.abs(audio_data))
+    if max_amplitude > 0:
+        audio_data = audio_data / max_amplitude
+    return audio_data
+
+
+class MediaClock:
+    """Simple monotonic clock for media tracks."""
+
+    def __init__(self) -> None:
+        self.t0 = perf_counter()
+
+    def now(self) -> float:
+        return perf_counter() - self.t0
+
+
+class WaveformVideoTrack(MediaStreamTrack):
+    """Video track that renders a live waveform of the incoming audio.
+
+    The track reads the most-active `OptimizedAudioProcessor` in
+    `_audio_processors` and renders the last ~2s of its `current_phrase_audio`.
+    If no audio is available, the track will display a "No audio" message.
+    """
+
+    kind = "video"
+
+    def __init__(self, session_name: str, width: int = 640, height: int = 240, fps: int = 15) -> None:
+        super().__init__()
+        self.session_name = session_name
+        self.width = int(width)
+        self.height = int(height)
+        self.fps = int(fps)
+        self.clock = MediaClock()
+        self._next_frame_index = 0
+
+    async def next_timestamp(self) -> tuple[int, float]:
+        pts = int(self._next_frame_index * (1 / self.fps) * 90000)
+        time_base = 1 / 90000
+        return pts, time_base
+
+    async def recv(self) -> VideoFrame:
+        pts, time_base = await self.next_timestamp()
+
+        # schedule frame according to clock
+        target_t = self._next_frame_index / self.fps
+        now = self.clock.now()
+        if target_t > now:
+            await asyncio.sleep(target_t - now)
+
+        self._next_frame_index += 1
+
+        frame_array: npt.NDArray[np.uint8] = np.zeros((self.height, self.width, 3), dtype=np.uint8)
+
+        # Select the most active processor (highest RMS) and draw its waveform
+        best_proc = None
+        best_rms = 0.0
+        try:
+            for pname, proc in _audio_processors.items():
+                try:
+                    arr = getattr(proc, 'current_phrase_audio', None)
+                    if arr is None or len(arr) == 0:
+                        continue
+                    rms = float(np.sqrt(np.mean(arr**2)))
+                    if rms > best_rms:
+                        best_rms = rms
+                        best_proc = (pname, arr.copy())
+                except Exception:
+                    continue
+        except Exception:
+            best_proc = None
+
+        if best_proc is not None:
+            pname, arr = best_proc
+
+            # Use up to 2 seconds of audio for the waveform
+            window_samples = min(len(arr), SAMPLE_RATE * 2)
+            if window_samples <= 0:
+                arr_segment = np.zeros(1, dtype=np.float32)
+            else:
+                arr_segment = arr[-window_samples:]
+
+            # Normalize segment to -1..1 safely
+            maxv = float(np.max(np.abs(arr_segment))) if arr_segment.size > 0 else 0.0
+            if maxv > 0:
+                norm = arr_segment / maxv
+            else:
+                norm = np.zeros_like(arr_segment)
+
+            # Map audio samples to pixels across the width
+            if norm.size < self.width:
+                padded = np.zeros(self.width, dtype=np.float32)
+                if norm.size > 0:
+                    padded[-norm.size:] = norm
+                norm = padded
+            else:
+                block = int(np.ceil(norm.size / self.width))
+                norm = np.array([np.mean(norm[i * block : min((i + 1) * block, norm.size)]) for i in range(self.width)], dtype=np.float32)
+
+            # Create polyline points, avoid NaN
+            points: list[tuple[int, int]] = []
+            for x in range(self.width):
+                v = float(norm[x]) if x < norm.size and not np.isnan(norm[x]) else 0.0
+                y = int((1.0 - ((v + 1.0) / 2.0)) * (self.height - 1))
+                points.append((x, max(0, min(self.height - 1, y))))
+
+            if len(points) > 1:
+                pts_np = np.array(points, dtype=np.int32)
+                cv2.polylines(frame_array, [pts_np], isClosed=False, color=(0, 200, 80), thickness=2)
+
+            cv2.putText(frame_array, f"Waveform: {pname}", (10, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)
+        else:
+            cv2.putText(frame_array, "No audio", (10, 20), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (200, 200, 200), 1)
+
+        frame = VideoFrame.from_ndarray(frame_array, format="bgr24")
+        frame.pts = pts
+        frame.time_base = fractions.Fraction(1 / 90000).limit_denominator(1000000)
+        return frame

 async def handle_track_received(peer: Peer, track: MediaStreamTrack) -> None:
    """Handle incoming audio tracks from WebRTC peers."""
@ -901,8 +1157,9 @@ async def handle_track_received(peer: Peer, track: MediaStreamTrack) -> None:
                        audio_data = _resample_audio(audio_data, frame.sample_rate, SAMPLE_RATE)
                    
                    # Convert to float32
-                    audio_data_float32 = cast(AudioArray, audio_data.astype(np.float32))
-                    
+                    audio_data_float32 = audio_data.astype(np.float32)
+                    audio_data = normalize_audio(audio_data)
+
                    # Process with optimized processor
                    audio_processor.add_audio_data(audio_data_float32)
                    
@ -937,27 +1194,57 @@ def _process_audio_frame(audio_data: npt.NDArray[Any], frame: AudioFrame) -> npt
 def _resample_audio(audio_data: npt.NDArray[np.float32], orig_sr: int, target_sr: int) -> npt.NDArray[np.float32]:
    """Resample audio efficiently."""
    try:
-        # Use high-quality resampling for better results
+        # Handle stereo audio by converting to mono if necessary
+        if audio_data.ndim > 1:
+            audio_data = np.mean(audio_data, axis=1)
+
+        # Use high-quality resampling
        resampled = librosa.resample(
-            audio_data.astype(np.float64), 
-            orig_sr=orig_sr, 
+            audio_data.astype(np.float64),
+            orig_sr=orig_sr,
            target_sr=target_sr,
            res_type='kaiser_fast'  # Good balance of quality and speed
        )
        return resampled.astype(np.float32)
    except Exception as e:
        logger.error(f"Resampling failed: {e}")
-        return audio_data
+        raise ValueError(f"Failed to resample audio from {orig_sr} Hz to {target_sr} Hz: {e}")
+


 # Public API functions
 def agent_info() -> Dict[str, str]:
-    return {"name": AGENT_NAME, "description": AGENT_DESCRIPTION, "has_media": "false"}
+    return {"name": AGENT_NAME, "description": AGENT_DESCRIPTION, "has_media": "true"}


 def create_agent_tracks(session_name: str) -> Dict[str, MediaStreamTrack]:
-    """Whisper is not a media source - return no local tracks."""
-    return {}
+    """Create agent tracks. Provides a synthetic video waveform track and a silent audio track for compatibility."""
+    class SilentAudioTrack(MediaStreamTrack):
+        kind = "audio"
+        def __init__(self, sample_rate: int = SAMPLE_RATE, channels: int = 1, fps: int = 50):
+            super().__init__()
+            self.sample_rate = sample_rate
+            self.channels = channels
+            self.fps = fps
+            self.samples_per_frame = int(self.sample_rate / self.fps)
+            self._timestamp = 0
+        async def recv(self) -> AudioFrame:
+            # Generate silent audio as int16 (required by aiortc)
+            data = np.zeros((self.channels, self.samples_per_frame), dtype=np.int16)
+            frame = AudioFrame.from_ndarray(data, layout="mono" if self.channels == 1 else "stereo")
+            frame.sample_rate = self.sample_rate
+            frame.pts = self._timestamp
+            frame.time_base = fractions.Fraction(1, self.sample_rate)
+            self._timestamp += self.samples_per_frame
+            await asyncio.sleep(1 / self.fps)
+            return frame
+    try:
+        video_track = WaveformVideoTrack(session_name=session_name, width=640, height=240, fps=15)
+        audio_track = SilentAudioTrack()
+        return {"video": video_track, "audio": audio_track}
+    except Exception as e:
+        logger.error(f"Failed to create agent tracks: {e}")
+        return {}


 async def handle_chat_message(
@ -1010,12 +1297,12 @@ def get_active_processors() -> Dict[str, OptimizedAudioProcessor]:

 def get_model_info() -> Dict[str, Any]:
    """Get information about the loaded model."""
-    model = _ensure_model_loaded()
+    ov_model = _ensure_model_loaded()
    return {
        "model_id": _model_id,
        "device": _ov_config.device,
        "quantization_enabled": _ov_config.enable_quantization,
-        "is_quantized": model.is_quantized,
+        "is_quantized": ov_model.is_quantized,
        "sample_rate": SAMPLE_RATE,
        "chunk_duration_ms": CHUNK_DURATION_MS
    }
--- a/voicebot/requirements.txt
+++ b/voicebot/requirements.txt
@ -29,10 +29,10 @@ dill==0.3.8
 dnspython==2.7.0
 fastapi==0.116.1
 ffmpy==0.6.1
-filelock==3.13.1
+filelock==3.19.1
 fonttools==4.59.2
 frozenlist==1.7.0
-fsspec==2024.6.1
+fsspec==2025.3.0
 google-crc32c==1.7.1
 gradio==5.44.1
 gradio-client==1.12.1
@ -45,17 +45,18 @@ httpx==0.28.1
 huggingface-hub==0.34.4
 idna==3.10
 ifaddr==0.2.0
-jinja2==3.1.4
+iniconfig==2.1.0
+jinja2==3.1.6
 jiwer==4.0.0
 joblib==1.5.2
 jsonschema==4.25.1
-jsonschema-specifications==2025.4.1
+jsonschema-specifications==2025.9.1
 kiwisolver==1.4.9
 lazy-loader==0.4
 librosa==0.11.0
 llvmlite==0.44.0
 markdown-it-py==4.0.0
-markupsafe==2.1.5
+markupsafe==3.0.2
 matplotlib==3.10.6
 mdurl==0.1.2
 ml-dtypes==0.5.3
@ -65,23 +66,40 @@ msgpack==1.1.1
 multidict==6.6.4
 multiprocess==0.70.16
 natsort==8.4.0
-networkx==3.3
-ninja==1.11.1.4
-nncf==2.17.0
+networkx==3.5
+ninja==1.13.0
+nncf==2.18.0
 numba==0.61.2
 numpy==2.2.6
+nvidia-cublas-cu12==12.8.4.1
+nvidia-cuda-cupti-cu12==12.8.90
+nvidia-cuda-nvrtc-cu12==12.8.93
+nvidia-cuda-runtime-cu12==12.8.90
+nvidia-cudnn-cu12==9.10.2.21
+nvidia-cufft-cu12==11.3.3.83
+nvidia-cufile-cu12==1.13.1.3
+nvidia-curand-cu12==10.3.9.90
+nvidia-cusolver-cu12==11.7.3.90
+nvidia-cusparse-cu12==12.5.8.93
+nvidia-cusparselt-cu12==0.7.1
+nvidia-nccl-cu12==2.27.3
+nvidia-nvjitlink-cu12==12.8.93
+nvidia-nvtx-cu12==12.8.90
 onnx==1.19.0
-openai-whisper==20250625
+openai-whisper @ git+https://github.com/openai/whisper.git@c0d2f624c09dc18e709e37c2ad90c039a4eb72a2
 opencv-python==4.11.0.86
 openvino==2025.3.0
+openvino-genai==2025.3.0.0
 openvino-telemetry==2025.2.0
+openvino-tokenizers==2025.3.0.0
 optimum==1.27.0
-optimum-intel @ git+https://github.com/huggingface/optimum-intel.git@c35534d077dddf9382c6d8699f13412d28b19853
+optimum-intel @ git+https://github.com/huggingface/optimum-intel.git@b9c151fec6b414d9ca78be8643d08e267b133bfc
 orjson==3.11.3
 packaging==25.0
-pandas==2.2.3
+pandas==2.3.2
 pillow==11.3.0
 platformdirs==4.4.0
+pluggy==1.6.0
 pooch==1.8.2
 propcache==0.3.2
 protobuf==6.32.0
@ -96,16 +114,22 @@ pyee==13.0.0
 pygments==2.19.2
 pylibsrtp==0.12.0
 pymoo==0.6.1.5
+pyopencl==2025.2.6
 pyopenssl==25.1.0
 pyparsing==3.2.3
+pytest==8.4.2
+pytest-asyncio==1.1.0
 python-dateutil==2.9.0.post0
+python-ffmpeg==1.0.16
 python-multipart==0.0.20
+pytools==2025.2.4
 pytz==2025.2
 pyyaml==6.0.2
 rapidfuzz==3.14.0
 referencing==0.36.2
 regex==2025.9.1
 requests==2.32.5
+resampy==0.4.3
 rich==14.1.0
 rpds-py==0.27.1
 ruff==0.12.11
@ -114,22 +138,24 @@ safetensors==0.6.2
 scikit-learn==1.7.1
 scipy==1.16.1
 semantic-version==2.10.0
-setuptools==70.2.0
+setuptools==80.9.0
 shellingham==1.5.4
+siphash24==1.8
 six==1.17.0
 sniffio==1.3.1
 soundfile==0.13.1
 soxr==0.5.0.post1
 speechrecognition==3.14.3
 starlette==0.47.3
-sympy==1.13.3
+sympy==1.14.0
 tabulate==0.9.0
 threadpoolctl==3.6.0
 tiktoken==0.11.0
 tokenizers==0.21.4
 tomlkit==0.13.3
-torch==2.8.0+cpu
-tqdm==4.66.5
+torch==2.8.0
+torchvision==0.23.0
+tqdm==4.67.1
 transformers==4.53.3
 triton==3.4.0
 typer==0.17.3
--- a/voicebot/set_whisper_debug.py
+++ b/voicebot/set_whisper_debug.py
@ -8,12 +8,8 @@ import logging
 import sys
 import os

-# Add the project root (parent of the voicebot directory) to sys.path so
-# imports like `from shared import ...` work when running this script from
-# inside the `voicebot` container.
-project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
-if project_root not in sys.path:
-    sys.path.append(project_root)
+# Add the voicebot directory to the path
+sys.path.append(os.path.dirname(os.path.abspath(__file__)))

 from shared.logger import logger