ai-voicebot/ARCHITECTURE_RECOMMENDATIONS.md

9.7 KiB

Architecture Recommendations: Sessions, Lobbies, and WebSockets

Executive Summary

The current architecture has grown organically into a monolithic structure that mixes concerns and creates maintenance challenges. This document outlines specific recommendations to improve maintainability, reduce complexity, and enhance the development experience.

Current Issues

1. Server (server/main.py)

  • Monolithic structure: 2300+ lines in a single file
  • Mixed concerns: Session, lobby, WebSocket, bot, and admin logic intertwined
  • Complex state management: Multiple global dictionaries requiring manual synchronization
  • WebSocket message handling: Deep nested switch statements are hard to follow
  • Threading complexity: Multiple locks and shared state increase deadlock risk

2. Client (client/src/)

  • Fragmented connection logic: WebSocket handling scattered across components
  • Error handling complexity: Different scenarios handled inconsistently
  • State synchronization: Multiple sources of truth for session/lobby state

3. Voicebot (voicebot/)

  • Duplicate patterns: Similar WebSocket logic but different implementation
  • Bot lifecycle complexity: Complex orchestration with unclear state flow

Proposed Architecture

Server Refactoring

1. Extract Core Modules

server/
├── main.py                 # FastAPI app setup and routing only
├── core/
│   ├── __init__.py
│   ├── session_manager.py  # Session lifecycle and persistence
│   ├── lobby_manager.py    # Lobby management and chat
│   ├── bot_manager.py      # Bot provider and orchestration
│   └── auth_manager.py     # Name/password authentication
├── websocket/
│   ├── __init__.py
│   ├── connection.py       # WebSocket connection handling
│   ├── message_handlers.py # Message type routing and handling
│   └── signaling.py        # WebRTC signaling logic
├── api/
│   ├── __init__.py
│   ├── admin.py           # Admin endpoints
│   ├── sessions.py        # Session HTTP API
│   ├── lobbies.py         # Lobby HTTP API
│   └── bots.py            # Bot HTTP API
└── models/
    ├── __init__.py
    ├── session.py         # Session and Lobby classes
    └── events.py          # Event system for decoupled communication

2. Event-Driven Architecture

Replace direct method calls with an event system:

from typing import Protocol
from abc import ABC, abstractmethod

class Event(ABC):
    """Base event class"""
    pass

class SessionJoinedLobby(Event):
    def __init__(self, session_id: str, lobby_id: str):
        self.session_id = session_id
        self.lobby_id = lobby_id

class EventHandler(Protocol):
    async def handle(self, event: Event) -> None: ...

class EventBus:
    def __init__(self):
        self._handlers: dict[type[Event], list[EventHandler]] = {}
    
    def subscribe(self, event_type: type[Event], handler: EventHandler):
        if event_type not in self._handlers:
            self._handlers[event_type] = []
        self._handlers[event_type].append(handler)
    
    async def publish(self, event: Event):
        event_type = type(event)
        if event_type in self._handlers:
            for handler in self._handlers[event_type]:
                await handler.handle(event)

3. WebSocket Message Router

Replace the massive switch statement with a clean router:

from typing import Callable, Dict, Any
from abc import ABC, abstractmethod

class MessageHandler(ABC):
    @abstractmethod
    async def handle(self, session: Session, data: Dict[str, Any], websocket: WebSocket) -> None:
        pass

class SetNameHandler(MessageHandler):
    async def handle(self, session: Session, data: Dict[str, Any], websocket: WebSocket) -> None:
        # Handle set_name logic here
        pass

class WebSocketRouter:
    def __init__(self):
        self._handlers: Dict[str, MessageHandler] = {}
    
    def register(self, message_type: str, handler: MessageHandler):
        self._handlers[message_type] = handler
    
    async def route(self, message_type: str, session: Session, data: Dict[str, Any], websocket: WebSocket):
        if message_type in self._handlers:
            await self._handlers[message_type].handle(session, data, websocket)
        else:
            await websocket.send_json({"type": "error", "data": {"error": f"Unknown message type: {message_type}"}})

Client Refactoring

1. Centralized Connection Management

Create a single WebSocket connection manager:

// src/connection/WebSocketManager.ts
export class WebSocketManager {
  private ws: WebSocket | null = null;
  private reconnectAttempts = 0;
  private messageHandlers = new Map<string, (data: any) => void>();
  
  constructor(private url: string) {}
  
  async connect(): Promise<void> {
    // Connection logic with automatic reconnection
  }
  
  subscribe(messageType: string, handler: (data: any) => void): void {
    this.messageHandlers.set(messageType, handler);
  }
  
  send(type: string, data: any): void {
    if (this.ws?.readyState === WebSocket.OPEN) {
      this.ws.send(JSON.stringify({ type, data }));
    }
  }
  
  private handleMessage(event: MessageEvent): void {
    const message = JSON.parse(event.data);
    const handler = this.messageHandlers.get(message.type);
    if (handler) {
      handler(message.data);
    }
  }
}

2. Unified State Management

Use a state management pattern (Context + Reducer or Zustand):

// src/store/AppStore.ts
interface AppState {
  session: Session | null;
  lobby: Lobby | null;
  participants: Participant[];
  connectionStatus: 'disconnected' | 'connecting' | 'connected';
  error: string | null;
}

type AppAction = 
  | { type: 'SET_SESSION'; payload: Session }
  | { type: 'SET_LOBBY'; payload: Lobby }
  | { type: 'UPDATE_PARTICIPANTS'; payload: Participant[] }
  | { type: 'SET_CONNECTION_STATUS'; payload: AppState['connectionStatus'] }
  | { type: 'SET_ERROR'; payload: string | null };

const appReducer = (state: AppState, action: AppAction): AppState => {
  switch (action.type) {
    case 'SET_SESSION':
      return { ...state, session: action.payload };
    // ... other cases
    default:
      return state;
  }
};

Voicebot Refactoring

1. Unified Connection Interface

Create a common WebSocket interface used by both client and voicebot:

# shared/websocket_client.py
from abc import ABC, abstractmethod
from typing import Dict, Any, Callable, Optional

class WebSocketClient(ABC):
    def __init__(self, url: str, session_id: str, lobby_id: str):
        self.url = url
        self.session_id = session_id
        self.lobby_id = lobby_id
        self.message_handlers: Dict[str, Callable[[Dict[str, Any]], None]] = {}
    
    @abstractmethod
    async def connect(self) -> None:
        pass
    
    @abstractmethod
    async def send_message(self, message_type: str, data: Dict[str, Any]) -> None:
        pass
    
    def register_handler(self, message_type: str, handler: Callable[[Dict[str, Any]], None]):
        self.message_handlers[message_type] = handler
    
    async def handle_message(self, message_type: str, data: Dict[str, Any]):
        handler = self.message_handlers.get(message_type)
        if handler:
            await handler(data)

Implementation Plan

Phase 1: Server Foundation (Week 1-2)

  1. Extract SessionManager and LobbyManager classes
  2. Implement basic event system
  3. Create WebSocket message router
  4. Move admin endpoints to separate module

Phase 2: Server Completion (Week 3-4)

  1. Extract bot management functionality
  2. Implement remaining message handlers
  3. Add comprehensive testing
  4. Performance optimization

Phase 3: Client Refactoring (Week 5-6)

  1. Implement centralized WebSocket manager
  2. Create unified state management
  3. Refactor components to use new architecture
  4. Add error boundary and better error handling

Phase 4: Voicebot Integration (Week 7-8)

  1. Create shared WebSocket interface
  2. Refactor voicebot to use common patterns
  3. Improve bot lifecycle management
  4. Integration testing

Benefits of Proposed Architecture

Maintainability

  • Single Responsibility: Each module has a clear, focused purpose
  • Testability: Smaller, focused classes are easier to unit test
  • Debugging: Clear separation makes it easier to trace issues

Scalability

  • Event-driven: Loose coupling enables easier feature additions
  • Modular: New functionality can be added without touching core logic
  • Performance: Event system enables asynchronous processing

Developer Experience

  • Code Navigation: Easier to find relevant code
  • Documentation: Smaller modules are easier to document
  • Onboarding: New developers can understand individual components

Reliability

  • Error Isolation: Failures in one module don't cascade
  • State Management: Centralized state reduces synchronization bugs
  • Connection Handling: Robust reconnection and error recovery

Risk Mitigation

Breaking Changes

  • Implement changes incrementally
  • Maintain backward compatibility during transition
  • Comprehensive testing at each phase

Performance Impact

  • Benchmark before and after changes
  • Event system should be lightweight
  • Monitor memory usage and connection handling

Team Coordination

  • Clear communication about architecture changes
  • Code review process for architectural decisions
  • Documentation updates with each phase

Conclusion

This refactoring will transform the current monolithic architecture into a maintainable, scalable system. The modular approach will reduce complexity, improve testability, and make the codebase more approachable for new developers while maintaining all existing functionality.