# Architecture Recommendations: Sessions, Lobbies, and WebSockets ## Executive Summary The current architecture has grown organically into a monolithic structure that mixes concerns and creates maintenance challenges. This document outlines specific recommendations to improve maintainability, reduce complexity, and enhance the development experience. ## Current Issues ### 1. Server (`server/main.py`) - **Monolithic structure**: 2300+ lines in a single file - **Mixed concerns**: Session, lobby, WebSocket, bot, and admin logic intertwined - **Complex state management**: Multiple global dictionaries requiring manual synchronization - **WebSocket message handling**: Deep nested switch statements are hard to follow - **Threading complexity**: Multiple locks and shared state increase deadlock risk ### 2. Client (`client/src/`) - **Fragmented connection logic**: WebSocket handling scattered across components - **Error handling complexity**: Different scenarios handled inconsistently - **State synchronization**: Multiple sources of truth for session/lobby state ### 3. Voicebot (`voicebot/`) - **Duplicate patterns**: Similar WebSocket logic but different implementation - **Bot lifecycle complexity**: Complex orchestration with unclear state flow ## Proposed Architecture ### Server Refactoring #### 1. Extract Core Modules ``` server/ ├── main.py # FastAPI app setup and routing only ├── core/ │ ├── __init__.py │ ├── session_manager.py # Session lifecycle and persistence │ ├── lobby_manager.py # Lobby management and chat │ ├── bot_manager.py # Bot provider and orchestration │ └── auth_manager.py # Name/password authentication ├── websocket/ │ ├── __init__.py │ ├── connection.py # WebSocket connection handling │ ├── message_handlers.py # Message type routing and handling │ └── signaling.py # WebRTC signaling logic ├── api/ │ ├── __init__.py │ ├── admin.py # Admin endpoints │ ├── sessions.py # Session HTTP API │ ├── lobbies.py # Lobby HTTP API │ └── bots.py # Bot HTTP API └── models/ ├── __init__.py ├── session.py # Session and Lobby classes └── events.py # Event system for decoupled communication ``` #### 2. Event-Driven Architecture Replace direct method calls with an event system: ```python from typing import Protocol from abc import ABC, abstractmethod class Event(ABC): """Base event class""" pass class SessionJoinedLobby(Event): def __init__(self, session_id: str, lobby_id: str): self.session_id = session_id self.lobby_id = lobby_id class EventHandler(Protocol): async def handle(self, event: Event) -> None: ... class EventBus: def __init__(self): self._handlers: dict[type[Event], list[EventHandler]] = {} def subscribe(self, event_type: type[Event], handler: EventHandler): if event_type not in self._handlers: self._handlers[event_type] = [] self._handlers[event_type].append(handler) async def publish(self, event: Event): event_type = type(event) if event_type in self._handlers: for handler in self._handlers[event_type]: await handler.handle(event) ``` #### 3. WebSocket Message Router Replace the massive switch statement with a clean router: ```python from typing import Callable, Dict, Any from abc import ABC, abstractmethod class MessageHandler(ABC): @abstractmethod async def handle(self, session: Session, data: Dict[str, Any], websocket: WebSocket) -> None: pass class SetNameHandler(MessageHandler): async def handle(self, session: Session, data: Dict[str, Any], websocket: WebSocket) -> None: # Handle set_name logic here pass class WebSocketRouter: def __init__(self): self._handlers: Dict[str, MessageHandler] = {} def register(self, message_type: str, handler: MessageHandler): self._handlers[message_type] = handler async def route(self, message_type: str, session: Session, data: Dict[str, Any], websocket: WebSocket): if message_type in self._handlers: await self._handlers[message_type].handle(session, data, websocket) else: await websocket.send_json({"type": "error", "data": {"error": f"Unknown message type: {message_type}"}}) ``` ### Client Refactoring #### 1. Centralized Connection Management Create a single WebSocket connection manager: ```typescript // src/connection/WebSocketManager.ts export class WebSocketManager { private ws: WebSocket | null = null; private reconnectAttempts = 0; private messageHandlers = new Map void>(); constructor(private url: string) {} async connect(): Promise { // Connection logic with automatic reconnection } subscribe(messageType: string, handler: (data: any) => void): void { this.messageHandlers.set(messageType, handler); } send(type: string, data: any): void { if (this.ws?.readyState === WebSocket.OPEN) { this.ws.send(JSON.stringify({ type, data })); } } private handleMessage(event: MessageEvent): void { const message = JSON.parse(event.data); const handler = this.messageHandlers.get(message.type); if (handler) { handler(message.data); } } } ``` #### 2. Unified State Management Use a state management pattern (Context + Reducer or Zustand): ```typescript // src/store/AppStore.ts interface AppState { session: Session | null; lobby: Lobby | null; participants: Participant[]; connectionStatus: 'disconnected' | 'connecting' | 'connected'; error: string | null; } type AppAction = | { type: 'SET_SESSION'; payload: Session } | { type: 'SET_LOBBY'; payload: Lobby } | { type: 'UPDATE_PARTICIPANTS'; payload: Participant[] } | { type: 'SET_CONNECTION_STATUS'; payload: AppState['connectionStatus'] } | { type: 'SET_ERROR'; payload: string | null }; const appReducer = (state: AppState, action: AppAction): AppState => { switch (action.type) { case 'SET_SESSION': return { ...state, session: action.payload }; // ... other cases default: return state; } }; ``` ### Voicebot Refactoring #### 1. Unified Connection Interface Create a common WebSocket interface used by both client and voicebot: ```python # shared/websocket_client.py from abc import ABC, abstractmethod from typing import Dict, Any, Callable, Optional class WebSocketClient(ABC): def __init__(self, url: str, session_id: str, lobby_id: str): self.url = url self.session_id = session_id self.lobby_id = lobby_id self.message_handlers: Dict[str, Callable[[Dict[str, Any]], None]] = {} @abstractmethod async def connect(self) -> None: pass @abstractmethod async def send_message(self, message_type: str, data: Dict[str, Any]) -> None: pass def register_handler(self, message_type: str, handler: Callable[[Dict[str, Any]], None]): self.message_handlers[message_type] = handler async def handle_message(self, message_type: str, data: Dict[str, Any]): handler = self.message_handlers.get(message_type) if handler: await handler(data) ``` ## Implementation Plan ### Phase 1: Server Foundation (Week 1-2) 1. Extract `SessionManager` and `LobbyManager` classes 2. Implement basic event system 3. Create WebSocket message router 4. Move admin endpoints to separate module ### Phase 2: Server Completion (Week 3-4) 1. Extract bot management functionality 2. Implement remaining message handlers 3. Add comprehensive testing 4. Performance optimization ### Phase 3: Client Refactoring (Week 5-6) 1. Implement centralized WebSocket manager 2. Create unified state management 3. Refactor components to use new architecture 4. Add error boundary and better error handling ### Phase 4: Voicebot Integration (Week 7-8) 1. Create shared WebSocket interface 2. Refactor voicebot to use common patterns 3. Improve bot lifecycle management 4. Integration testing ## Benefits of Proposed Architecture ### Maintainability - **Single Responsibility**: Each module has a clear, focused purpose - **Testability**: Smaller, focused classes are easier to unit test - **Debugging**: Clear separation makes it easier to trace issues ### Scalability - **Event-driven**: Loose coupling enables easier feature additions - **Modular**: New functionality can be added without touching core logic - **Performance**: Event system enables asynchronous processing ### Developer Experience - **Code Navigation**: Easier to find relevant code - **Documentation**: Smaller modules are easier to document - **Onboarding**: New developers can understand individual components ### Reliability - **Error Isolation**: Failures in one module don't cascade - **State Management**: Centralized state reduces synchronization bugs - **Connection Handling**: Robust reconnection and error recovery ## Risk Mitigation ### Breaking Changes - Implement changes incrementally - Maintain backward compatibility during transition - Comprehensive testing at each phase ### Performance Impact - Benchmark before and after changes - Event system should be lightweight - Monitor memory usage and connection handling ### Team Coordination - Clear communication about architecture changes - Code review process for architectural decisions - Documentation updates with each phase ## Conclusion This refactoring will transform the current monolithic architecture into a maintainable, scalable system. The modular approach will reduce complexity, improve testability, and make the codebase more approachable for new developers while maintaining all existing functionality.