ai-voicebot/docs/ARCHITECTURE_RECOMMENDATIONS.md

299 lines
9.7 KiB
Markdown

# Architecture Recommendations: Sessions, Lobbies, and WebSockets
## Executive Summary
The current architecture has grown organically into a monolithic structure that mixes concerns and creates maintenance challenges. This document outlines specific recommendations to improve maintainability, reduce complexity, and enhance the development experience.
## Current Issues
### 1. Server (`server/main.py`)
- **Monolithic structure**: 2300+ lines in a single file
- **Mixed concerns**: Session, lobby, WebSocket, bot, and admin logic intertwined
- **Complex state management**: Multiple global dictionaries requiring manual synchronization
- **WebSocket message handling**: Deep nested switch statements are hard to follow
- **Threading complexity**: Multiple locks and shared state increase deadlock risk
### 2. Client (`client/src/`)
- **Fragmented connection logic**: WebSocket handling scattered across components
- **Error handling complexity**: Different scenarios handled inconsistently
- **State synchronization**: Multiple sources of truth for session/lobby state
### 3. Voicebot (`voicebot/`)
- **Duplicate patterns**: Similar WebSocket logic but different implementation
- **Bot lifecycle complexity**: Complex orchestration with unclear state flow
## Proposed Architecture
### Server Refactoring
#### 1. Extract Core Modules
```
server/
├── main.py # FastAPI app setup and routing only
├── core/
│ ├── __init__.py
│ ├── session_manager.py # Session lifecycle and persistence
│ ├── lobby_manager.py # Lobby management and chat
│ ├── bot_manager.py # Bot provider and orchestration
│ └── auth_manager.py # Name/password authentication
├── websocket/
│ ├── __init__.py
│ ├── connection.py # WebSocket connection handling
│ ├── message_handlers.py # Message type routing and handling
│ └── signaling.py # WebRTC signaling logic
├── api/
│ ├── __init__.py
│ ├── admin.py # Admin endpoints
│ ├── sessions.py # Session HTTP API
│ ├── lobbies.py # Lobby HTTP API
│ └── bots.py # Bot HTTP API
└── models/
├── __init__.py
├── session.py # Session and Lobby classes
└── events.py # Event system for decoupled communication
```
#### 2. Event-Driven Architecture
Replace direct method calls with an event system:
```python
from typing import Protocol
from abc import ABC, abstractmethod
class Event(ABC):
"""Base event class"""
pass
class SessionJoinedLobby(Event):
def __init__(self, session_id: str, lobby_id: str):
self.session_id = session_id
self.lobby_id = lobby_id
class EventHandler(Protocol):
async def handle(self, event: Event) -> None: ...
class EventBus:
def __init__(self):
self._handlers: dict[type[Event], list[EventHandler]] = {}
def subscribe(self, event_type: type[Event], handler: EventHandler):
if event_type not in self._handlers:
self._handlers[event_type] = []
self._handlers[event_type].append(handler)
async def publish(self, event: Event):
event_type = type(event)
if event_type in self._handlers:
for handler in self._handlers[event_type]:
await handler.handle(event)
```
#### 3. WebSocket Message Router
Replace the massive switch statement with a clean router:
```python
from typing import Callable, Dict, Any
from abc import ABC, abstractmethod
class MessageHandler(ABC):
@abstractmethod
async def handle(self, session: Session, data: Dict[str, Any], websocket: WebSocket) -> None:
pass
class SetNameHandler(MessageHandler):
async def handle(self, session: Session, data: Dict[str, Any], websocket: WebSocket) -> None:
# Handle set_name logic here
pass
class WebSocketRouter:
def __init__(self):
self._handlers: Dict[str, MessageHandler] = {}
def register(self, message_type: str, handler: MessageHandler):
self._handlers[message_type] = handler
async def route(self, message_type: str, session: Session, data: Dict[str, Any], websocket: WebSocket):
if message_type in self._handlers:
await self._handlers[message_type].handle(session, data, websocket)
else:
await websocket.send_json({"type": "error", "data": {"error": f"Unknown message type: {message_type}"}})
```
### Client Refactoring
#### 1. Centralized Connection Management
Create a single WebSocket connection manager:
```typescript
// src/connection/WebSocketManager.ts
export class WebSocketManager {
private ws: WebSocket | null = null;
private reconnectAttempts = 0;
private messageHandlers = new Map<string, (data: any) => void>();
constructor(private url: string) {}
async connect(): Promise<void> {
// Connection logic with automatic reconnection
}
subscribe(messageType: string, handler: (data: any) => void): void {
this.messageHandlers.set(messageType, handler);
}
send(type: string, data: any): void {
if (this.ws?.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify({ type, data }));
}
}
private handleMessage(event: MessageEvent): void {
const message = JSON.parse(event.data);
const handler = this.messageHandlers.get(message.type);
if (handler) {
handler(message.data);
}
}
}
```
#### 2. Unified State Management
Use a state management pattern (Context + Reducer or Zustand):
```typescript
// src/store/AppStore.ts
interface AppState {
session: Session | null;
lobby: Lobby | null;
participants: Participant[];
connectionStatus: 'disconnected' | 'connecting' | 'connected';
error: string | null;
}
type AppAction =
| { type: 'SET_SESSION'; payload: Session }
| { type: 'SET_LOBBY'; payload: Lobby }
| { type: 'UPDATE_PARTICIPANTS'; payload: Participant[] }
| { type: 'SET_CONNECTION_STATUS'; payload: AppState['connectionStatus'] }
| { type: 'SET_ERROR'; payload: string | null };
const appReducer = (state: AppState, action: AppAction): AppState => {
switch (action.type) {
case 'SET_SESSION':
return { ...state, session: action.payload };
// ... other cases
default:
return state;
}
};
```
### Voicebot Refactoring
#### 1. Unified Connection Interface
Create a common WebSocket interface used by both client and voicebot:
```python
# shared/websocket_client.py
from abc import ABC, abstractmethod
from typing import Dict, Any, Callable, Optional
class WebSocketClient(ABC):
def __init__(self, url: str, session_id: str, lobby_id: str):
self.url = url
self.session_id = session_id
self.lobby_id = lobby_id
self.message_handlers: Dict[str, Callable[[Dict[str, Any]], None]] = {}
@abstractmethod
async def connect(self) -> None:
pass
@abstractmethod
async def send_message(self, message_type: str, data: Dict[str, Any]) -> None:
pass
def register_handler(self, message_type: str, handler: Callable[[Dict[str, Any]], None]):
self.message_handlers[message_type] = handler
async def handle_message(self, message_type: str, data: Dict[str, Any]):
handler = self.message_handlers.get(message_type)
if handler:
await handler(data)
```
## Implementation Plan
### Phase 1: Server Foundation (Week 1-2)
1. Extract `SessionManager` and `LobbyManager` classes
2. Implement basic event system
3. Create WebSocket message router
4. Move admin endpoints to separate module
### Phase 2: Server Completion (Week 3-4)
1. Extract bot management functionality
2. Implement remaining message handlers
3. Add comprehensive testing
4. Performance optimization
### Phase 3: Client Refactoring (Week 5-6)
1. Implement centralized WebSocket manager
2. Create unified state management
3. Refactor components to use new architecture
4. Add error boundary and better error handling
### Phase 4: Voicebot Integration (Week 7-8)
1. Create shared WebSocket interface
2. Refactor voicebot to use common patterns
3. Improve bot lifecycle management
4. Integration testing
## Benefits of Proposed Architecture
### Maintainability
- **Single Responsibility**: Each module has a clear, focused purpose
- **Testability**: Smaller, focused classes are easier to unit test
- **Debugging**: Clear separation makes it easier to trace issues
### Scalability
- **Event-driven**: Loose coupling enables easier feature additions
- **Modular**: New functionality can be added without touching core logic
- **Performance**: Event system enables asynchronous processing
### Developer Experience
- **Code Navigation**: Easier to find relevant code
- **Documentation**: Smaller modules are easier to document
- **Onboarding**: New developers can understand individual components
### Reliability
- **Error Isolation**: Failures in one module don't cascade
- **State Management**: Centralized state reduces synchronization bugs
- **Connection Handling**: Robust reconnection and error recovery
## Risk Mitigation
### Breaking Changes
- Implement changes incrementally
- Maintain backward compatibility during transition
- Comprehensive testing at each phase
### Performance Impact
- Benchmark before and after changes
- Event system should be lightweight
- Monitor memory usage and connection handling
### Team Coordination
- Clear communication about architecture changes
- Code review process for architectural decisions
- Documentation updates with each phase
## Conclusion
This refactoring will transform the current monolithic architecture into a maintainable, scalable system. The modular approach will reduce complexity, improve testability, and make the codebase more approachable for new developers while maintaining all existing functionality.