High-performance WebSocket-based voice processing backend for ESP32/M5Stack devices with real-time audio streaming, transcription, and synthesis