Revolutionizing Live Performances With Gesture-Controlled Audio-Visuals

We present a compact, energy-efficient gesture-control system that empowers performers to intuitively manipulate audio and visual effects through eight discrete hand gestures. By deploying a lightweight machine-learning model on a Raspberry Pi 4, the system captures live hand landmarks with a camera module, processes them in real time (≤30 ms per inference), and transmits simple OSC toggles and thumb‐tip coordinates to a Max 9 patch for instantaneous parameter control. This headless, portable solution eliminates bulky DJ/VJ hardware, reduces control-gear power consumption by over 85 %, and halves per-event labor and equipment costs.

Extensive testing confirms ≥ 90 % recognition accuracy under varied lighting conditions and steady 20 FPS performance on the Pi, meeting stringent latency and compute constraints. The modular design—separated into data collection, model training, live inference, and OSC integration—ensures easy maintenance and future extensibility. Our work lays a robust foundation for immersive, hands-free interaction in clubs, festivals, and home studios, and paves the way for expanded gesture vocabularies, personalized adaptation, and multi-user collaborations.

Project Details

Student(s): Karam Azar, Samir Fawaz, Andreo Kobersy
Advisor(s): Dr. Samer Saab
Year: 2024-2025

[photo]