Voice Slowly Catching Up on Multimodal AI Features