Researchers give commands to smart speakers by pointing laser at microphone

Researchers have succeeded in sending commands to, for example, a smart speaker by pointing laser light at the microphone, after which the voice assistant can interpret it as sound and therefore an instruction. This is possible due to a property of laser light and mems microphones.

Security researcher Takeshi Sugarawa of The University of Electro-Communications in Tokyo and Professor Kevin Fu of the University of Michigan show that they can talk to any computer or device that can receive voice commands using laser light. This concerns, for example, Google Home, Amazon Echo and Facebook Portal speakers. In theory, the ‘light commands’ can be aimed at these devices from a great distance, for example to open garages, make online purchases, flip switches or perform other unwanted actions via the voice assistant. This works with Siri, Alexa and Google Assistant.

This ‘hack’ also works when the laser light passes through a window. As long as the laser beam is aimed at the microphones of, for example, smart speakers, tablets or smartphones, the microphones react to the light under certain circumstances as if it were sound. The laser light is then converted into an electrical signal by the microphone. This appears to occur when the intensity of the light is adjusted based on a specific frequency. By modulating an electrical signal in the intensity of the laser beam, the microphones can be fooled. In theory, this works with all devices with mems microphones, the researchers say. The microphones operate on light aimed directly at them.

The researchers tested their find on sixteen different types of devices, including a 60mW laser. Almost all smart speakers tested responded to the laser beams from a distance of more than 50 meters. With smartphones, the signals were only registered at shorter distances: with the iPhone XR this was at ten meters and the Samsung Galaxy S9 and Google Pixel 2 at five meters. When using a 5mW laser, the Google Home and the first-generation Echo Plus were found to respond even at a distance of more than 110 meters. Because windows are not an obstacle, it is possible to have the voice assistants perform actions by activating the laser from another building.

No expensive, special equipment is needed to do this and sometimes it is not even necessary to aim the laser beam very precisely. The laser light may be visible, but the researchers state that it is also possible to use invisible infrared light. They have already tried this on a short distance and it turned out to work. Normally, voice assistants give an audible response after giving a command, but according to the scientists, an attacker can in principle also give an initial command with this method to lower the volume or, for example, activate the whisper mode of the Echo speaker.

The scientists have shown that it works, but they themselves have no explanation for how this is possible. They don’t have a physical explanation at the moment. Paul Horowitz, a Harvard professor emeritus, tells Wired there could be two mechanisms behind it. First of all, the laser light can raise the temperature, causing the air to expand and create a pressure difference, just like a sound does. In addition, the components of the devices can play a role if they are not completely opaque. The laser light could then pass the microphone and fall directly onto the electronic chip, where vibrations may be converted into an electrical signal. According to Horowitz, this is the same photovoltaic effect that occurs in diodes in solar cells and at the end of fiber optic cables, where light is converted into electricity.

The study is published under the title Light Commands: Laser-Based Audio Injection
Attacks on Voice-Controllable Systems.