James Bond had cool devices, but not as cool as these everyday listening devices: potato chip bags, plants, tin foil — anything that vibrates to sound waves of human speech.
Two years ago, US President Barack Obama was caught on audiotape talking to Vladimir Putin and reassuring him that certain political decisions between the two countries would be easier to make after he was re-elected in 2012. But what if the audio portion of the tape wasn’t available; could the conversation have been reconstructed from the visual portion?
Reconstructing Audio Using Visual Information
Research currently underway at Adobe, Microsoft and MIT may make billions of hours of television and videos fair game for reconstructing conversations going on while cameras were rolling.
Simply put, cameras with even a standard 60 frames per second speed can pick up minute vibrations of objects in videos such as the surface of a glass of water and the leaves of a plant nearby. Using computer algorithms, audio signals are extracted allowing for reconstruction of intelligible speech.
Everyday Objects are Now “Visual Microphones”
Sound hitting any object causes imperceptible vibrations. The only caveat is that the number of frames per second of the video must be higher than the frequency of the audio signal. High-speed cameras, with 2,000 to 6,000 frames per second, are better at picking up these vibrations, but today’s smart phones will suffice. Commercial high-speed cameras, that take 100,000 frames per second, would be particularly adept at helping translate images into the voices of people.
Researchers are testing a variety of materials to see how they respond to soundwaves so that translations become easier and easier to do.
Related articles on IndustryTap:
- Are Mass Surveillance Systems the New Normal?
- Riot Control Vehicles & Equipment Markets Booming Worldwide
- Open-Source, 3D Printed Military Grade Drones
References and related content: