Snapberry - Help the Blind Hear the World
This is the 1st release of Snapberry, an individual homework @ IoT course in CMU.
The 2nd release will be the group work exploring more possibilities.
Introduction
Snapberry is a Raspberry Pi project creating an assistant I.o.T. device for the blind to hear the world. It's built with a camera, a Raspberry Pi, and a speaker. The scene captured by the camera is translated into description words by the Microsoft vision API and then speak out with the speaker.
Links
Components
Spectacles as Camera
Spectacles by Snap Inc. has special appearance with two cameras on both sides. This feature naturally fits the needs of cameras in this project Snapberry. Isn't it cool to extend Spectacles usage to cover new users and new market?
Raspberry Pi
In this project, Raspberry Pi is utilized as the server to process captured photos. The server is built with Node.js and Express.
Wireless Earbuds as Speaker
The words description of the scene is generated by Microsoft vision API and then read out from the speaker. Wireless earbuds like AirPods, Bragi, BeatsX can be used for the blind to hear the scene description.
Core Code and Explanation
Front-end
$(".GetBtn").click(() => {
// get time stamp
var timeStamp = Math.floor(Date.now() / 1000);
// get photo from raspberry camera => scene recognition => speech
getPhoto(timeStamp)
.then(res => res.blob())
.then((data) => {
console.log(data);
return postForDesc(data);
})
.then(res => {
console.log(res);
$(".res").html(res.description.captions["0"].text);
responsiveVoice.speak(res.description.captions["0"].text, "US English Female");
})
})
For the front-end part, the process can be divided into three parts:
Call getPhoto() API to ask raspberry camera to take a photo;
Get the returned photo and send it to Microsoft Vision Recognition API for image recognition and get description for the scene;
Call responsiveVoice API to speak out the description for the blind.
Backend
router.get('/photo', function (req, res) {
var camera = new RaspiCam({
mode: "photo",
output: "./photo/image" + req.query.timeStamp + ".png",
encoding: "png",
timeout: 0 // take the picture immediately
});
camera.on("start", function (err, timestamp) {
console.log("photo started at " + timestamp);
});
camera.on("read", function (err, timestamp, filename) {
console.log("photo image captured with filename: " + filename);
// leave 1s for photo to save
setTimeout(() => {
camera.stop();
}, 1000);
});
camera.on("exit", function (timestamp) {
console.log("photo child process has exited at " + timestamp);
// read file and send back to front-end
var img = fs.readFileSync('./photo/image' + req.query.timeStamp + '.png');
res.writeHead(200, { 'Content-Type': 'image/gif' });
res.end(img, 'binary');
});
camera.start();
});
The core of backend is how to manipulate camera to take photo/save photo and construct API to return the taken photo to the front-end.
A package named raspicam is adopted for the communication between raspberry pi and raspberry camera. Notice that a time stamp is transmitted from the front-end to record the time when the photo is taken. This is a semantic way to name the photos, which is helpful for later photo categorization.
Results
Framework
- Backend
- Frontend
- Other