YouTube player

If you’ve ever tried to communicate with someone who only spoke a foreign language, you know that it can be extremely difficult — even with the help of modern translation websites. This project will show you how to turn a $35 mini-computer into a feature-rich language translator that not only supports voice recognition and native speaker playback, but is also capable of translating between thousands of language pairs. The unbelievable part is that it can all be done on the cheap by leveraging inexpensive hardware, free translation APIs, and some open source software.

Even if you’re not interested in building this exact translation tool, there are still many parts of this project that might be interesting to you such as Google’s speech recognition API, Microsoft’s translation API, and text to speech. All of the source code for this project is publicly available on Github, and you’re welcome to use and modify it as you wish. The Universal Translator makes a perfect weekend project because it incorporates a wide range of technology and tools to create something immediately useful. Oh, and it’s a blast to play with.

Pick up all the parts at your local RadioShack, follow the instructions below, and you’ll have your own babel fish in no time.

Project Steps

Set up Raspberry Pi and Download Software

If you don’t already have your Raspberry Pi up and running, follow the instructions provided by the Raspberry Pi Foundation: http://www.raspberrypi.org/help/noobs-setup/

Make sure your Raspberry Pi is connected to the internet. If you don’t have an Ethernet connection handy, you can use a USB wi-fi dongle to connect to your network.

From the command line, run the following two commands to update the software on your Raspberry Pi:

sudo apt-get update

sudo apt-get upgrade

This process may take a while.

Install the software required for this project with these commands:

sudo apt-get install python-pip mplayer flac python2.7-dev libcurl4-gnutls-dev

sudo pip install requests pycurl

Note: This project was tested with the latest version of Raspbian at the time of publishing. Be aware that some details may change as updated software is released. To check your version against the version in our test build, execute uname -a from the command line:

pi@raspberrypi ~ $ uname -a

Linux raspberrypi 3.12.22+ #691 PREEMPT Wed Jun 18 18:29:58 BST 2014 armv6l GNU/Linux

Configure and Test Headset

Plug in the USB headset (use a powered USB hub, if necessary).

Run the following commands, which will list your sound devices:

cat /proc/asound/cards

cat /proc/asound/modules

You should see that the Logitech Headset is listed as card 1. The second command should show that the driver for card 0 (the default output) is snd_bcm2835, which is the Raspberry Pi’s analog audio output. The driver for card 1 (our Logitech Headset) is snd_usb_audio. If you don’t see the headset listed, try rebooting:

sudo reboot

In order to set the USB headset as the default for both audio input and output, you’ll need to update the ALSA configuration file. Open it in the text editor nano:

sudo nano /etc/modprobe.d/alsa-base.conf

Change the line that says:

options snd-usb-audio index=-2

to:

options snd-usb-audio index=0

Save and close the file with Ctrl-X and typing y. Reboot the Raspberry Pi using the following command:

sudo reboot

After the reboot, the sound system should be reloaded so that when you run the above commands

cat /proc/asound/cards

cat /proc/asound/modules

again, you should see the USB Headset is now the default input/output device (card 0) as shown above.

Test it out by recording a 5 second clip from the microphone:

arecord -d 5 -r 48000 make.wav

Note: If you see an error that says “overrun!!!”, you can safely ignore it.

Play it back through the headphone speakers:

aplay make.wav

Don’t worry if the audio is a bit scratchy, you will be recording at a higher quality in the next step.

To adjust the levels you can use the built-in utility alsamixer. This tool handles both audio input and output levels.

sudo alsamixer

Download and Extract Source Code

There are a few options for speech recognition with Raspberry Pi, but I thought the best solution for this tutorial was to use Google’s Speech to Text service. This service allows you to upload the audio file that was recorded and convert it to text (which you will later use to translate).

The project source code will handle this process for you. To download and extract the files, run the following commands:

wget https://github.com/dconroy/PiTranslate/archive/master.zip

unzip master.zip

cd PiTranslate-master

Note: There is a combination of shell and Python scripts in the bundle.

Make the speech to text script executable:

sudo chmod +x stt.sh

Set up Google Speech Recognition API

In order to use Google’s speech recognition API, you must register for it. Follow these steps while logged into your Google account.

Go to https://cloud.google.com/console and click “Create Project.” Give it a name like “My Universal Translator” and click “Create.” (Google will generate a random Project ID. You can leave this field alone.)

While Google’s cloud servers create your new project, open a new browser tab to join the Chromium-dev Google Group, which will give you access to the Speech API: https://groups.google.com/a/chromium.org/forum/?fromgroups#!forum/chromium-dev

Return to your Google Project tab. If the process of creating your project is complete, you’ll see the Project Dashboard. Click “APIS & AUTH” on the left side and then click “APIs.” Scroll down and turn on the Speech API.

Still in the Project Dashboard, click “Credentials” on the left side, under the “APIS & AUTH” header. Click the button “Create new Key” and select “Browser Key.” Leave the allowed referrers text field blank and click “Create.”

The newly generated API key is now displayed in the dashboard.

On your Raspberry Pi, open the file text-to-translate.py and find the line that says:

key = 'xxx'

Replace xxx with the key from your dashboard. To make this easier, you may want to launch the Raspberry Pi’s desktop environment (with startx), log into your Google account with the web browser Midori, and copy and paste the key into the file. You could also change the file via SFTP from your computer.

Note: While free, Google’s speech API only allows 50 requests per day.

Set up Microsoft Translation API

Google has a translation API as well, but for this project, you’ll use Microsoft’s service because it’s free. Log into Microsoft’s Azure Marketplace with your Microsoft account at: https://datamarket.azure.com/developer/applications/

Click the button that says “REGISTER” to create a new application.

Fill in a Client ID, Application Name, and Redirect URI (any URL will do).

Click “CREATE”

Much like you did with the Google Key in the previous step, save the Client ID and Client Secret between the single quotes on the appropriate lines in PiTranslate.py:

args = {

'client_id': '',#your client id here

'client_secret': '',#your azure secret here

'scope': 'http://api.microsofttranslator.com',

'grant_type': 'client_credentials'

}

Use the Universal Translator

Now that the API keys have been created and entered into the provided code, you can try out the Universal Translator. With the headset on, execute:

./stt.sh

Speak into the headset and press Ctrl-C when done.

You’ll hear the translation in the headset and you’ll also see feedback on the command line.

By default, it translates from English to Spanish. You can easily change your origin and destination languages in the last line of text-to-translate.py and the script will do the rest! There are literally thousands of language pairs supported. Enjoy!