Tag Archives: Hack

Hacking Google’s Text To Speech “API”

When I was at my previous job, one task I had was localizing a large set of phrases to multiple languages, both in text and audio files. I did this by using the awesome Google Translate API.

The Google Translate website has features for translating text and playing audio of it in the translated language. There’s no official API for getting audio, though. Luckily, I’ve never let a lack of an official API stop me before.

I had read a few old blog posts about how Google’s undocumented TTS API could be used, albeit with a 100 character limit. Going over 100 characters would result in a truncated audio file. Some of the text I needed to output to audio was longer than that. It turns out that with a little bit of Chrome web inspector, I could replicate the functionality of the Google Translate site.

The first thing to check out is the url scheme of the audio files, which looks like this:

Breaking down the parameters, “ie” is the text’s encoding, “q” is the text to convert to audio, “tl” is the text language, “total” is the total number of chunks (more on that later), “idx” is which chunk we’re on, “textlen” is the length of the text in that chunk and “prev” is not really important.

The Google Translate site itself gets around its own character limit by breaking big blocks of text into “chunks”. It seems to try and break along punctuation, but for super long sentences it will also break in the middle of a sentence, which ends up sounding pretty weird. Using the Gettysburg Address as an example, Google makes a request for the chunk “civil war”.

Gettysburg Address

In order to download audio files for longer chunks of text, I wrote up a python script that broke the text down and made separate requests to Google. The script would write all of the files to one file, and somehow, it worked! Just to be safe, I also set my script up to use Google’s Flash player as the referer (sic) and set the user agent to a version of Firefox.

At the time, I didn’t want to release the code as it was being used for some uber top secret stuff. But since I’m not working on that project anymore, I refactored the original code into a command line Python script. Along the way I had to learn how to use Python’s argparse, which is a pretty neat way of parsing command line arguments.

The project is available on Github right now, so go grab it and try it out. If you’re curious what the output sounds like, here’s a recording of female Abraham Lincoln reciting the Gettysburg Address (yes, she mispronounces some words). One fun thing to try out is outputting clashing input and output languages. Here’s Female Japanese Abraham Lincoln reciting the same speech (she just seems to be spelling words, slacker).

If you enjoyed this hack, let me know and I could post some other ones I’ve been working on. And if you find a way to improve the code (probably not difficult at all) go ahead and submit a pull request on Github. And if you’re from Google, please don’t shut down my Gmail and Adsense accounts.

Photo Hack Day NYC 2011 And My Hack: AllPaper

This past weekend I participated in my first NYC hackathon. The New York Hackathon is a mythical beast. They’ve been written about quite extensively, but you really need to just show up to one to know what it’s all about.

This particular hackathon (Photo Hack Day) was organized by Aviary, with sponsorships by Bing, Pepsi, PBR and others. There was plenty of food so that hackers didn’t have an excuse to leave, ever. I still ended up going home to get some sleep on Saturday night before the demo on Sunday.

I really think I picked a good first hackathon to attend, as the prizes were pretty insane. First place got $5000 and the cash prizes for second and third place were quite good as well. On top of those, there were prizes for best use of certain APIs. Face.com’s API stood out as a popular one; 500px had quite a few users as well.

I ended up working on a project I had wanted to build for a while: a custom collage creator for your iPhone wallpaper. I called it “AllPaper.” It seemed pretty catchy and I don’t think there’s a product with that name yet. I integrated Instagram into it first, then threw support for 500px, PicPlz and Facebook before calling it a hack. I also integrated the Sincerely Ship Library which lets you turn your collage into a real postcard. I’m really looking forward to trying that out!

Overall I’m pretty happy with how the hack turned out. I can probably turn it into a real app fairly easily with some tweaking here and there. As this was my first hackathon, I noticed a few interesting things about the top hacks. There was some grumbling about how many of them seemed “hardcoded.” I wonder if presentation really counts more than actual hack-worthiness these days. You could spend your time actually coding something, or you could create the facade of something that looks even better, but is not really that worthy of being called a “hack.” It seems that the presentation wins over the true hack-worthiness, as one of the cooler ones, a jailbroken iPhone that took MMS messages and uploaded them to a web service, won nothing.

Coming away empty-handed was a little damaging to the ego, especially after winning something in my last hackathon, but my ego’s not too bruised. I hate to sound like I’m complaining! Almost all of the presentations were well done, and the winners certainly did deserve to win. I’m hoping to participate in something similar again soon, as the atmosphere was really enjoyable.

Fun Trolling Facebook Polls (For Science (Actually Lulz)!)

I saw a Facebook Poll late last night that a friend had voted on. The question was something like “Which pair of shoes should I get?” The poll had twitpic.com links as answers, so the idea was that people look at the pics and let the guy decide which pair of shoes was better.

Apparently in Facebook Polls, you click on the answer to vote. And there’s no unvote (you can vote for another choice, but you can’t abstain after clicking). So people ended up clicking on the twitpic link thinking they’d see the image, and ended up accidentally voting on the poll. I fell for this, too. There were something like a couple thousand answers on that poll. I believe it’s been removed now.

I figured I could do better with a more salient question, so I made one up myself. “Which pair of glasses look better on me?” I made the question have two twitpic links, which you can view here and here if you actually copy and paste them in. I figured people are naturally judgers, and something like helping someone choose glasses to wear is an easy task (plus you theoretically get to see pictures of faces, which people just love, consciously or subconsciously).

I started the poll late last night, which probably didn’t help, but a few friends took the bait. I hope they forgive me as I did this for science the lulz! When I woke up this morning, there were currently 51 votes, from people I know, friends of friends, and even people two degrees out of my social network! I think it would be really interesting to see how this poll spreads through Facebook (assuming they don’t shut it down first).

I guess now that this post is published, any scientific value is gone (since you could be reading from anywhere and vote for my poll non-virally). The main point is that when you design systems very rigidly (in Facebook’s case, not letting people abstain from a poll, which believe it or not is a valid bit of information), interesting consequences pop up.

I’ll keep checking the status of the poll and see if it actually blows up, whimpers and dies or gets taken down quickly.

Analysis Edit:
I think another reason that this poll is so effective is that it makes it seem that the person who voted is the originator of the poll. Check out the newsfeed formatting:

The voter’s name is prominently displayed (though I blurred it) and the person who asked the question is nowhere to be seen.

Edit #1: The time is now about 12:40PM and the total number of voters has doubled to 99!

Edit #2: It’s about 1:10PM and the number has doubled again to 201!

Edit #3: The time is around 1:24PM and there’s 304 answers.

Edit #4: Alright, it’s 1:35PM and there’s 406 votes.
Edit #5: Wow. It’s 1:41PM and there’s 502 votes.
Edit #5: It’s 1:48 and there are 621 votes.
Edit #6: I’m just going to simplify my updates now…
1:53PM – 716 votes
1:58PM – 811 votes
2:02PM – 904 votes
2:07PM – 1031 votes
2:16PM – 1282 votes
2:22PM – 1442 votes
2:27PM – 1619 votes
2:38PM – 2013 votes
2:46PM – 2393 votes
2:50pm – 2604 votes
2:54pm – 2811 votes
2:58pm – 3038 votes
3:04pm – 3408 votes
3:11pm – 3861 votes
3:14pm – 4142 votes
3:23pm – 4761 votes
3:39pm – 6169 votes
3:47pm – 6806 votes
3:51pm – 7198 votes
3:56pm – 7693 votes
4:00pm – 8010 votes
4:06pm – 8624 votes
4:10pm – 9038 votes
4:19pm – 10,013 votes!
4:28pm – 11,007 votes
4:37pm – 12,009 votes
4:46pm – 13,046 votes
4:53pm – 14,009 votes
5:04pm – 15,216 votes
5:09pm – 15,886 votes (dinnertime)
5:45pm – 19,764 votes
5:55pm – 20,722 votes
6:06pm – 21,829 votes
6:30pm – 24,104 votes
6:40pm – 25,013 votes
6:51pm – 26,001 votes
7:02pm – 27,014 votes
7:14pm – 28,013 votes
7:26pm – 29,001 votes
7:42pm – 30,373 votes
7:53pm – 31,124 votes
(mini break)
9:41pm – 38,332 votes
10:14pm – 40,175 votes
10:34pm – 41,360 votes
10:50pm – 42,232 votes
11:38pm – 44,690 votes
12:12am – 46,761 votes
12:51am – 47,677 votes
1:48am – 49,358 votes
Day 2
10:10am – 53,601 votes
10:31am – 53,812 votes
12:48pm – 55,418 votes
1:07pm – 55,598
1:36pm – 55,923
2:32pm – 56,470
4:41pm – 57,559
10:36pm – 59,078
1:51am – 59,426
EDIT: Facebook finally deleted the poll, with something like 60,000 votes last time I checked.