Reverse Engineering Quibi: Protocol Buffers and HLS Streaming
I’ve had a lot of free time in the past few weeks so I decided to spend some of it working on side projects. I really enjoy reverse engineering apps, so I decided to take a look at Quibi.
Quibi (short for “Quick Bites”) is a video streaming app/service that has a bunch of shows that are short. The idea is that you can sit on a bus ride and consume an episode or two, depending on how long your commute is. One of the constraints of the platform is that you can only watch these videos on a phone or tablet with the app installed.
Since everyone is stuck indoors for a while this constraint is kinda stupid and most people would probably like to watch their videos on their big tv rather than huddle around a phone, which is what Emily and I had to do to watch that stupid viral show about the terrible wife with the golden arm.
Anyway, I had an idea to write a tvOS app that would work with Quibi, so you could watch your terrible shows on your tv. Here’s what I learned trying to reverse engineer the Quibi app.
Setting up Charles
The first rule of reverse engineering club is that you should probably install Charles proxy on your computer, and point your phone to it. This lets you inspect network requests and figure out the API for an app. Depending on how stringent the app’s security settings are, you can either learn a lot, or very little from Charles.
The way Charles works is that you basically route all of your traffic from your iPhone to your computer. You need to install a certificate on your phone and trust it, so that your phone thinks that it’s going through a secure connection when it’s really getting owned. But since you’re self-owning, it’s generally okay. I’m not going to do a full tutorial on how to do this because you can just read the docs.
You can also buy a version of Charles that runs directly on your phone. I did this, but I also find it easier to work with on my computer, so I used the free version that stops working every 30 minutes. I should probably buy the full version.
Anyway, once you get the proxy working on your phone and add the proper domains to the allowed list, you should start seeing your network requests and responses pop up in Charles after firing up the app and doing stuff.
It turns out (luckily for me) that Quibi doesn’t utilize certificate pinning (at least at the time I’m writing this article). Cert pinning is a way to prevent snooping of network requests by embedding a certificate (or maybe a hash of it or a public key) to be trusted into the actual app binary. This means that adding the additional Charles certificate won’t work because the app won’t accept it. The process of pinning is kind of a pain in the ass, which is why a lot of companies don’t do it.
Because Quibi doesn’t use cert pinning, that means I can observe all of the requests and responses that the app sends and receives, which makes it a lot easier to reverse engineer!
I won’t go too much into the way that Quibi does authentication, as it appears to be a pretty basic implementation of OAuth 2.0, using auth0 as a service provider. It doesn’t appear to use a client secret key, as I’m able to just replay the request with my username, password, client id and other parameters and get a valid access token back. One interesting thing to note here is how short the token expiration is, just 2700 seconds, or 45 minutes.
Once authentication happens, the access token is used in subsequent requests as the authorization bearer header tokens.
I didn’t bother to set up a new app with authentication because I wanted to get to the meat of the app, and started taking a look at how the app’s requests and responses were structured, which brings us to…
I’ll be honest, I was initially pretty annoyed when I read the line,
in Charles. That’s because protocol buffers are pretty annoying to work with unless you actually have access to the .proto files that were used to generate the schema for the response.
Luckily, I have experience working with protobufs (as the cool kids call them) because we started implementing them at Lyft last year. As a primer, protocol buffers basically define requests and responses, and their types, in a way that can be shared between servers, clients, etc. It does so in a way that reduces the amount of redundancy in the data. So instead of looking at a JSON file that labels each key/value pair for each item in an array, you just see the values, and the keys (and their types) are essentially encoded outside of the format itself. That’s probably a gross simplification of protobufs, but that’s basically how I understand their functionality in lay person terms.
So getting back to Quibi, I was seeing calls to endpoints like
with a bunch of wacky characters, urls, and names and descriptions of tv shows. Looking at the raw text gave me an idea of what was being returned, but there was also a lot of data that I couldn’t see because it was encoded as a protobuf response.
I found a tool written by someone named Alba Mendez called protobuf-inspector which allows you to visualize the values in a protobuf response. Once I threw the response into this tool, it started to make a lot more sense.
Here, I could see that the home cards were displaying structured information for each of the cards in the app. The hierarchy seemed to be a card with series info, a mp4 preview link, and then info about that particular episode. There were also some values that didn’t really seem useful, like “<varint> = 1” which could’ve meant anything, like a bool value or episode number.
The tool has a way to set up object definitions, so that when you run it again, you see the key names and types that you defined. This is helpful if you’re trying to guess what something is and you want to compare it against a few different objects. Here’s what the response looked like once I tried defining some of the keys:
This makes the response look a bit more logical. I really guessed some of these, so if someone from Quibi actually reads this maybe you can confirm. I probably spent more time on this than I needed, because really all I want is to show a list of shows and maybe even start watching a show.
To waste even more time, I ended up defining this response in a .proto file, compiled it into Swift with swift-protobuf and got a Swift app to parse the response into real Swift structs! Here is the proto definition:
The generated proto code in Swift is pretty big, so I’ll leave it as an exercise to the reader to run it through swift-protobuf.
Here’s what it looks like running in the debugger in my app:
So with what I have written about so far, I could actually write a functioning Quibi client that supports logging in, making a request to get a list of tv show cards, and displays them, with a “live” preview, just like the real app. That doesn’t matter much unless you can actually watch the show, though.
HLS and Video Streaming
I apologize for the anti-climactic finale of this blog post, but this is where I got stuck.
Before I tried reverse engineering this app, I pretty much knew nothing about video streaming. When looking at the chain of requests that the app makes, it appears that the app hits an endpoint called “GetPlaybackInfo” and sends a payload of series id, season #, and episode # along with a mystery UUID that I haven’t seen anywhere else in the app requests/responses, then receives a link to a “license” url, a few links to .m3u8 resources and some cookies for accessing those .m3u8 resources.
Then the app makes a request to the license url with some form encoded data and receives some other encoded data back. Finally, the app makes a request to one of the .m3u8 files and starts streaming the video.
I did some research and it looks like a .m3u8 url basically provides the client with a way or ways to display the video to the user. It can include things like different video streams with varying quality, and it looks like it even has some subtitle file support.
I tried just replaying the call to the .m3u8 file with the same authentication cookie and it unfortunately didn’t work. I think that the license url provides the app with a way to decode the video, and without knowing what to send or how to decode it, I think I’m essentially stuck.
I sort of didn’t expect to be able to finish this app anyway, so I’m pretty happy with how far I got. I also figure that if I try to go any further with this, Quibi will probably try to sue me or something, so it probably isn’t worth it. In any case, I did learn a lot from this project, and hopefully you have too, from reading this post. If you have any ideas on how I would proceed or if you enjoyed this post, feel free to let me know!