Contents

Overview

As a team of 4 hackathon newbs, Hack & Roll was an exciting opportunity for me and my team to put our development skills to the test. As one of our first full stack projects ever, it was not surprising that we didn’t perform spectacularly. However, I’ve picked up many invaluable lessons in web, API and flask development, while having an absolute blast! I was in charge of most of the backend work, building the Flask app and all the Python scripts that it relied on to work (Our search algorithm, Youtube playlist maker and Youtube tag scraper).

Project Overview

Inspired by the Wikipedia Game, our project entitled All Roads lead to Rick aimed to provide a bot that could traverse from any youtube video in the world to the one true meme of our generation through clicking through a chain of suggested videos. Relying heavily on the Youtube API, we made use of as much Youtube video metadata as possible to ensure our bot made the most informed choice of suggested videos. Our website would then display the list of videos it traversed, while giving users an option to create a Youtube playlist out of them. For fun, we also used our bot to compile fun facts about the list of videos like their summated duration and views.

We built the app using Flask and hosted it on Google Cloud Project. Frontend, we worked with the classic Bootstrap, as my team was fairly new to frontend development. After the competition, we cleaned up our project and were satisfied with our fairly well-performing bot.

Website Home Website Results

Have a look at the Repository here!

Challenges & Solutions

Challenge: Is it even possible?

Although exciting, we realized that our project was fairly ambitious, with very little prior literature that was relevant to what we were trying to achieve. We discovered a Stanford paper on an AI that was optimised to play the Wikipedia Game. However, we soon realized that the Youtube videos were far too dissimilar from Wikipedia articles for any application.

To make matters more difficult, the Youtube algorithm is extraordinarily opaque, with little authoritative literature on how it works or what parameters it uses. I had to work on my search algorithm blind, using as many parameters that I could from the Youtube Data API.

Solution: Mine that Metadata

Based on our intuition and testing, we settled on the following metadata information to use with our bot:

Tags

For our bot to make informed choices, it first needed as much context as possible on each video. My first thought went to video tags, where the salient points and topics of a video could be found. Our initial plan was to use the Youtube Data API scrape the tags of the Rick Astley video, its top 30 related videos and the top 30 videos related to them (so 1 + 30 + [30 x 30] = 931 videos!). We found that the tags scraping was generating positive results, but that the tags had a lot of “noise”, resulting in our bot being misled to unrelated videos. We decided to explore the other metadata information to focus the scope of our bot.

Visited Channels

We found that our bot also had the tendency to get stuck in binge sessions of the same channels and videos. The Youtube algorithm further exacerbated this issue, suggesting more content that our bot kept getting hooked on. To circumvent this issue, we decided to build a set of “visited channels”, preventing our bot from revisiting channels that it had been to.

Categories

Through personal experience, we found that getting to music videos seemed to be the fastest way to traverse to the Rick Astley video. This triggered our next bot filter, where we pointed our bot to prioritise picking videos with the category of “music”. This greatly improved the effiency of our bot, but we found that it still got stuck in generic playlist compilations.

Title Keywords

Our final decision-making filter was to build a list of title keywords to get our bot to prioritise authentic music videos. The Json file for this was built manually, using patterns in music video titles like “(Official Music Video)” and the use of dashes to partition the artist and song name. This allowed us to break out of these loops fairly reliably.

Challenge: More quota, please?

Lastly, I struggled greatly with the Youtube Data API quota that is imposed. After calculating our quota consumption for the bot, we realised that we could only run it a few times on a single project quota. We had to come up with creative ways to circumvent the issue.

Solution: Use Your Own Key!

Initially, our reaction to the issue was to minimise our reliance on the API and try to find another way to scrape Youtube’s data. Turns out, it was much harder that we thought to scrape Youtube. Taking a closer look at Google’s API keys, we realized that the keys were fairly easy to create and did not contain any sensitive information. This brought us to the idea of getting users to use their own API keys. Each user would tap on their own quota when running the bot, allowing us to scale the web service to a much wider audience.

Learning Points

I learnt a lot from this experience, especially in the area of error management, HTTP request handling and API integration. Technical learning value aside, it was also one of my first times coding in a team, and there was a lot for us to grow in that area as well.

One of our biggest issues this hackathon was time management. There was a lot of inefficiency in the way we shared, discussed and tested our code. We also grossly underestimated the time it would take to complete our various parts. Most crucially, we were GitHub noobs, where we simply did not have the proper project discipline of keeping our files updated and making well-documented code edits. This will have to change for future hackathons.

Future Improvements

There is a lot of room for improvement for our website. It could be a lot prettier, and our bot could be a lot more accurate. There are still other parameters like caption language that we simply did not have the time to properly test and explore. Further, the target video of the bot could definitely be expanded to beyond just Rick Astley, leaning towards a Youtube version of the “Wikipedia Game”.