The intro

YackWords is a cool and fun little app that lets you search for a location/city and gives you back a compilation of all the words used in reviews of businesses in a 5km sphere around that location.

Those words will be sorted by descending order based on the occurrences count. Under those you will be proposed the top 5 businesses that had that had the most occurrences of that very word in their reviews.

The story

We all came together Friday with a vague idea of wanting to building a suggestion engine for Yelp based on a user's previous reviews. Slowly settling down on the specifics of it we went to bed confident we had a solid idea, enough data, and exited to start working on it using all kinds of new (for some) tools: PHP with Laravel, MongoDB and Golang for the importer and api.

Saturday is when our dreamy bubble of hope and excitement broke apart. The more we dug and though about on what we could base suggestions and what we can really take in account as a positive input for scoring businesses (all without venturing in NLP), we came down to only using categories of businesses to which reviews are associated with.

2pm, we decide it's not worth pursuing an idea that is essentially Yelp search with only one criteria the user can't change. We all, spirits down, dabble here and there and try thinking of a new idea, we find few useless mini idea's but continue searching.

Mean while during Saturday's afternoon Juan and Frederic are exploring the dataset imported in MongoDB at this point with all kinds of MapReduce, Group and Geospatial esoteric queries. It is incredible the speed at which you can iterate and extract useful information from a dataset once it's put in a good datastore. We explore features of the query language and do few graphs in Google Spreadsheets. The Geospatial queries require an index on a location key, a special one, plus that key can't be two separate lat/lng but one key (i.e. {loc: {lat: 0, lng: 0}}). We edit the schema, delete all businesses from the DB, rerun the import, 35 seconds is all it takes to import 61 000 businesses, joyful!

Towards the end of the day we end up venturing in the review text's word splitting area an gather count of words that occur the most often in reviews. It's fun and cool but not so useful.

The more we think about it, the we just want to make it useful! There it goes Frederic starts a new project for the frontend part that will display those words in a nice and pretty way, later location was mixed in to get only local businesses. While Juan was still exploring the data using Robomongo.

From this point, time flies away, a good little night of sleep to be able to focus and be productive.

Sunday: Few features added but mostly polishing and deployment of the app was accomplished.

End of story. Go visit YackWords while it lasts.

P.S.: The food was AMAZING and never in too little quantity, great success for a fist community Yackaton, thanks everyone for participating!

The code

GitHub for YackWords

GitHub for importer and api

Share this project:

Updates