Data is Beautiful

Despite all the woes around internet accessibility and net neutrality occurring in the USA in recent times, the internet remains a modern marvel. Sure, there is always the risk of your data being sold (if it isn't already) but in exchange you gain access to the majority of humankind's knowledge. All of our joy, sorrow, science and math.

The internet is an endless supply of data and learning. It isn't all about making pretty graphs like you might find on /r/dataisbeautiful but amazing stuff can be done regardless. The massive amounts of data accessible on the internet will allow for the identification of the driving forces of certain actions. As can be seen below, Bitcoin prices correlate very nicely with Google searches relating to it.

From /r/dataisbeautiful and /u/DeanLa

As my second last semester in university wraps up I realize the true power of data. In a course I took about pattern recognition and machine learning we got to see first hand what can happen with a valuable dataset, a computer, and a group of determined students. Students were paired off and told to propose and complete a project of their choosing. My partner and I decided to test the performance of traditional classifiers against neural networks for both accuracy and speed when applied to object detection and tracking. This project is discussed in more detail here. The results were actually surprisingly good as we would be able to run either system on a video stream at around 10 frames per second which surprised us both.

Seeing the projects that came out of the course was absolutely amazing. The number of interesting data sets that students were able to either generate (we made our own) or find online was amazing. Many were pulled from Kaggle, a site where a massive number of data sets is hosted, but others were taken from machine learning competitions to see if they could out-do the winners.

One group was looking at flight delays based on which airport you were coming or going to, another was looking at diagnosing certain types of cancer, some looked at hand writing for a variety of applications as well. All around there was a lot of really cool projects and a lot of potential for future development.

Go spend a few hours scrolling through Kaggle's existing analysis or /r/dataisbeautiful. You certainly won't regret it and you might even learn something!


Popular posts from this blog

Ongoing Linux Cheatsheet

Data Processing: Matlab, Python or Octave?

Baremetal Drivers: RFM69HCW and FRDM-K22F