$ git commit -am “My research experience”

Your first reaction to the title was probably just like mine when I first dove into my summer research internship, EUREKA.

For this summer, I am working on Machine Learning and its uses in Computer Vision. More specifically we are trying to create a image recognition model that categorizes animals with a given picture. This involves heavy use of the Linux command line (aka bash) and Python; it also requires a time-consuming system setup so that everything runs smoothly.

At first, I was lost. I was trying to set up the system so I could use the Python library OpenCV. What do all these words and symbols mean?! I would scream internally while trying to read the endless stream of error lines. where in the wold is that bashrc file? Why do I need to install so many dependencies for this? Why are my training labels not being saved?! All these questions consequently led to never-ending browsing of StackOverflow posts, looking for guidance in the dark world of Linux. However, as days went by, I started seeing a shed of light in that dark place.

The best analogy that I can think of to describe this, is the process of solving a jigsaw puzzle. At first there’s so many pieces everywhere and it’s hard to pick a starting point. Many people (myself included) like to start on a strategic point, like the corner piece, because building everything around it is much easier. My “corner piece” were my CS classes – CS8, CS16 and CS24 – so I was somewhat familiar with Linux and basic bash commands. Soon enough, I started making connections. “So that’s how you add a PYTHONPATH!”. I had many “oooh” moments where things started to make sense,”No wonder why the changes on the bashrc weren’t taking effect, I wasn’t sourcing the file!”.

One after the other, and with the help of m awesome mentor, I slowly installed all of OpenCV’s dependencies, learning something new every minute.

“Phew! that was a nasty obstacle” I thought.  I was ready to get started on coding Python. I considered myself a good Python programmer, and I thought I was a bash ninja after setting up OpenCV. Turns out I was wrong in both cases.

My initial task was to implement a program that recognized digits in a picture. The purpose of this script is to recognize temperature values. As I Googled and browsed different people’s codes, I realized how much Python I didn’t know. “That doesn’t even look like Python code!” I thought while skimming through different pieces of code. Finally, I came across a well-documented StackOverflow post that I could understand. This was my “corner piece” to OCR and Machine Learning.

Fast forward all the bugs and problems I faced trying to make my code work.

Ta-da! My program worked. Not only did it have 100% accuracy on all different cameras but it also reads through 400 images in 20 seconds. It was 5 times faster and 30% more accurate compared to the code my PI used initially. With that I realized that I love programming. I love troubleshooting, looking at problems from different perspectives and “thinking outside of the box”.

The next task was to create/implement a program that recognized animals in a given picture. It sounds complicated, because it is and it’s something that top tech companies like Google and Microsoft are still trying to nail down. Luckily, there exists an amazing tool called Caffe that does exactly what I was looking for. It’s a convolutional neural network that takes in “training data” in order to learn to classify things. It even came with a default training dataset called Model Zoo, could you guess what it does? It recognizes animals in pictures!

Much like with OpenCV, Caffe also needed a time-consuming setup. Installing it was very similar to OpenCV but it felt easier, because I had learned where to look for answers and how to troubleshoot better.

It’s alive! I told myself when it was able to recognize an image of a deer as a gazelle. It worked, however it’s nowhere near perfect. It works well with nice photos, but not all the pictures that we have are good shots, especially because they are motion triggered, so often they are blurred, or pixelated because the lack of light to name a few issues. All these distortions make the program misinterpret animals, for example it recognized a bobcat that was far away as an armadillo and a bird as an impala (no idea how that happened). Currently I am in the process of narrowing down the choices that the model has to recognize. For example, we don’t have gazelles in California so if an animal is identified as a gazelle, it’s probably a deer.

I also started working on moving my scripts to remote servers so I can leave my program running as long as it takes for however many pictures there are – we currently have 900,00. However, the remote servers run CentOS, which is a different version of Linux, so I will spend the next few days on StackOverflow trying to install everything on a different system.

Let’s see where the command line takes me.