Who am I?

Greetings! I'm Zach, and I love digging into deep questions using data analysis techniques. I started my career by getting a Ph.D. in Physics and applying my love of statistics and coding to the "deep problems of the universe." I've run experiments that studied how nuclei split apart to release energy (nuclear fission), what makes up a nucleus, and numerous other physics topics. All of these tasks shared two common themes: how do we get tons of data and what do we do with it once we have it? Solving these problems was the key to unlocking the answers to the puzzles of the universe. However, for me, what always mattered the most was problem solving through data techniques. So, yes, I've loved working with huge collaborations to study what happened in the first nanoseconds after the Big Bang... but I'm just as excited to use analysis techniques to predict what soup someone will eat based on their lifestyle! I'm fluent in C++, Python and UNIX, and I have a working knowledge of R, Java, BASH, LaTeX, C#, and Javascript. I'm also a musician, a rock climber, and have a deep love of green chile. See More...

Featured Projects:

Compendium Of Talks

Compendium of Talks

Want to hear me talk about data science? That's a thing I do sometimes.
NLP Pipeline Manager

NLP Pipeline Manager

Natural Language Processing (NLP) has a lot of very frustrating parts. In this post, I introduce a library I wrote, and how I hope it makes NLP suck less.
Numpy Vectorization

Why numpy is so Great

Using numpy vectorization, I demonstrate just how awesome numpy can be. With a few simple changes, my code is 2 orders of magnitude faster.
Machine Learning with Spark

Nicer Machine Learning with Spark: RFormula

Picking up from the last tutorial, we use RFormula to make our code much easier to read.
Machine Learning with Spark

Machine Learning with Spark

A tutorial for using Spark to predict airline delays, using Spark's Machine Learning methods.
Setting up Spark on a Cluster

Setting up a Spark Cluster on AWS

A tutorial for building a cluster of computers, installing Spark, and doing your first Spark project on AWS.