What We’re Reading - 12/27/2019

2 minute read

Did you know you can now sign up for weekly-ish updates to our blog via email?

How tracking pixels work - Julia Evans

Julia Evans walks us through how Facebook and Old Navy use tracking pixels for retargeting.

I often hear about this slightly creepy internet experience: you’re looking at a product online, and a day later see an ad for the same boots or whatever that you were looking at. This is called “retargeting”, but how does it actually work exactly in practice?

In this post we’ll experiment a bit and see exactly how Facebook can know what products you’ve looked at online! I’m using Facebook as an example in this blog post just because it’s easy to find websites with Facebook tracking pixels on them but of course almost every internet advertising company does this kind of tracking.

Want to make good business decisions? Learn causality | Stitch Fix Technology – Multithreaded

Causal Inference is on my (and I’m sure many others’) list of to-learns for 2020. This is nicely illustrated using the Google Tax as an example.

Causal inference provides a set of powerful tools for understanding the extent to which causal relationships can be learned from the data we have. Standard machinery will often produce poor causal effect estimates, which modern methods from effect estimation, such as TMLE, will consistently outperform. By taking these considerations seriously at Stitch Fix, we foster a culture of better, clearer decision-making.

Corrie Bartelheimer: A Bayesian Workflow with PyMC and ArviZ | PyData Berlin 2019 - YouTube

Nice overview of a Bayesian workflow from PyData Berlin 2019.

When are teams being more aggressive on fourth down? | NFL Football Operations

Since I started using NFL data for data modeling projects (e.g. here and here), I’ve been reading a lot more sports statistics blog posts.

Google Cloud Platform (GCP) Security Best Practices – Assured

GCP is getting more traction, not just around BigQuery, and I’ll be keeping this post close by as a cheat sheet.

Coding habits for data scientists | ThoughtWorks

Helpful post with bad and good examples of how to structure your (Python) code in data science projects.

Merging vs. Rebasing | Atlassian Git Tutorial

Speaking of good code habits, this post takes some of the scary out of git merge and git rebase.

Interesting Github repo of the week:

GitHub - huginn/huginn: Create agents that monitor and act on your behalf. Your agents are standing by!

Huginn is a system for building agents that perform automated tasks for you online. They can read the web, watch for events, and take actions on your behalf. Huginn’s Agents create and consume events, propagating them along a directed graph. Think of it as a hackable version of IFTTT or Zapier on your own server. You always know who has your data. You do.

From the Twitters