Exploring Google BigQuery with the R tidyverse
In this post we’ll explore options in R for querying Google BigQuery using dplyr and dbplyr.
Here we share new things we’ve learned and think you’ll hopefully find useful. Lately, we’ve been learning more about our new favorite data warehouse platform Snowflake, optimization in Julia and probabilistic programming in PyMC3.
What are you excited about? Let us know!
In this post we’ll explore options in R for querying Google BigQuery using dplyr and dbplyr.
In this post we’ll take another look at logistic regression, and in particular multi-level (or hierarchical) logistic regression in RStan brms.
How to implement the Write-Audit-Publish (WAP) pattern using dbt on BigQuery
Updated Post: How to backup a Snowflake database to S3 or GCS, contributed by Taylor Murphy
In this post, we’ll model field goal attempts in NFL football using Bayesian Methods.
What We’re Reading - 01/09/2020: articles and posts we enjoyed
The Blog in Review - 2019
What We’re Reading - 12/27/2019: articles and posts we enjoyed
What We’re Reading - Week of 12/20/2019: articles and posts we enjoyed
Introducing the nfl-dbt repo: dbt analytical models for NFL play-by-play data
What We’re Reading - Week of 12/16/2019: articles and posts we enjoyed
In this post, we’ll model a key NFL football stat, Fourth Down Attempts, using Bayesian Modeling and PyMC3.
What We’re Reading - Week of 12/09/2019: articles and posts we enjoyed
What We’re Reading - Week of 12/02/2019: articles and posts we enjoyed
What We’re Reading - Week of 11/25/2019: articles and posts we enjoyed
How to backup a Snowflake database to S3 or GCS.
We examine how to bulk-load the contents of a pandas DataFrame to a Snowflake table using the copy command.
We take a closer look at setting up a data pipeline for file-based data sources using Snowflake’s powerful Snowpipes feature.
We look at how we can use Powersets, Combinations and Permutations to calculate the number of ways we can split products in an order across multiple warehous...
How to parse nested dictionaries in Snowflake table columns using SQL
We explore a few applications of the Dirichlet Multinomial distribution using PyMC3.
Using SQL and dbt, we create a date dimension table using a retail/merchandising calendar known as a 4-5-4 calendar.
We try to solve a plant production example problem using linear programming with Julia using two different formulations.
How to randomly select rows within a subset or group of your data using SQL
An opinionated introduction to ‘advanced’ SQL for data warehouse platforms, such as Redshift, Snowflake and Teradata, Part 2 of N
An opinionated introduction to query strategies for data warehouse platforms, such as Redshift, Snowflake and Teradata
How to calculate approximations of confidence intervals for proportions & probabilities in Tableau
How to calculate approximations of confidence intervals for proportions & probabilities in SQL
Working with clients on migrating their Tableau dashboards from Redshift to Snowflake, I ended up writing this little helper in Python that does much of the ...
We’ll be at JuliaCon 2018 from August 7th to 11th in London to learn more about this exciting language and community. Let us know if you’ll be there - we’d l...
We solve a textbook optimization example involving planning multiple warehouse locations using Julia
A Julia port from Python of the Wedding Table Assignment example from PuLP.