Published inTowards Data Science·May 23A better way to analyze feature release impactOr — why naive “before-after” comparisons can drive bad product decisions — A/B tests are the gold standard for estimating causal effects in product analytics. But in many cases they aren’t feasible. One of the most common ones is the feature release. In this post I’ll discuss the common practice of measuring feature release impact using simple “before-after” comparisons and the biases…Data Science5 min readData Science5 min read
Published inTowards Data Science·Oct 31, 2022Better Churn Prediction — Using Survival AnalysisAnswering the “when” question — On a previous post I made the case that survival analysis is essential for better churn prediction. My main argument was that churn is not a question of “who” but rather of “when”. In the “when” question we ask when will a subscriber churn? Put differently how long does a…Churn4 min readChurn4 min read
Jul 13, 2022DALL-E: the end of human creativity?Usually I post about how I instruct machines to translate my ideas into scientific discovery. This time around I’d like to discuss something a bit different: How machines can help turn ideas into art. About a month ago I was given early research access to the DALL-E neural net by…AI2 min readAI2 min read
Published inTowards Data Science·Jun 8, 2022Better Churn PredictionTo churn or not churn — That is not the real question! — One of the main topics I’ve been working on through the years is churn. Churn reduction is a top priority for many companies and correctly identifying it’s root causes can greatly improve their bottom line. Considering how well known and appreciated the churn problem is I’m often perplexed by how…Churn5 min readChurn5 min read
Published inGeek Culture·May 23, 2021Sometimes more data can hurtDon’t believe it? neither did I! — Photo by Ben White on Unsplash So here’s a mind blower: In some cases having more samples can actually reduce model performance. Don’t believe it? Neither did I! Read on to see how I demonstrate that phenomenon using a simulation study. Some context On a recent post on my personal blog I’ve…Machine Learning4 min readMachine Learning4 min read
Published inTowards Data Science·Apr 29, 2021Don’t Be Fooled by the Hype Python’s GotR still R still is the tool you want — If you think there’s a typo in the subtitle think JLO instead (: We all know python popularity among DS practitioners has soared over the past few years, signalling both aspiring DS on the one hand and organisations on the other to favour python over R in a snowballing dynamic. …Python6 min readPython6 min read
Published inTowards Data Science·Apr 13, 2021data.table speed with dplyr syntax: Yes we can!Why choose between speed and readability when you can have both? — R has many great tools for data wrangling. Two of those are the dplyr and data.table packages. While dplyr has very flexible and intuitive syntax, data.table can be orders of magnitude faster in some scenarios. One of those scenarios is when performing operations over a very large number of groups…Dplyr4 min readDplyr4 min read
Apr 8, 2021Data Science 101: Is Python better than R?2035Enoch KanHave you tried using the fread function from data.tableHave you tried using the fread function from data.table package in R? I think multiple benchmarks show it to be way faster than pandas in reading CSV files.1 min read1 min read
Apr 20, 2020dowhy library exploration — Just be-causeOriginally published at https://iyarlin.github.io on April 20, 2020. It is not often that I find myself thinking “man, I wish we had in R that cool python library!” (maybe because I don’t do much NLP/DL). …Causality7 min readCausality7 min read
Jan 21, 2020Automatic DAG learning — part 2 — Just be-causeOriginally published at https://iyarlin.github.io on January 21, 2020. We’ve seen on a previous post that one of the main differences between classic ML and Causal Inference is the additional step of using the correct adjustment set for the predictor features. In order to find the correct adjustment set we need…Causality5 min readCausality5 min read