15 – Probability – Bayes Rule

In the previous post on Probability we looked at terms like Joint Probability, Multiplication Rule & Conditional Probability. Let us extend on this understanding and learn Bayes Rule in this post. Formula & Description: The formula for Bayes Rule is which expands to Events: E1 through En represent n Events. “A” represents one of the […]

13 – Introduction to Probability

Probability is another very integral subject when it comes to analytics. The significance of variables or features used in a model, significance of relationship between variables & significance of a statistical model are all represented in terms of “p” values. Deep understanding of probability is key to setting up a hypothesis, building models and interpreting […]

“prediction” function in R – Number of cross-validation runs must be equal for predictions and labels

A blog reader reached out to me with an error he faced with prediction function while trying to plot an ROC curve. This post captures the error and the fix. Error message: Error in prediction(Traindata$predict.score, Traindata$Subscribe) :   Number of cross-validation runs must be equal for predictions and labels. Reason: R code segment that gives error […]

12 – Types of Sampling

In the previous post on Sampling and Estimation we got introduced to some important sampling terms and concepts and the types of estimation using sampling approach. In this post we will delve into the types of sampling and the pros and cons of each approach. Probability Sampling: Samples selected through probability sampling techniques listed below […]

11 – Sampling and Estimation

We would ideally like to base our estimations on the entire data that is of interest for given business problem. That would result in higher accuracy. But, often we will deal with situations where it is practically not feasible to collect and process the entire data. We may not be able to afford the cost […]