“confusionMatrix” function in R – The data contain levels not found in the data

“confusionMatrix” function of “caret” package threw error as below while validating prediction results in R. Error message: Error in confusionMatrix.default(loan$Defaulter, loan$Prediction) : The data contain levels not found in the data. Reason: This error comes up because the two columns we feed into confusion matrix function have different levels. R line of code that gives […]

10 – Descriptive Statistics – Numeric Variable

In the previous post we saw the different distributions and charts available to summarize the categorical variables. There are similar distributions and charts available for Numeric variables which we will see in this post. Frequency Distribution: Similar to categorical variables, Frequency Distributions can be created for quantitative variables too. In the case of categorical variables […]

9 – Descriptive Statistics – Categorical Variable

Before getting into any statistical modelling and more detailed analytics, it is important for us to understand the data and its distribution at a more basic level. Below are some distributions and plots that will help us to understand the categorical variables in our data set. These are called Descriptive Statistics. Frequency Distribution: Assume a […]

8 – Measures of Variability

In the previous blogs we looked at measures of central tendency, and expanded it to other measures of location. Measures of central tendency give us an idea about the middle point around which data is spread. But they don’t exactly tell us how much the variability there is in the data. Though percentiles give us […]