Data Visualization in Analytics

New age Data Visualization products are intuitive and easy to use. They are fairly simple to learn for non-technical business users and novice data analysts. They provide the ability to connect to varied data sources in a static / live mode. They can leverage on statistical programming languages like R & Python to integrate more functionality on top of what the visualization products themselves offer. Dynamic drill downs could be provided in the dashboard that the target audience can explore as per their filter criteria, without needing to know the nuances of the product. Visualization has come a long way from the static Excel and PowerPoint charts of olden days.

What is typically done in a Data Analytics project?

  • Analyze the data attributes, patterns, relationships, missing elements; Impute missing values, transform variables
  • Build, validate, test, score and implement the statistical model
  • Present model results

The analytics consultant has to work with business to define the objective / requirement / problem statement / hypothesis. There is complexity involved in finding the right data sources, collecting relevant data elements, potentially dealing with big data scenarios, moving data to cloud or a suitable platform for analytics, etc. I am leaving those pieces out for the purpose of this article. Let us assume all that is done and dusted. Then, in a nutshell, you analyze your data, build models and present results.

Where does Data Visualization fit in?

There is a lot that needs to be done in the data analysis and preparation phase. Good models cannot be built without deep understanding of data. Data Visualization plays a vital role in this phase. With the level of sophistication the visualization products offer these days, the analyst can plot a variety of charts, slice and dice the data to see the different perspectives and figure out what insights can be derived. There are a whole lot of interesting insights that might potentially be uncovered in this phase itself. Also, business objectives can be added or refined as we progressively understand the data better during this phase. Then comes the crucial point, where we decide which statistical models make sense for the given data and business objectives.

At this point, it moves over to the Data Scientist. He will do his magic and come out with some results. Let me leave the magic portion as a black box. The only thing I would like to note here is, build complicated statistical models only if they make sense for the scenario and are really required. If there are multiple approaches that can yield the same results, remember that the simplest is always the best. If all the insights can be generated just using Data Visualization for a particular project, so be it. Make it a 100% Data Visualization project. Don’t build additional models just because you have hired Data Scientists with PhDs.

Assuming a statistical model was required and we did build it, the final portion is to interpret and present the results. The model will throw out a lot of stuff like the p value, chi square value, adjusted R square etc. A lot of that may not mean much to the end client. It is an art to present the results in an intuitive way, and provide business recommendations & advice. Data Visualization could come in handy again.

In summary, Visualization can be used extensively in data analysis and to present results. Data Visualization is one of the key skills in the scheme of an Analytics project.

Leave a comment