Big Data Toolbox


Continuing our theme of tools for analytics teams, what tools should analysts have in their toolbox? It’s a broad question and one with diverging views. So, I am delighted to welcome back a guest blogger who doesn’t shy away from controversy.

Martin Squires is a very experienced Analytics leader, whom I’ve previously interviewed in our audio series. He has also posted before on why he disagrees with the use of Business Partners (a view I countered here).

I hope this post might be the start of a series of perspectives. But, for now, over to Martin to share his wisdom on what should be in your toolbox…

Getting caught up in comparison
With the plethora of new kit that seems to come out every year (or even month), it’s easy to get lost in the depths of debates about R vs Python, Azure vs AWS etc etc. When asked about my favourite tools it reminded me that I don’t get the pleasure to be “hands-on” for a living anymore.

But what I do have to do is to try and make sure teams I lead have the right tools, competency frameworks and development plans. So I thought I’d take it back a step or two and try and look at what I’d still want to learn if I was setting out on the development path today. What I’d recommend for someone thinking of starting that journey.

The ideal toolbox I think needs to enable an analyst to do three main things:

Get their data and prep it for analysis
Explore and analyse the data and unearth the insights
Present the findings in a way that drives action
Stuff your toolbox with these evergreen tools
So, what’s the stuff I’d fill the toolbox with to achieve that?

SQL
Yes, I know it’s not as fashionable or as new or as sexy as some other languages. But, if you just look on Linkedin at job postings the old workhorse isn’t being put out to grass anytime soon. It seems that most companies still have big relational databases and SQL is still the best way to learn how they work and to get data out of them.

Plus, once you’ve learned SQL then moving on to learn other stuff like R or Python is a lot easier. SQL…the gateway drug for coders! Check out the “Top 20 technology skills in 2019 Data Science job listings” chart in this blog post (spoiler SQL is 3rd)

Exploratory data analysis (EDA)
Not sure if it’s a tool as such but John Tukey's book should still be compulsory reading. For all the “data science is the sexiest job in the 21st century” stuff, just having a great grounding in how to explore a data set is still “Analysis 101”.


How do analysts visualise and explore their data to understand the patterns in it? Even if you are going on to build complex models with the data, understanding the materials you are building with is a vital first step.

A very underrated skill. As far as software tools go I rather like tools like Alteryx which make the task a lot, lot simpler and more efficient than it used to be. But, here is a link to this vital EDA guide.

Data visualisation
This one also depends just what exactly you need to do. For some, this is simply about how to present data to end-users once you’ve got the analysis in place, for others it’s a tool that helps with the exploratory data analysis. Helping analysts to drill down into the data and carry out a train of thought analysis.

The two are very different, differences which have driven more than one BI project onto the rocks. Here, I’m sticking to the “present the data” option. Again, I’m almost tempted to include books as well as tools per se and I’d certainly recommend reading Alberto Cairo, Andy Kirk and Stephen Few before touching software. Check out Paul’s recommended reading list too.

In terms of kit then all the main players in this space do a good job and it really just comes down to personal taste. Tableau and Qlik are both very good tools and both have free versions with lots of good online training material so either would tick the box as a great place to start.

Geographical Information Systems (GIS)
A big hobby horse of mine. Geography is key to understanding customer behaviour in lots of areas. For example, in healthcare, how much customer choice of pharmacy is driven by proximity to a GP? (answer, a hell of a lot)

So, I’m glad to see that fellow guest blogger Tony Boobier has also championed the priority of thinking about location & maps in his posts.

Learning a mapping tool for me is essential. There are deeper and more powerful tools than QGIS out there (most of the real experts I know are fans of ArcGIS) but again it has a free version with good training resources, a good place to start

Which do your analysts have in their toolbox?
Thanks to Martin for that list of recommendations. Very useful & clearly grounded in practical experience. Do you agree? Any others you would recommend should be essentials for a good analyst to have in their toolbox?


In part two of this series, Martin will complement this selection of tools with the softer skills he knows analysts also need. The people skills they need to have in their toolbox to achieve real impact with their work.

Comments

Popular Posts