There are many places where a bit of analytics can help any business. You may want to automate some routine tasks. You may need to analyse your sales data to decide which products are performing. You may even be attempting to create a new product that could help change the world.
With all of the tools at your disposal to accomplish these tasks, how can you decide which one is right for your needs? Whether you want somebody else to help or you want to try your own hand at analysis, the options can be overwhelming. Thankfully, we have experience implementing solutions in a number of these tools to help our clients.
We’ve put together this helpful guide, covering the most common tools to help you on your journey. Each tool has strengths and weaknesses, and no tool is right for every situation. It’s important to pick the best tool, as well as hire the right people or outside help with the right skill sets for your team.
Excel
The bread and butter (and the bane of our existence at times) of most analysts. It is best used for small scale data analysis. The best part of Excel is its widespread use. Most business professionals have a familiarity with using it as a tool, even if some of the more complex formulas and features are more than they need.
You can do almost any basic mathematical function you can think of in Excel, and present your data in a wide variety of intuitive and visually engaging ways. Data visualisation couldn’t be easier, you just have to highlight your data and navigate through a few menus. It integrates well with other Microsoft products, and with some more advanced VBA coding, you have the ability to automate everything from data analysis, to creating PowerPoint / PDF reports.
Excel has a few downsides that we must mention. It’s biggest challenge is that it doesn’t scale particularly well as the size of your data or the complexity of your formulas increases. On some occasions, our clients used large, complex Excel spreadsheets which could take up to hours to refresh, and these workbooks would crash almost daily. The other big challenge is Excel’s version compatibility. Newer versions of Excel have far more advanced features which are very useful, including data tables, PowerQuery, and new formulas. Unfortunately, these aren’t always backwards compatible, and you may be forced to redo your work in an “old Excel” friendly manner when sharing with colleagues or clients. Finally, complex operations and more advanced data manipulation is limited in Excel – one of the most advanced things you can do is a linear regression.
Google Sheets
Google’s done exceptionally well mimicking Excel, and has continued to improve on it every year with Google Sheets. (please change the filter options to match Excel, it drives us mad!).
It’s very similar to Excel – but with less functionality (for now) and a bit more technical frustration. The ability to share Sheets easily with colleagues with Google drive, and the automated version control / back-ups are a huge plus, and is great for team collaboration.
The cherry on top with Google Sheets, is the ability to connect & integrate it with the wider Google suite. You have the ability to connect Google Sheets to interact with Google Slides, Gmail, Google Forms, Google Cloud Database… etc. and be able to do all sorts of automation with a few lines of code in Python, R or using Google Scripts. This is becoming a real fan favourite within our team.
SQL
No talk of data science in the business would be complete without talking about SQL. SQL is a structured query programming language used to manage databases and manipulate the data within. It is possibly the easiest programming language to learn. SQL is 30 years old, and as such the code is mature, optimized, and efficient. It is usually the best choice for pure data manipulation, and the main choice for dealing with databases. Data manipulation that takes an hour in Python takes minutes in SQL.
It integrates with Excel, Python, R, and most other tools, so you can extract data directly from your database and analyse in an environment you are comfortable in. You just need to bring in somebody who know how to set these connections up, and you’ll be on your way to streamlined analysis in no time!
It is worth noting that SQL is usually a data manipulation tool in a larger analytical toolkit. SQL doesn’t have the functionality to present data in a business savvy way on its own. You will generally complete your analysis in Excel, R or Tableau etc. with the data sourced from SQL.
Python
Python is an open source coding language, with many customisable tools for analysis. Any analytical task you want to do, from visualisation to database manipulation, can be done from Python. As an open source tool, Python is free. It can easily work with data from most sources. Python is also a particularly good choice for automation of mundane business tasks.
We’ve automated legal contracts and complex data processing with packages used to interact with Excel, Word, Google, and corporate databases. We've also helped clients scan through Outlook email boxes to perform text mining. Python also has very strong support for complex machine learning if you want to go that route. From our experience Python is the tool of choice for much of the research and advancements in AI today. Finally, you can extend Python to your business needs as far as you want to. Its Django package is the programming language of choice for many of today’s most popular websites, and with the right help you could easily use it to create a number of web apps.
Python has a few downsides. It is the tool with the steepest learning curve. While it is designed to be as easy to use as possible, it is still a coding language and you need some degree of computer science skills to utilise it effectively. It may be easier to hire outside help for your team, at least to set up the foundational groundwork.
R / R Studio
R is open source software designed especially to work with data and analyse. R is free, though some of its more advanced packages aren’t. You can do almost anything analytical with a great degree of customisation within R. It has less functionality than Python, but most things you can do in R are easier to do. R has a number of useful software packages.
The Shiny package is particularly useful to build aesthetically pleasing web applications with minimum effort and no required knowledge of html or CSS. It also has much more support for specific statistical functions in its packages than Python. Of course, the powerful ggplot for data visualisation is a plus.
R is easier to learn than Python, but its syntax can be frustrating for those familiar with traditional coding languages.
Alteryx
A relatively new and trending player in the Analytics world, we're definitely seeing more and more clients using this platform. Alteryx is a powerful analytics platform which spans across end-to-end data analytics processes (from ETL to data cleaning and manipulation, as well as basic visualisation).
This work flow tool allows you to merge and manipulate data in a drag-and-drop interface. Super easy to pick up, which means you don't need to be from a very technical background to use it. The biggest advantage of Alteryx is its ability to integrate with a wide range of other data sources or platform - extract data directly from Salesforce and SQL databases within the same workflow and blend them!
Data exploration (i.e. when you don't know what the data is good for yet) is slightly slower to do in Alteryx, and you can very quickly end up with a messy, over complicated workflow. From our experience, good practice needs to be set up and communicated at the beginning of your Alteryx journey - trust us, it will save you a lot of time and grief later on.
Tableau
Tableau is a powerful data visualisation tool which has taken the analytics world by storm. It helps simplify raw data into intuitive dashboards through drag-and-drop functionality - which means that you don’t need to be a super technical guru to use it!
A large majority of our clients use Tableau to create automated internal self-serve dashboards. For smaller teams (e.g. the finance department), we’ve seen our clients use it for simple data manipulation and blending as well. Tableau is very easy to pick up and also integrates well with SQL, Excel and a number of big data platforms. One of the biggest benefits of Tableau is the ability to deploy dashboards and schedule data refreshes on Tableau servers which makes data collaboration a whole lot easier. Diving into the data and performing exploratory analysis is a much easier task using Tableau.
From our experience, there is a cap to how much data Tableau can handle, and how complex your analysis can be as it will severely impact the performance of your dashboard.
Power BI
An up-and-coming dashboard software from Microsoft is Power BI. Think of it as a visually and technically beefed up version of Excel. If you are familiar with Excel, you should find adjusting to the functions and formulas here quite easy. PowerBI’s biggest strength is that it lets you adjust and slice your data with a few clicks. It can easily import data from Excel and database sources. If you want a cheap dashboard software with a lot of out of the box functionality, this is a good choice.
The downside to PowerBI is that it isn’t useful for much outside of creating analytics dashboards. Similar to Tableau, it is usually best as a tool in a larger toolkit. Simple tasks such as creating distribution charts or metrics (e.g. median) is not as straight forward to do in PowerBI making it a lot less flexible and intuitive for data exploration. This is likely more useful for creating dashboards when you already have a pretty good idea of the design.
Which one’s for you?
To summarise, there are a lot of options out there (many more which we haven’t called out). From our experience,
For small datasets – Excel and GoogleSheets is usually your best choice
For large volume of data – SQL and a combination of Python or R are good tools of choice
For automation – Alteryx is the easiest to use, Python for more complex tasks
For advanced analysis – R is better for statistics, Python is better for advanced machine learning (deep learning in particular)
For pretty and useful visualisation – PowerBI for more simpler dashboards and Tableau for more customisation
We hope this has helped guide the selection process for tools that you can use to complete your analytics task. If not, we are happy to help. We’re in the business of helping after all!
About the Author
Wil Grebner is data analytics consultant at EdgeRed Analytics with a Masters in Econometrics from the University of Sydney. He grew up in River Falls, Wisconsin, USA - a great small town nobody reading this has probably ever heard of. In his spare time he enjoys not drinking bubble tea.
About EdgeRed
EdgeRed is an Australian boutique consultancy specialising in data and analytics. We draw value and insights through data science and artificial intelligence to help companies make faster and smarter decisions.
Subscribe to our newsletter to receive the our latest data analysis and reports directly to your inbox.
Comments