Top AI-Powered Analytics Tools: Assisting or Replacing Data Analyst Jobs?

A chessboard with chess pieces with a human data facing off against a robot representing AI-powered analytics tools The metaphorical face-off between the AI-powered analytics tools and the human data analyst. Generated by DALLE

Where do we stand between AI acting as a complex toolkit for analysis specialists vs. AI as an assistant to ease or replace parts of the analyst's work? For this, we will have a run through the chronological developments in AI affecting data analysis as well as recent tools on the market to establish the current state of the art in the automation of data analysis.

Data analysis is the process of extracting meaningful insights and patterns from raw data. It is a critical component in a wide range of fields, as it enables informed decision-making, problem-solving, and the identification of opportunities for improvement. The field of data analysis has benefitted from the new capabilities provided by AI tools. More recently, however, AI is increasingly capable of assisting in the end-to-end process of data analysis and performing required data analysis steps by itself. On one hand, this increases the productivity of data analysts employing such tools and on the other hand less technical users become more independent from specialised data analysts and can benefit from enhanced decision-making support and newfound data insights.

What is data analysis and why is it important?

Broadly speaking, data analysis is the process of inspecting, cleansing, transforming, and modelling, and visualising data to derive meaningful insights and inform decision-making. It is out of scope for a deep-dive into the main branches of data analysis or common procedures but please feel free to subscribe to follow upcoming news and articles in this domain.

In today's digital age, where vast amounts of data are generated, the role of data analysis has become indispensable across industries. From businesses seeking to optimise operations and understand consumer behaviour to scientists exploring complex phenomena and policymakers formulating evidence-based strategies, data analysis provides a foundation in virtually every aspect of modern society.

With the rise of automation and artificial intelligence (AI), the relevance of data analysis surged yet again. AI-driven systems rely heavily on data analysis and corresponding processing for tasks such as model training, fine-tuning, and validation. Conversely, AI technologies also have an impact on the field of data analysis itself. Initially, this expanded the analyst's toolbox with advanced tools such as machine learning or clustering methods, allowing to extract deeper insights from increasingly complex datasets. Advancements in AI have enabled these systems to play a more hands-on role in the end-to-end process of data analysis, autonomously applying reasoning to run various analytical procedures independently.

Regardless of the way or by whom the data analysis is performed and insights are extracted, ultimately, effective data analytics provides the potential for the following benefits according to Oxagile:

  • Increased profit

  • Cost reductions

  • Higher operating margins

  • More effective strategic decisions

  • Improved control of operational processes

  • A better understanding of customers

  • Increased workforce productivity

  • Enhanced security

The rise of AI in data analysis

The evolution of data analysis has been greatly influenced by the steady advancements in artificial intelligence (AI) over the past few decades. Starting from the early beginnings in the 1990s and 2000s, the integration of machine learning algorithms into data analysis tools paved the way for more sophisticated techniques. During the Big Data era of the 2000s and 2010s, the introduction of technologies like Apache Hadoop running the MapReduce programming model enabled the processing of large-scale, unstructured data. Concurrently, the rise of the Python data science ecosystem, including libraries like NumPy, SciPy, Pandas, and Scikit-learn, empowered users to build increasingly sophisticated custom data analysis pipelines.

The true AI revolution, however, took place in the 2010s and 2020s. Breakthroughs in deep learning, with the development of powerful neural network architectures like convolutional neural networks (CNNs) and recurrent neural networks (RNNs), revolutionised fields like computer vision and natural language processing (NLP), opening up new frontiers for data analysis, particularly for unstructured data such as text. The emergence of open-source AI frameworks, such as TensorFlow, PyTorch, and scikit-learn, further democratised access to advanced machine learning capabilities, accelerating innovation in data analysis.

More recently, the advent of Automated Machine Learning (AutoML) tools, pioneered by companies like Google and Amazon, has made it easier for non-expert users to leverage powerful AI-driven data analysis techniques without the need for extensive technical expertise. Alongside these advancements, there has been a growing emphasis on Explainable AI (XAI) and responsible data analysis practices, ensuring transparency, fairness, and ethical AI-driven decision-making.

While many of these advancements in AI have been more closely related to enhanced data modelling and predictive capabilities, they have yet significantly improved the core domain of data analysis. Techniques like deep learning have enabled the automated extraction of complex features and patterns from structured and unstructured data (audio-visual and textual data), allowing analysts to uncover hidden insights that would have been difficult to detect through traditional statistical methods alone. Furthermore, AI-powered anomaly detection algorithms can identify outliers and anomalies in large datasets with greater accuracy, supporting the data exploration and sense-making process.

AI-based automation of data analysis and reports

The previous section highlighted how the developments in AI have expanded the data analyst's toolkit. In this section, we will go further and look into more independent assistance AI can provide in data analysis and report writing, as well as currently existing tools and their capabilities.

When searching the web for automated data analysis and reports, one is likely to come across ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) services. Some of the most common examples of these tools include Apache NiFi, Talend, and Microsoft Azure Data Factory. ETL/ELT services essentially facilitate the process of moving data from source systems to a data warehouse or data lake, transforming it into a usable format, and then loading it into a destination system. These tools vastly facilitate handling tasks, substituting efforts which otherwise would have to be covered by designing and developing software architecture/backend, data storage, deployment, and other DevOps procedures.

However, it's crucial to note that while ETL/ELT tools excel at automating and deploying data processing pipelines, they fall short when it comes to the iterative process of data analysis and thus do not necessarily assist with the core essence of data analysis -- extracting meaningful insights and patterns from data. This is where the distinction lies between automated data processing and true autonomous data analysis.

With the rise of large language models like ChatGPT and GPT agents, AI models now provide the capability to summarise extensive analysis results and perform a level of data querying using natural language. Let's have a look at the already existing tools and the level of autonomous data analysis and reports generation provided by these AI-based tools.

Best AI-driven tools for data analysis automation

APEX by Numvio

https://numvio.com/

APEX is an AI-powered platform by Numvio aiming to empower businesses with data analytics and tailored AI modelling without the complexities of coding, extensive setups, and data prompting. The platform requires minimal setup and querying, providing comprehensive data analysis from just uploading a dataset file or spreadsheet. The platform then generates a data analysis report, which not only visualises and describes key features of the dataset but also highlights the business value potential of the data with specific suggestions for further AI model applications to benefit the business. A remarkable feature of APEX is that the user can provide custom requests or simply select one of the suggested AI use cases, which will prompt APEX to automatically create a data pipeline, train corresponding AI models, and generate a dashboard for the tailored use case. The data pipeline, AI model processing and dashboard are then ready for deployment via the web UI or APIs that facilitate streamlined decision-making. The benefit of this platform is that it makes the wide range of data analysis (from EDA to AI modelling) accessible to users with diverse backgrounds with minimal time effort.

Julius

https://julius.ai/

Julius offers a text-based (chat) interface to analyse and visualise data. It can create graphs, run statistical tests, and answer questions about the uploaded data. It is also capable of performing common data analysis tasks, such as EDA. It appears that this system needs to be precisely instructed on each step of the calculation and processing, making this tool suitable to professionals or experienced users in the field of data analysis. Some capabilities in the generation of forecasting models is also provided, albeit with a rather experimental or academic nature, as the service appears to lack deployment features and the ability to process data and run AI models in the form of data pipelines.

DataGPT

https://datagpt.com/

Similar to Julius, DataGPT offers a user-friendly data querying service where users can ask data-related questions to receive AI-generated answers accompanied by relevant graphs. With its AI onboarding assistant, DataGPT provides a simple setup by suggesting suitable metrics and dimensions from the data source. However, it primarily relies on user queries, seems to lack report generation, and currently lacks AI modelling capabilities, making it more suited for straightforward data interactions rather than in-depth analytics or reporting.

Tableau AI

https://www.tableau.com/products/tableau-ai

Tableau AI features the "Einstein Copilot", which is a text-based AI chatbot. Based on user queries, it can provide information about the data, show charts, and support with code examples to set up a data analytics procedure. The module "Tableau Pulse" also adds similar functionality to previously created charts and dashboards. A benefit of this system is the access to the large Tableau ecosystem (e.g., offering support for a wide range of databases). The downside is however, that this requires a considerable amount of manual setup, despite a level of AI assistance. While the AI assistant is also capable of suggesting data queries, this approach assumes that the user has existing knowledge about the analysed data and its potential in order to write queries. User reviews of Tableau praise the visual appeal and configurability of the charts while noting the high pricing and effort due to a steep learning curve for more advanced features.

Microsoft Fabric

https://www.microsoft.com/en-us/microsoft-fabric

Microsoft Fabric provides access to new and existing components from the Microsoft services Power BI (for data visualisation and reporting), Azure Synapse (to query datasets and develop code for data analysis), and Azure Data Factory (for data preparation by constructing ETL/ELT pipelines). Fabric offers "Copilot", another chat-based AI assistant. You might be familiar with it in the form of GitHub Copilot or Copilot in the Microsoft Office suite (Excel, Word, etc.). In Fabric, it can provide code completion, generate code templates, create charts in response to questions about the data, and it can summarise previously configured reports. Therefore, similar to Tableau AI, a-priori knowledge is required both for the analysed data as well as technical coding skills, in particular for more advanced capabilities such as AI model training. Another factor to consider is the potentially inaccessible pricing since Copilot is only available for the paid (non-trial) workspace SKU capacities using F64, Premium P1, or higher.

Polymer

https://www.polymersearch.com/

Polymer uses AI to automatically analyse data and generate tailored dashboards from various sources or pre-made templates. Another feature is the ability to embed Polymer's functionality, catering app developers looking to integrate data analytics into their software. As a notable mention, Luzmo is another service that also specialises in the software embedding feature. While its chatbot interface allows for querying of the created charts, users may find the initial analytics limited in comprehensive insights that would be expected from an EDA and lacking detailed explanations. Polymer shines in creating visually appealing dashboards with added textual insights but also doesn't offer AI model generation or data processing pipeline deployment, suggesting potential need for supplementary tools for more advanced tasks.

Akkio

https://www.akkio.com/

Akkio is an AI data platform, which features "Chat Explore" to query data and generate graphs using GPT-4. The platform allows users to copy these visuals into Akkio reports. While Akkio supports the extra functionality of model training and deployment, its reports offer rather basic insights with brief chart descriptions, often lacking summary metrics and conclusions of the comprehensive dataset provided to the system. Additionally, like most of the previously listed services, the platform demands specific data queries, as well as a manual data preprocessing setup (supported by an AI assistant), catering more to technically experienced users.

Is AI replacing data analyst jobs?

The emergence of AI-driven tools in data analysis automation suggests that truly automated data analysis is becoming a reality. Platforms like APEX by Numvio, Tableau AI, and Microsoft Fabric exemplify this trend by offering advanced capabilities to automate data processing, report generation, and even model training in some cases, significantly reducing the manual effort traditionally associated with these tasks. Most of the existing tools however are on a spectrum: on one end requiring high effort and skill to provide extensive functionality and on the other end offering ease of use but providing limited analysis depth or features. I.e, these tools often lack the interpretive depth, and strategic insight that human analysts bring to the table. Moreover, these platforms typically require specific data queries, manual data preprocessing, and a level of technical expertise, making them more suitable for technically inclined users or professionals in the field of data analysis.

At this stage, both human data analysts and AI-based data analysis automation complement each other. Expert data analysts are required to operate or query AI-assisted systems, fine-tune preprocessing steps, and add the iterative element, which is essential in data analysis, such as adjusting a preprocessing step once the analysis yields a certain insight or providing feedback on the AI-generated results. The collaboration between AI and human analysts can lead to more insightful, efficient, and actionable insights, driving business growth and efficiency.

In summary, while AI-driven tools are revolutionising the data analysis landscape, they are not replacing data analyst jobs but rather augmenting the capabilities of data analysts and enhancing the overall data analysis process. However, technology is moving fast and it will be exciting to see the future leaps that it brings.