Data Horizon: Innovation Recap & Top Trends for 2024

Data Horizon: Innovation Recap & Top Trends for 2024

Written by

Ajit Monteiro, CTO & Co-Founder

Published

December 15, 2023

Data & AI Strategy

Data Technology Innovation Recap

In 2023, “Artificial Intelligence (AI)” transitioned from a mere buzzword to a tangible reality. Snowflake’s focus on Apps and AI led to strategic acquisitions and significant product updates, while Matillion introduced its cloud-native Data Productivity Cloud and Power BI introduced new features like CoPilot and DAX Query View. Explore how these advancements are transforming data analytics and intelligence.

Snowflake Updates

This year, Snowflake made several acquisitions to improve their AI, data sharing, migration, and python offerings.

Myst AI

Myst AI for its time series forecasting abilities; crucial for optimizing business operations like supply chain management and financial planning

SnowConvert

SnowConvert to bolster Snowflake’s migration toolkit, significantly easing the transition of legacy databases to Snowflake’s cloud data platform

LeapYear

LeapYear to incorporate differential privacy into Snowflake’s services, enabling secure data collaboration with mathematically proven privacy protection

Ponder

Ponder to enrich Snowflake’s data capabilities with Ponder’s expertise in bridging data science libraries and scalable data operations, fostering enhanced analytics and machine learning functionalities

Neeva

Neeva to integrate advanced, privacy-focused search capabilities into Snowflake’s data cloud, enhancing user experience and competitive edge in data analytics

Samooha

Samooha to accelerate our vision for removing the technical, collaboration, and financial barriers to unlocking value with data clean rooms

Improved Python Support

(Generally Available): Snowflake has notably enhanced its Python support, allowing data scientists and engineers to integrate Python tools and libraries with Snowflake’s platform seamlessly. This enhancement facilitates efficient data analysis and machine learning directly within Snowflake, leveraging Python’s widespread popularity in the data science community to maximize the platform’s data analytics capabilities.

Why it matters: Allows you to use Python inside of Snowflake, including in user-defined functions, and workbooks.

Snowflake Native Apps Framework

(Available on AWS, Private Preview on Azure): Snowflake has expanded its ecosystem by introducing native applications, a move that significantly enriches the functionality and flexibility of its cloud data platform. These native apps, developed either by Snowflake or third-party developers, are designed to operate seamlessly within Snowflake’s environment, offering users a range of tools and services directly accessible within the platform.

Why it matters: Allows for developing and publishing data applications inside of Snowflake.

Streaming and Dynamic Tables

(Open Preview): Dynamic Tables allow users to define transformation logic with a simple, declarative SELECT statements, and Snowflake will automatically keep the table up-to-date, with cost-efficient incremental updates, a custom-defined refresh schedule.

Why it matters: Dynamic tables removes the need for some customers to maintain their own incremental update ETL framework.

Support for Git Integration
(Open Preview): You can now maintain your entire database schema in git and deploy automatically via the Snowpark CLI. Snowflake’s new CREATE OR ALTER statement will handle DDL modifications automatically without dropping objects entirely. Check out our blog post to learn more.

Why it matters: Allows development teams to work simultaneously on a codebase while tracking and managing changes efficiently.

Notifications for Better Observability

(Generally Available): The new SYSTEM$SEND_EMAIL() system stored procedure can be used to send- email notifications. You can call this stored procedure to send an email notification from a task, your own stored procedure, or an interactive session.

Why it matters: Allows for data-driven email notifications for use-cases like KPI thresholds, data observability, access monitoring, etc.

External Network Access

(Open Preview): You can create secure access to specific network locations external to Snowflake, then use that access from within the handler code for user-defined functions (UDFs) and stored procedures. You can enable this access through an external access integration.

Why it matters: Allow or block specific network access to Snowflake, as well as call external APIs from Snowflake.

Snowpark Container Services

(Expected Release in Summer 2024): Enables developers to deploy, manage, and scale generative AI and full-stack apps, including infrastructure options like GPUs, securely within Snowflake. This service broadens access to third-party services like LLMs, Notebooks, MLOps tools, and more.

Why it matters: Allows you to host code from any language inside of Snowflake, and make sure your data that is used by the code doesn’t leave the Snowflake cloud.

Large Language Models and Document AI

(Expected Release in Summer 2024): A new LLM developed by Snowflake, built from the acquisition of Applica’s generative AI technology, helps customers extract insights from unstructured data without needing machine learning expertise

Why it matters: Easy access to document-based machine learning.

Matillion Updates

(Expected Release in Summer 2024): A new LLM developed by Snowflake, built from the acquisition of Applica’s generative AI technology, helps customers extract insights from unstructured data without needing machine learning expertise

Data Productivity Cloud

Matillion released its Data Productivity Cloud product, which is a cloud-native version of its software.

Why it matters: In addition to enhanced functionality, moving to the SaaS version of Matillion will remove the need to host and manage their own instance, allowing more time to focus on business value.

LLM-enabled Data Pipelines

In Q1 2024, Matillion expects to launch an AI Prompt component, allowing users to augment their data pipelines with GenAI.

Why it matters: Companies with an existing Matillion implementation may find this feature to be an easy route to proof-of-concept, before investing heavily in a custom LLM integration.

PipelineOS

Announced in July 2023, PipelineOS is the key innovation in Matillion’s new Data Productivity Cloud. The stateless, microservice architecture allows for both horizontal and vertical scaling.

Why it matters: Understanding the concept of PipelineOS will better equip customers to transition to the SaaS platform.

Microsoft Updates

Power BI continues to redefine the landscape of data visualization with its new features, including CoPilot and the DAX Query View. Additionally, Microsoft released new branding for all of their data-related tools: Microsoft Fabric.

CoPilot in Power BI

Using Copilot, you can simply describe the visuals and insights you’re looking for, and Copilot will do the rest. Users can create and tailor reports in seconds, generate and edit DAX calculations, create narrative summaries, and ask questions about their data, all in conversational language.

Why it matters: Allows you to create dashboards using natural language, minimizing the need for a analytics engineer, especially for ad-hoc use cases.

DAX Query View

A new feature that provides powerful querying capabilities, enhancing data analysis and insights derivation.

Why it matters: Gives you access to writing DAX queries inside of PowerBI without any external tools.

Microsoft Fabric

Announced in May 2023, Microsoft Fabic is the new branding for all data-related tools in the Microsoft ecosystem (e.g. Data Factory, PowerBI, Synapse, etc.).

Why it matters: Currently, Snowflake and Databricks are considered to be the industry leading data cloud platforms, and Fabric is Microsoft’s response to those tools.

Data Trends and Predictions for 2024

In the ever-evolving landscape of business intelligence, several key trends are reshaping how organizations perceive, manage, and leverage their data assets. From the imperative of AI-ready data to the paradigm shift of treating data as a product, these trends encapsulate transformative approaches driving the future of business analytics. As businesses head into 2024, understanding and adapting to these trends is paramount for sustained growth and innovation. Let’s delve into each trend, uncovering their significance and impact for modern data organizations.

Trend 1: AI-ready data

If your data isn’t ready for AI, then your organization isn’t ready for AI. Before jumping on the AI bandwagon, companies need to take a step back and assess their data landscape. An AI algorithm’s effectiveness is tied to the quality of data it processes.

While 79% of executives plan to boost investments in AI/ML, a mere 24% consider themselves truly data-driven (Wavestone). As Forrester aptly notes, while AI may be ready for the spotlight, it’s the quality of data and analytics that truly determine if it shines. Enhancing data quality isn’t just a peripheral task; it’s a strategic move that can boost AI/ML model accuracy by a significant 20% (Forrester). So what exactly does it mean for data to be AI-ready? According to Gartner, for data to meet the AI-ready criteria, it must be secure, enriched, fair, accurate, and governed.

To help your organization navigate the complexities of preparing your data for AI, OneSix has built some free tools and resources:

Check out our comprehensive Roadmap to AI-ready Data to get familiar with the core steps
Take our 5-minute AI Readiness Assessment to evaluate your organization’s data maturity; this will be the basis for determining how to accelerate your journey to AI readiness

Trend 2: Unified business visibility

In the pursuit of unified business visibility, data analytics teams encounter a significant hurdle in finding data within the organization, with a notable 24% citing this challenge as their foremost struggle in executing data strategies during 2023 (Forrester). To address this, Generative AI will increase the trend toward centralized data that creates a single source of truth for large language models (LLMs) and everything else (Snowflake). This shift aligns with the evolving need for a consolidated and authoritative data repository to streamline organizational data.
The explosion of unstructured and semi-structured data further underscores the necessity for organizations to embrace all-in-one unified data platforms. Forrester highlights that these platforms are essential for effective cost management, support for diverse multi-structured data analytics, and enabling broader use cases and workloads.
Embedded analytics is going to play a pivotal role in building a data-driven culture. By integrating analytics capabilities and data visualizations directly into user workflows, applications, or portals, embedded analytics streamlines access to insights and provides users with a highly interactive and user-friendly data engagement experience. OneSix can help you get there, leveraging BI platforms like Power BI, Sigma, Tableau, Sisense, and Pyramid Analytics. Get in touch with us to learn more.

Trend 3: Data as a product

The evolving market trend emphasizes a transformative shift in how data is perceived— treating data as a product. This involves ensuring datasets are discoverable, addressable, self-describing, interoperable, trustworthy, and secure. The consumers of these datasets can then be other departments and organizations through data sharing technologies. This shift presents an opportunity for organizations to enrich their internal repositories by augmenting them with externally published datasets.

These external sources, such as economic indicators, population statistics, or weather data, offer a chance to enhance the depth and breadth of insights, empowering businesses with a more comprehensive and diverse informational landscape to drive informed decision-making and innovation.

Trend 4: Vector databases

Vector databases are rapidly emerging as a highly efficient storage solution tailored for the demands of AI, big data applications, and the long-term memory requirements of Large Language Model use cases. They excel at similarity-based searches, proving invaluable for critical functions like image recognition, recommendation systems, and intricate data comparisons.

What sets vector databases apart is their remarkable scalability and rapid data processing capabilities, making them ideal for real-time operations and large-scale applications. Their adaptability and speed render them not just as a viable option but as a crucial asset in the evolving landscape of data storage and utilization.

Gear up for a transformative year ahead

As we head into 2024, the world of data and AI technology is poised for significant transformation. The trends witnessed in 2023—AI-ready data, unified business visibility, cloud-first approaches, data-as-a-product paradigm, and real-time analytics—carry profound implications for businesses across all industries. Understanding and jumping into these trends is a must for any business aiming to stay ahead in this fast-moving world of data and AI.