Unlocking the power of real-time analytics: 5 key considerations

Uncategorized

In a recent research survey from TechTarget’s Enterprise Strategy Group, we found that only 4% of organizations have the ability to get real-time insights from their data. For another 28%, it takes days, and 45% report that it takes weeks or longer for them to gain insight from their data.1 There is a rapidly growing competitive and productivity gap building between companies leveraging data effectively and delivering actionable data for automated business processes and consumer actions and those not yet able to glean these insights from their data. While the pioneers of real-time analytics tended to be large organizations with substantial resources, cloud-native platforms now allow companies of any size to have cost-effective analytic capabilities to close the gap and accelerate innovations and time to value.

Analytics is a top priority

As shown in Figure 1, when respondents to a recent Enterprise Strategy Group research survey were asked about the importance of analytics projects and initiatives compared to all business priorities, 40% of them stated that analytics was their most important priority. Another 35% ranked it as among their top five priorities.2 The importance of adopting analytics practices in organizations cannot be understated.

Figure 1. 40% of Organizations Rank Analytics as Their Most Important Business Priority

picture1Intel

Data-driven companies who adopt measurable analytic practices typically see these benefits and more:

  1. Improved decision-making. Data analytics helps organizations make informed decisions based on data-driven insights rather than intuition or guesswork.
  2. Increased operational efficiency. By having an up-to-date view of key metrics and consistently monitoring them, organizations can streamline processes and optimize resources.
  3. Customer insights and personalized experiences. Data analytics can provide valuable insights into customer behavior and preferences, enabling organizations to tailor their offerings and improve customer satisfaction.
  4. Competitive advantage. Organizations can gain a competitive advantage by leveraging data to gain a deeper understanding of their operations and markets and drive actions that boost revenue or reduce costs.
  5. Fraud detection. Data analytics can help organizations detect fraudulent activities by identifying unusual patterns or transactions in large amounts of data.
  6. Risk management. By analyzing data, organizations can identify and assess potential risks, enabling them to make informed decisions and minimize potential losses.

Real-time analytics versus batch

Batch analytics represents the more traditional form of analytics, where data is centralized, processed, and analyzed over larger time intervals. Batch analytics typically operates on data that is hours or at least tens of minutes old. In contrast, real-time analytics is optimized for low-latency analytics, ensuring that data is available for querying in seconds so that data and events can be acted upon as they are generated.

One use case for batch analytics is business intelligence reporting. Business intelligence uses historical data to report on business trends and answer strategic questions. In these scenarios, the goal is to use data to craft a strategy, not to take immediate action. Real-time data would not generally affect the result of the trend analysis, making this better suited for batch analytics.

Batch analytics use cases like business intelligence, reporting, and data science have less stringent latency requirements and, therefore, can tolerate Extract, Transform, Load (ETL) pipelines to homogenize and enrich data for analytics. In contrast, real-time use cases have low latency requirements and attempt to reduce or remove the need for ETL processes.

Many analytics systems, like data warehouses, are designed for batch analytics. Batch analytics systems process the data in batches, and data is collected and loaded into the system over time. Rather than having an “always on” design for data processing, they can restrict data processing to specific time intervals to reduce costs. Batching also helps with data compression, reducing the overall storage footprint and making periodic analytics on large-scale data economical.

In contrast, systems designed for real-time analytics have native support for semi-structured data and other modern data formats to avoid ETL processes and achieve low data latency. They are also optimized for compute efficiency to reduce the resources required to constantly process incoming data and execute high-volume queries.

Real-time analytics leads to higher performance

Real-time analytics is increasing demand for the benefits it gives to organizations seeking to deliver the most relevant, responsive user experiences. Real-time personalization and offers ensure users have their experiences tailored to their needs and represent an opportunity for organizations to increase engagement and revenue. Quick, snappy consumer experiences on real-time dashboards increase user adoption. Using embedded real-timeanalytics offers users a better experience, where they don’t have to wait seconds to minutes for data or queries to load. They can interact easily with the data, making it a seamless user experience. For faster time to value, using subsecond queries with low latencies, users can ask several questions about the data and reach faster, better-informed decisions, making them more productive and increasing the number of decisions they can make daily.

For operational analytics use cases, analysts can adjust their operations immediately in response to real-time views of their business, eliminating delays and inefficiencies. This shorter feedback loop also allows organizations to iterate faster when making decisions on their operations.

Time-sensitive interventions address use cases that are inherently very pressing, such as catching security vulnerabilities or optimizing shipping and delivery routes. If users had to wait minutes for the data to be processed and available for querying, they would lose the window of time to make an impact, and real-time analytics ensures optimal results for these use cases.

Why doesn’t everyone do real-time analytics today?

Real-time analytics offers some significant benefits over batch analytics, so why haven’t most organizations moved to a real-time paradigm? There are a number of perceived challenges around cost and complexity in getting to real-time analytics. They include:

  • Ingest and query performance. Real-time analytics requires fast ingestion of large volumes of data, often data streams, and the ability to run sub-second queries on that data. Architecting a system that can meet these requirements is inherently complex.
  • High cost. Retrofitting existing batch analytics solutions to meet data latency and query latency requirements is costly and inefficient. Many systems also have compute tightly coupled to storage, resulting in expensive overprovisioning of resources.
  • Operational complexity. Self-managing the infrastructure and software required for real-time analytics can consume significant time and effort.

5 considerations for a real-time analytics solution

Given the aforementioned challenges, organizations seeking to implement real-time analytics should be evaluating potential solutions on the following dimensions:

  1. Ingestion data latency. The system needs to handle high-velocity data streams. New data should typically be available for querying in 1-2 seconds.
  2. Query latency. The system needs to support complex analytics, with queries returning in milliseconds, even on TBs of data.
  3. Compute efficiency. Look for databases that minimize the compute resources required to sustain high rates of fast ingestion and queries. For example, using a mutable database helps handle updates efficiently, while indexing helps avoid inefficient data scanning.
  4. Efficient scaling. Evaluate systems based on their ability to scale to meet bursty ingest and query workloads. Databases that can scale compute and storage separately will be more cost-effective.
  5. Operational simplicity. Minimize operational and data engineering efforts by using a fully managed cloud database to significantly reduce operational burden.

It is worth noting that cloud-native solutions offer the greatest efficiency and simplicity in meeting the latter two considerations.

Importance of real-time cloud analytics

Cloud capabilities are driving transformation in established companies and enabling new innovative startups. Leveraging the cloud is at the core of any digital transformation strategy. The desire to achieve competitive advantage and meet changing business requirements and customer demands drives the adoption of cloud capabilities and business models, processes, and solutions. As shown in Figure 2, 88% of recent Enterprise Strategy Group research survey respondents view the cloud as a big part of ongoing data analytics strategies, if not the basis.3 Organizations are adopting cloud-native applications to ease the burden of building and managing end-to-end data analytics strategies, ranging from data ingestion to delivering decision-making-ready data to end users.

Figure 2. Use of Cloud Analytics on the Rise

picture3 Intel

Rockset, as a cloud-native database, helps to deliver cloud capabilities for any data-driven organization across industries where real-time, data-driven decisions or actions are essential. Achieving cloud scale is an integral part of the equation. Some of Rockset’s cloud capabilities include:

  • Scalability. Cloud analytics provides the ability to scale computing resources up or down as needed without the limitations of on-premises hardware.
  • Cost-effectiveness. Cloud analytics eliminates the need for significant upfront investment in hardware and IT infrastructure as well as ongoing maintenance costs.
  • Accessibility. Cloud analytics allows organizations to access their data and analytics tools from anywhere with an internet connection, increasing collaboration and decision-making speed.
  • Data integration. Cloud analytics enables data integration from multiple sources, providing a comprehensive business view.
  • Improved security. Cloud analytics provides better security because cloud providers invest heavily in security measures to protect customer data, often exceeding the security capabilities of individual organizations.
  • Disaster recovery. Cloud analytics provides automatic disaster recovery, ensuring the availability of business-critical data and analytics tools.

Rockset is a real-time analytics database that enables companies to build cloud-scale, data-driven applications. Rockset connects to change data capture or event streams, ingests and indexes changes in real-time, and gives subsecond data APIs for search, aggregations, and joins. Data is stored in RocksDB and organized in a Converged Index for faster, compute-efficient data retrieval, avoiding the full table scans that warehouses rely on. Compute-storage separation also allows for efficient scaling in the cloud with less computing.

Rockset has a simple process to empower its customers, as described in Figure 4. It starts with creating an account on Rockset, a fully managed cloud service. The second step is to connect data sources, and Rockset will continuously ingest and index data from streaming sources, databases, and data lakes. All data is indexed so that results are ultimately processed faster. The third step is to save the SQL statement as a Query Lambda, and a REST endpoint is created for the data API. The final step is to hit the REST endpoint from the application’s code, and the results of analytic queries can be returned in milliseconds.

Figure 4 Rockset Speed and Simplicity

picture4 Intel

Rockset with Intel

In order to reach a new level of performance, Rockset has adopted 3rd Gen Intel® Xeon® Scalable processors with built-in AI accelerators through a strategic collaboration with Intel and the Intel Disruptor Program4. The 3rd Gen Intel Xeon Scalable processor has enabled Rockset to push the limits of their real-time analytics database, providing customers up to 84% more throughput for data applications, according to Rockset.

Rockset’s real-time analytics database is built for subsecond analytics on streaming data. Hundreds of modern data applications, including personalization engines, logistics tracking, game monetization, anomaly detection, and IoT applications, are powered by Rockset. The Rockset Converged Index contributes to its performance by accelerating multiple types of queries regardless of the shape of the data.

Use cases of organizations that benefit from Rockset

Organizations are using Rockset to build the real-time analytic applications they need to innovate, increase profitability, and streamline their business. Uses of Rockset include real-time insurance pricing at Aliianz Direct, developer-facing dashboards at Meta, real-time personalization at Whatnot, and powering a logistics SaaS platform at Command Alkon. ESG interviewed staff at the following Rockset customers to understand the value they are realizing.

Windward (windward.ai)

Windward is the leading predictive intelligence company that fuses AI and big data for the global maritime industry through its 360° risk management solution. The Windward platform is a single interface for maritime risk management and domain awareness needs with comprehensive data and advanced technology to solve the toughest maritime challenges. By using maritime domain expertise, augmented by AI and machine learning, Windward empowers its partners to predict what’s ahead, minimizing uncertainty and helping to build resilient organizations.

To understand how Windward uses Rockset, we spoke with Benny Keinan, Vice President of Research & Development. We learned that Windward uses 15-plus unique AI models, 2.2M daily vessel activities, 2,000 weekly risk indicators, 18-plus data sources, and analysis of over 1,400 container ports and over 5,500 container vessels to provide actionable maritime insights. Windward partners use its single platform for actionable predictive intelligence of maritime domain awareness, supply chain management, container tracking, law enforcement, defense, and many more use cases, including risk and compliance management.

Over its 12 years in business, Windward has seen an explosion of data and maintains seven years of historical data live in systems and more than 10 years of archived data. Keinan shared the challenges they were facing before implementing Rockset. Data was analyzed in a batch process, where they would repeatedly analyze the same data sets looking for different insights for different use cases. As the scale of the data grew, the analysis took longer each time. Schedulers were used for this process, and in some cases, a query would be performed hourly on two years of data but not completed within the hour. It was time for a change to meet their growing data volume and their current use cases; there was more they wanted to support, but that was currently impossible.

Windward started a proof of concept (POC) process to test different databases. It found that many were good at a single use case but did not support the necessary multiple uses cases, such as working with streaming data, historical data sets, and new data sets that required reanalyzing data regularly instead of just once. Geo Queries were also a feature requirement that eliminated some of the competitors. After a four-month POC, Rockset was chosen due to its flexibility in managing all Windward use cases.

The first significant change was to switch from a batch process to a streaming process for incoming live data, and Rockset reduced the time for data analysis in this use case. The second significant change focused on historical, current, and future data sets. Windward uses NoSQL MongoDB, which could be replicated on top of Rockset without needing to change the schema. Windward also uses relational databases, which could be replicated on top of Rockset. In both cases, Windward did not need to redefine data models to use Rockset, which was not the case with most other solutions they looked at in the POC.

The result is a streamlined process to meet all of their required use cases. Latency decreased from minutes to sub-seconds for some critical queries, making some impossible queries now possible, which allowed them to expand to larger geographical areas and include deeper history in terms of years of data. This unblocking of use cases had an immediate effect, enabling them to innovate faster and close new sales opportunities in the first months of deployment, setting up the company to innovate faster on new opportunities as they are uncovered. Rockset manages tens of millions of “blips” (e.g., raw data such as a ship), entities (e.g., who owns the ship), and insights (e.g., where it is going).

Xometry

Xometry provides a marketplace for on-demand manufacturing driven by AI. Xometry uses its proprietary technology to enable buyers to source manufactured parts and assemblies through an efficient process that allows fast pricing, availability, and shipping from a network of global manufacturing services companies, ranging from startups to Fortune 100 companies.

To understand how Xometry uses Rockset, we spoke with Graham Taylor, Xometry software integration engineer. We learned that for years the on-demand manufacturing industry suffered from a lack of consistent pricing, driven mainly by existing manufacturing sourcing and procurement processes that were complex, uncertain, costly, and time-consuming.

Xometry has developed the means to generate an instant and accurate price for buyers, allowing sellers to source curated manufacturing opportunities that match their specific processes and capacity. AI is used to accurately and quickly price part designs and lead times and to match them to the appropriate sellers. This allows Xometry to combine part features with data gathered from the financial transactions conducted in the marketplace to construct and continually improve prices across a wide range of designs, materials, and sizes.

Xometry had a latency issue affecting the performance and delivery of decision-making content to users. Xometry turned to Rockset to solve their challenges. The initial decision to go with Rockset was based on side-by-side testing with competitors; Rockset had the fastest streaming capabilities and was chosen. Rockset has been implemented as a real-time analytics tool for data streaming from MongoDB, data normalization, and data transformation.

Graham explained that Xometry has a complex nested MongoDB data structure. Xometry is focused on utilizing Rockset’s inherent streaming power, using it almost like a data warehouse to perform SQL transformations to normalize complex data structures. They then use the Rockset-exposed Lambda functions to query and send data to other places, including Salesforce, where a catalog of available products is maintained. Xometry has many MongoDB collections and nests, normalized using auto-mapping and transformed by Rockset. Another core feature of Rockset is timestamping, which allows Xometry to know when data structures have changed. Being knowledgeable with SQL queries, the Rockset query engine feels very familiar.

Xometry found that Rockset’s native streaming drops the data pipeline from minutes to seconds with a dramatic latency reduction. They also moved from prototype to production in a matter of weeks and genuinely valued the hands-on relationship with the Rockset team, which is invested in their success.

Conclusion

We live in a data-driven world with expectations that using data to drive decisions and actions will only continue to increase in importance over the coming years. Organizations must embrace this reality and look for solutions to accelerate their data analytics initiatives. At Enterprise Strategy Group, our research shows a growing number of organizations seeing the use of data as strategic, with 44% stating that “data helps to support our business” and another 21% going even further to say that “data is our business” and emphasizing the importance across the organization today and for the future 5. This includes e-commerce, applications, IoT, 5G, geospatial, and more. The sources and volumes of data keep growing for organizations, and we also see this growth in the number of internal and external data users. This growth on both sides of the data workflow emphasizes the need to implement solutions.

The two use cases from Xometry and Windward demonstrate the power of Rockset in helping to transform how these companies manage data. In both cases, the companies knew their goals but struggled to achieve them. Rockset was able to unblock the issues, streamline processes, and dramatically increase performance, leading to new innovations and faster access to curated data for real-time analytics. Both organizations saw very positive outcomes from working with Rockset.

Rockset has emerged as a leader in real-time data analytics databases designed to help any size organization accelerate their use of data at cloud scale. Rockset has cost-effectively simplified the process of collecting and ingesting data, processing and storing the data, and delivering real-time analytics to empower businesses and consumers. If you are serious about using data to drive your business, innovate faster, and increase productivity, we highly recommend contacting Rockset for a demonstration and discussion.

Sign up for a demo.

1 Source: Enterprise Strategy Group Research Report, Cloud Analytics Trends, March 2022.

2 Source: Enterprise Strategy Group Complete Survey Results, Cloud Analytics Survey, November 2021.  

3 Source: Enterprise Strategy Group Research Report, Cloud Analytics Trends, March 2022.  

4 Source: Rockset Press Release, Rockset Achieves 84% Faster Performance For Real-Time Analytics With Intel Xeon Scalable Processors, November 2022.  

5 Source: Enterprise Strategy Group Research Report, The Evolution of Intelligent Data Management, April 2022.

Copyright © 2023 IDG Communications, Inc.

Source

Leave a Reply

Your email address will not be published. Required fields are marked *