Join Our Webinar On 7th June 2024 - Building Real Time Analytics Solutions With ClickHouse Cloud

Read More

Real Time Data and Analytics

Benjamin Wootton

Benjamin Wootton

Follow me on LinkedIn
Real Time Data and Analytics

The vast majority of Business Intelligence and Analytics solutions in place today operate on out-of-date, backwards looking, historical data.

When you are operating on strategic timeframes and asking long-term questions, e.g. about last quarter or last weeks sales data, this is absolutely fine. The benefits of analysing up-to-the-minute data are minimal.

However, there are many scenarios where businesses are looking to move towards much more real-time solutions, e.g.:

  • Using data for operational purposes, for instance guiding employees "next best action" continually throughout the day;
  • Real-time monitoring of critical KPIs and indicators in order to identify problems and opportunities early;
  • Optimising or facilitating the customer experience in real time;
  • Security, compliance, regulatory or safety controls where the business is at risk due to slow data processing.

In all of these situations, the value of data decays over time. The earlier we can get it into the hands of our employees, algorithms and customers the better - even down to millisecond granularity.

The Challenges Associated With Real-Time Data

Moving from traditional Business Intelligence towards more real-time processing is however a challenging technical problem to solve.

Much of the data and analytics world is based on centralised data warehouses, with infrequent batch ETL jobs loading data on an hourly or even daily basis. Once the data is loaded, the main consumption model is through reports and dashboards which may be infrequently accessed by senior leaders in the business.

Businesses need to modernise from this situations towards "streaming" events in real-time, where they are bought into into centralised data lakes and stream processors where they can be analysed and processed proactively and automatically.

Many scenarios also need to be very accurate, in some instances providing exactly once processing such that we never lose or never double process a message. To achieve this, every part of the technology stack needs to be reliable to failure.

The analytics we need to perform over real-time data could be complex in nature. For instance, we might need to aggregate data and ask questions across data streams and across time windows and spanning both historical and real-time data. We also need to deal with edge scenarios such as errors, anomalies, duplicate or late arriving data.

Individually all of these are solveable, but to deploy real-time data processing with good performance, reliability and which provides the type of complex analytics that we need to can be a large undertaking.

Why Now?

Companies across industries are investing in becoming more data-driven, using their data more intelligently and in real-time to improve their customer experience and business efficiency. Well-trodden case studies include technology companies. such as Google, Amazon, Uber or Netflix who use their vast quantities of data to offer amazing digital experiences. Companies that don't make this leap will over time likely fall behind in terms of customer experience and operational efficiently.

As this is a complex technology modernisation journey, we believe it is worthwhile starting early. Look to implement one or two streaming use cases in the business, create a Microservice to drive some real time change, and offer your employees a taste of real time data on the user expeirence. The benefits will immediately become apparent as a new feature or two hits production.

As your business operates day to day, a number of events are taking place. Examples include orders, dispatches, customer enquiries or complaints, data from connected devices and of course those events more specific to your industry.

To build a truly compelling customer experience and an efficient business, we need to be able to monitor, analyse and react to these events in real time, and make automatic changes in response to them.

Unfortunately, many businesses do not have this real time data processing capability. Instead, their data is trapped in siloed systems and perhaps integrated into a data warehouse with batch “extract, transform and load” where it is then analysed by humans days or weeks after the events have happened.

Real Time is about moving beyond this, using modern data and analytics to understand what is happening “right now” across your business, and respond to situations immediately, intelligently and automatically in order to improve business performance.

The Benefits Of Real Time Data & Analytics

By processing data in real time, businesses can dramatically improve their performance and bottom line.

In addition to using data for strategic long-term decision making, you'll be able to leverage it for operational analytics purposes to identify immediate the steps you can take right now to grow your business and improve efficiency.

This of course improves the customer experience, which becomes more proactive and more personalised through data insights.

Likewise, employee experience can be improved by arming your people with an “up to the minute” view of what is taking place right now, advising on their “next best action”.

Using streaming real time data, we can also detect anomalies and situations of interest as they happen, and ideally intervene quickly or automatically before they ever impact a KPI.

This all ultimately feeds through to increased market share and revenue, and decreased costs by operating a more efficient business.

The Challenges Associated With Real Time Analytics

Real Time Data processing is a very valuable capability. However, technically it is a hard problem to solve.

First, it is a data intensive task, potentially requiring thousands of events to be processed in parallel and with low latency. These can vary in load, such as a sudden spike in user activity or machine data which needs to be ingested without delay.

Many scenarios also need to be very accurate, in some instances providing exactly once processing such that we never lose or never double process a message. To achieve this, every part of the technology stack needs to be reliable to failure.

The analytics we need to perform over real time data could be complex in nature. For instance, we might need to aggregate data and ask questions across data streams and across time windows and spanning both historical and real time data. We also need to deal with edge scenarios such as errors, anomalies, duplicate or late arriving data.

Individually all of these are solveable, but to deploy real time event stream processing with good performance, reliability and which provides the type of complex analytics that we need to can be a large undertaking.

From Dashboards to Automated Reports

The immediate opportunity is to expose real time analytics to your users through dashboards and reports, so that they can see operationally what is taking place in the business.

This turns business intelligence from a backwards looking activity into something which can guide your employees “next best actions” in order to improve business outcomes

However, the real value starts to come when we automate responses to situations before they even impact a KPI. For instance, having identified that a certain warehouse is shipping orders late, we might wish to re-route orders to another destination until the original warehouse has caught up.

This is the definition of an intelligent business. where we are observing the state of the world, processing data intelligently, and automating responses intelligently and automatically. Streaming Analytics are the basis for doing this.

Dashboards

Most analytics and business intelligence programmes target reports and dashboards as a means of eventually communicating the insights to users.

Users need to login to access those reports which are updated infrequently, sometimes with hours or days of delays. They then need to interpret the data and action the recommendations, perhaps communicating to other people in their organisation to action the finding manually.

This end to end process full of human dependencies introduces a big delay from identifying the insight to having it actioned in the real world.

This setup is simply not good enough for todays fast paced business world. In almost all cases, the earlier we can use the insight the better, and the gold standard should be real time:

  • If we notice that an order has just passed its SLA for being shipped, we should instantly trigger an alert rather than wait for the next batch;
  • If a customer cancels their order, an event should immediately be sent to the warehouse to prevent it being shipped and returned;
  • If a potentially fraudulent transaction is logged, we should identify this before accepting the payment;
  • If a hotel experiences a spike in demand, prices should instantly be adjusted in order to maximise revenue.

As the examples above highlight, ideally we would like to identify some situation and then action the response automatically, using an API call or some other integration directly into the business process or user experience.

To get there, Data and Analytics needs to evolve as a field. Putting insights onto a Dashboard and Report is simply not fast enough, because they rely on a human in the chain to review and action. Instead, we should be moving towards real-time intelligence to decide on the next best action, and then automating our responses in real time.

This idea was further discussed in our post on closed loop analytics.

In many situations, the earlier we respond to incoming data the better. This might be in a genuinely real time situation such as a self driving car, a trading system or a fraud check, or a more vanilla business scenario such as a product out of stock which we hope to inform our users about as soon as possible.

The value of data is said to decay over time. The sooner we can respond as a business, the sooner we can use the data to improve the customer experience, operate more efficiently or capture revenues. If too much time passes after capturing the data, these opportunities fall away exponentially.

For this reason, many companies are looking to process their data much faster, if not in real time, as part of their digital transformation ambitions.

This can however be technically challenging with traditional approaches to data engineering and business intelligence, which are more based around periodic delivery of batch data and relatively simple slice and dice analysis once it's received.

The first thing companies need to do is refresh and re-engineer their data platforms to deliver data faster. This could involve something simple like more frequent extract, transform and load from source systems, or something more complex such as moving to a streaming architecture. This data would then commonly be ingested into storage such as a data warehouse or data lake, and made visible through reports and dashboards earlier than it has been historically.

For many companies and business scenarios, slightly faster delivery of data into the hands of business users might be enough. If you have a few tens of thousands of rows in a relational database, putting a dashboard over the top and getting a relatively real time view of the business is feasible. It's effectively a minimum maturity level for real time analytics though.

{
  "firstName": "John",
  "lastName": "Smith",
  "age": 25
}

In many situations, the earlier we respond to incoming data the better. This might be in a genuinely real time situation such as a self driving car, a trading system or a fraud check, or a more vanilla business scenario such as a product out of stock which we hope to inform our users about as soon as possible.

The value of data is said to decay over time. The sooner we can respond as a business, the sooner we can use the data to improve the customer experience, operate more efficiently or capture revenues. If too much time passes after capturing the data, these opportunities fall away exponentially.

For this reason, many companies are looking to process their data much faster, if not in real time, as part of their digital transformation ambitions.

This can however be technically challenging with traditional approaches to data engineering and business intelligence, which are more based around periodic delivery of batch data and relatively simple slice and dice analysis once it's received.

The first thing companies need to do is refresh and re-engineer their data platforms to deliver data faster. This could involve something simple like more frequent extract, transform and load from source systems, or something more complex such as moving to a streaming architecture. This data would then commonly be ingested into storage such as a data warehouse or data lake, and made visible through reports and dashboards earlier than it has been historically.

For many companies and business scenarios, slightly faster delivery of data into the hands of business users might be enough. If you have a few tens of thousands of rows in a relational database, putting a dashboard over the top and getting a relatively real time view of the business is feasible. It's effectively a minimum maturity level for real time analytics though.

Join our mailing list for regular insights:

We help enterprise organisations deploy advanced data, analytics and AI enabled systems based on modern cloud-native technology.

© 2024 Ensemble. All Rights Reserved.