Business Intelligence

business-intelligence-photoIn this article, I decided to take a look at Business Intelligence.  So much is written about the topic that it feels a little bit like saying I’m going to write about religion.  There are so many frameworks, concepts and belief systems, you have to ask yourself the question which one do I follow?  There are vendor based articles, white papers, academic writings, journals etc. all attempting to pitch or sell their version of the truth. I’m not sure if I’m agnostic or have chosen a path to follow at this point.

But let’s start out by looking at what Business Intelligence is:

“Business Intelligence is a term used by hardware and software vendors and information technology consultants to describe the infrastructure for warehousing, integrating, reporting, and analysing data that come from the business environment, including big data.” (Laudon & Laudon, 2014, p 492).

“So, stripped to its essentials, business intelligence and analytics are about integrating all the information streams produced by a firm into a single, coherent enterprise-wide set of data, and then, using modelling, statistical analysis tools (like normal distributions, correlation and regression analysis, Chi square analysis, forecasting, and cluster analysis), and data mining tools (pattern discovery and machine learning), to make sense out of all these data so managers can make better decisions and better plans, or at least know quickly when their firms are failing to meet planned targets.” (Laudon & Laudon, 2014, p 492).

So what does this mean?  We want to take some data, turn it into information, create some knowledge and make informed decisions and choices.

data to insight image

A more complex version of this can be seen in the below framework.  I find visualising the information a much stronger way of understanding a concept.

http://www.slideshare.net/arunvanlvanoor/business-intelligence-14961814

business-intelligence-framework

My earlier blogs have looked at the power of visualisation using Google Fusion table and R for statistical analysis and visualisation, I thought it might be useful to continue with this theme.  But before I go any further, I can’t talk about visualisations without mentioning my three favourites visualisation websites, http://www.informationisbeautiful.net/ , http://flowingdata.com/ and http://fivethirtyeight.com/.

BI Tools for Analytics and Visualisation

BI Tools can bring so much to the business: (remember what we trying to do, take some data, do something interesting and make decisions) – the problem nowadays is the volume, variety and velocity of the data.  Enhanced reporting, speed and analytics can,

  • Drive sales, through better forecasting, sales team performance
  • Improve customer satisfaction through enhanced call centre capabilities
  • Optimised manufacturing processes driving operational efficiency
  • Marketing campaign effectiveness and competitive advantage
  • Financial analytics

The list is endless in terms of delivering business benefits.

The Gartner Magic Quadrant highlights a number of available tools and technologies for data analysis and visualisation.  Their report discusses a potentially growing gap between traditional vendor products such as SAS, Oracle and IBM and the growth of products such as Tableau and QLike.  Gartner reports that businesses are choosing products like Tableau and Qlike for their ease of use over other products which are potentially more fit for purpose.  So in this context, let’s take a look at a couple.

Gartner Magic quadrant for BI & Analytics Platforms

 

Tableau

According to the Wall Street Journal, Tableau was born out of the simple idea that databases should generate pictures instead of a bunch of numbers.   A business intelligence software that helps people see and understand their data.  Tableau comes in a number of different flavours depending on the size of your organisation, Tableau Server, Tableau On-line and Tableau Public (To mention just a couple).  Industry experts agree that Tableau is head and shoulders above the competition for easy to implement and use data visualisation tools.

tableau dashboard

Birst

Birst’s website tagline of “The Best of Both Worlds – Enterprise BI with Blazing Fast Data Discovery” encompasses what Business Intelligence is all about.  Birst claim to offer “the only enterprise business intelligence platform that connects together the entire organization through a network of interwoven virtualized BI instances on-top a shared common analytical fabric. Birst enterprise BI delivers the speed, self-service, and agility front-line business workers demand, and the scale, security, and control to meet rigorous corporate data standards. Birst delivers all of this and much more with low TCO via public or private cloud configurations.” https://www.birst.com/why-birst/.

Forrester (Leading research and advisory firm)  seem to largely agree with Birst in their claim – see Forrester Report.

 

Forrester Wave – Cloud BI platforms forrester wave image

It’s not my intention in this article to provide a list of pros and cons of different software and hardware solutions.  It is merely to briefly highlight a couple of options.  Choosing the right solution for your Business Intelligence needs really depends on the type of business you have, the size, nature, geographical dispersal and available budget.  Ultimately it will come down to your business needs, do you want significant predictive analytics, do you have large volumes of unstructured data, maybe you don’t really know.  Think about the data that you have currently, what are you likely to have in the future and what would you like to do with it.  There are endless resources available to help you consider what approach to take and how you can achieve Return on Investment.  I think these links to calculating ROI are interesting.

 

References

Continue reading Business Intelligence

Data Quality – Garbage in Garbage out

While reviewing the content of this course “Data Management and Analytics”, and considering my next report topic, it occurred to me that there is a very strong central theme throughout the course – “Data”.  Ok, so this is stating the blindingly obvious but it does underpin nearly everything in the business world.  But it not just Data though is it?  Data is simply a series of charagarbage in out imagecters, a mixture of alphanumeric digits until we put some context to it.  Ultimately, it’s what we do with data, how and where we do it that gives us any form of realistic meaning.  Buzzwords of the decade include Big Data, Data Analytics and Business Intelligence are all reliant on data.  However, all of these trends would be useless without data, but more importantly meaningful data.

The quality of the data we use determines and underpins the success, of lack thereof in our daily decisions.  It is for this reason, I believe data quality should be front and centre of the buzzwords for the decade.

Data Quality

There are well recognised papers by industry experts that advocate 4 core dimensions of Data Quality.  Nancy Couture in her paper on “Implementing an Enterprise Data Quality Strategy” (2013) suggested “fitness for use” as a broad definition when considering a data quality assessment programme.  In this article, it is suggested, rather than trying to focus on every dimension, start by focusing on the basics of completeness and timeliness and then move on to validity and consistency.

Components

The below is a simple illustration of the dimensions of data quality.

data-quality-dimensions

As illustrated above, there are 6 core dimensions to data quality.

Completeness can be described as the expected comprehensiveness.  Data can be complete even if optional data is missing.  For example, customer contact information should hold name, address and phone number as mandatory fields but potentially have customer name middle initial as optional.  Remember though that data can be complete but not accurate.

planet field cartoon

Timeliness “Delayed data is data denied”.  Timeliness is really about having the right information at the right time.  User expectation drives timeliness.  For example, income tax returns are due on a certain date, filing late returns incurs a penalty.    In the good old days, we went to a travel agent to book a holiday. Nowadays, the user expectation is to be able to see real time availability and price. We suffer real frustration in decision making when occasionally we come across a system where real time information is not available.  According to Jim Harris of Information-Management, due to the increasing demand for real-time data-driven decisions, timeliness is the most important dimension of data quality.

Consistency of data refers to data across the organisation being in sync with each other.  Identical information available across all processes and departments in an organisation.  This can be difficult to achieve where there are multiple processing systems taking information from potentially different sources. A Master Data Management (MDM) strategy seeks to address inconsistency.  In database parlance, consistency problems may arise during database recovery situations.  In this case it is essential to understand the back-up methodologies and how the primary  data is created and accessed.

Validity – Is the data itself valid?  Validation rules are required to ensure the capturing of data in a particular manner ensure that the detail is valid.  Ensuring that the same fields are used consistently for the same information capture.  Nancy Couture describes validity as “correctness” of the actual data content.  This is the concept that most data consumers think about when they envision data quality.

Integrity refers to Data that has a complete or whole structure i.e. overall completeness, accuracy and consistency.    The business rules define how pieces of data relate to each other in order to define the integrity of the data.  Data integrity is usually built into database design with the use of entity and referential integrity rules.

Accuracy.  Data values stored for an object are the correct values.  It may seem an obvious component of the data quality dimension but the data that is captured needs to correct i.e. accurate.  There are two aspects, one is that the recording of the information is correctly recorded as in without typo and data entry error.  The second is that data needs to be represented in a consistent and unambiguous form.  For example, the manner in which a date of birth is recorded, US style 12/10/1972 or European style 10/12/1972.  So when is the birthday?  Good database design should resolve issues on this nature.

cartoon - metadata

Business Benefits

Data Quality as a subset of Data Management is aligned with Master Data Management (MDM) and Data Governance.  They all focus on Data as an asset to the business.  Modern business parlance seeks to find a Return on Investment (ROI) from their Data Management strategies.

Data Analytics: With quality data, we can undertake sound analysis of the business and improve the quality of decision making which in turn improves business performance.  The business can investigate potentially new areas of revenue not previously considered.

Timeliness of good data and analytics affords new opportunities to reach the market with new offerings ahead of the competition.  Further competitive edge can be achieved with rapid decision turnaround, rapid reaction to market conditions.  Predictive analytics can lead to a proactive position in the marketplace.

Customer satisfaction ratings can be improved through improved accurate interaction with the business.

Customer trust in the information and how it is stored is likely to be important in the future.

“Gartner predicts that 30 percent of businesses will have begun directly or indirectly monetizing information assets via bartering or selling them outright by 2016”.

Compliance: Knowing your organisational data i.e. who, what, where, how, why and when goes a long way towards achieving compliance.  Whether it’s compliance with Data Protection requirements, Financial regulations, compliance with Sarbanes-Oxley (SOX), PCI Security (Payment Card Industry) or seeking to achieve ISO 8000, the International Standard for Data Quality.

This is by no means an exhaustive list of the business benefit of good Data Quality.  What about the cost to business of poor data quality?  It depends on the business.

Customers: Poor data, leading to poor marketing, sales, support or service experience will cost your business customers and revenue.

Shareholders: Data accuracy, auditability, transparency are crucial to stakeholder’s trust.  Loss of trust will mean downgrading of shares and weak stock market performance.

Employee Productivity and Retention: Endless hours spent scrubbing data for report input reduces employee performance and leads to poor morale and ultimately staff churn.

The list of impacts on the business of poor quality data is endless.

Perspective

Taking a step back, it is a matter of perspective.  Some aspects of Data Quality are critical to the business, others less so.  It is a matter of prioritisation and understanding the impact / risk and/ or advantage to the business of seeking to pursue Data quality.  But therein lies the Catch 22, if your data quality is not good enough how can you make balanced informed decisions?

References

Continue reading Data Quality – Garbage in Garbage out

Is Big Data Incompatible with Data Protection?

anonymisation imageI think in order to answer this question; we need to firstly look at what Big Data is.  There is no one definition, but I think this is a pretty good one: “the term big data is used to describe datasets with volumes so huge that they are beyond the ability of typical Database Management systems to capture, store and analyse”.

Gartner analyst, Doug Laney devised the 3V’s model for Big Data.  Gartner’s definition is “Big data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.”

3 V's of Big Data

There are many others who have joined the mix, such as IBM with their 4th V – Veracity  and the Wired Article by Neil Biehn advocating The missing V’s of Big Data – Viability and Value.

We look at what big data can do in terms of advancement in science and medicine, predictive analytics and we are amazed at the cleverness of it all.  The ability to predict our intelligence based on whether we liked a curly fries image on Facebook is both amazing and disturbing at the same time.  Ted Talk – Jennifer Goldbeck, The Curly Fry Conundrum.

I am fascinated by the opportunities that Big Data analytics can bring, but I am more than a little concerned about what can and will happen in the future if our data is used for less than advantageous means.  Let’s take for example a recent documentary I watched – BBC Horizon – The Age of Big Data

The programme addressed how Big Data was used for crime prediction in Los Angeles, the analysis was so great it was possible to predict where and when and possibly by whom the next crime would be committed.  Does that mean I could be pre-imprisoned just in case?  Ok, this is an extreme example, but the movie “Minority Report” comes to mind.

So, who is watching Big Brother?

Thankfully in Europe we have strong Data Protection Regulations, which are due to get stronger with the introduction of GDPR (General Data Regulation Regulation) April 2016. See my recent blog on the Data Protection Road Map.

An extensive document published by ICO – UK, Big Data and Data Protection, sought to discuss and address the implications and compatibility of Big Data and Data Protection. If one looks at the core principles of data protection in the context of big data and big data analytics, there are some key concerns to be addressed.  ICO has captured a summary of practical aspects to consider when using personal data for Big Data analytics:

DP Summary for Big Data

2 Important Points to Remember

  • Big Data is characterised by volume, variety, velocity of “all” data.
  • Data Protection is interested because it involves the processing of personal data.

So does this alleviate concerns?

Potentially yes, there are many methods and tools available to organisations that not only protect our personal data but also remove the individual identifiable element.  Anonymisation is one approach.

Applied correctly, anonymisation means data is no longer personal data.  Anonymisation seeks to strip out any identifier information such that the individual can no longer be identified by the data alone or in combination with other data.  Anonymisation is not just about sanitizing the data, it is also a means of mitigating the risk of inadvertent disclosure or loss of personal data.  Organisations will need to demonstrate anonymisation was carried out in a most robust manner.  From a business perspective, this should be balanced with adopting solutions that are proportionate to the risk.

ICO has published an extensive Anonymisation Code of Practice, which they claim is the first of its kind from any European Data Protection authority.   It provides excellent guidance and also suggests some anonymisation techniques; which include: data masking, pseudonymisation, aggregation, derived data items and banding.  A further useful resource is UKAN UK Anonymisation Network.

Is Big Data Compatible with Data Protection or not?

Ultimately I believe it’s not actually about compatibility.  Big Data and Data Protection are not mutually exclusive.  The must and do co-exist.  The challenge for organisations is and will be more so in the future, that of building trust with individuals and operating ethically.

Data Protection principles should not be seen as a barrier to Big Data progress.   Applying core principles such as fairness, transparency and consent as a framework to trust and ethics will encourage innovative ways of informing and engaging with the public in the future.

anonymisation image

References & Bibliography

Continue reading Is Big Data Incompatible with Data Protection?