Last evening I was privileged to speak at the London Business Analytics Meetup. The talk was titled "An Introduction to Predictive Analytics". I promised the audience that due to my lack of slides I'd summarise my comments into a reference blog with some interesting links and resources to refer to.
A high level introduction into predictive analytics
The measurement of rugby players is a good example of predictive analytics at work.
A general framework for describing the analytical maturity of a given organisation, which is composed of three parts:
These companies are mature when it comes to gathering, aggregating, modeling and using data within their organisations. They have had data science teams for several years, if not since inception, and are applying forecasting and prediction techniques to well formed problems. They have typically invested heavily in systems, which are either built internally, purchased (Cloudera Stack), open-source (Hadoop), or a combination of both.
Earlier in the data analytics life cycle. The management or leadership are aware of the potential for value to exist within the various data sources across the organisation. They may have already invested in some tools and people to start attempting to understand the data and to begin extracting it's potential value.
This is a catch all for the rest of the world. Not meant in a negative or disparaging way - just a mental model. There are companies where they are simply too small to require prediction. Or within their industry prediction isn't possible due to the particular aspects of the problem space. Or the problem is of a form that isn't computationally possible in anything less than super-polynomial time as a function of the size of the inputs. If you are interested in understanding more about algorithmically hard problems the following is a good book, doesn't cover data science or prediction though! Focused primarily on algorithmic problems and complexity: Algorithmics: The Spirit Computing
Organisations at this stage are often looking for exploratory tools, initial prototypes to interrogate the data and to identify potential clusters of customer behaviour.
Spend more time defining the problem you are attempting to solve than you think you need.
In the end these are expert systems. Humans need to use them and understand the output effectively. This is where design, UX and data visualisation is vital. The world doesn't need another dashboard.... Data Viz is a huge topic and unless you expect your Account managers (or whomever) to interpret K-Means Clustering in an iPython notebook you will need to translate the output of your analysis into a consumable format.
Literature: (various books I've found useful or have been recommended to me)