An Introduction to
Advanced Analytics and Data Mining
Transforming Data
Dr Barry Leventhal
April 2014
© Copyright BarryAnalytics Limited All Rights Reserved
Agenda
• What are Advanced Analytics and Data Mining?
• The toolkit of data mining techniques
• Some issues to keep in mind
Transforming Data
• Which technique should you use?
A process of discovering and interpreting
patterns in (often large) data sets
in order to solve business problems
What is Data Mining?
Transforming Data
Converts Data into Information
Pattern Information ActionData
What is Advanced Analytics?
“Any solution that supports the identification of meaningful
patterns and correlations among variables in complex, structured
and unstructured, historical, and potential future data sets for the
purposes of predicting future events and assessing the
Data
Visualisation
Web
Analytics
Text
Analysis
Transforming Data
purposes of predicting future events and assessing the
attractiveness of various courses of action. Advanced Analytics
typically incorporate such functionality as data mining, descriptive
modelling, econometrics, forecasting, operations research
optimisation, predictive modelling, simulations, statistics and text
analytics.” (Source: Forrester Research)
Data
Mining
Contact
Optimisation
Simulation
Social
Network
Analysis
How can Advanced Analytics help?
• By helping companies to increase revenues or reduce costs
increase
revenues
reduce
costs
Transforming Data
improve
profit
Tom Davenport:
“Companies have long used business
intelligence for specific applications, but these
initiatives were too narrow to affect corporate
performance. Now, leading firms are basing their
competitive strategies on the sophisticated
analysis of business data.”
Where can Advanced Analytics add Value?
Store location
Product
management
Customer
Transportation
/Fleet
management
Resource
planning
Transforming Data
Customer
management Web site
management
Anywhere else
where I have
large numbers to
manage
planning
The Toolkit of Data Mining Techniques
Transforming Data
Traditional
Statistics
• Regression Models
• Survival Analysis
• Factor Analysis
• Cluster Analysis
• CHAID
Machine Learning
• Rule Induction
• Neural Networks
• Genetic Algorithms
Two main types of Analytical Model
Type 1: Models driven by a Target Variable
e.g. Which customers to cross sell?
- Implies building a Predictive Model
- ‘Directed’ Data Mining Techniques
Transforming Data
- ‘Directed’ Data Mining Techniques
Type 2: Models with no Target Variable
e.g. What are our most important customer segments?
- Implies a Descriptive Model
- ‘Undirected’ Data Mining Techniques
The Data Mining Process
Transforming Data
Data
Cross Industry Standard Process for
Data Mining
Reference:
Step-by-step data mining guide
CRISP-DM 1.0
Some issues to keep in mind:
Issue 1: Use an appropriate technique
• Some years ago, the DMA Targeting & Statistics
Group held a seminar to explain and compare four
analytical techniques:
– Cluster Analysis
– Decision Tree
– Neural Network (supervised)
Transforming Data
– Neural Network (supervised)
– Regression Model
• The four techniques were applied to a sample of
lifestyle data in order to predict private healthcare
cover
Comparison of private healthcare targeting via
four analytical techniques
300
350
400
Some issues to keep in mind:
Issue 1: Use an appropriate technique
Transforming Data
100
150
200
250
10% 20% 30% 40% 50%
Cluster Analysis Decision Tree Regression Model Neural Net
Source:
CMT/ DMA Targeting & Statistics Interest Group
Issue 2: Modelling and Deploying are separate
stages in the data mining process
Historical
Data
+ Known
Outcomes
Modelling
Model
Deploying
Transforming Data
• “Modelling” is ‘one-off’ – until model requires rebuild
• “Deploying” takes place repeatedly
Recent
Data
+
Deploying
Model
Predictions
Issue 3: Do not forget your data!
Your data is the key to gaining value from analytics
and modelling – essentials to consider:
• Data quality
Transforming Data
• Data predictivity
• Data integration
• Data governance
The Importance of Data Integration
Customer
Transactions
Customer
Attributes
Integration
• Business value increased by integrating complementary datasets
• New insights may be created by data integration, e.g.
Transforming Data
Market Research Online Behaviour
Integration
Many applications of data integration...
• Predictive models to target behaviours identified by research
• Integration of web, email and traditional offline channels
• Tracking across channels, e.g. Attribution of media effects
Which analytical technique should you use ?
The choice generally depends on…
• business problem
• whether problem is predictive or descriptive
• underlying data environment
• variables to be predicted or described
Transforming Data
• variables to be predicted or described
• ability to implement solution
• whether key statistical assumptions hold
• Obtain help from a Statistician or Data Mining Consultant
• All about the problem, not the technique
• Combination of approaches works best
Thank you!
Barry Leventhal
Transforming Data
+44 (0)7803 231870
Barry@barryanalytics.com

An Introduction to Advanced analytics and data mining

  • 1.
    An Introduction to AdvancedAnalytics and Data Mining Transforming Data Dr Barry Leventhal April 2014 © Copyright BarryAnalytics Limited All Rights Reserved
  • 2.
    Agenda • What areAdvanced Analytics and Data Mining? • The toolkit of data mining techniques • Some issues to keep in mind Transforming Data • Which technique should you use?
  • 3.
    A process ofdiscovering and interpreting patterns in (often large) data sets in order to solve business problems What is Data Mining? Transforming Data Converts Data into Information Pattern Information ActionData
  • 4.
    What is AdvancedAnalytics? “Any solution that supports the identification of meaningful patterns and correlations among variables in complex, structured and unstructured, historical, and potential future data sets for the purposes of predicting future events and assessing the Data Visualisation Web Analytics Text Analysis Transforming Data purposes of predicting future events and assessing the attractiveness of various courses of action. Advanced Analytics typically incorporate such functionality as data mining, descriptive modelling, econometrics, forecasting, operations research optimisation, predictive modelling, simulations, statistics and text analytics.” (Source: Forrester Research) Data Mining Contact Optimisation Simulation Social Network Analysis
  • 5.
    How can AdvancedAnalytics help? • By helping companies to increase revenues or reduce costs increase revenues reduce costs Transforming Data improve profit Tom Davenport: “Companies have long used business intelligence for specific applications, but these initiatives were too narrow to affect corporate performance. Now, leading firms are basing their competitive strategies on the sophisticated analysis of business data.”
  • 6.
    Where can AdvancedAnalytics add Value? Store location Product management Customer Transportation /Fleet management Resource planning Transforming Data Customer management Web site management Anywhere else where I have large numbers to manage planning
  • 7.
    The Toolkit ofData Mining Techniques Transforming Data Traditional Statistics • Regression Models • Survival Analysis • Factor Analysis • Cluster Analysis • CHAID Machine Learning • Rule Induction • Neural Networks • Genetic Algorithms
  • 8.
    Two main typesof Analytical Model Type 1: Models driven by a Target Variable e.g. Which customers to cross sell? - Implies building a Predictive Model - ‘Directed’ Data Mining Techniques Transforming Data - ‘Directed’ Data Mining Techniques Type 2: Models with no Target Variable e.g. What are our most important customer segments? - Implies a Descriptive Model - ‘Undirected’ Data Mining Techniques
  • 9.
    The Data MiningProcess Transforming Data Data Cross Industry Standard Process for Data Mining Reference: Step-by-step data mining guide CRISP-DM 1.0
  • 10.
    Some issues tokeep in mind: Issue 1: Use an appropriate technique • Some years ago, the DMA Targeting & Statistics Group held a seminar to explain and compare four analytical techniques: – Cluster Analysis – Decision Tree – Neural Network (supervised) Transforming Data – Neural Network (supervised) – Regression Model • The four techniques were applied to a sample of lifestyle data in order to predict private healthcare cover
  • 11.
    Comparison of privatehealthcare targeting via four analytical techniques 300 350 400 Some issues to keep in mind: Issue 1: Use an appropriate technique Transforming Data 100 150 200 250 10% 20% 30% 40% 50% Cluster Analysis Decision Tree Regression Model Neural Net Source: CMT/ DMA Targeting & Statistics Interest Group
  • 12.
    Issue 2: Modellingand Deploying are separate stages in the data mining process Historical Data + Known Outcomes Modelling Model Deploying Transforming Data • “Modelling” is ‘one-off’ – until model requires rebuild • “Deploying” takes place repeatedly Recent Data + Deploying Model Predictions
  • 13.
    Issue 3: Donot forget your data! Your data is the key to gaining value from analytics and modelling – essentials to consider: • Data quality Transforming Data • Data predictivity • Data integration • Data governance
  • 14.
    The Importance ofData Integration Customer Transactions Customer Attributes Integration • Business value increased by integrating complementary datasets • New insights may be created by data integration, e.g. Transforming Data Market Research Online Behaviour Integration Many applications of data integration... • Predictive models to target behaviours identified by research • Integration of web, email and traditional offline channels • Tracking across channels, e.g. Attribution of media effects
  • 15.
    Which analytical techniqueshould you use ? The choice generally depends on… • business problem • whether problem is predictive or descriptive • underlying data environment • variables to be predicted or described Transforming Data • variables to be predicted or described • ability to implement solution • whether key statistical assumptions hold • Obtain help from a Statistician or Data Mining Consultant • All about the problem, not the technique • Combination of approaches works best
  • 16.