Data Mining Task Primitives Issues in Data Mining

Topics Covered
Data Mining Task Primitive
Integration of Data Mining systems
Major issues in Data Mining

Data Mining Task Primitive
A data mining task can be specified in the form of a data
mining query, which is input to the data mining system.
A data mining query is defined in terms of data mining
task primitives.
These primitives allow the user to interactively
communicate with the data mining system during the
mining process to discover interesting patterns.

List of Data Mining Task Primitives
Set of task relevant data to be mined.
Kind of knowledge to be mined.
Background knowledge to be used in discovery process.
Interestingness measures and thresholds for pattern
evaluation.
Representation for visualizing the discovered patterns.

Set of task relevant data to be mined:
This specifies the portions of the database or the set of data
in which the user is interested.
This portion includes the following
Database Attributes
Data Warehouse dimensions of interest
For example, suppose that you are a manager of All
Electronics in charge of sales in the United States and
Canada. You would like to study the buying trends of
customers in Canada. Rather than mining on the entire
database. These are referred to as relevant attributes.

Kind of knowledge to be mined
This specifies the data mining functions to be performed,
such as
Characterization& Discrimination
Association
Classification
Clustering
Prediction
Outlier analysis
For instance, if studying the buying habits of customers in
Canada, you may choose to mine associations between
customer profiles and the items that these customers like
to buy.

Background knowledge to be used in discovery process
Users can specify background knowledge, or knowledge
about the domain to be mined. This knowledge is useful
for guiding the knowledge discovery process, and for
evaluating the patterns found. User beliefs about
relationship in the data.
There are several kinds of background knowledge.
 Concept hierarchies are a popular form of background
knowledge, which allow data to be mined at multiple
levels of abstraction.

An example of a concept
hierarchy for the attribute
(or dimension) age is
shown in the following
Figure.
the root node represents
the most general
abstraction level, denoted
as all.

Interestingness measures and thresholds for pattern evaluation
The Interestingness measures are used to separate
interesting and uninteresting patterns from the
knowledge. They may be used to guide the mining
process, or after discovery, to evaluate the discovered
patterns. Different kinds of knowledge may have different
interestingness measures.
For example, interesting measures for association rules
include support and confidence.

Representation for visualizing the discovered patterns
This refers to the form in which discovered patterns
are to be displayed.
Users can choose from different forms for
knowledge presentation, such as rules, tables,
reports, charts, graphs, decision trees, and cubes.

Data Mining Task Primitives Issues in Data Mining

More Related Content

Similar to Data Mining Task Primitives Issues in Data Mining

Recently uploaded

Data Mining Task Primitives Issues in Data Mining