Top PDF Study on the Customer targeting using Association Rule Mining

Study on the Customer targeting using Association Rule Mining

Study on the Customer targeting using Association Rule Mining

massive quantities of data. Data mining techniques can be implemented rapidly on existing software and hardware platforms to improve the value of existing information resources, and can be integrated with new products and systems. Examples of profitable applications illustrate its relevance to today’s business environment as well as a basic description of how data warehouse architectures can evolve to deliver the value of data mining to end users. Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there are prolific patterns and/or long patterns. In this study, we propose a novel frequent pattern tree (FP-tree) structure, which is an extended prefix- tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Efficiency of mining is achieved with three techniques: (1) a large database is compressed into a highly condensed, much smaller data structure, which avoids costly, repeated database scans, (2) our FP-tree based mining adopts a pattern fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based, divide-and- conquer method is used to decompose the mining task into a set of smaller tasks for mining confined patterns in conditional databases, which dramatically reduces the search space. Our performance study shows that the FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent pattern mining methods.
Mostrar mais

3 Ler mais

Association Rule Mining for Both Frequent and Infrequent Items Using Particle Swarm Optimization Algorithm

Association Rule Mining for Both Frequent and Infrequent Items Using Particle Swarm Optimization Algorithm

The particle swarm optimization algorithm is an important heuristic technique in recent years and this study uses this technique to mine association rules effectively. If this technique considers user defined threshold values, interesting association rules can be generated more efficiently. Therefore this study proposes a novel approach which includes using particle swarm optimization algorithm to mine association rules from databases. Our implementation of the search strategy includes bitmap representation of nodes in a lexicographic tree and from superset-subset relationship of the nodes it classifies frequent items along with infrequent itemsets. In addition, this approach avoids extra calculation overhead for generating frequent pattern trees and handling large memory which store the support values of candidate item sets.
Mostrar mais

11 Ler mais

Mutual information and sensitivity analysis for feature selection in customer targeting: a comparative study

Mutual information and sensitivity analysis for feature selection in customer targeting: a comparative study

For further understanding the effects of using each predictive model built on each set of features, four confusion matrices are computed for MI and DSA methods. Table 9 shows the results extracted considering a typical cut-off probability 0.5, i.e., in which the most likely outcome is considered a success if the model predicts it with 50% or more of probability, and a cut-off lowered to just 10%, to account for the fact that this particular bank intends to increase efficiency with a especial emphasis on avoiding loosing successful contacts, considering lost deposit subscriptions directly implicates on missing business opportunities for retaining important financial assets in a crisis period (thus, the cost of losing a successful contact is much higher than the gain of avoiding a needless unsuccessful contact) [6]. Table 10 shows performance metrics for each of the approaches for the two cut-off points, as well as for the standard LR model with all the features. Generally, while there is a trade-off between metrics when comparing the three methods (including using all features), the results corroborate the findings from Figure 4, with LR-DSA achieving a performance just slightly below the LR model using all features.
Mostrar mais

16 Ler mais

WEB-BASED DATA MINING TOOLS : PERFORMING FEEDBACK ANALYSIS AND ASSOCIATION RULE MINING

WEB-BASED DATA MINING TOOLS : PERFORMING FEEDBACK ANALYSIS AND ASSOCIATION RULE MINING

This paper aims to explain the web-enabled tools for educational data mining. The proposed web-based tool developed using Asp.Net framework and php can be helpful for universities or institutions providing the students with elective courses as well improving academic activities based on feedback collected from students. In Asp.Net tool, association rule mining using Apriori algorithm is used whereas in php based Feedback Analytical Tool, feedback related to faculty and institutional infrastructure is collected from students and based on that Feedback it shows performance of faculty and institution. Using that data, it helps management to improve in-house training skills and gains knowledge about educational trends which is to be followed by faculty to improve the effectiveness of the course and teaching skills.
Mostrar mais

10 Ler mais

RETAIL IN THE EMERGING MARKETS: A STUDY BASED ON ASSOCIATION RULES

RETAIL IN THE EMERGING MARKETS: A STUDY BASED ON ASSOCIATION RULES

Since product hierarchy information is not available for the mature market dataset we will have to find common ground to compare the results. Again, the focus of our work is studying emerging markets therefore we will present our results for this dataset using the support and subcategory restrictions as planned. To be able to compare the results we need to use the same methodology to find and measure negative correlation between products for both datasets. We can keep the minimum support restriction but we need to find an alternative way of grouping similar products. Since we only have transaction information for the mature market dataset we will have to use transaction related variables to group products. Products with a relevant substitution effect, in other words an effect worth monitoring for commercial purposes, are expected to have similar transaction behavior regarding frequency and associations with other products. For instance, if we accept that pepsi is a potential substitute product for coca cola, and coca cola is found to be associated with chips, we should also expect pepsi to be associated with chips. Additionally if pepsi doesn’t have a frequency similar to coca cola then this is a substitution effect with no commercial potential, therefore not worth monitoring. We will then use a segmentation algorithm to group items with similar transaction characteristics in clusters, which ultimately means that customers purchase them in a similar way. Similar purchasing behavior of two negatively correlated products is a strong indication that they are perceived as similar by the customer.
Mostrar mais

41 Ler mais

A Recent Review on XML data mining and FFP

A Recent Review on XML data mining and FFP

Data Mining is referred to as Knowledge Discovery in Databases. It deals with issues such as representation schemes for the concept or pattern to be discovered, design of appropriate functions and algorithms to find patterns. However data on the web and bioinformatics databases often lack such a regular structure called semi- structured. This survey papers gives a brief survey of XML data mining using association rules and fast frequent pattern in various fields, the modifications made to the association rules according to the applications they were used and its effective results. Thus association rules prove themselves to be the most effective technique for frequent pattern matching over a decade. XML has become very popular for representing semi structured data and a standard for data exchange over the web. Mining XML data from the web is becoming increasingly important. However, the structure of the XML data can be more complex and irregular than that. Association Rule Mining plays a key role in the process of mining data for frequent pattern matching. First Frequent Pattern-growth, for mining the complete set of frequent patterns by pattern fragment growth. First Frequent Pattern-tree based mining adopts a pattern fragment growth method to avoid the costly generation of a large number of candidate sets and a partition-based, divide-and-conquer method is used. This paper shows a review of XML data mining using Fast Frequent Pattern mining in various domains.
Mostrar mais

6 Ler mais

Mining Recurrent Pattern Identification on Large Database

Mining Recurrent Pattern Identification on Large Database

Number of algorithms are available for data mining. In this paper we have taken up the Apriori Algorithm, Compacting Data Set (CDS), Frequent Pattern Algorithm using Dynamic Function, Multilevel association rule mining algorithm based on Boolean matrix and the Frequent Pattern Growth Algorithm for the study and comparison. All the above algorithms were examined with respect to their basic principle and suitability.

5 Ler mais

An Intelligent Association Rule Mining Model for Multidimensional Data Representation and Modeling

An Intelligent Association Rule Mining Model for Multidimensional Data Representation and Modeling

The traditional association rule mining algorithms to recognize frequent events in form of itemsets were widely-used example of association rule mining is Market Basket Analysis (Agrawal et al., 1993) were among the first to address the problem of pattern Classification by using breast cancer dataset[14] from the database. The work on association rules was extended from patterns [1,2,11] ,the authors explored data cube-based [2] rule mining algorithms on multidimensional databases, where each tuple/transaction consisted of multi-dimensional data features.In the area of multi-dimensional data sets [11], authors discussed a multidimensional data model, in which the multidimensional data was viewed as a value in the multidimensional space. Based on this model, efficient data mining have been performed using data cubes[2] based on aggregates of dimensions were computed in [9,10]. Rule mining is another well studied data mining problem and over the years many techniques have been designed to construct decision trees for mining the patterns in the data [8].However, it is necessary to perform classification in addition to association rule mining for effective decision making. Therefore, this paper focuses on the integration of ARM with Fuzzy rule mining for better decision.
Mostrar mais

8 Ler mais

A New Approach to Find Predictor of Software Fault Using Association Rule Mining

A New Approach to Find Predictor of Software Fault Using Association Rule Mining

From the above table we found that all the top 20 rules revolve around only 8 metrics among 17 metrics: UWCS, INST, LMC, NOM, AVCC, LCOM2, CBO and FOUT. 9 metrics are deleted from analysis. According to our objective we want to find all those metrics as predictors using association mining. But in our problem we are finding frequent software metrics in every class. From observation we found that if any metric is found in antecedent part of the relation and other metrics comes in consequent part. Then it means there is no need to use both of the metrics in the relation for developing fault prediction model. Because they can share same type of information in prediction of fault. After giving more focus on the generated top 20 rules, we found that:
Mostrar mais

14 Ler mais

A pragmatic approach on association rule mining and its effective utilization in large databases

A pragmatic approach on association rule mining and its effective utilization in large databases

This paper deals with the effective utilization of association rule mining algorithms in large databases used for especially business organizations where the amount of transactions and items plays a crucial role for decision making. Frequent item-set generation and the creation of strong association rules from the frequent item-set patterns are the two basic steps in association rule mining. We have taken suitable illustration of market basket data for generating different item-set frequent patterns and association rule generation through this frequent pattern by the help of Apriori Algorithm and taken the same illustration for FP-Growth association rule mining and a FP-Growth Tree has been constructed for frequent item-set generation and from that strong association rules have been created. For performance study of Apriori and FP-Tree algorithms, experiments have been performed. The customer purchase behaviour i.e. seen in the food outlet environments is mimicked in these transactions. By using the synthetic data generation process, the observations has been plotted in the graphs by taking minimum support count with respect to execution time. From the graphs it has that as the minimum support values decrease, the execution times algorithms increase exponentially which is happened due to decrease in the minimum support threshold values make the number of item-sets in the output to be exponentially increased. It has been established from the graphs that the performance of FP-Growth is better than Apriori algorithm for all problem sizes with factor 2 for high minimum support values to very low level support magnitude.
Mostrar mais

10 Ler mais

Telecom customer segmentation and precise package design by using data mining

Telecom customer segmentation and precise package design by using data mining

customers feel as satisfied as possible, and the entire customer. Group segmentation of group characteristics is a prerequisite for meeting the diverse heterogeneity needs. Studying customer value, different customers, the value brought to the enterprise is different, according to the amount of value that the customer can bring to the enterprise, the customer group is divided into high-value customers, low-value customers, potential value customers, etc., so customer segmentation Play an extremely important role in business management (Junhai Ma, Tiantong Xu, Wandong Lou, 2018). To study the ability of enterprises to deal with resources, the resources of enterprises are limited. How to allocate resources to customers reasonably and maximize the benefits of resources is a problem that enterprises need to seriously consider, so the statistics, analysis and subdivision of customer groups are at this time. It has become particularly important to rely on research results for resource allocation, which determines the operational efficiency of the company. Reasonable and effective resource allocation, based on the characteristics of each type of customer group, the implementation of targeted marketing activities, can maximize the value of each type of customer groups, deepen potential profit points, help companies provide decision-making basis, reduce operations Cost, improve management efficiency. The customer segmentation clarifies that consumers themselves are also diverse and cannot respond to all consumers with a single strategy (Kochetov Vadim, 2018). Customer segmentation can quickly improve the management level of the organization, find the corresponding customer market, and then adopt different marketing strategies for customers in different market segments.
Mostrar mais

71 Ler mais

An association rule mining-based framework for understanding lifestyle risk behaviors.

An association rule mining-based framework for understanding lifestyle risk behaviors.

Using accumulation technique methods, one cannot detect specific behavior combinations. In order to study the association structure of 7 binary health risk behaviors, we would need to analyze a contingency table with 262 7 possible levels. Conse- quently, there will be a number of empty cells; an exhaustive analysis of the table is challenging. With correlation analysis, regression approaches, and odds ratios, behavioral associations are generally studied from the perspective of a single behavior with preconceived ideas about the order of importance of behaviors. This can lead researchers to overemphasize the role of the primary selected behavior [32]. The present study avoided this problem by utilizing ARM, a technique that assumes no hierarchy of lifestyle risk behaviors and creates simple association rules between three or more behaviors.
Mostrar mais

9 Ler mais

Frame work for association rule mining with updated fp-growth and modified cofi approaches

Frame work for association rule mining with updated fp-growth and modified cofi approaches

dimensionality of the problem. Let D be the set of transactions where each transaction T is a set of items such that T  I. A unique identifier TID is given to each transaction. A transaction T is said to contain X, a set of items in I, if X  T. An association rule is an implication of the form “X =>Y” where XI, YI, and X  Y=. An itemset X is said to be frequent if its support s is greater than or equal to a given minimum support threshold α. Discovering Association rules, however is nothing more than an application for frequent itemset mining, like inductive databases, query expansion, document clustering etc. This problem used to mine frequent patterns from the databases like Retail transaction database, Chess database and Mushroom database using association mining algorithms FP-Growth, COFI* and to generate association rules among frequent patterns . Association mining mines transaction database to extract the frequent patterns present in the database. By understanding these patterns, customer’s behavior of purchasing items can be analyzed and that information can be used to improve sales by keeping the items, which are purchased together at side by side.
Mostrar mais

11 Ler mais

Association Technique based on Classification for Classifying Microcalcification and Mass in Mammogram

Association Technique based on Classification for Classifying Microcalcification and Mass in Mammogram

Many research works have used data mining technique to analysis mammogram. Researches that use data mining approach to classify can be found in [11, 12]. Most of them classify a mammogram into benign or malign, and the candidate regions are captured from the whole original image. Luiza et al. [11] proposed a classification method based on association rule mining. The original image was split initially in four parts, for a better localization of the region of interest. And the extracted features were discretized over an interval before organizing the transactional data set. Aswini et al. [4] proposed an image mining techniques using mammograms to classify and detect the cancerous tissue. The mammogram image is classified into normal, benign and malignant class and to explore the feasibility of data mining approach
Mostrar mais

8 Ler mais

A Threshold Free Implication Rule Mining

A Threshold Free Implication Rule Mining

On execution time wise, each running time takes less than 3 seconds on a notebook computer Pentium Centrino 1GHz with 1.5G of main memory and running Windows XP Home Edition. Zoo dataset contains 101 transactions and 43 item sets. The search space on a target is 2 2(n-1) - ( 2 (n-1) - 1 ) where 2 2(n-1) is the total number of both positive and negative rules, and ( 2 (n-1) - 1 ) is the total number of positive rules using a single consequence item set as a target. In this case, zoo dataset contains 2E+25 combinations of item sets. We use an optimistic assumption to grasp the size of the search space; we assume only one computation cycle time (1 / 1GHz) is needed to form and to validate a combination of item set in a single transaction. Based on this optimistic assumption, it follows that a search without pruning would require at least 6E+10 years to complete. In comparison, our search time is feasible. From these two experiments, we conclude that association rule pairs are useful to discover knowledge (both frequent and infrequent) from dataset.
Mostrar mais

6 Ler mais

Multi-objective Numeric Association Rules Mining via Ant Colony Optimization for Continuous Domains without Specifying Minimum Support and Minimum Confidence

Multi-objective Numeric Association Rules Mining via Ant Colony Optimization for Continuous Domains without Specifying Minimum Support and Minimum Confidence

In a basket market transactions where transactions are a list of items purchased by a customer, the knowledge that association rules give us are something like: “70% of customers who buy A also buy B”. The applications of association rules are in discovering customer buying patterns for cross-marketing and attached mailing applications, catalog design, product placement, customer segmentation, etc., based on their buying patterns [12]. Given a set of transactions, the problem of mining association rules is to find all association rules that have support and confidence greater than the user-specified minimum support and minimum confidence respectively. Association rules can be boolean or numeric. Numeric association rules can have some numeric attributes, like age, height and etc. they also can have categorical attributes like gender, brand, and etc. numeric attributes need to be dicretized in order to transform the problem into a Boolean one, before mining association rules. An example of a numeric association rule in an employee database is like this [13]:
Mostrar mais

8 Ler mais

Decision support methods for finding phenotype--disorder associations in the bone dysplasia domain.

Decision support methods for finding phenotype--disorder associations in the bone dysplasia domain.

We have presented an approach that combines association rule mining with the Dempster-Shafer theory (DST) to compute probabilistic associations between sets of clinical features and disorders. These can then serve as support for medical decision making (e.g., diagnosis). Experimental results show that the proposed approach is able to provide meaningful outcomes even on small datasets with sparse distributions. Moreover, the result shows that the approach can outperform other Machine Learning techniques and behaves slightly better than an initial diagnosis by a clinician. To test the accuracy of the approach, we have performed several experiments comparing human-mediated initial and final diagnoses, as well as outputs produced by other machine learning algorithms, in which we have treated our approach as a traditional classifier. The results show that we can achieve a top-1 accuracy of 47.43% (i.e., the accuracy calculated only via the candidate with the highest probability) by using disorder descriptions for 20 to more than 100 cases. This represents an increase in accuracy of around 7% when compared to the initial human-made diagnosis, and around 4% when compared to the next best machine learning approach.
Mostrar mais

10 Ler mais

Database Reverse Engineering based on Association Rule Mining

Database Reverse Engineering based on Association Rule Mining

In parallel to the attempts of applying learning techniques to existing large databases, researchers in the area of database reverse engineering have proposed some means of extracting conceptual schema. Lee and Yoo [14] proposed a method to derive a conceptual model from object-oriented databases. The derivation process is based on forms including business forms and forms for database interaction in the user interface. The final products of their method are the object model and the scenario diagram describing a sequence of operations. The work of Perez et al. [15] emphasized on relational object-oriented conceptual schema extraction. Their reverse engineering technique is based on a formal method of term rewriting. They use terms to represent relational and object-oriented schemas. Term rewriting rules are then generated to represent the correspondences between relational and object-oriented elements. Output of the system is the source code to migrate legacy database to the new system. Recent work in database reverse engineering has not concentrated on a broad objective of system migration. Researchers rather focus their study on a particular issue of semantic understanding. Lammari et al. [13] proposed a reverse engineering method to discover inter-relational constraints and inheritances embedded in a relational database. Chen et al. [5] also based their study on entity- relationship model. They proposed to apply association rule mining to discover new concepts leading to a proper design of relational database schema. They employed the concept of fuzziness to deal with uncertainty inherited
Mostrar mais

6 Ler mais

Efficient Frequent Pattern Tree Construction

Efficient Frequent Pattern Tree Construction

Association rules are usually required to satisfy a user-specified minimum support and a user-specified minimum confidence [1][2][3]. Association rules can be extracted using two familiarized algorithms named as Apriori algorithm and FP-Growth algorithm [22][23][24]. The FP-Growth algorithm is completely depends on fp-tree [4]. In previous, the fp-tree node is labeled only with its support count that consumes more time while traversing to extract association rule. In this paper we are more concentrated on the node labeling scheme of fp-tree in FP-Growth algorithm. Here we propose a new two level node labeling scheme for frequent pattern growth tree. Using the new labeling approach the frequent item support count can be extracted in less time comparatively the traditional naming scheme of fp-tree. This paper provides the major advantages in the FP-Growth algorithm for association rule mining with using the newly proposed approach.
Mostrar mais

6 Ler mais

Comparative Study of Apriori Algorithm Performance on  Different Datasets

Comparative Study of Apriori Algorithm Performance on Different Datasets

Data Mining is known as a rich tool for gathering information and apriori algorithm is most widely used approach for association rule mining. To harness this power of mining, the study of performance of apriori algorithm on various data sets has been performed. Using Java as platform implementation of Apriori Algorithm has been done and analysis is done based on some of the factors like relationship between number of iterations and number of instances between different kinds of data sets. Conclusion is supported with graphs at the end of the paper.
Mostrar mais

6 Ler mais

Show all 10000 documents...