Localized associations in market based analysis
Association rule mining approaches have been proposed in the data mining liter- ature to analyze market basket data. The outputs of such approaches are rules that identify pairs of associated products that imply the co-occurrence of particular prod- ucts in a basket. As established in the marketing research literature, several factors can influence the co-occurrence of products. First, marketing mix activities such as pricing and promotions in one product category may influence a consumer’s purchase decision of products in other categories. Second, due to consumer heterogeneity, the association of a set of products may vary across customer segments. Third, a basket of products may just be purchased together coincidentally since the consumer wants to spread the shopping cost of one trip. Finally, aggregate associations (or correlations) may dif- fer from localized associations. In the extreme case, we may see Simpson’s paradox, which means that a pair of products is positively correlated in every consumer seg- ment but shows negative correlation in the aggregate data or vice versa. This happens when market baskets from multiple sources are pooled into one aggregate database. Thus product associations may be distorted in two ways - change of magnitude, or change of direction. As a result, association rules discovered by existing association rule mining may be spurious. We develop an exploratory rule mining algorithm based on transaction attributes such as consumer demographics or marketing mix variables that identifies segments of the data (a subset of the baskets) which exhibit strong asso- ciations between pairs of products that are not seen in the aggregate data set. Results are presented using an IRI market basket data set that contains transactions including 22 categories over 2 years for 500 panelists.