NEW TECHNIQUES FOR DISCRIMINATION WITH NOMINAL LEVEL VARIABLES
The basic discriminant analysis problem is as follows: Assume there are two distinct populations. A random sample is drawn from each of the two populations, and various characteristics are measured for each member of both samples. The aim is to use observed differences in the characteristics between the two groups as a means of predicting population membership of new observations. Practical examples of this problem include the identification of eligible versus ineligible welfare recipients or correct versus incorrect tax returns. The classical statistical methods developed to deal with this problem tend to make unduly restrictive assumptions about the nature and distribution of the population characteristics. In particular, assumptions of multivariate normality, linearity, additivity, and order of interactions are involved. The dissertation shows that these assumptions are often inappropriate and unnecessary, especially when dealing with characteristics measured at the nominal level. A new class of procedures based on a general sequential algorithm is defined. These procedures are shown to be the discrete equivalent of sequential regression procedures. The new procedures are compared to Fisher's linear (and quadratic) discriminant analysis procedures on a set of actual problems drawn from the literature and are found to perform as well as, or better than, Fisher's approach in every case. In addition, a detailed example is carried out using data from the New Hampshire Medicaid program. In this case, a post-test sample was used to demonstrate the validity of the results. The new methods were found to work equally well on the new sample. Furthermore, the results were superior to those achieved by other methods applied to similar data. Thus, it is concluded that this new class of sequential algorithms represents a viable alternative to the usual methods of discriminant analysis, particularly if the researcher is concerned about the validity of the assumptions underlying those methods.