A comparison of clustering procedures and similarity measures in creating clusters using warping functions
Functional data is data that can be represented as continuous curves, such as stock prices, temperature changes across time, or the growth of the human body. This study focuses on clustering functional data that is characterized by variation along the time axis through the clustering of the curves' warping functions, by-products of curve registration procedures that represent a summarized version of each curve's timing of events relative to other curves in the sample. This approach is optimized by comparing two ways to define the similarity of these warping functions---using their B-spline coefficients and an approximation of the squared L2 distance between curves---and a variety of agglomerative hierarchical and partitioning clustering procedures. The results will be compared to alternative approaches by applying these same similarity measures and clustering procedures to the original smoothed curves.