Estimating mechanical properties of cloth from videos using dense motion trajectories : Human psychophysics and machine learning
Humans can visually estimate the mechanical properties of deformable objects (e.g., cloth stiffness). While much of the recent work on material perception has focused on static image cues (e.g., textures and shape), little is known about whether humans can integrate information over time to make a judgment. Here we investigated the effect of spatiotemporal information across multiple frames (multiframe motion) on estimating the bending stiffness of cloth. Using high-fidelity cloth animations, we first examined how the perceived bending stiffness changed as a function of the physical bending stiffness defined in the simulation model. Using maximum-likelihood difference-scaling methods, we found that the perceived stiffness and physical bending stiffness were highly correlated. A second experiment in which we scrambled the frame sequences diminished this correlation. This suggests that multiframe motion plays an important role. To provide further evidence for this finding, we extracted dense motion trajectories from the videos across 15 consecutive frames and used the trajectory descriptors to train a machine-learning model with the measured perceptual scales. The model can predict human perceptual scales in new videos with varied winds, optical properties of cloth, and scene setups. When the correct multiframe was removed (using either scrambled videos or two-frame optical flow to train the model), the predictions significantly worsened. Our findings demonstrate that multiframe motion information is important for both humans and machines to estimate the mechanical properties. In addition, we show that dense motion trajectories are effective features to build a successful automatic clothestimation system.