Categorical Exploratory Data Analysis: From Multiclass Classification and Response Manifold Analytics perspectives of baseball pitching dynamics
From two coupled Multiclass Classification (MCC) and Response Manifold Analytics (RMA) perspectives, we develop Categorical Exploratory Data Analysis (CEDA) on PITCHf/x database for the information content of Major League Baseball's (MLB) pitching dynamics. MCC and RMA information contents are represented by one collection of multi-scales pattern categories from mixing geometries and one collection of global-to-local geometric localities from response-covariate manifolds, respectively. These collectives shed light on the pitching dynamics and maps out uncertainty of popular machine learning approaches. On MCC setting, an indirect-distance-measure based label embedding tree leads to discover asymmetry of mixing geometries among labels' point-clouds. A selected chain of complementary covariate feature groups collectively brings out multi-order mixing geometric pattern categories. Such categories then reveal the true nature of MCC predictive inferences. On RMA setting, multiple response features couple with multiple major covariate features to demonstrate physical principles bearing manifolds with a lattice of natural localities. With minor features' heterogeneous effects being locally identified, such localities jointly weave their focal characteristics into system understanding and provide a platform for RMA predictive inferences. Our CEDA works for universal data types, adopts non-linear associations and facilitates efficient feature-selections and inferences.
READ FULL TEXT