Dependently Typing R Vectors, Arrays, and Matrices
The R programming language is widely used in large-scale data analyses. It contains especially rich built-in support for dealing with vectors, arrays, and matrices. These operations feature prominently in the applications that form R's raison d'être, making their behavior worth understanding. Furthermore, ostensibly for programmer convenience, their behavior in R is a notable extension over the corresponding operations in mathematics, thereby offering some challenges for specification and static verification. We report on progress towards statically typing this aspect of the R language. The interesting aspects of typing, in this case, warn programmers about violating bounds, so the types must necessarily be dependent. We explain the ways in which R extends standard mathematical behavior. We then show how R's behavior can be specified in LiquidHaskell, a dependently-typed extension to Haskell. In the general case, actually verifying library and client code is currently beyond LiquidHaskell's reach; therefore, this work provides challenges and opportunities both for typing R and for progress in dependently-typed programming languages.
READ FULL TEXT