Inference of Shape Expression Schemas Typed RDF Graphs
We consider the problem of constructing a Shape Expression Schema (ShEx) that describes the structure of a given input RDF graph. We employ the framework of grammatical inference, where the objective is to find an inference algorithm that is both sound i.e., always producing a schema that validates the input RDF graph, and complete i.e., able to produce any schema, within a given class of schemas, provided that a sufficiently informative input graph is presented. We study the case where the input graph is typed i.e., every node is given with its types. We limit our attention to a practical fragment ShEx0 of Shape Expressions Schemas that has an equivalent graphical representation in the form of shape graphs. We investigate the problem of constructing a canonical representative of a given shape graph. Finally, we present a sound and complete algorithm for shape graphs thus showing that ShEx0 is learnable from typed graphs.
READ FULL TEXT