Discovering Power Laws in Entity Length
This paper presents a discovery that the length of the entities follows a family of scale-free power law distributions. The concept of entity here broadly includes the named entity, entity mention, time expression, and domain-specific entity that are well investigated in natural language processing and related areas. The power law distributions in entity length have well-defined means and finite variances and possess the scale-free property. We explain the phenomenon of power laws in entity length by the principle of least effort in communication and the preferential mechanism.
READ FULL TEXT