Statistical patterns of word frequency suggesting the probabilistic nature of human languages

by   Shuiyuan Yu, et al.

Traditional linguistic theories have largely regard language as a formal system composed of rigid rules. However, their failures in processing real language, the recent successes in statistical natural language processing, and the findings of many psychological experiments have suggested that language may be more a probabilistic system than a formal system, and thus cannot be faithfully modeled with the either/or rules of formal linguistic theory. The present study, based on authentic language data, confirmed that those important linguistic issues, such as linguistic universal, diachronic drift, and language variations can be translated into probability and frequency patterns in parole. These findings suggest that human language may well be probabilistic systems by nature, and that statistical may well make inherent properties of human languages.


Zipf's law in 50 languages: its structural pattern, linguistic interpretation, and cognitive motivation

Zipf's law has been found in many human-related fields, including langua...

Source codes in human communication

Although information theoretic characterizations of human communication ...

Is there an aesthetic component of language?

Speakers of all human languages make use of grammatical devices to expre...

Probabilistic Typology: Deep Generative Models of Vowel Inventories

Linguistic typology studies the range of structures present in human lan...

Grammatical cues are largely, but not completely, redundant with word meanings in natural language

The combinatorial power of language has historically been argued to be e...

ImmunoLingo: Linguistics-based formalization of the antibody language

Apparent parallels between natural language and biological sequence have...

