Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

W. B. Deng

Quantitative Entropy Study of Language Complexity

Jan 15, 2017

R. R. Xie, W. B. Deng, D. J. Wang, L. P. Csernai

Figure 1 for Quantitative Entropy Study of Language Complexity

Figure 2 for Quantitative Entropy Study of Language Complexity

Figure 3 for Quantitative Entropy Study of Language Complexity

Figure 4 for Quantitative Entropy Study of Language Complexity

Abstract:We study the entropy of Chinese and English texts, based on characters in case of Chinese texts and based on words for both languages. Significant differences are found between the languages and between different personal styles of debating partners. The entropy analysis points in the direction of lower entropy, that is of higher complexity. Such a text analysis would be applied for individuals of different styles, a single individual at different age, as well as different groups of the population.

Via

Access Paper or Ask Questions

Rank-frequency relation for Chinese characters

Jan 26, 2014

W. B. Deng, A. E. Allahverdyan, B. Li, Q. A. Wang

Figure 1 for Rank-frequency relation for Chinese characters

Figure 2 for Rank-frequency relation for Chinese characters

Figure 3 for Rank-frequency relation for Chinese characters

Figure 4 for Rank-frequency relation for Chinese characters

Abstract:We show that the Zipf's law for Chinese characters perfectly holds for sufficiently short texts (few thousand different characters). The scenario of its validity is similar to the Zipf's law for words in short English texts. For long Chinese texts (or for mixtures of short Chinese texts), rank-frequency relations for Chinese characters display a two-layer, hierarchic structure that combines a Zipfian power-law regime for frequent characters (first layer) with an exponential-like regime for less frequent characters (second layer). For these two layers we provide different (though related) theoretical descriptions that include the range of low-frequency characters (hapax legomena). The comparative analysis of rank-frequency relations for Chinese characters versus English words illustrates the extent to which the characters play for Chinese writers the same role as the words for those writing within alphabetical systems.

* Eur. Phys. J. B (2014) 87: 47
* To appear in European Physical Journal B (EPJ B), 2014 (22 pages, 7 figures)

Via

Access Paper or Ask Questions