Upcoming Events and Registration

Register

  • Rank Dynamics of Language with Carlos Gershenson (May. 18)
  • receive other information from NECSI (optional)
NECSI protects your information: We do not sell, trade, or otherwise transfer to outside parties your personally identifiable information.

Wednesday, May 18
2:00 to 3:00 PM
210 Broadway

Rank Dynamics of Language with Carlos Gershenson






Studies of rank distributions have been popular for decades, especially since the work of Zipf. For example, if we rank words of a given language by use frequency (most used word in English is 'the', rank 1; second most common word is 'of', rank 2), the distribution can be approximated roughly with a power law. The same applies for cities (most populated city in a country ranks first), earthquakes, metabolism, the Internet, and dozens of other phenomena. We recently proposed "rank diversity" to measure how ranks change in time [1], using the Google Books Ngram dataset. Studying six languages between 1800 and 2009, we found that the rank diversity curves of languages are universal, adjusted with a sigmoid on log-normal scale. We are studying several other datasets (sports, economies, social systems, urban systems, earthquakes, artificial life). Rank diversity seems to be universal, independently of the shape of the rank distribution. I will present our work in progress towards a general description of the features of rank change in time, along with simple models which reproduce it.

[1] Cocho G, Flores J, Gershenson C, Pineda C, Sanchez S (2015) Rank Diversity of Languages: Generic Behavior in Computational Linguistics. PLoS ONE 10(4): e0121898. http://dx.doi.org/10.1371/journal.pone.0121898


 

 

Phone: 617-547-4100 | Fax: 617-661-7711 | Email: office at necsi.edu

210 Broadway Suite 101 Cambridge, MA USA