Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Random-text models have been proposed as an explanation for the power law relationship between word frequency and rank, the so-called Zipf's law. They are generally regarded as null hypotheses rather than models in the strict sense. In this context, recent theories of language emergence and evolution assume this law as a priori information with no need of explanation. Here, random texts and real texts are compared through (a) the so-called lexical spectrum and (b) the distribution of words having the same length. It is shown that real texts fill the lexical spectrum much more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high.
Here, we test Neutral models against the evolution of English word frequency and vocabulary at the corpus scale, as recorded in annual word frequencies from three centuries of English language books. Against these data, we test both static and dynamic predictions of two neutral models, including the relation between corpus size and vocabulary size, frequency distributions, and turnover within those frequency distributions. Although a commonly used Neutral model fails to replicate all these emergent properties at once, we find that modified two-stage Neutral model does replicate the static and dynamic properties of the corpus data. This two-stage model is meant to represent a relatively small corpus of English books, analogous to a ‘canon’, sampled by an exponentially increasing corpus of books among the wider population of authors. More broadly, this model — a smaller neutral model within a larger neutral model — could represent more broadly those situations where mass attention is focused on a small subset of the cultural variants.
A cellular automaton model called the Firm Dynamics Model (FDM) is introduced to simulate the dynamics of firms within an economy. The model includes the growth of firms and their mergers and exits. The main objective is to compare the size-frequency distributions in the model with the empirical firm size distributions of several countries (USA, UK, Spain and Sweden). The empirical size distributions were assembled from business censuses and additional information on the country’s largest companies in terms of the number of employees. For the four datasets analyzed here, the firm size distribution is compatible with a power law of the Pareto type with an exponent of close to two (for the probability density). For its part, the model delivers two different size-frequency distributions depending on the type of merger that firms can undergo: the friendly-merger version gives rise to subcritical distributions with an exponential tail, whereas the aggressive-merger version produces power-law distributions. The simulation model was run with underlying lattices in one, two and three dimensions in order to compare the simulated power-law exponent with the empirical one. The best agreement was obtained with the two-dimensional aggressive-merger model version, for which the power-law exponent is 2.19±0.01, as compared with an empirical exponent of 2.1±0.1 (average over the four datasets). Further simulations with the model on a Bethe lattice confirm that the two-dimensional model provides the best fit to the empirical exponent.
The principle of least effort (PLE) is believed to be a universal rule for living systems. Its application to the derivation of the power law probability distributions of living systems has long been challenging. Recently, a measure of efficiency was proposed as a tool of deriving Zipf’s and Pareto’s laws directly from the PLE. This work is a further investigation of this efficiency measure from a mathematical point of view. The aim is to get further insight into its properties and usefulness as a metric of performance. We address some key mathematical properties of this efficiency such as its sign, uniqueness and robustness. We also look at the relationship between this measure and other properties of the system of interest such as inequality and uncertainty, by introducing a new method for calculating nonnegative continuous entropy.