SciELO - Scientific Electronic Library Online

 
vol.35 número3Pruebas de bondad de ajuste para la distribución Gumbel con datos censurados por la derecha tipo IIUso de muestras de rango ordenado en una prueba de ajuste basada en entropía para la distribución Laplace índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • En proceso de indezaciónCitado por Google
  • No hay articulos similaresSimilares en SciELO
  • En proceso de indezaciónSimilares en Google

Compartir


Revista Colombiana de Estadística

versión impresa ISSN 0120-1751

Resumen

GUERRERO, FABIO G.. On the Entropy of Written Spanish. Rev.Colomb.Estad. [online]. 2012, vol.35, n.3, pp.425-442. ISSN 0120-1751.

A discussion on the entropy of the Spanish language by means of a practical method for calculating the entropy of a text by direct computer processing is presented. As an example of application, thirty samples of Spanish text are analyzed, totaling 22.8 million characters. Symbol lengths from n = 1 to 500 were considered for both words and characters. Both direct computer processing and the probability law of large numbers were employed for calculating the probability distribution of the symbols. An empirical relation on entropy involving the length of the text (in characters) and the number of different words in the text is presented. Statistical properties of the Spanish language when viewed as produced by a stochastic source, (such as origin shift invariance, ergodicity and asymptotic equipartition property) are also analyzed.

Palabras clave : Law of large numbers; Shannon entropy; Stochastic process; Zipf's law.

        · resumen en Español     · texto en Inglés     · Inglés ( pdf )