Entropy Indicators for Investigating Early Language Processes

Lyon, C., Nehaniv, C.L. and Dickerson, B. (2005) Entropy Indicators for Investigating Early Language Processes. In: UNSPECIFIED.
Copy

We examine evidence for the hypothesis that language could have passed through a stage when words were combined in structured linear segments and these linear segments could later have become the building blocks for a full hierarchical grammar. Experiments were carried out on the British National--Corpus, consisting of about 100 million words of text from different domains and transcribed speech.--This work extends and supports the results of our previouswork based on a smaller corpus reported previously. Measuring the entropy of the texts we find that entropy declines as words are taken in groups--of 2, 3 and 4, indicating that it is easier to decode words taken in short sequences rather than individually. Entropy further declines when punctuation is represented, showing that appropriate segmentation captures some of the language structure. Further support for the hypothesis that local sequential processing underlies the production and perception of speech comes from neurobiological evidence. The observation that homophones are apparently ubiquitous and used without confusion also suggests that language processing may be largely based on local context.


picture_as_pdf
901888.pdf

View Download

EndNote BibTeX Reference Manager Refer Atom Dublin Core OpenURL ContextObject in Span HTML Citation MPEG-21 DIDL RIOXX2 XML ASCII Citation OpenURL ContextObject MODS METS Data Cite XML
Export

Downloads