Computer-based corpora: problems of collection and interpretation of Kenyan texts in English
Nyamasyo, Eunice A.
MetadataShow full item record
Computer-basedcorpora as sources of language material for description is a relatively new concept inlinguistics in Kenya, if not in Africa generally. The collection of relevant and / or appropriate text samples:spoken, written, or otherwise, is therefore fraught with a number of difficulties. The linguist isfaced with a range of problems in, firstly, processing any collected material and secondly, making correctinterpretations of the said data. The computer, a recent innovative tool in language-based research, requires the researcher to have both data inputting and processing skills. Text samples as the basis of data are obtained from various sources some of which require special permission to access. Once acquired, text samples vary in origin and characteristics hence raising issues of interpretation.Notwithstanding, the Kenyan sample is an essential component of the International Corpusof English (ICE) as a source of data for the description of the present-day English language.