WORDS such as “yeah”, “so”, “oh” and “like” are among the 100 most used in Britain, researchers say.
Top of the list are “the”, “be”, “and”, “a” and “of”.
Lancaster University’s Professor of corpus linguistics Vaclav Brezina says: ‘Words can tell us fascinating stories about how we live’
But the 5,000-strong list even includes “Victorian”, “Sydney” and “Belgium”, which appear around 13 times per one million words.
They are part of a dictionary collated by Lancaster University’s professor of corpus linguistics Vaclav Brezina and lecturer Dr Dana Gablasova.
They analysed a 100million-word dataset that includes a wide range of genres of spoken and written English, including informal speech, fiction, newspapers, academic writing and e-language.
The book — A Frequency Dictionary of British English — helps distinguish different uses of words that are essential for both language learners and researchers interested in words occurring in real contexts.
Professor Brezina said: “Words can tell us fascinating stories about how we live, what we find important and how we think about the world.
“This dictionary provides detailed insights into the use of English words across a number of contexts, a social geography of language, if you like.”
The dictionary is based on extensive research on existing British-English using the British National Corpus 2014 – a 100 million word representative corpus developed at Lancaster University.
Researchers used a new tool – known as #LancsBox X, which was developed and custom made by a team at Lancaster University – to analyse all the data.
Professor Brezina added: “Dictionaries were a lifetime’s work but now technology, including this cutting-edge tool, will bring together language research and computation to make it so much easier.
“This is the future of lexicography – automated and semi-automated processes which will radically alter the way we produce dictionaries, grammar books, and teaching materials.
“Also, not only can we accurately describe the uses and meanings of words, using #LancsBox X we can also easily visualise word associations and the connections between words, allowing us to directly witness the intricate structure of language.
“As the saying suggests, a picture can sometimes be worth a thousand words.”
Published: [#item_custom_pubDate]