About Dante

The Corpus

The 1.7 billion word LEXMCI corpus of English was created by the Lexicography MasterClass in 2008 as a source of lexicographic information for the lexicographers compiling the Dante database.

Its components include:

Corpus concordance extract

Texts in the LEXMCI corpus are annotated with information about their genre, mode (written or spoken), medium (book, website etc), and language variety (to distinguish American, British and Hiberno English). This extensive annotation ensured full coverage of language variation by the lexicographers creating the database. All the full-sentence examples in the database are drawn from this corpus.