Mark Davies: Publications

Books and monographs
12017A Frequency Dictionary of Spanish: Core Vocabulary for Learners. Second edition: revised and expanded. Routledge. (Co-authored with Kathy Hayward Davies)
22010A Frequency Dictionary of American English: Word Sketches, Collocates, and Thematic Lists. Routledge. (Co-authored with Dee Gardner.)
32009Corpus linguistic applications: current studies, new directions. Rodopi. (Co-editor, with Stefan Gries and Stefanie Wulff.)
42007A Frequency Dictionary of Portuguese: Core Vocabulary for Learners. Routledge. (Co-authored with Ana Maria Raposo Preto-Bay)
52005A Frequency Dictionary of Spanish: Core Vocabulary for Learners. Routledge.
62004El uso del Corpus del Español y otros corpus para investigar la variación actual y los cambios históricos. Tokyo: Univ. Sophia.
Journal articles and chapters (click to download)
7 2021"The Coronavirus Corpus: design, construction, and use." International Journal of Corpus Linguistics. 26(4): 583-98.
82021"Constitución de corpus crecientes del español". In Giovanni Parodi, Pascual Cantos, Chad Howe. The Routledge Handbook of Spanish Corpus Linguistics. (With Giovanni Parodi)
9 2020"The TV and Movies corpora: design, construction, and use." International Journal of Corpus Linguistics. 26(1): 10-37.
102019"The advantages and challenges of ‘big data’: Insights from the 14 billion word iWeb corpus". Linguistic Research 36(1), 1-34. (With Jong-Bok Kim)
112019"The best of both worlds: Multi-billion word ‘dynamic’ corpora". In Piotr Bański, et a. Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-7) 2019. Mannheim: Leibniz-Institut fur Deutsche Sprache.
122019"If olive oil is made of olives, then what’s baby oil made of? The shifting semantics of Noun+Noun sequences in American English." In J. Egbert & P. Baker (Eds.), Using corpus methods to triangulate linguistic analysis, New York: Routledge. 163-84. (With Jesse Egbert)
132019"Historical shifts with the into-causative construction in American English." Linguistics 57: 29-58. (With Jong-Bok Kim)
142018"Sorting them all out: Exploring the separable phrasal verbs of English." System 76: 197-209. (With Dee Gardner)
152018"Using (and useful) corpora for the study of the history of English". In Teaching the History of the English Language, eds. Chris Palmer and Colette Moore. MLA Options for Teaching Series.
162018"Corpus-based studies of lexical and semantic variation: The importance of both corpus size and corpus design." In From data to evidence in English language research (Digital Linguistics), eds. Suhr, Carla, Terttu Nevalainen and Irma Taavitsainen. Leiden: Brill. 34-55.
172018"Uso del Corpus del Español y los corpus relacionados para la lexicografía histórica española." In Historia del léxico español y Humanidades digitales. Eds. Alejandro Fajardo, et al. Berlin: Peter Lang. 49-76.
182017"Using Large Online Corpora to Examine Lexical, Semantic, and Cultural Variation in Different Dialects and Time Periods". In Corpus-Based Sociolinguistics, ed. Eric Friginal et al. London: Routledge. 19-82.
192016"The Effect of Representativeness and Size in Historical Corpora: An Empirical Study of Changes in Lexical Frequency." In Studies in the History of the English Language VII: Generalizing vs. particularizing methodologies in historical linguistic analysis, eds. Don Chapman, Colette Moore, and Miranda Wilcox. Berlin: De Gruyter / Mouton. 131-50. (With Don Chapman)
202016"The Into Causative Construction in English: A Construction-based Perspective." English Language and Linguistics 20 (1): 55-83. (With Jong-Bok Kim)
212016"A response to ‘To what extent is the Academic Vocabulary List relevant to university student writing?". English for Specific Purposes 42: 62-68. (With Dee Gardner)
222015"Corpora: An Introduction". In Cambridge Handbook of English Corpus Linguistics, eds. Douglas Biber and Randi Reppen. Cambridge: Cambridge University Press. 11-31.
232015"A Corpus Linguistic Approach to Vocabulary Learning for University Students." In ESL Readers and Writers in Higher Education: Understanding Challenges, Providing Support, eds. Norm Evans, Neil Anderson, and William Eggington. London: Routledge. 180-197. (With Dee Gardner)
242015"Introducing the 1.9 Billion Word Global Web-Based English Corpus (GloWbE)." 21st Century Text. (Peer-reviewed, online journal).
252015"Exploring the Composition of the Web:  A Corpus-based Taxonomy of Web Registers". Corpora 10 (1): 11-45. (With Douglas Biber and Jesse Egbert)
262015"Expanding Horizons in the Study of World Englishes with the 1.9 Billion Word Global Web-Based English Corpus (GloWbE)." English World-Wide 36: 1-28. (With Robert Fuchs)
272015"The importance of robust corpora in providing more realistic descriptions of variation in English grammar". In Linguistic Vanguard (peer-reviewed online journal from Mouton de Gruyter)
282015"Developing a Bottom-up, User-based Method of Web Register Classification" in its current form for publication in Journal of the Association for Information Science and Technology. (With Douglas Biber and Jesse Egbert)
292014"Making Google Books n-grams useful for a wide range of research on language change". International Journal of Corpus Linguistics 19 (3): 401-16.
302014"Powerful (yet simple) comparisons of a wide range of phenomena in British and American English". ICAME Journal 38:35-56.
312014"Creating and Using the Corpus do Português and the Frequency Dictionary of Portuguese". In Working with Portuguese Corpora, eds. Tony Berber Sardinha and Telma Ferreira. Continuum Publishers. 89-110.
322014"Examining syntactic variation in English: the importance of corpus design and corpus size". English Language and Linguistics 19 (3): 1-35.
332013"Google Scholar vs. COCA: two very different approaches to examining academic English". Journal of English for Academic Purposes 12: 155-165.
342013"A New Academic Vocabulary List." In Applied Linguistics 35: 1-24. (With Dee Gardner)
352013"Establishing Corpora from Existing Data Sources". In Data Collection in Sociolinguistics:  Methods and Applications, ed Christine Mallinson, et al. London: Routledge. 210-12.
362012"Expanding Horizons in Historical Linguistics with the 400 million word Corpus of Historical American English". Corpora 7: 121-57.
372012"Examining Recent Changes in English: Some Methodological Issues". In The Oxford Handbook of the History of English, eds. Terttu Nevalainen and Elizabeth Closs Traugott. Oxford: Oxford Univ. Press. 263-87.
382012"Recent shifts with three nonfinite verbal complements in English: Data from the 100 million word TIME Corpus (1920s-2000s)". In Current Change in the English Verb Phrase, ed. Bas Aarts, et al. Cambridge: Cambridge Univ. Press. 46-67.
392012"The 400 Million Word Corpus of Historical American English (1810-2009)". In English Historical Linguistics 2010, ed. Irén Hegedus, et al. Philadelphia: John Benjamins. 217-50.
402012"Looking at Recent Changes in English with the Corpus of Contemporary American English (COCA)". 21st Century Text. (Peer-reviewed, online journal).
412012"Comparisons between Google Books and Google Books Corpus." Computer-assisted Foreign Language Education. 145:15-18. (With Xingfu Wang).
422011"Synchronic and Diachronic Uses of Corpora". In Perspectives on Corpus Linguistics: Connections & Controversies, eds. Vander Viana, Sonia Zyngier and Geoff Barnbrook. Philadelphia: John Benjamins. 63-80.
432011"Creating and Using the Frequency Dictionary of Contemporary American English: Word Sketches, Collocates, and Thematic Lists". In Corpus-based studies in language use, language learning, and language documentation, ed. John Newman, et al. Amsterdam: Rodopi. 283-97.
442011"The Corpus of Contemporary American English as the First Reliable Monitor Corpus of English". Literary and Linguistic Computing 25: 447-65.
452010"More than a peephole: Using large and diverse online corpora". International Journal of Corpus Linguistics 15: 405-11.
462010"Semantically-based, learner-oriented queries with the 400+ million word Corpus of Contemporary American English". Łódź Studies in Language, ed. Stanislaw Goźdź-Roszkowski. Frankfurt: Peter Lang.
472010"Creating Useful Historical Corpora: A Comparison of CORDE, the Corpus del Español, and the Corpus do Português". In Diacronía de las lenguas iberorromances: nuevas perspectivas desde la lingüística de corpus, ed. Andrés Enrique-Arias. Frankfurt/Madrid: Vervuert/Iberoamericana. 137-66.
482010"What students need (and want): semantically-oriented queries in large online corpora". SYNAPS (Bergen) 24: 27-40.
492009"The 385+ Million Word Corpus of Contemporary American English (1990-2008+): Design, Architecture, and Linguistic Insights". International Journal of Corpus Linguistics. 14: 159-90.
502009"Relational databases as a robust architecture for the analysis of word frequency". In What's in a Wordlist?: In Investigating Word Frequency and Keyword Extraction, ed. Dawn Archer. London: Ashgate. 53-68.
512008"Spanish and Portuguese Corpus Linguistics". Studies in Hispanic and Lusophone Linguistics. 1:149-86.
522008"The corpus-based Frequency Dictionary of Portuguese: A new tool for learners and teachers." In Proceedings of TALC 8: Teaching and Language Corpora, ed. Ana Frankenberg-Garcia, et al. Lisbon. (Co-authored with Ana Maria Raposo Preto-Bay)
532008"The Corpus of Contemporary American English--a Useful Tool for English Teaching and Research". Computer-Assisted Foreign Language Education in China. 5:24-31 (Co-authored with Wang Xingfu and Liu Guohui).
542007"Pointing Out Frequent Phrasal Verbs: A Corpus-Based Analysis". TESOL Quarterly 41:339-59. (Co-authored with Dee Gardner)
552007"Semantically-based queries with a joint BNC/WordNet database". In Corpus Linguistics Twenty-five Years On, ed. Roberta Facchinetti. Amsterdam: Rodopi. 149-167.
562006"Towards the first comprehensive survey of register variation in Spanish". In Corpus Linguistics Beyond the Word: Corpus Research from Phrase to Discourse, ed. Eileen Fitzpatrick. Rodopi. 73-86.
572006"Vocabulary Coverage in Spanish Textbooks: How Representative is It?" In Selected Proceedings from the Conference on the Acquisition of Spanish and Portuguese as First and Second Languages, ed. Jacqueline Toribio. Cascadilla. 132-43. (Co-authored with Timothy L. Face). 132-43.
582006"Spoken and written register variation in Spanish: A Multi-dimensional Analysis." Corpora 1:1-37. (Co-authored with Doug Biber, James Jones, and Nicole Tracy-Ventura).
592005"The advantage of using relational databases for large corpora: speed, advanced queries, and unlimited annotation". International Journal of Corpus Linguistics 10: 301-28.
602005"On diachronic shifts with Spanish se: preliminary evidence from large electronic corpora." In Romance Corpus Linguistics II: Corpora and Diachronic Linguistics, ed. Claus Pusch, et al. Guntar Naar. 431-42.
612005"Vocabulary Range and Text Coverage: Insights from the Forthcoming Routledge Frequency Dictionary of Spanish". In Selected Proceedings from the 7th Hispanic Linguistics Symposium, ed. David Eddington. 106-15.
622005"Advanced research on syntactic and semantic change with the Corpus del Español". In Romance Corpus Linguistics II: Corpora and Diachronic Linguistics, ed. Claus Pusch, et al. Guntar Naar. 203-14. Reprinted in: Corpus Linguistics. Critical Concepts in Linguistics (6 vols.). Ed. Teubert, Wolfgang & Ramesh Krishnamurthy. London: Routledge. 337-48 (Volume 5).
632004"Student use of large, annotated corpora to analyze syntactic variation". In Corpora and Language Learners, ed. Guy Aston, et al. Philadelphia: John Benjamins. 259-69.
642004"Student use of large corpora to investigate language change". In Applied Corpus Linguistics: A Multidimensional Perspective, ed. Thomas Upton, et al. Amsterdam: Rodopi. 207-22.
652003"Diachronic Shifts and Register Variation with the "Lexical Subject of Infinitive" Construction. (Para yo hacerlo)". In Linguistic Theory and Language Development in Hispanic Languages, ed. Silvina Montrul and Francisco Ordóñez. Somerville, MA: Cascadilla Press. 13-29.
662003"Annotation without lexicons: an alternative to the standard bootstrapping approach". In Proceedings from Corpus Linguistics 2003, ed. Paul Rayson, et al. 174-83.
672002"Un corpus anotado de 100.000.000 palabras del español histórico y moderno". SEPLN 2002 (Sociedad Española para el Procesamiento del Lenguaje Natural). 21-27.
682002"'Esto es ligero de fazer: Object to Subject Raising in Medieval and Early Modern Spanish". In Structure, Meaning, and Acquisition of Spanish, ed. James F. Lee, et al. Somerville, MA: Cascadilla Press. 19-31.
692001"Creating and using multi-million word corpora from web-based newspapers". In Corpus Linguistics in North America, eds. Rita C. Simpson and John M. Swales. Ann Arbor: U Michigan P. 58-75.
702000"Using multi-million word corpora of historical and dialectal Spanish texts to teach advanced courses in Spanish linguistics". In Rethinking Language Pedagogy from a Corpus Perspective, eds. Lou Burnard and Tony McEnery. Frankfurt am Main; New York: P. Lang. 173-85.
712000"Syntactic Diffusion in Spanish and Portuguese Infinitival Complements". In New Approaches to Old Problems: In Issues in Romance Historical Linguistics, eds.Steven Dworkin and Dieter Wanner. Amsterdam; Philadelphia: John Benjamins. 109-27.
721999"The Historical Development of Subject Raising in Portuguese: A Corpus-Based Approach". Neuphilologische Mitteilungen 100:95-110.
731999"A Computer Corpus-Based Study of Subject Raising in Modern Portuguese". Lingvisticae Investigationes 21:379-400.
741998"The Evolution of Spanish Clitic Climbing: A Corpus-Based Approach." Studia Neophilologica 69:251-63.
751997"A Corpus-Based Approach to Diachronic Clitic Climbing in Portuguese." Hispanic Journal 17: 93-111.
761997"Using Large Computer-Based Corpora as a Philological Tool: An Analysis of Four Medieval Spanish Bibles." Dactylus 16: 70-92.
771997"The History of Subject Raising in Spanish". Bulletin of Hispanic Studies (Liverpool) 74: 399-411.
781997"A Corpus-Based Analysis of Subject Raising in Modern Spanish." Hispanic Linguistics 9: 33-63.
791996"The Diachronic Interplay of Finite and Nonfinite Verbal Complements in Spanish and Portuguese." Bulletin of Hispanic Studies (Glasgow) 73:137-58.
801996"The Diachronic Evolution of the Causative Construction in Portuguese." Journal of Hispanic Philology 17:261-92.
811995"The Evolution of Causative Constructions in Spanish and Portuguese." In Current Research in Romance Linguistics, ed. John Amastae, et al. Philadelphia: John Benjamins, 1995. 105-122.
821995"The Evolution of the Spanish Causative Construction." Hispanic Review 63:57-77.
831995"Analyzing Syntactic Variation with Computer-Based Corpora: The Case of Modern Spanish Clitic Climbing". Hispania 78:370-380.
841994"Parameters, Passives, and Parsing: Explaining Diachronic Shifts in Spanish and Portuguese". In Variation and Linguistic Theory, ed. K. Beals, et al. Chicago: CLS. Vol 2. 46-60.
851992"A Tentative Bibliography of Historical Spanish Syntax." Hispanic Linguistics 5:279-351.
Reviews (click to download)
862009Review of Using Spanish Corpora (Giovanni Parodi). Modern Language Journal. 93: 467-68.
872009Review of The International Corpus of English – British Component (ICE-GB), the Diachronic Corpus of Present-day Spoken English (DCPSE), and ICECUP 3.1. Language. 85: 443-45.
882004Review of Léxico Hispanoamericano (Peter Boyd-Bowman, et al). La Coronica: A Journal of Medieval Spanish Literature and Language 33: 259-64.
892004Review of Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching (Sylvaine Granger, et al). Modern Language Journal. 88: 469-70.
902001"Review of Construcciones causativas en el español medieval (Milagros Alfonso Vega). Revista Canadiense de Estudios Hispánicos 25: 329-30.
911995"Omnipage and WordCruncher: Tools for Creating and Searching Digitized Text Corpora." La Corónica 23:111-115.