Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Automatically identifying typical vocabulary in Swedish textbooks in the natural sciences
Karlstad University, Faculty of Arts and Social Sciences (starting 2013), Department of Language, Literature and Intercultural Studies (from 2013). Göteborgs universitet.ORCID iD: 0000-0003-0597-1546
2016 (English)In: ECER 2016, Leading Education: The Distinct Contributions of Educational Research and Researchers, 2016Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

According to functional linguistic theory, the acquisition of knowledge and that of language are parallel phenomena (see for example Halliday 1985). At the same time, different subjects and disciplines have developed their own ways to describe the world, to construe meaning and their own linguistic means to exploit and present this knowledge. As concluded by Ribeck and Borin (in press) within the field of Swedish textbook studies, subjects cannot be generalized with respects to lexical features. Sadly, both Nordic and international studies have indicated that subject-specific linguistic distinctions rarely are made explicit to the students (Chandler 1995, Dysthe et al. 2006). A direct consequence of this situation is that students often perceive form level transitions as ‘’discoursal shocks’’ (Ask 2005). To prevent language related traumas from occurring and to instead be able to offer well-informed and adequate linguistic support to students, the school needs to be more aware of specific linguistic demands at different form levels. Altogether there seems to be an undisputed call for subject-specific language studies.

My poster presentation will be of a study that constitutes a subset of my newly published dissertation in natural language processing within the field of subject-specific language. The overall purpose of my thesis is to account for subject-specific grammatical features on the educational levels of the Swedish secondary and upper secondary school. This undertaking includes identification of typical linguistic patterns related to vocabulary, phraseology and syntax. I am also interested in describing the linguistic progression in the textbooks from different subjects, from secondary school textbooks, though upper secondary school textbooks, up to academic (university-level) texts.

In my poster presentation, I intend to focus the lexical inventory part of this task, which includes automatic identification of core-vocabulary with subject-specific usages. I will show examples of index lists over vocabulary extracted from a corpus of 5.2 million words originating from Swedish secondary and upper secondary textbooks in the natural sciences, social sciences and mathematics. The linguistic investigation focuses on features in the natural sciences, i.e., biology, physics and chemistry, and textbooks from other disciplines are mainly included for comparative purposes. Besides internally comparing the textbook registers from different disciplines to one another, the textbooks in natural science are also compared to reference corpora, comprising narrative and academic texts.

Method

My work describes a quantitative procedure for characterizing the register of Swedish textbooks in natural sciences, i.e. from the subject fields of biology, chemistry and physics. The method is corpus-based and uses tools from language technology to automatically produce, what I call, index lists. This empirical, corpus-based method relies on automatic linguistic annotations produced by language technology tools to calculate what I call index lists, rank-ordered lists of characteristic linguistic features of specific text corpora as compared to reference texts. The idea of such lists originates from the project to develop an academic word list for Swedish (Ribeck et al. 2014). I produce index lists for typical vocabulary, extracted from a 5.2 million word textbook corpus. As well as being frequent and well dispersed, the linguistic variables selected for the index lists are also characteristic of the text type in question, as is evident when they are compared to a reference corpus, comprising textbooks in the social sciences and mathematics, as well as narrative and academic texts.

Expected Outcomes

The results show that textbooks in natural science contain a lot of content-specific, technical vocabulary. This characteristic not only distinguishes natural scientific language from everyday language, but also from social scientific language, which on the lexical level has more in common with narrative texts. In the transition between secondary and upper secondary school, there is an evident increase in linguistic demands on the readers. In the upper secondary textbooks the words are longer and the vocabulary more varied. Notably, the linguistic development between the form levels is more marked in the natural-science textbooks, compared to social sciences and mathematics. Nevertheless, the textbook language overall shows a relatively low lexical complexity in comparison to academic language. To this day, the Swedish curriculum lacks explicit descriptions of the expected linguistic progression in relation to different subjects and form levels. With my results I make an empiric contribution to the theoretical foundation on how the language in natural science is construed in Swedish educational context. My wish is that my study will give educational researchers, language instructors and other educationists, who actively work to support students' development of subject-specific linguistic competence, the basis for further discussion on how an effective development of language teaching should be designed. I also hope that presenting this study to educational researchers from other countries will inspire them to conduct similar linguistic investigations of texts that students in their country need to master during their years of schooling.

References

Ask, Sofia 2005. Tillgång till framgång. Lärare och studenter om stadieövergången till högre utbildning. Växjö: University of Växjö, Department of arts.

Chandler, Daniel 1995. The act of writing. A media theory approach. Aberystwyth: University of Wales.

Dysthe, Olga, Siri Breistein, Jens Kjeldsen and Liv Ingeborg Lied 2006. Studentperspektiv på rettleiing. Dysthe, Olga and Akylina Samara (eds.), Forskningsveiledning på master- og doktorgradsnivå, 207–227. Oslo: Abstrakt Forlag.

Halliday, Michael Alexander Kirkwood 1985. An introduction to functional grammar. London: Edward Arnold Ltd.

Ribeck, Judy and Lars Borin 2014. Lexical bundles in Swedish secondary school textbooks. Human language technology challenges for computer science and linguistics. Lecture notes in computer science, 238–249. Berlin: Springer International Publishing.

Ribeck, Judy, Håkan Jansson and Emma Sköldberg 2014. Från aspekt till övergripande – en ordlista över svensk akademisk vobabulär. Vatvedt Fjeld, Rut and Marit Hovdenak (eds.), Nordiska studier i lexikografi 12. Rapport från konferensen om lexikografi i Norden, August 2013. Oslo, 370-384.

Place, publisher, year, edition, pages
2016.
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:kau:diva-71793OAI: oai:DiVA.org:kau-71793DiVA, id: diva2:1303358
Conference
ECER 2016, Leading Education: The Distinct Contributions of Educational Research and Researchers, 22-26 August, 2016, Dublin
Available from: 2019-04-09 Created: 2019-04-09 Last updated: 2019-04-17Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

https://www.eera-ecer.de/ecer-programmes/conference/21/contribution/37609/

Authority records BETA

Ribeck, Judy

Search in DiVA

By author/editor
Ribeck, Judy
By organisation
Department of Language, Literature and Intercultural Studies (from 2013)
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 15 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf