Basic natural language processing techniques for the analysis of mentions of women in Spanish graded readers

Iban Mañas Navarrete, María-Valle Sell

Abstract


Analyses of gender representation in textbooks of Spanish as a Foreign Language (SFL) have been limited by manual-count methods. The present study adopts a different approach and investigates gender representation in a corpus of 47 graded readings for students of L2 Spanish by applying basic Natural Language Processing techniques based on the work by Lucy, Demszky, Bromley, and Jurafsky (2020). The differences in the space devoted to each gender and its evolution over time are examined. Likewise, the explanatory value of the gender of the main characters on the distribution of mentions of men and women is explored. The results show that women are underrepresented, even in books where the main characters are both women and men. This work provides the results of the analysis and a set of tools for text analysis that is available for the community to apply it to the analysis of other texts or to adapt it to analyze different social aspects of language use and to deepen the characterization of the teaching materials used in SFL.

Keywords


gender studies; textbooks; graded readers; NLP; Spanish as Foreign Language; text as social practice



DOI: https://doi.org/10.22201/enallt.01852647p.2024.78.1075

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Estudios de Lingüística Aplicada

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.