Section outline

  • Course Overview

    In this course you'll learn how to approach the study of language and social interactions in digital environments so you can analyze textual data sets from social media sites, digital archives, and digital surveys and interviews.

    Learning Outcomes

    By the end of this course you will understand:

    1. The foundations of Natural Language Processing (NLP)
    2. How text mining tools have been used successfully by social scientists
    3. Basic text processing techniques
    4. How to approach narrative analysis, thematic analysis, and metaphor analysis
    5. Key computer science methods for text mining, such as text classification and opinion mining

    • Course Instructors

      • Gabe Ignatow

        Gabe Ignatow is a Professor in the Department of Sociology at the University of North Texas. His research interests are in sociological theory, digital research methods, cognitive social science and philosophy of social science. He currently serves on the editorial boards of Sociological Forum and the Journal for the Theory of Social Behavior. Along with the two recent books on text mining co-authored with Rada Mihalcea, Gabe has co-authored a forthcoming volume on digital social research methods and co-edited the Oxford Handbook of Cognitive Sociology. He is currently working on a book project on sociological theory in the digital age while serving as his department's graduate program director.

        View Bio for Gabe Ignatow
      • Rada Mihalcea

        Rada Mihalcea is a Professor in the Computer Science and Engineering department at the University of Michigan. Her research interests are in computational linguistics, with a focus on lexical semantics, multilingual natural language processing, and computational social sciences. She serves or has served on the editorial boards of the Journals of Computational Linguistics, Language Resources and Evaluations, Natural Language Engineering, Research in Language in Computation, IEEE Transactions on Affective Computing, and Transactions of the Association for Computational Linguistics.

        View Bio for Rada Mihalcea
      • Pre-Course Self Assessment

        Before you dive into this course, spend a few moments reflecting on your familiarity with the topic and your current level of skills confidence. 

        You will then re-visit the same questions in our Post-Course Self Assessment and reflect on how the course has helped you develop in confidence and grow your skills.

        • Module One: Foundations

          In this first module, we cover the foundational concepts involved in text mining. Specifically, we define text mining and discuss how it relates to text analysis. We then discuss how to acquire textual data that you can use for your own research project. Finally, we help you to identify appropriate ethical guidelines for your research project and to consider several philosophical issues that will help you to define your project and what you can expect to learn from it.

        • Module Two: Research Design and Basic Tools

          This module includes two main topics. The first, research design, will be vitally important to you if you are working on your own research project. If you are interested in learning about text mining but are not currently working on a project of your own, you may wish to commit less time to study research design. However, the lessons on basic tools presented in the lessons in this module are important for all text mining students.

        • Module Three: Text Mining Fundamentals

          In this module, we will cover the fundamental principles and procedures for text mining as developed in computational linguistics. This is a technical module covering some of the main concepts of Natural Language Processing (NLP). If you are interested in learning NLP you will benefit a great deal from the lessons in this module. If you are focused on your own social science research project, you should learn the basics of NLP but you can be more selective about which lessons in this module will be most valuable.

        • Module Four: Methods from the Humanities and Social Sciences

          In this module, we provide a smorgasbord of tools and techniques for text mining and text analysis. These tools and techniques have been developed by scholars in the humanities and social sciences, and they have all proven their value in large numbers of published empirical research projects (books, journal articles, reports, etc.). We recommend you learn the fundamentals of each approach while steadily focusing in on the tools and techniques that are best suited for your own project.

        • Module Five: Computer Science Methods

          In this the final module we do a deep dive into the major text mining procedures developed by computer scientists and other researchers working in Natural Language Processing (NLP). These include automated procedures for classifying texts, measuring sentiments and opinions expressed in texts, extracting information from texts, and analyzing the topics discussed within texts. This module contains the most technical lessons of the course, so be sure to take your time to gain as much as possible from each lesson.

        • Post-Course Self Assessment

          Now you’ve completed the course, spend a few moments reflecting on where your familiarity with the topics and your confidence skills le vels are at now. 

          Has the course helped you develop new skills and grow your confidence?

          You'll need to complete the Post-Course Self Assessment in order to download your certificate. If you didn't do the Pre-Course Self Assessment before starting the course, please go to the top of the page and reflect on your familiarity with the topic and your level of skills confidence before you started the course.

          • Completion: Certificate

            Completing the course and Post-Course Self Assessment will unlock a course certificate, which you can download here.

            • Give Feedback About This Course

              Did you enjoy the course? Please take two minutes to share your feedback. We use learner feedback in future course updates and developments to provide an excellent learning experience.

            • Accessibility, Diversity, Equity and Inclusion