Skip to content

Latest commit

 

History

History
64 lines (49 loc) · 2.96 KB

README.md

File metadata and controls

64 lines (49 loc) · 2.96 KB

Writing Style Differences between Jane Austen and Charles Dickens

Follow the passion to Sentiment Analysis in NLP, we conduct our first sentiment analysis project as our first assignment in study.


PART 1: Exploring Emotional Characteristics

This project explores the emotional characteristics of texts written by two celebrated British authors, Jane Austen and Charles Dickens, using text mining and sentiment analysis. By obtaining paragraph-level datasets from Kaggle, then filtering and reducing their size, it was possible to analyze:

  • Polarity (positive/negative sentiment)
  • Subjectivity

The results were visualized using:

  • Radar charts
  • Box plots
  • Word clouds

Goal

The main goal of this study is to compare the emotional tendencies in the writing styles of both authors, in order to verify common literary views:

  • Jane Austen: Often characterized as humorous and generally positive, with relatively balanced emotional changes.
  • Charles Dickens: Known for greater emotional fluctuation, focusing on themes such as social injustice, poverty, and the complexity of human nature.

Findings

The findings of this project provide a valuable foundation for:

  • Developing automated writing style simulations.
  • Performing literary text analysis.

Additionally, the results help us:

  • Better understand the authors’ backgrounds and the historical context they wrote in.
  • Reflect on how readers today might respond to their works.

PART 2: Advancing Research with NLP Techniques

This project aims to build upon earlier research by further examining the stylistic and emotional characteristics found in the works of Jane Austen and Charles Dickens.

PART 1 Findings

We discovered:

  • Austen’s writing tends to be more positively inclined overall.
  • Dickens’s works display greater emotional fluctuation and a higher proportion of neutral sentiment.

PART 2 Objectives

We use the same pre-processed dataset to conduct a more in-depth analysis:

  • Syntactic structure analysis
  • Part-of-speech distribution (adjectives, adverbs, and verbs)
  • Sentiment analysis enriched by:
    • Compound sentiment indices
    • Subjectivity indices
    • TF-IDF (Term Frequency-Inverse Document Frequency)
    • Named Entity Recognition (NER)

Goal

By integrating both traditional and advanced NLP methods, we aim to:

  • Present a more comprehensive picture of each author’s stylistic features and emotional tendencies.
  • Investigate how different vocabulary, syntax, and characters influence the overall emotional style of the texts.

Significance

Through systematic corpus analysis, this project contributes to:

  • A better understanding of the emotional dynamics in classic literary works.
  • Insights for:
    • Developing automated writing style simulations.
    • Enhancing text sentiment analysis.
    • Supporting literary research.