AI-THON

Mental Health Of INDIA
During COVID-19

TEAM : THE ELITE

Associate Partners:

  • Chennai Mathematical Institute

    Academia Partner

  • SPJIMR

    Entrepreneurship Ecosystem Partner

  • Nasscom Community

    Community Partner

Media Partners:

  • PRMoment

    Media Partner

  • So You Wanna Be
    In TV?, London

    Community Media Partner

  • Bloggers Alliance

    Blogging Partner

Hiring Partners:

  • L&T Infotech

  • Lifevitae Singapore

  • EKO Informatics

Team : The Elite

  

Priyank Jha
PGP in Data Science @ Aegis
https://spotle.ai/Priyankjha1

  

Devleena Banerjee
Business Analytics @ IIM Indore
https://spotle.ai/DevleenaBanerjee

  

Vidhya Subramaniam
Business Analytics @ IIM Indore
https://spotle.ai/VidhyaSubramaniam

  

Chiranjeevi Karthik
Student @ Vardhaman College
https://spotle.ai/Karthikchiranjeevi

Table of Contents

1. Introduction

      Problem | Objective

2. Our Approach

        Methodology | Solution

3. The Outcome

        Results | Conclusion

4. Productionization

        Limitations | Prototype

Problem Statement


  • Can we analyze the mental health of a person based on his twitter usage?
  • If yes, what are the factors that determine this?
  • To what extent COVID-19 affected people`s mental health?





Objective


  • To come up with an effective methodology to analyze mental health based on tweets.
  • To understand what determines the emotion conveyed in a tweet.
  • To gather insights on how COVID-19 affected mental health based on tweets.




Methodology

Observations : Labelling tweets



Labelling tweets
using hashtags & emojis Failed
  • % of tweets with emojis and hashtags which correspond to an emotion are less.
  • Less correlation between emotions extracted from emojis,hashtags and polarity of tweets.

Labelling tweets
using lexicons Worked
  • For every tweet we could extract whether a particular emotion is present in the tweet or not.


Our Strategy


No external
dataset was used
  • The unsupervised approach using emotion lexicons is relatively faster.
  • We just need a single scan of the dataset to label each tweet with a particular emotion.
Multiple emotions
in single tweet
  • Thanks to the emotion lexicons, we could label each tweet with multiple emotions.




Solution

Modelling


  • We have built 6 binary classification models, where each model corresponds to a particular emotion.
  • To build these models, Logistic regression was used over the vector embeddings extracted using the lexicon database.





Results


Effectiveness of Models


Conclusion

  • Yes, we can determine the mental health of a person using twitter usage.
  • The overall emotions in a tweet are decided by the emotions of individual words and not hashtags or emojis.
  • COVID-19 has definitely taken a toll on people`s mental health as fear and sadness seem to be dominating their emotional state.

Limitations


  • The predictions of our model are accurate only for tweets that use vocabulary similar to that of our training set.
  • If none of the words in a tweet are part of our training set vocabulary, then it is implicitly labelled as neutral.
  • To overcome the above limitations, we can train on a larger dataset.






Prototype




Complete Analysis




Time of the day the tweets were posted

Emojis wordcloud

Hashtags wordcloud

Most used Hashtags

Number of tweets belonging to each emotion

Polarity

Correlation between emotions extracted from emojis v/s Polarity of the tweet

References


1. https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html
2. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop_duplicates.html
3. https://www.geeksforgeeks.org/handling-oserror-exception-in-python/
4. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.tseries.offsets.DateOffset.html
5. https://stackoverflow.com/questions/43146528/how-to-extract-all-the-emojis-from-text
6. https://emojis.wiki/
7. https://stackoverflow.com/questions/43145199/create-wordcloud-from-dictionary-values
8. https://textblob.readthedocs.io/en/dev/quickstart.html#sentiment-analysis
9. http://sentiment.nrc.ca/lexicons-for-research/
10. https://seaborn.pydata.org/generated/seaborn.pairplot.html
11. https://stackoverflow.com/questions/9897345/pickle-alternatives




Thank You