Researchers develop AI model that detects mental disorders using Reddit posts

Dartmouth College researchers have developed an artificial intelligence (AI) model that can be used to predict mental disorders using data from conversations on Reddit, according to a university paper.
Researchers Xiaobo Guo, Yaojia Sun, and Soroush Vosoughi presented a paper titled “Emotion-Based Modeling of Mental Disorders in Social Networks” at the 20th International Conference on Web Intelligence and Intelligent Agent Technology.

According to the paper, most of the AI ​​models that currently exist work on the basis of psycholinguistic analysis of user-generated text content. Despite showing high levels of performance, content-based rendering models suffer from domain and topic bias.

Vosoughi explained to a Dartmouth science writer when discussing the possibility that if a model learns to correlate the word “COVID” with “sadness” or “anxiety,” it will automatically assume that a scientist researching and publishing on COVID is suffering of depression and anxiety.

The new model suppresses these topic-specific biases by relying entirely on emotional states without learning anything about the topic described in the publications.

To train the model, the researchers collected two datasets between 2011 and 2019: the first was a dataset of users with one of the three emotional disorders of interest (major depression, anxiety, and bipolar disorders) and the second was a user data set. without known mental disorders, which acted as a control group.

The first set of data was collected based on self-reported mental disorders, meaning the researchers looked for users who had made posts or comments that said something along the lines of “I’ve been diagnosed with bipolar/depression/anxiety.” Only posts made before self-reporting were considered for research because previous work had shown that users realizing they have a disorder will change their online behavior and create bias.

  Yassi Pressman ends 'fitness hibernation'

The researchers then made sure that the data across all four classes (one for each user with each disorder of interest and a control group) had similar temporal distributions: this means that the data across all four classes had a similar publication distribution. similar based on time. The data sets were also balanced with 1,997 users for each of the classes.

After this, the researchers divided the data into training (70%), validation (15%), and testing (15%). After training the model on the data and then testing it, the researchers found that the emotion-based representation model they used was more accurate in predicting disorders than the TF-IDF-based method (term frequency, document frequency reverse). TF-IDF is used to calculate the importance of a keyword, based on its frequency and post importance.

s.parentNode.insertBefore(t,s)}(window, document,’script’,
fbq(‘init’, ‘444470064056909’);
fbq(‘track’, ‘PageView’);

Leave a Comment