How Machine Learning and Social Media Are Expanding Access to Mental Health

Cite as: 2 GEO. L. TECH. REV. 137 (2017)


After years of anticipation, the era of telemedicine has finally arrived.5 It has been nearly a decade since doctors, insurance companies, and other health care professionals predicted a telemedicine revolution.6 Telemedicine literally means “healing from a distance” and involves the direct treatment and prevention of diseases and injuries by healthcare professionals at a distance.7 The global market for telemedicine is anticipated to reach $113.1 billion by 2025.8

Until the mid-twentieth century, it was common for doctors in the United States to make house calls to treat their patients. But due to a decrease in the number of general practitioners and the inefficiencies and cost of travel, the house-call became extinct.9 In 1980, only 0.6% of patient visits occurred in the home.10 But thanks to recent advancements in cell phone technology, faster Internet and broadband speeds, and sophisticated computer algorithms, house calls are on the rise.11 The “twenty-first century house call” enables doctors to connect to their patients from the comfort of their office, home, or hotel room. As health care costs in general continue to increase, patients also benefit from reduced costs associated with telemedicine. In addition, telemedicine expands access to health care for people in rural areas who typically travel many hours to visit the doctor, and to low-wage workers who cannot afford to take a day off from work to receive health treatment.

While telemedicine is more closely linked to diagnosis and treatment by physicians, telehealth involves a broader range of healthcare related services. Tele-health is defined by the Health Resources and Services Administration as “the use of electronic information and telecommunication to support long-distance health care, health-related education, and public health administration.”12 Telemental health—or mental health services provided from a distance—is one of the fastest growing sectors within the telehealth space. The National Business Group on Health reports that 56% of employers plan to offer telemental health services to their employees in 2018, which is double the number in 2017.13 This surge in mental health coverage could not come at a better time. Approximately one in five adults in the United States experiences a mental illness in a year.14 []. Suicide is the tenth leading cause of death in the United States, and suicide deaths have increased twenty-four percent over the past fifteen years.15

One of the most exciting developments in this area is the combination of machine learning algorithms that diagnose mental illness with user-data gathered from social media websites. This article will briefly explain the machine learning process and how these systems are capable of analyzing social media activity to diagnose mental illness.


For years, psychologists have relied on pattern recognition models to determine whether a patient is at risk of developing a mental illness or committing suicide.16 Unfortunately, those models are not very effective. In fact, they were only slightly more accurate than flipping a coin.17 The difficulty lies in the inherent nuance of language and the subtlety of human behavior. For instance, a person may casually use the word “suicide” in everyday discussion, but have no intention of killing themselves. While a similar person never utters the word “suicide”, but their oral statements and body language indicate they are at risk of committing suicide.18 Human beings are complicated: individuals have their own relevant factors to consider when diagnosing their mental health.

Accurate suicide prediction requires an analysis of hundreds of factors, including race, gender, age, socio-economic status, physical and mental medical history, and other information which may be deemed relevant.19Machine learning algorithms are perfect for this job because they are capable of processing mass amounts of data and distilling those data down into usable formulas to meet the desired purpose.20Machine learning algorithms have a number of uses, including extracting insight, discovering anomalies, recognizing patterns, and making predictions.21 Of these uses, pattern recognition and predictions are particularly important in diagnosing mental illness or predicting the risk of suicide. What differentiates machine learning from traditional pattern recognition and prediction methods is that a machine learning algorithm improves each time it is used, because it uses artificial intelligence to automatically learn and improve based on its own experience. 22 Not only are patterns and predictions more accurate, they are also formulated at a much faster pace than typical models, with many algorithms processing data in near real-time. In fact, a recent study which used machine learning algorithms to predict the risk of patient suicide found that the algorithm’s prediction accuracy was sixty to eighty percent and could predict potential suicide attempts as far as 720 days out.23

To understand how machine learning algorithms are capable of achieving such accuracy, we must first discuss their structure. Machine learning algorithms are typically composed of three parts: the parameters, the model, and the learner. First, the parameters, which include the factors used by the model to form its decision, must be assembled.24 In the context of suicide prevention, relevant factors include the age, gender, race, and medical history of the patient. Second, those factors are input into the model. The model is the original formula that the algorithm uses to make a prediction.25 For instance, the formula may be based on a trend which previous researchers identified in patients that committed suicide, such as a depression diagnosis or previous suicide attempt. The model uses that formula to classify the parameters and make a prediction. Lastly, the learner takes the prediction and compares it to the actual outcome (i.e., whether the patient attempts suicide in the future), then feeds the outcome back into the system so the model can adjust its prediction algorithm, and become more accurate.26 The end result is a prediction model that can consider more factors than traditional methods, and is more accurate because it self-adjusts each time the model is run.

Computer learning systems become more accurate over time because of advances in computational power. Couple that computational power with a wealth of real-time user data and you get the future of mental health care. This is exactly what happens when large amounts of social media user data is processed by machine learning applications. This technology has the potential to diagnose more people suffering from mental illness and to prevent suicide.27


A recent study used a decision tree algorithm to diagnose Instagram users with depression.28 Decision trees take information and categorize it based on “training data.” The training data create a rule so that new data can be filtered through the rule and predictions can be made. The algorithm improves as predictions are compared with actual outcomes, which are fed back into the system, much like the example above. The Instagram study used a 1200-tree “forests” algorithm­—a popular classification algorithm that uses multiple decision trees­ to diagnose depression based on Instagram photos. A random forests classifier processes data separately in each decision tree within the “forest” and the output is weighed or averaged to create a more reliable prediction.29


Figure 1: A study of Instagram filters found that picture hue, saturation, and brightness were relevant factors to consider to diagnose depression. Instagram filters are photograph editing tools which alter the appearance of a photograph. The Y-axis corresponds to the number of Instagram posters that used a specific filter, and the X-axis shows how filter usage differs between depressed and healthy users.30


The Instagram study analyzed a number of factors­, including picture hue, saturation, brightness, filters used, presence of a human face, number of faces, and number of likes received. The decision tree results for each factor were averaged to form a prediction of whether the Instagram user was suffering from depression. The algorithm was quite successful,31 and found that depressed users were more likely to post bluer, grayer and darker pictures, while healthy individuals posted more vibrantly colored photos. See Figure 1. Interestingly enough, depressed users were more likely to include human faces in their pictures, but typically included less faces per picture than healthy users.32


Facebook first introduced its deep learning neural network called DeepText to the public back in June 2016.33 The technology relies on the use of “word embeddings,” a mathematical concept that “preserves the semantic relationship among words.”34 In the past, words were converted into integer format so that the computer could distinguish between them. For instance, the word “sister” might be assigned an integer ID such as 3478, while “sis” is assigned a separate wholly unrelated integer ID like 993258.35 The problem with this approach is that the relationship between the two words is lost. But with word embedding technology, the words “sister” and “sis” are calculated in close proximity so that their relationship can be captured.36

Facebook primarily uses DeepText to gather information on its users to sell to advertisers. But recently, it has begun using it to prevent suicide by tracking user posts and offering high-risk users a lifeline.37 Facebook released its first pattern recognition algorithm to identify users who may be at risk of self-harm or suicide in March 2017, two months after a fourteen-year-old girl named Naika Venant tragically took her own life on Facebook Live.38 The pattern recognition algorithm uses a neural network to flag posts which contain words or phrases that relate to suicide or self-harm, as well as comments added by concerned friends.39 Neural networks are composed of a vast network of nodes, akin to the human brain, therefore they are able to share information across nodes and solve more nuanced problems.40

DeepText monitors user posts and feeds them through a classification algorithm to determine whether each user is at risk of committing suicide.41 The greater the risk, the more prominent the “report post” button becomes to the right of the post itself. If the signals reach a higher level of urgency, the system automatically alerts Facebook’s Community Operation Team, which evaluates the post and determines whether to take action.42 The team may message the user directly or trigger a pop-up screen of suicide prevention resources on the user’s Facebook timeline.

For now, the algorithm only supports text inputs, but Facebook is developing an algorithm which reviews picture and video posts as well, currently achieving a one in three success rate.43 Some experts have recommended including additional inputs into the algorithm, such as the frequency with which users log in, the number of posts, and the time at which they post.44


Although seventy-eight percent of Americans currently use some form of social media, less than a third of the world population uses social media.45 To reach non-social media users, companies are developing cell phone applications that collect data on their users throughout the day. The data is analyzed by machine learning algorithms to determine the user’s mental state. Cogito Inc. developed an algorithm called Companion which uses a person’s phone to monitor changes in social behavior and movement patterns.46 The application uses the phone’s voice recorder, GPS, and usage history to sense changes in mood, activity levels, quality of sleep, and social connectedness. The application updates the user with visual tracking tools once a determination is made.47 See Figure 2.

Figure 2: Cogito COMPANION is a cell phone application that diagnoses a person’s mood based on existing sensors within that person’s cell phone. Doctors are able to use this application to monitor their patient’s moods in-between appointments, and to identify patterns or areas of concern that need to be discussed or treated. The application is currently being tested and may be available to the general public in 2018.48



As the telehealth industry continues to grow, new technology development will give rise to a number of nuanced legal issues. The health-law field is still wrestling with, privacy, liability, agency, and intellectually property issues that arise when applications create new medical technology to treat patients using the patient’s own data. It is quite possible that future patients will be misdiagnosed by a phone application or social media study, and that potential harm could befall the patient as a result. Who is responsible in that situation? In addition, there are major privacy concerns for social media users who are being monitored without their knowledge. The legal field will have to adapt old legal concepts, or possibly create new legal frameworks, to address these new developments.

Advances in machine learning technology and social media data collection have led to more accurate mental health predictions, but the technology is still in its infancy. One of the major benefits of machine learning is that algorithms improve the more they are used and the larger the data set. This means that mental health diagnosis will likely improve in the coming years: welcome news, considering the great need for better mental health services.

Joseph Simpson

GLTR Staff Member, Georgetown Law, J.D. expected 2018; University of Maryland, B.S. 2008. ©2018, Joseph Simpson