Social Media Algorithms: Why You See What You See
INTRODUCTION
Searching the word “algorithm” on Google leads to images of spider-web charts, three-dimensional graphs and rows of mathematic equations. What could social media and math have to do with each other? Looking at the motive behind social media companies tells us why algorithms are an inseparable part of social media platforms.
SOCIAL MEDIA AS BUSINESS
Facebook, Twitter, Yelp, and Instagram are some of the bigger social media platforms. At their core, social media platforms are businesses. They often advertise themselves as “free” to consumers. This does not mean that such platforms do not make any profit. In fact, social media companies profit from the free and open use of their platforms.
Specifically, social media companies derive profit from having users stay “engaged” on their platform. “Engaging,” in the context of social media, means interacting with content on the platform, including viewing, liking, commenting, sharing, and saving posts.1 The longer a user stays engaged, the more exposure advertisements receive. An individual instance of exposure is called an “impression.”2 Advertisers want as many potential customers as possible to see their ads and therefore seek out platforms that promise a high number of impressions on their ads. In other words, social media platforms have an incentive to keep users engaged so they can profit—in some instances by the billions—from hosting ads.3
MAXIMIZING ENGAGEMENT
To keep users engaged for as long and as frequently as possible, social media platforms want to make their news feeds interesting and relatable to users. It becomes crucial to predict what individual users, or groups of users, may find interesting. User actions on the platform generate data indicating each user’s preference. A 2009 statistic reported by McKinsey & Co. states that “large companies—with 1,000 employees or more—already had 200 terabytes [or 200,000 gigabytes] of data stored about all facets of their consumers’ lives.”4 If traditional businesses in 2009 possessed terabytes of data, imagine how much data social media companies possess in 2017. Social media companies have accumulated millions of posts, clicks, photos, and check-ins from all over the world. They have potentially useful data at their fingertips, but parsing this data to interpret consumer preference is difficult because such data is inherently “unstructured,” i.e. unorganized.5
Because hiring humans to analyze enormous amounts of unstructured data may be costly and inefficient, social media platforms enlist the help of algorithms to do the heavy lifting. An algorithm is a fancy way to describe a set of steps to reach a goal. Algorithms are all around us. For example, we follow an algorithm of our own when we go through a step-by-step process to shop for a new laptop. First, we collect information on the kinds of laptops on the market and their specifications. Next, when analyzing the gathered information, we may prioritize certain factors more than others, such as better performance capability over disk space within the same price range. We also may make predictions in the process, such as concluding that spending a little bit more to obtain a higher performance laptop is worth paying the extra price tag in the long run. Finally, taking all these factors into account, we come out of the process with a decision to buy the laptop we feel meets our needs. We practice algorithmic thinking in our daily lives.
An algorithm created by the software engineers of a social media platform works to maximize user engagement. Remember that maximizing user engagement will lead to a higher number of impressions on ads, which in turn will lead to profit from hosting these ads.
THE SEARCH FOR THE PERFECT FEED
First, the algorithm analyzes the accumulated unstructured data. Then the algorithm predicts the content a user may find interesting. Finally, the algorithm populates the user’s feed with that interesting content.
During the analysis, the accumulated data can be organized into different categories that each reveal clues about what a user likes to see. Engagement itself is a simple indication of a user’s interest in a particular content. The more frequent the engagement, the stronger the association the algorithm will make between the user and that content.6 Other activities may indicate interest, such as what profiles and pages were searched; to whom the user sends direct messages; and whom the user may know in person offline.7 The algorithm sees positive reactions—such as liking and sharing—from the user as indicators of the content’s general appeal and will display that content to a wider audience.
A big challenge for social media platforms is how to figure out the interests of those users who do not necessarily engage on the platform—also known as “passive” users. Approximately 90% of social media users are passive.8 For example, in this age of political correctness, users may be interested in a serious or controversial current event but may not wish to publicly indicate their opinion on the subject.9 To decipher the interests of a passive user, the algorithm turns to a factor that is present both in action and inaction—time. Even if a user is passive and does not engage with a post, the algorithm records the duration of time a user keeps the post on the screen as opposed to simply scrolling through.10 The algorithm sees the amount of time spent reading the post as an indication of the user’s interest in the content of the post.
After the algorithm creates a pool of potentially interesting content for a user, the algorithm gives each content a rank based on its appeal to the user. The algorithm then populates a user’s feed so that content with a higher rank is prioritized and appears near the top of the feed.11 Pushing the most interesting content to the top of the feed is expected to increase the chance a user will engage with the post.12 Within the pool, the algorithm may boost the rank of certain content, such as actions by close friends, by applying a multiplier.13 The algorithm may also apply an absolute rule to certain content, such as prioritizing a friend’s major life event above all others.14 The algorithm can also “normalize” the assigned rank so that different types of content—text, photos, video—appear in a particular order on the feed to maximize the chance of engaging.15 In the end, a feed curated by an algorithm is a culmination of juggling between multiple considerations and predicting the best arrangement of content for attracting user consumption. The way an algorithm comes up with an individual recipe for each user varies by company.16
HUMANS STILL MATTER
Despite having harnessed the incredible power of algorithms, humans are not completely out of the picture. Because meaningful content cannot always be defined in terms of click rates and freshness, social media platforms benefit from humans providing intuitive guidance on how users think and behave. For example, Facebook hires people to provide feedback on the curated content.17 Facebook also directly incorporates human input by letting users choose whose content they want to see first.18 Humans can also recognize and catch moral lapses to which algorithms may be blind. For example, Facebook announced in May 2017 that the company will hire 3,000 people to flag down abusive content, such as broadcasts of murder and rape.19
Some see human intervention as a source of contention because humans may introduce bias into the curation process. The clamor surrounding Facebook’s Trending section in 2016 exemplifies the divisiveness of this debate. The Trending section displays what is popular on the platform at the time.20 During the 2016 presidential election, users pointed out that the contents of the section seemed to lean disproportionately left on the political spectrum. People became suspicious of the fact that Facebook, through a third party, hired fifteen to eighteen people to manually write short descriptions for the Trending contents,21 though Facebook denied the allegation that systemic bias suppressed conservative topics from appearing in the Trending section.22 Facebook has since redesigned the Trending section so that excerpts from the Trending news article will automatically appear as its description.23 The company had previously relied on its list of more than 1,000 news publishers to discover “important” news for the section, but the controversy led the company to abandon the policy.24 Facebook emphasizes that the platform prioritizes “friends and family” and that the platform is “not in the business of picking which issues the world should read about.”25
However, as long as humans are on the receiving end of algorithmic products, “manual” might not always be a step in the wrong direction. Algorithms are limited in predicting human behavior.26 Algorithms cannot account for everything a human may think and feel. For example, in what is described as “algorithmic cruelty,” Facebook’s year-in-review feature repeatedly displayed the photos of a user’s six-year-old daughter who passed away that year, decorated with dancers and balloons.27 In the eyes of the user, the post seemed to be “celebrating a death.”28 The algorithm that created the post was apathetic to the inadvertent consequence.
SOCIAL MEDIA ACCOUNTABILITY
Meanwhile, people are becoming increasingly dependent on social media and their inner algorithms to stay up-to-date. Bots—software programmed to behave like human users on social media—and trolls—human users posting provocative content to promote an agenda—continue to cloud the integrity of information people see online.29 In reaction, social media platforms have been introducing changes to promote reliability and transparency in their contents and operations.30
In the wake of the November 2016 presidential election, the Association for Computing Machinery—“the world’s largest society of computer scientists, software engineers, software architects and developers”—revised the society’s code of ethics for the first time in twenty-five years.31 The new code introduced “principles for algorithmic transparency and accountability to prevent potential flaws and biases in their design, implementation and use that can cause harm.”32 People also speak of establishing an “AI watchdog” that can audit algorithms to investigate discriminatory AI decisions, which create real-life consequences such as unjust airport detentions and revocation of licenses.33
So long as algorithms are written by humans bound by contractual obligation to companies, flaws and biases will remain in the conversation.34 After all, the very act of weighing factors requires value judgements. Accepting that fact, perhaps it is not too soon to start thinking about what role the social media industry—and the algorithms that power it—should play in civic life.35
Sang Ah Kim
GLTR Staff Member; Georgetown Law, J.D. expected 2018; University of Georgia, B.A. 2014.