Feel free to read the Introduction (Part 1) or the section on Collaboration Initiatives (Part 3), but each individual part also stands on its own.
Designing Transparent Software
Ranking algorithms, whether in a search engine or news feed context, are designed to serve the information needs and wants of their target users. Contrary to popular belief that they are entirely controlled by large technology companies, these information platforms are influenced by many diverse actors living in a complex social environment. From SEO hacks to the malicious injection of false data; from external data organizations influencing political elections to Reddit users searching for and persecuting a criminal suspect (who turn out to be innocent), users and ranking algorithms evolve together to serve the desires of many actors in a social environment. While some public intellectuals believe these algorithms to be autocratic, controlled only by big tech, many make the convincing case that these algorithms are plutocratic, controlled by both tech corporations and users, and exist in an agonistic, or competing environment between client and service.
Engineers can choose to design their ranking systems in order to grant users more agency in this competing environment, such as personalizing their browsing recommendations and news feeds. They can also design their system to be robust against malicious users “gaming” the system. Both are desirable, and the following suggestions present concrete design options for increasing “transparency” from the engineering perspective.
Ranking algorithms should be made safe and secure against the malicious collection of algorithm outputs and the injection of false data into algorithm inputs. Engineers should consider “worst-case” or “harm” scenarios in the design of ranking algorithms. It is entirely possible for third parties, for example, to “game” the system by injective a massive amount of fake posts in the interest of corporate or political advertising.
Examples in industry: In the 2018 Cambridge-Analytica scandal, Facebook failed to protect their user data from collection by malicious third parties for mass political and social engineering. This is a failure in secure API service design.
Users should be presented with reasons for why they see the information presented to them. If the ranking algorithm is driven by a neural network, engineers should strive to increase the model’s explainability.
Examples in industry: Facebook in 2019 launched the “Why am I seeing this post?” feature, allowing users to see the factors that led into the ranking algorithm’s decision, including the user’s past actions and affiliations on the platform. (Whether or not these reflect the actual algorithm’s internal workings is, however, unknown to the user.)
Users should have a robust set of options for controlling and personalizing the ranking algorithm’s output. Another way of stating this is that the user should be able to provide feedback to the ranking algorithm by identifying and labeling the algorithm’s mistakes. They should be able to hide posts and categorize which posts they want to see.
Examples in industry: YouTube in 2019 launched the “Don’t recommend” feature, allowing users to provide feedback to the recommendation algorithm. In 2020, Facebook and Instagram allowed users to turn off political ads.
Users should be able alter privacy settings and tune the amount of data collection they “opt in” to. There should be one centralized location where the user sees what parameters of their user data go into influencing the search engine or news feed. They should by default opt out of the collection of sensitive and personally identifiable information.
Examples in industry: In 2021, Apple pushed an update that forcefully prompts the user to select whether they would like to allow data tracking and personalized advertising for every app that they download from the App Store, before they are able to use the app.
Users should be able to make edits and see edits on documents outputted by the ranking algorithm. They should be able to vote or flag certain documents as inappropriate for moderators to remove. They should be able to witness the conversation that took place prior to the removal and editing of documents.
Examples in industry: All edits to Wikipedia articles, both malicious and constructive, including discussions about edits, can be viewed in the page history. Alternatively, some ranking algorithms, such as those used by Twitter, automatically decide to remove hate speech, violent images, or posts that conflict with the platform’s values. A balance of the two approaches is probably ideal.
Read more
Introduction (Part 1)
Product Design (Part 2)
To be continued…