The turn of the century was met with widespread acceptance of the internet – an enabler for people to discover and connect with others like them – all around the globe. The vision of organizations like Google, ‘to organize a seemingly infinite amount of information on the web’, is no more a distant dream. As the internet grows exponentially around us (and with us), so does the amount of information being produced.
Though there may not be a way to precisely measure the volumes of consumption, it would be safe to assume that over 500 million images are uploaded to the internet every day! As mind-numbing as this number may be, it pales in comparison to the volume of plain text being generated – nearly 5 trillion words a day. The size of the internet is even larger, owing to the production of data through IoT devices, the need to store all of the above forms of data, and more.
These overwhelming statistics can only be expected to rise with time, as the internet reaches more people every day. The number of internet users (~4.5 billion people) shall continue to grow, and with it, the penetration of social media that has only reached 85% of the internet so far. This brings us to the age old question; “How safe is the internet?” In this article I examine the impact of user generated content, and the often understated importance of monitoring, scanning and moderating this content, as well as the various means to perform these crucial activities.
Unlike traditional media, user generated content largely evades quality control of any kind. Concerns about quality, however, are outweighed by those around the potential damage caused by harmful and unwanted content. In the absence of a formal review and editing mechanism, several users post threatening, insensitive and extremely graphic content; and get away with it. Enterprises most disconcerted by such content are Social Networks, Gaming companies, Online Communities and dedicated Kids’ Platforms. As the amount of user generated and consumed content rises, so do concerns around safety. Regulatory bodies and policy organizations too have taken cognizance of the risks associated with the unbridled use of such platforms. At the same time, as debates around free-speech and artistic liberty gain global momentum, organizations are often left to fend for themselves in this precariously treacherous environment.
Collectively, organizations today employ tens of thousands of content moderators to sift through billions of violent, sexually explicit, and offensive online images, videos and comments. Today, content safety is not a choice but the norm, with millions of dollars in fines being imposed on enterprises for hosting, collecting and storing illegal and harmful information. As parents around the world become more aware of the dangers of online content for children and teens, safety is the top priority of both platforms and users alike. Research has shown that social platforms that are perceived to be unsafe are likely to lose over a third of their users! With reputations and revenues at stake, organizations can no longer afford gaps in content safety. However, many leading organizations have realized that manual content moderation alone may not be a sustainable solution for two reasons: –
This begs the question – “What then can an organization do to protect its users and its reputation?”
The answer – Artificial Intelligence
Early adopters of AI enabled solutions in the content tagging and moderation space have been quick to realize the advantages of speed and scale that come with having done so. AI has matured significantly over the last 5 years, making it possible to deliver commercially viable outcomes across industries. While some content is more complex to analyze than the rest, the real challenge lies with monitoring content in near real-time. As the rules enforced by moderation systems evolve, so do the violators of these systems. Therefore, a truly effective solution is one that uses analytics to constantly adapt to these changes in line with community guidelines.
User generated content can be moderated both pre and post upload, and can have regionally varying rules & definitions, depending on the priorities of the platform – thereby necessitating any content moderation solution to be customizable. In the pre-moderation stage, AI can tag and classify content for manual review, increasing accuracy and reducing the time taken by a manual reviewer. It can also help mitigate the extent of impact such content has on an individual moderator by limiting the exposure to less harmful parts of content and blurring out the rest.
Systems with natural language understanding and sentiment analysis capabilities can effectively emulate linguistic nuances in multiple languages. Such systems are evolving with time, and need to be constantly updated to remain relevant and reliable. The scope of content that such solutions deal with is extremely transitionary and challenging, and requires an understanding akin to that of a human being. What’s important to note is that even the most advanced systems operate by focusing on carefully isolated parts of human generated content, and an absolutely independent AI solution is not yet around the corner. Therefore, what is just as critical as the AI, in any content moderation system, is the definition of rules.
However, not all is perfect with automated content moderation. While a well-built system is significantly better than a manual process, an inaccurately built system may be contextually unaware and therefore ineffective. What’s most important is to work with the right partner, who not only understands the need to build infallible technology, but also has significant experience in doing so. Integrating over two decades of technology solutions experience, with our prowess in delivering cutting-edge AI enabled automation projects for leading global clients; we present ValueLabs Sift (ValueLabs’ AI powered content moderation tool).
To know more about our content moderation capabilities Click here