Scunthorpe problem

Source: Wikipedia, the free encyclopedia.

An example of the Scunthorpe problem in Wikipedia because of a regular expression identifying "cunt" in the username

The Scunthorpe problem is the unintentional blocking of online content by a

spam filter or search engine because their text contains a string (or substring) of letters that appear to have an obscene
or otherwise unacceptable meaning. Names, abbreviations, and technical terms are most often cited as being affected by the issue.

The problem arises since computers can easily identify strings of text within a document, but interpreting words of this kind requires considerable ability to interpret a wide range of

false positives
affecting many innocent phrases.

Etymology and origin

The problem was named after an incident in 1996 in which AOL's profanity filter prevented residents of the town of Scunthorpe, North Lincolnshire, England, from creating accounts with AOL, because the town's name contains the substring "cunt".[1] In the early 2000s, Google's opt-in SafeSearch filters made the same error, with local services and businesses that included Scunthorpe in their names or URLs among those mistakenly excluded from appearing in search results.[2]

Workarounds

The Scunthorpe problem is challenging to completely solve due to the difficulty of creating a filter capable of understanding words in context.[3][4]

One solution involves creating a whitelist of known false positives. Any word appearing on the whitelist can be ignored by the filter, even though it contains text that would otherwise not be allowed.[5]

Other examples

Mistaken decisions by obscenity filters include:

Refused web domain names and account registrations

Blocked web searches

Blocked emails

Blocked for words with multiple meanings

  • In October 2004, e-mails advertising the pantomime Dick Whittington sent to schools in the UK were blocked by school computers because of the use of the name Dick, sometimes used as slang for penis.[22]
  • In May 2006, a man in Manchester in the UK found that e-mails he wrote to his local council to complain about a planning application had been blocked as they contained the word erection when referring to a structure.[23]
  • Blocked e-mails and web searches relating to The Beaver, a magazine based in Winnipeg, caused the publisher to change its name to Canada's History in 2010, after 89 years of publication.[24][25] Publisher Deborah Morrison commented: "Back in 1920, The Beaver was a perfectly appropriate name. And while its other meaning [vulva] is nothing new, its ambiguity began to pose a whole new challenge with the advance of the Internet. The name became an impediment to our growth".[26]
  • In June 2010, Twitter blocked a user from Luxembourg 29 minutes after he had opened his account and posted his first tweet. The tweet read: "Finally! A pair of great tits (Parus major) has moved into my birdhouse!" Despite including the Latin name to point out that the tweet was about birds, any attempts to unblock the account were in vain.[27]
  • In 2011, a councillor in
    a pejorative term for gay men).[28]
  • Residents of Penistone in South Yorkshire have had e-mails blocked because the town's name includes the substring penis.[29]
  • Residents of Clitheroe (Lancashire, England) have been repeatedly inconvenienced because their town's name includes the substring clit, which is short for "clitoris".[30]
  • Résumés containing references to graduating with Latin honors such as cum laude, magna cum laude, and summa cum laude have been blocked by spam filters because of inclusion of the word cum, which is Latin for with (in this usage), but is sometimes used as slang for semen or ejaculation in English usage.[31]

News articles

Other

See also

  • Censorship by Google
  • Cupertino effect – Software bug in a spell checker
  • False positive
     – Types of error in data reporting
  • Predictive text – Input technology for mobile phone keypads
  • Rebracketing – Process in historical linguistics
  • Spam detection
     – Methods to prevent email spam
  • Wordfilter – Script used to censor words or phrases on the internet

References

  1. ^ Clive Feather (25 April 1996). Peter G. Neumann (ed.). "AOL censors British town's name!". The Risks Digest. 18 (7).
  2. ^ a b McCullagh, Declan (23 April 2004). "Google's chastity belt too tight". CNET. Archived from the original on 16 June 2011.
  3. ^ Oberhaus, Daniel (29 August 2018). "Life on the Internet Is Hard When Your Last Name is 'Butts'". Vice. Retrieved 31 July 2022.
  4. ^ Gellis, Cathy (31 August 2018). "The Scunthorpe Problem, And Why AI Is Not A Silver Bullet For Moderating Platform Content At Scale". Techdirt. Retrieved 31 July 2022.
  5. .
  6. ^ Festa, Paul (27 April 1998). "Food domain found "obscene"".
    News.com. Archived
    from the original on 10 May 2020.
  7. ^ "Foire aux questions". radio-canada.ca. Archived from the original on 21 October 2012. Retrieved 24 February 2011.
  8. ^ Barker, Garry (26 February 2004). "How Mr C0ckburn fought spam". The Sydney Morning Herald. Archived from the original on 3 September 2009.
  9. ^ Cockburn, Craig (9 March 2010). "BBC fail – my correct name is not permitted". blog.siliconglen.com. Archived from the original on 30 September 2020.
  10. ^ "Is Yahoo Banning Allah?". Kallahar's Place. Archived from the original on 14 January 2016. Retrieved 24 February 2011.
  11. ^ Rubin, Daniel. "When your name gets turned against you". The Philadelphia Inquirer. Archived from the original on 5 August 2008. Retrieved 3 August 2008.
  12. ^ "E-Rate And Filtering: A Review Of The Children's Internet Protection Act". Congressional Hearings. General. Energy and Commerce, Subcommittee on Telecommunications and the Internet. 4 April 2001.
  13. ^ "F-Word Town's Name Gets Censored By Internet Filter". Archived from the original on 1 December 2008. Retrieved 27 July 2011.{{cite news}}: CS1 maint: bot: original URL status unknown (link)
  14. ^ Chin, Josh (6 July 2011). "Following Jiang Death Rumors, China's Rivers Go Missing". The Wall Street Journal. Archived from the original on 13 August 2011.
  15. ^ Molloy, Mark (27 February 2018). "Wine lovers cannot buy Burgundy tipple on Google as internet giant cracks down on 'gun' searches". The Telegraph. Archived from the original on 2 March 2018. Retrieved 27 February 2018.
  16. ^ "Yahoo admits mangling e-mail". BBC News. 19 July 2002. Archived from the original on 26 January 2021. Retrieved 21 June 2013.
  17. ^ "Hard news". Need To Know 2002-07-12. 12 July 2002. Retrieved 21 June 2013.
  18. ^ Knight, Will (15 July 2002). "Email security filter spawns new words". New Scientist. Archived from the original on 24 September 2020. Retrieved 21 June 2013.
  19. ^ "E-mail vetting blocks MPs' sex debate". BBC News. 4 February 2003. Archived from the original on 4 February 2021.
  20. ^ "Software blocks MPs' Welsh e-mail". BBC News. 5 February 2003. Archived from the original on 4 February 2021.
  21. ^ Kwintner, Adrian (5 October 2004). "Name of museum is confused with porn". News Shopper.
  22. ^ Jones, Sam (13 October 2004). "Panto email falls foul of filth filter". The Guardian. Archived from the original on 4 February 2021.
  23. ^ "E-mail filter blocks 'erection'". 30 May 2006. Archived from the original on 4 February 2021.
  24. ^ "The Beaver mag renamed to end porn mix-up". The Sydney Morning Herald. Agence France-Presse. 13 January 2010. Archived from the original on 9 November 2020. Retrieved 24 February 2021.
  25. ^ Austen, Ian (24 January 2010). "Web Filters Cause Name Change for a Magazine". The New York Times. Archived from the original on 9 November 2020. Retrieved 24 February 2021.
  26. ^ Sheerin, Jude (29 March 2010). "How spam filters dictated Canadian magazine's fate". BBC News. Archived from the original on 16 January 2021.
  27. ^ "Luxemburger Twitter-Neubenutzer nach 29 Minuten blockiert" [Luxembourg new Twitter user blocked after 29 minutes]. Tageblatt (in German). 22 June 2010. Retrieved 12 June 2010.[dead link]
  28. ^ "Black Country Councillor Caught up in Faggots Farce". Birmingham Mail. 24 February 2011.
  29. ^ Tom Chatfield (17 April 2013). "The 10 best words the internet has given English". The Guardian.
  30. .
  31. ^ Maher, Kris. "Don't Let Spam Filters Snatch Your Resume". Career Journal. Archived from the original on 23 October 2006. Retrieved 11 February 2008.
  32. ^ Frauenfelder, Mark (30 June 2008). "Homophobic news site changes athlete Tyson Gay to Tyson Homosexual". Boing Boing. Archived from the original on 4 February 2021.
  33. ^ Arthur, Charles (30 June 2008). "Computer autocorrects surname 'gay' to.. no, you guess". The Guardian. Archived from the original on 13 November 2020.
  34. from the original on 25 October 2020. Retrieved 24 February 2021.
  35. ^ Moore, Matthew (2 September 2008). "The Clbuttic Mistake: When obscenity filters go wrong". The Telegraph. Archived from the original on 23 February 2020.
  36. ^ "Microsoft Confirms "Gaywood" Is An Offensive Surname, Mr. Gaywood Responds". May 2008. Archived from the original on 9 November 2012.
  37. ^ Keating, Lauren (17 February 2016). "These Are The Words Nintendo Censors From Appearing On The 3DS". Tech Times. Retrieved 14 November 2023.
  38. ^ Mozur, Paul; Tejada, Carlos (13 February 2013). "China's 'Wall' Hits Business". The Wall Street Journal. Archived from the original on 10 September 2013. Retrieved 25 May 2013.
  39. ^ "Faggots and peas fall foul of Facebook censors". Express & Star. November 2013. Archived from the original on 10 May 2020.
  40. ^ Gibbs, Samuel (21 January 2014). "UK porn filter blocks game update that contained 'sex'". The Guardian. London. Archived from the original on 11 November 2020.
  41. ^ Ferguson, Amber (22 May 2018). "Proud mom orders 'Summa Cum Laude' cake online. Publix censors it: Summa … Laude". The Washington Post. Archived from the original on 22 May 2018. Retrieved 22 May 2018.{{cite news}}: CS1 maint: bot: original URL status unknown (link)
  42. The Huffington Post. Archived
    from the original on 5 September 2018.
  43. ^ Hern, Alex (27 May 2020). "Anti-porn filters stop Dominic Cummings trending on Twitter". The Guardian. Archived from the original on 20 February 2021.
  44. ^ Ferreira, Becky (15 October 2020). "A Profanity Filter Banned the Word 'bone' at a Paleontology Conference". Motherboard. Archived from the original on 23 February 2021.
  45. ^ Morris, Steven (27 January 2021). "Facebook apologises for flagging Plymouth Hoe as offensive term". The Guardian. Archived from the original on 29 January 2021.
  46. ^ Kempf, Cédric (12 April 2021). "Insolite : Bitche est censuré par Facebook". Radio Mélodie (in French).
  47. ^ Darmanin, Jules (13 April 2021). "Facebook takes down official page for French town of Bitche". POLITICO. Retrieved 3 July 2021.