Bad language: block posting or censor words?

We have had a couple proposals to configure the forum to automatically process a list of blocked / censored profane words, such as you might find in:

CMU Offensive / Profane words list
https://www.cs.cmu.edu/~biglou/resources/bad-words.txt

Note that we have very few moderation issues in this area. What minimal content we receive that that is extremely profane, pornographic, or hate speech gets automatically filtered by our Askimet spam plugin. It is also fairly uncommon for users to be abusive or insulting. What is more common is for people to express frustration or to denigrate their own skills or code – for example:

Simple bad language examples

I am crap at coding
Why is this shit so hard?
Why the fuck doesn’t it work?

We have a few options:

  1. continue to moderate on a case by case basis. Casual bad language that is not abuse or hate speech will be largely acceptable unless someone specifically objects that they are offended by a specific case; then it would be moderated.

  2. set up a censor list. In Discourse this can be configured with Admin > Logs > Watched Words. This lets us configure a replacement on the fly with placeholders or euphemisms, which would appear something like this:

    • Why the f**k doesn’t it work?
    • Why the **** doesn’t it work?
    • Why the [heck] doesn’t it work?

    In the forum software censor lists only operate outside code blocks, so they won’t affect properly marked code. For example, Chrisir’s sketch containing the word “fuck” as part of a Beckett quotation dataset would be unchanged. How to sort words alphabetically from a String?

  3. set up a block list. A person submitting a post containing one of the blocked words will get a request to revise their post before submitting.

In the case we did set up a list, I would suggest that it be very minimal ~5-50 words – as most slurs (such as racial slurs) are already filtered as spam, and we wouldn’t want to automatically censor words that are commonly used in other registers (for example, Dick is a common name, and hell is a religious concept that you could make art or a game about).

Welcoming comments and feedback.

1 Like

Well, that‘s just my personal opinion, but i never had any problem with abusive languague (be it thrown about or directly at me), so for me it can just go on like this (not that i ever actually saw any offensive post Anyway).

Also, as you said most refer to themselves when cursing, so it shouldn‘t affect others at all… and the community should be mainly 14+ in age (and we all know that at that age, you already know all those Bad words anyway, and maybe even worse now that the internet gives fast access to everything).

All in all, the way it is now, i never saw anything even slightly worth censorship.

And keep in mind that there are many using different languagues to post (or in their Code, though that wouldn‘t be affected Anyway as you said), which can contain completely normal words that just are written like Bad words in english. Dick for example is just fat in german, so if one is using the word in german it‘s be censored…

2 Likes

Thanks for sharing these thoughts, @Lexyth.

Right – if we did anything at all, we would need to be very minimal – there are many many problems with automatic filtering or flagging, including usage within and across languages. We also wouldn’t be looking at partial words, as this leads to the Scunthorpe problem. And the fast majority of insult words are also common words in other contexts.

In general, I’m happier just using the automatic spam filter – it is trained and evolved, and we don’t have to curate it. Abuse is already against our community guidelines – but an additional question is whether the community also wants a norm (in the FAQ and/or automatically enforced rules) against the most common forms of basic bad language, like “oh sht, this is fcking hard!” Currently, if somebody says “Your code looks like crap” that violates our current terms, but if somebody says “MY code looks like crap” then right now we don’t have any particular rules against it that I’m aware of.

1 Like

Well, in my opinion Talking about ones own Code as „crap“ isn‘t really offensive :sweat_smile:

This forum is the Foundation’s public portal:
so the Foundation needs to protect itself and its good name
and also protect Users
( even from themselves like when children publish photos address tel number ) .

  • -a- fast acting automatic measures might be required
    ( we not want to be part of a video sharing campaign of terrorists )
  • -b- additional automatic flagging ( not censor, just inform admin of potential problem )
  • -c- but best is to do that, what the whole forum is about

teaching

so while a user type (input) anything ( from user name to CODE )
the forum editor system could help like a enhanced

spell checker

( spelling, grammar )
just also for usually filtered words ( bad language )
updated by current political / sometimes called religious motivated / events

where to get a updated list?
i never checked, but sure there is the same discussion from the PRO s somewhere online
?do forum admins have their own forum about best practice?

anyhow the idea is:
that when the system helps/warns while typing
but the user , despite that warning, use that words / bad language /
the admin, getting it also as a flag,
and can be sure that posting was wanted by that user and can act more decisive.


that system i talk about, is already working to some degree

  • like you get already warnings when using a link what was posted already…
    so just might need some digging and updating…

as FREE SPEECH is a high value the hurdle to censorship
must be also high / for the system AND the admins


1 Like