Add scalar for labels #2320

enn-nafnlaus · 2024-03-14T15:38:21Z

Is your feature request related to a problem? Please describe.

The addition of custom labels is a milestone in social media design! However, at present, it is impossible to apply varying degrees of labels - either a binary "yes" (labeled) or "no" (unlabeled)

To give an example: let's say someone was creating a labeling service for "Far-Right". Do they put the label on:

Someone who once said something in support of someone who, entirely unrelated, supports a right-wing politician?
Someone who makes the occasional statement directly in support of a right-wing politician?
Someone who proudly and consistently supports a right-wing politician?
Someone who's a member in an allegedly militant far-right organization?
Someone who literally is wearing a Nazi uniform in their profile?

In reality, with respect to most things, reality doesn't like binaries; there's degrees of applicabilities - including the default !porn, !sexual, !graphic-media, and !nudity. Indeed, one may argue that !porn, !sexual, and !nudity are just different degrees of the same scalar topic.

Indeed, often there's a flipside to a coin, in that labels may apply in reverse. If someone is tagging content as anti-transgender, there's also trans-supportive content. If someone is tagging content as anti-artist, then there's also pro-artist content that could be tagged. Etc.

Describe the solution you'd like

Add a new, optional field to the label:

/** How applicable is the label to this content, on a scale of -1 to 1?  */
degree: float;   // For backwards compatibility, defaults to 1.0.

Optional: give more capabilities to labelValueDefinitions, allowing for multiple definitions for different degrees:

"labelValueDefitions": [
  {
    "identifier": "spider",
    "degreeMin": 0.4,
    "degreeMax": 0.7,
    "severity": "alert",
    "blurs": "media",
    "defaultSetting": "warn",
    "locales": [
      {"lang": "en", "name": "Spider Warning", "description": "Spider!!!"}
    ]
  },
  {
    "identifier": "spider",
    "degreeMin": 0.7,
    "degreeMax": 1.0,
    "severity": "filter",
    "blurs": "media",
    "defaultSetting": "hide",
    "locales": [
      {"lang": "en", "name": "Spider Warning", "description": "Spider!!!"}
    ]
  }
  ]

... with degreeMin and degreeMax both being optional parameters, with defaults of 1e-12 (or other very low number) and 1.0, respectively

Moderation prefs might be like:

labelers: [
  {
    did: 'did:plc:1234...',
    labels: {
      porn: 'hide',                  // Backwards compatible, no range preferences specified, default from degree >0 to 1
      spiders: [[0.4,0.7,'warn'], [0.7,1.0,'hide']]
      // ...
    }
  }

Alternatively, one could add a new field after labels, "rangeLabels", which takes ranges, and leave the current "labels" unchanged, if there was a desire or need to keep the definition unchanged / consistent.

Describe alternatives you've considered

There is one obvious alternative, which is just "make more and more labels for every degree". For example, Nudity vs. Sexual vs. Porn. There are problems with this, however (beyond the obvious issue of label-spam).

The classic case brought up in terms of moderation is "We don't want Nazis here!". The problem is that everyone's definition on what defines a "Nazi" is different. For some people, "Nazi" means "literal national socialist, wants to commit genocide of non-white people, etc"). For others, it's "Anyone who ever votes for or says anything nice about a politician that's right-of-centre". For some people, "Nazi" means "someone goose-stepping and giving Heil Hitler salutes. For others, it's some rando who happens to drive a used Tesla. So what's a tagger to do?

When you design a system, you inherently create an incentive to use it in the way it's designed. When you create a binary-tagging system, you encourage binary usage. Most labelers aren't going to go through the nuance of creating separate "NaziLite", "NaziModerate", "NaziIntense" etc tags, confusing their users and making subscription more complex. They're just going to binary sort the world into Nazi or non-Nazi. And they're either going to heavily over-tag or heavily-undertag, which one being depending on the actions of the labeler and the views of the person using the labels.

The next issue is that binaries don't compound well. Let's say a moderator gets an alert about a user's posts- some might be kind of iffy about some topic, but it's not clear. They they check up to see if the user is is already flagged to that topic. In a scalar situation, they might see the user as tagged to a degree of "0.2" on that topic, or "0.8", or whatnot, and this may influence their take on whether the iffy posts are probably bad or just innocent. But without scalars, they're either "not bad at all", or "they're awful".

Binaries also are ill suited to automated processing. Let's say that in the future we don't want a given label to stick around indefinitely - maybe a person liked or did something once, but then never talked about it again, or outright changed their view on it - should they be flagged forever? Should one action lead to permanent heavy moderation to begin with? But if they're repeat offenders, the system should become increasingly confident that they're a problem. We might end up with a system where, say, we have:

labelCount: float // increment by 1 every time the user is labeled; weaken via exponential decay with a halflife of, say, 1 year
degree: float       // New moderation events apply degree = ((oldDegree / oldLabelCount) + newDegree) * newLabelCount
// Drop the label altogether if, say, abs(labelCount * degree) < 0.3 and labelCount < 0.5, or whatnot.

... and the user might choose whether to alert, hide, etc based on not just degree, but labelCount (e.g. how much they're repeat offenders on a topic).

Well, what you end up with is floating point math. But floating point math is ill-suited to use with binaries. Now, I'm not proposing to jump to such a solution immediately, but in terms of futureproofing, scalars work much better than binaries.

Lastly: it's quite simply harmless to have a scalar as an option. You can literally put in the scalar as an option, and change nothing else about the protocol or software, and still provide benefits, in that people who use it now are futureproofed for later when such a scalar might actually be utilized.

Hence, it's my take that at least providing the scalar as an option is quite desirable. It's not like four bytes for a floating point is a massive addition to a label, so it seems to be only positives.

Additional context

None.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scalar for labels #2320

Add scalar for labels #2320

enn-nafnlaus commented Mar 14, 2024 •

edited

Add scalar for labels #2320

Add scalar for labels #2320

Comments

enn-nafnlaus commented Mar 14, 2024 • edited

enn-nafnlaus commented Mar 14, 2024 •

edited