Share
Explore

Propagating Credence

I’m busy designing a governance tool called Option. If you’re reading this contemporary to the publication of this article, you can play with the prototype here:

The core data structure behind Option is illustrated in this image:
image.png

Next to each yellow circle (what I call a pip) we have the text of a claim, and under each yellow pip is a silver pip, followed by another yellow pip. That second yellow pip under the first is called the “child claim” and it displays its text next to it.

A claim:
image.png

Its child claim:
image.png

What you’ll notice if you hover over one of these silver pips is that it too is secretly a claim, and by hovering over it (or clicking it) you can see the underlying claim’s text.

image.png

In this case, I’ve hovered over the pip sandwiched between the claim “We should not slow down AI research.” and the claim “We should slow down AI research.” What it reveals is a claim that says ““We should not slow down AI research.” invalidates “We should slow down AI research.””

The nested quotes can get a bit difficult to parse as they start to add up, but you understand the premise: everything is a claim, and a special type of claim can be used to link two claims together. We’re focusing on linking invalidating claims to other claims.

To return to our prior example, “We should slow down AI research.” can be invalidated by, “We should not slow down AI research.”

For now, I’ll resist the urge to argue for why this is sensible from a psychological as well as a computational perspective. Just know that this is the game: each child claim tries to invalidate the parent claim.

I will admit that from a user perspective this is quite difficult to get used to. It’s easy to accidentally start thinking of these claims as headings under which you want to put supporting arguments. In fact, I did just that about 20 minutes ago in preparation for writing this article, and added a bunch of claims arguing against slowing down AI research under the claim “We should not slow down AI research.” But as a reminder (for you as much as for me) claims are not headers, they’re claims, and the way to contribute in Option is to be critical of that claim.


After using Option for even a short period of time, you’ll begin to to run into some common issues. First, you might find that someone has made a truly ridiculous claim worthy of ridicule. For the most part you already know how to handle this — you add invalidating claims underneath it — but there’s one case where it seems especially wrong.

What happens when their claim is totally irrelevant to the parent claim?

For example, here we consider the claim that the should be reversed (i.e. the funds should be returned to the hacked users). All of the claims under the parent claim are decent, you might not agree with their content, but you at least you are likely agree they belong there. All, that is, save (at least) one. Can you spot it?
image.png

I hope you’ll agree that the claim, “You can’t mute winter, it can’t speak.” is not relevant to the parent claim. Even if it’s true, it has no bearing whatsoever on the Wintermute hack. “Wintermute” is just a name for the hack, it doesn’t have anything to do with the season winter. How can we handle this?

You’re an astute reader, so you already know where this is going. I’ve already told you that linking claims can be clicked on, that everything in Option is a claim, and that the action you can take under a claim is making a new claim about how it can be invalidated. If we put all that together we get this:

image.png

Here we see that we’ve selected the claim, ““You can't mute winter, it can't speak.” invalidates “The Wintermute hack should be reversed.”” and we’ve offered the exact invalidating claim from above, that is, that “"Wintermute" is just a name for the hack, it has nothing to do with winter the season.”

Now you understand how to deal with irrelevant claims that you don’t believe belong somewhere, but you might start get the feeling that it doesn’t really matter. After all, even though we’ve put our invalidating claim under the linking claim, the silly, “You can't mute winter, it can't speak.” claim still shows up. How do we make it matter that we added this, or for that matter, any, claim?

This is a great question, and it’s really really not easy to answer. Rather than take you through all the things that won’t work, and believe me, that negative space is enormous, I’ll just give you an outline of my approach.

There’s probably a future article here exploring that space, for example, pinning down what exactly is so bad about voting, but that’s beyond my interest at the moment.


What I’d like to introduce you to is the quantity called “credence”, an all important score in Option, and I’ll drag you down with me into the nitty gritty of the interesting design choices that need to be made in order to calculate that score.

Put simply, credence is a number that shows up next to a claim in Option. It indicates how much people believe that claim. A credence score is in the domain of the reals, which is just to say it can be any number you can think of between negative infinity and positive infinity.

For most of this exploration, we’re going to be focusing on the impact of this scoring mechanism on the expressiveness, the incentives, and the user experience of the tool. In particular, we’re going to look at a design choice that’s proving troublesome for me.

The design choice in front of us is actually extremely simple to describe and is summed up in the very image we started with:
image.png

One thing you’ll doubtless notice is that there’s a distinct difference in the kind of relationship between these claims:
We should not slow down AI research.
There are no interesting claims in Option.
Compared to the rest of the relationships between claims.

To illustrate this, consider how if “We should slow down AI research.” is considered true then that completely invalidates its parent claim: “We should not slow down AI research.” If “We should slow down AI research.” had a high score, we would expect “We should not slow down AI research.” to have its score degraded in comparison.

If you think this is just a special fluke because these two claims happen to be the direct inverse of one other, consider the relationship here:
We should slow down AI research.
We should build a godlike super-AI of perfect goodness to prevent AI doing harm.
Here, the relationship is weaker, but we can still understand that the author is perhaps arguing we should accelerate our AI research, not slow it down, because AI is the very thing that will save us from AI.

In both these cases, the validity of the claim — which here I’m calling credence — can be thought to directly chip away at the credence of the parent claim. In other words, one way to encode this relationship could be with this algorithm:
def propagate(child_claim):
return (- child_claim.credence)

Here we’re imagining a quite simple function called propagate which returns the inversion of its credence (remember, each claim is invalidating its parent claim). It’s “propagating” credence of one claim up to the parent claim. This is one way you might use this function:
claim.credence += propagate(child_claim)

For now, I’m not at all going to talk about where credence comes from in the first place, I’m just going to will it into existence and we’ll start putting it to use.

Now let’s compare those prior example claims to this claim pair:
There are no interesting claims on Option.
We should not slow down AI research.

The relationship between these two is much more interesting than the previous pairings we considered.

In previous pairings, the credence of the underclaim could be thought as directly invalidating the parent claim. Information about that credence could be used to do work in the propagation algorithm to reduce the credence of the parent claim.

Here, it’s not quite as clear that’s the case, though we still understand the relationship that’s here. What the author seems to be saying is that “We should not slow down AI research.” is itself an interesting claim. In other words, the author would probably agree with our reasoning if we clicked on the linking claim:
“We should not slow down AI research.” invalidates “There are no interesting claims in Option.” And added this claim under it: “”We should not slow down AI research.” is not an interesting claim.”

In fact, that’s exactly what you’ll see in Option if you click on that linking claim:
image.png

Just to clarify: the original author of the linking claim ““We should not slow down AI research.” invalidates “There are no interesting claims in Option.”” probably wouldn’t agree with the claim “”We should not slow down AI research.” is not an interesting claim.”, after all, they linked it there. But they would agree with the relevance of that claim. In other words, they would endorse this claim:
“““We should not slow down AI research.” is not an interesting claim.” invalidates ““We should not slow down AI research.” invalidates “There are no interesting claims in Option.”””

I told you the quotes were going to get worse. To do the untangling for you, this long ass claim is saying that:
“We should not slow down AI research.” is not an interesting claim.
and all told the above line is a claim that invalidates the linking claim below:
“We should not slow down AI research.” invalidates “There are no interesting claims in Option.”

Which it does, this is exactly identical to making the claim, ““We should not slow down AI research.” is an interesting claim.” The positive way to write this long list of negatives is:
If the claim “We should not slow down AI research.” was not an interesting claim, then “We should not slow down AI research.” would do nothing to invalidate “There are no interesting claims in Option.”

This is an important distinction because it pulls apart two important concepts:
A claim itself
The quoted claim

In other words, it recommends a different way to design the tool, which is that you could design it in such that the propagation algorithm didn’t allow any strange behavior like this. Instead, it could always be the case that only the child claim’s credence contributed to the parent claim’s credence.

The advantage would be a much simplified model, which didn’t require taking into account the distinction between quoting a claim and the claim itself. Instead, it could be left to the user to directly quote the claim. For example, in the case of the relationship between these two claims:
There are no interesting claims on Option.
We should not slow down AI research.

In a tool without an ability to handle the quote of a claim as an invalidation of a parent claim, all the user would have to do is quote it themselves. In other words, instead of writing "We should not slow down AI research.” as an invalidating claim, they would write “The claim, “We should not slow down AI research.” is an interesting claim.”

One might argue that this lacks some elegance, because now all those invalidating claims under, “There are no interesting claims on Option.” would be quite repetitive and would have to be manually placed there. You couldn’t use the tool quite as easily to navigate. But that’s hardly an argument, right? And further, it could be easily avoided by things like an internal language for the tool which might allow templating like, “”{parent_claim}” is not an interesting claim.”

Coupled with a search feature like, “show me all the instances of the claim “”{parent_claim}” is not an interesting claim.”“ you would easily be able to enumerate all the claims that invalidate the parent claim, “There are no interesting claims on Option.”

However, for some reason, I don’t think this is right. I can barely express why.

I have only three hints to offer as to why this is a mistake, and as to why it’s worth figuring out that horrible convoluted indirection thing that’s implied by allowing quotes of claims to be invalidating themselves. Here are the three:
the claim “We should build a godlike super-AI of perfect goodness to prevent AI doing harm.” and the way in which invalidation defines relevance domains
the inelegance of search
default logic

I’ll start with the best argument I can muster and then I’ll trail off into incoherence.

When we examined the relationship between these claims:
We should slow down AI research.
We should build a godlike super-AI of perfect goodness to prevent AI doing harm.

We discovered that we had to do a bit of interpretation on behalf of the author of #2. In particular, we had to infer that the domain in which this would apply to the parent claim is only one where building the godlike super-AI is in conflict with slowing down AI research. Of course, we could imagine a world where everyone agrees simultaneously that we should go for this godlike super-AI and that to get there it requires us to slow down research. And so, the invalidating claim for the link between these two claims:
We should slow down AI research.
We should build a godlike super-AI of perfect goodness to prevent AI doing harm.
Could be a claim like:
“Building a godlike super-AI would require slowing down AI research.”
“Building a godlike super-AI would be compatible with slowing down AI research.”
“Slowing down AI research wouldn’t preclude building a godlike super-AI.”

There’s three attempts at invalidating the link between 1 & 2, each with slightly different meanings (the good news is that we don’t have do worry about duplication nor overlap in Option, for reasons I’d like to talk about someday but not today).

Importantly, this invalidating claim performs a necessary function: it defines the domain over which the claim, “We should build a godlike super-AI of perfect goodness to prevent AI doing harm.” has impact: wherever those link invalidating claims are to be thought to be true the propagation to the parent should be zero.

On one hand, you could argue that the link invalidating claim is once again just modulating the child claim’s effect on the parent claim. That’s a fair argument.

However, to me, it also seems like maybe we once again have instance where “the way in which” the child claim invalidates the parent claim matters. Much like the way in which, “We should not slow down AI research.” invalidated, “There are no interesting claims on Option.”

However, I will admit that the clear difference is that in this case we’re just modulating the credence of the child claim on the parent claim by way of the linking claim. Whereas in the “There are no interesting claims on Option.” case we’re directly trying to propagate from the linking claim as if it were a claim itself.

This exploration has so far left me feeling more like allowing quotes of claims to be invalidating themselves might not be a good strategy, and we’d be fine with the template language and search solution. This is why we write. So, let’s move on to our second concern: the inelegance of search.

The concern here is that we’d like to introduce as few elements as possible into the system. However, here we’re considering adding a template language and search. The user story would be this:
User creates a claim like, “{claim} is an interesting claim”
User uses that template claim on existing claims, essentially annotating them (this is somewhat like giving them a type)
User then makes the claim, “The set of claims matching template, “{claim} is an interesting claim” invalidate “There are no interesting claims on Option.””
The system would then have to render all the claims in the interface

This is all fine, I guess, but it worries me for two types of reasons (while we at it):
mechanically
aesthetically

Mechanically, I’m concerned I won’t know how to propagate the credence of the link between all those subclaims “{claim} is an interesting claim” and whatever the {claim} to the parent claim “There are no interesting claims on Option.”. In other words, if in one universe “{claim} is an interesting claim” is only used on what are considered to be very interesting claims, and in another it’s only used on claims that are quite boring as judged by the community, shouldn’t the credence propagate differently? Clearly in the second case the argument is much weaker, but how much weaker?

Mechanically, how would the template claim work? Would it be a single claim that’s just applied in multiple places, where its child claim is the mentioned {claim}, or does this require its own special treatment? For example, what if there are multiple {claim}s mentioned in the template?

Aesthetically, how do you render those claims in the interface? It seems like running the search will be computationally costly, and like you’d want to search by the strongest claims first. But you’d also like to avoid repeatedly showing the “{claim} is an interesting claim” text. This, I think, is a suprable hurdle.

Finally, . I want it. I think allowing quotes of claims to be invalidating makes this a default logic which is great for legal programming, whereas in the latter case it’s more like a type theoretic structure. Is that even the case? I promised I’d collapse into incoherence by the end, maybe this is that moment...
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.