Birds of a Feather Mistrust Together

Alignment, group dynamics, drifting intentions, and the Promise Theory of Trust

18 min readAug 20, 2023

Whether we’re talking about the animal kingdom, human collaborations, teamwork, or just choosy people, it’s a common narrative: birds of a feather (i.e. members in a group) flock together because they “sort of trust their own”. The recent work I’ve been doing about trust shows that–in fact–the opposite might well be true. Rather than trust, it seems that the dynamics of mistrust may play the key role in determining how groups come together to collaborate. Flocks flock because they mistrust something in their environment, not because they trust one another. This sounds “natural” for animals in the wild, but when it comes to humans, we somehow believe we are better than that–we are guided by morals and kindness. Indeed, we are experts at crafting moral-tinted narratives for all our human behaviour. But when we trust something or someone, we don’t really stop to hug each other happily ever after; we even drift apart as time goes by because we don’t have a good reason to go back and check on the status. While the full details of this picture remain to be written down rigorously, I want to sketch out some of these latest findings in this more popular form.

Earlier this year, I embarked upon a project to study trust with fresh eyes. It has revealed some highly suggestive, and perhaps unexpected, traits about human collaboration. In particular (and most lately), it has involved a study about how users edit the pages of Wikipedia. It’s a great case study, because there’s a lot of data to study. Some important lessons and tantalising phenomena are surely at work in those recorded interactions.

The tribulations of editing Wikipedia articles, involve multiple vested interests and non-aligned intentions. They offer a rare insight into a “reservoir” of dynamically similar but semantically different intentional processes, duplicated manifold at scale. This is the realm of Promise Theory, and Wikipedia offers a simple experimental system to test some of the consequences of the theory.

Perhaps unexpectedly, given the moral history of human study, the spectre of mistrust looms far larger there than do any niceties about moral goodness in these episodes. Thanks to the wonders of a technological age, most Wikiedpia users don’t even know one another, let alone care enough to trust one another, and yet they can trust or mistrust the content. That puts the role of different promises into sharp relief. The data might come from a single source study, but the source is large and varied, and if we can understand the dynamics as a physicist might then we are a step closer to understanding the processes that make trust into a coarse predictor of behaviour: the dream of social physics.

Trust in promises

Let me try to explain some of what I believe the study shows. We should start with some preamble, to set the context.

Human groups may form for a couple of reasons. In Game Theory, one speaks of so-called cooperative and non-cooperative groups, but this doesn’t tell us about what force of attraction underlies the grouping. In a set of lecture notes, I proposed that we might understand this problem through a deeper understanding of the role of trust. Agents may accrete around a seed, coming together independently to contribute to a common cause (like fighting a common enemy or working on a common product), or they form clusters by pairwise percolation of promise relationships. In the figure below, the left hand picture typifies an organisation, and the right a more complex society.

Two ways groups come together: by aligning around a common seed like moths to a flame, or by mutual need to depend on one another's promises.

It can be argued that humans often care more about outcomes than other humans–and that outcomes are the subjects of promises. We might think that we wouldn’t accept a promise from someone we didn’t trust, and yet this happens everyday, because some promises are so important that we need them to be kept by someone and there might not be too many options. Do you trust the government? Do you trust your boss? Do you trust your team? Everyone’s answers to these questions will occasionally cross into darker territory, but that doesn’t lead to instant cutting of all ties. Needs overshadow trust.

Trust comes in two forms: trustworthiness assessments and invested trust, which I’ve argued behave like potential and kinetic forms. Kinetic or invested trust relates to the ancillary-process of monitoring a promised process. The more important a promise is, the less we trust its outcome to be given and the closer we attend to the promise keeping activity. In Promise Theory, any promise relationship is the result of some agent (person, thing, group, etc) offering a kind of an outcome (+) to another agent, who must then promise to accept it (-) and use it for a dipolar relationship to ensue. It costs both parties to enter into this bond. The result is a dependency.

So what is it that keeps groups of agents working together? This is sometimes hard to see impartially, but as we’ll see, the evidence from studying Wikipedia editing indicates that without dependency, agents not only drift apart quite easily but may actually drive one another apart by contention. If they need each other directly, but they do need the outcome that all of them are trying to deliver, they may fight over the outcome and stoke division.

Most promised dependencies come and go like software, but some offers (+) are innate hardware promises, such as the labels of kin and tribe. Even these can be disregarded unless there is something to keep the group together. Even without a strong reason for staying together, a group’s togetherness only needs only go unopposed to appear more or less intact, as a tribe or a family. But even kin and tribe will eventually drift apart unless they have some attractive promises to stay close to one another. The cost of putting up with non-aligned intentions (with contention) can overwhelm the benefits of staying together.

Group and society

Discussions of Group Dynamics typically revolve around soft topics like psychology and moral philosophy, as do discussions of most human conditions. Trust is no exception. Nonetheless, I already showed that we can define a natural version of trust without any such interpretations. Now, by studying the volumes of data about Wikipedia collaborative editing, sampling the huge database using Monte Carlo methods, we indeed arrive at a somewhat different conclusion. Groups cohere out of mutual suspicion, not out of a herd’s sense of belonging.

It doesn’t take a Hari Seldon to see that there is a remarkable and impartiality (call is a universality) to human behaviour on a large scale, and yet our self-narrative is often a bit presumptuously personalised. We seek perhaps to present ourselves in a better light than we imagine. Discussions of free will and agency (in the usual individual sense) typically abound, but certainly cannot matter too much at scale as all such effects would be averaged away. If no universal features to our behaviour were discovered, then it could be true that we’re indeed all as unique and individual as we like to believe. But if there is so much as one general law of behaviour about mass populations, then we’re victims of statistical mechanics — yup, we’re all just Bosos shouting Fermi la bouche! (Sorry, that’s a physics joke.)

Experimental tests of Promise Theory

It was unclear to me for a long time how to demonstrate experimentally the consequences of Promise Theory. Direct tests are certainly hard to imagine, because many interactions are quite ephemeral and unreproducible, but like many interaction theories, the proof of the pudding tends to come from indirect statistical consequences at a large scale, where detailed semantics become largely irrelevant to the assessment of the measurements. Nevertheless, we can trace the origins of large scale effects to certain promises.

What does seem to be the case, on an impartial level, is that agents form groups as a result of contentious bonds formed from repeated exchange of information. We might call it a currency (the term utility is overloaded with simplistic connotations from Game Theory and Economics). In chemistry, it might be electrons. Nevertheless, we can infer that there must exist such a measure of effective economic currency entirely from the relative growth rates for different sizes of group during independent editing processes. The data show that group size distributions aren’t random at all, in a statistical sense. There are shaped by effective potentials or “forces” like the proverbial invisible hand. Those forces are related to the two kinds of trust.

Wikipedia gives us a special opportunity to study this, because it’s effectively a giant reservoir of overlapping parallel processes, driven by largely anonymous strangers–not by kin, family, and friends. It also doesn’t seem to be run (only) by people who have deep training in certain subjects, or who have made an oath to honour a quest for impartial truth. Yet Wikipedia keeps its promises quite well, and has mechanisms to filter out contention over content. When we use Wikipedia we have essentially agreed to keep certain promises, and these promises are the seeds around which groups form.

Impositions versus promises

There’s a general principle in Promise Theory which says that impositional acts (i.e. “out of the blue” changes made without prior invitation or alignment) tend to reduce the assessment of trustworthiness in the imposer. Agents in a group, who come to impose changes and correct one another, will thus begin to escalate mistrust in a group. This is what keeps people coming back.

The evolutionary reason for mistrust may very well be a pragmatic hardwiring of processes to bring agents more into alignment. If we interact, and it doesn’t cost us too much, we may tend to align ourselves more closely–reach a kind of consensus, but once things get too contentious, the costs of interacting outweigh the benefits and agent groups break apart.

In statistical physics, this kind of phase transition tends to be governed by a parameter like temperature or pressure. Here we might imagine the effect relates to the degree of excitability of the group (analogous to a temperature) and thus the agitation of the group determines an average cohesive size through a detailed balance condition. That’s something to look for in future work.

We do see transitional features like this in the Wikipedia data. There is a sense that most editing episodes actually fizzle out from unwanted growth, rather than reach a natural state of completion. We don’t have a detailed picture yet, but we can infer some details. Agents initially come together either because they need one another (to rely on one another’s promises) or because they mistrust an outcome they care about (a collective promise in which all are involved). In neither case do they come together because they trust one another.

There are various ways N agents can interact (figure 1). A single seed can act as a hub attracting satellite agents who either contribute to a common process (such as when people come to edit a single page on Wikipedia, when they form a team to build a product, or when they align against a common enemy). Alternatively, they might form a web of mutual interest, such as when a number of agents offer each other different services to delegate individual expertise and exalt benefits of scale.

The formation of a group for a common purpose, either on an individual level or on a group level, is instigated by a seed of some type. But which purpose? It might not be obvious to everyone what this is. This specific promise around a common purpose equates somehow to the need to correct or maintain that thing (a shared goal or simply a brand identity). If people trust in a thing, they leave it alone. People come together (and more importantly remain together) due to a mistrust of the status quo: they want to know more, to check the details, or even make changes. But they have limits. At some level of agitation, the group may cost more than it’s worth.

The results

Enough preamble, let’s look at what the specific case of Wikipedia data tells us about group collaboration. The basic editing process is represented in the figure below.

Parallel threads, one for each topic, formed from punctuated episodes are what leads to Wikipedia topic pages.

For each topic page on Wikipedia, content is generated via a sequence of “editing episodes” or threads in which one or more users impose their changes onto the page uninvited. They are neither invited nor commissioned–they come from “everywhere and nowhere” and have their own ideas and agendas.

Importantly, they don’t come to interact socially, they just bump into one another as they are drawn to keep a similar promise in an individual way. Each topic has to start with a single user creating the initial page, but once one user has made a contribution then the outcome will be discovered by others and they will come, assess, and criticise the changes and want to impose changes of their own.

Like it or not, users end up flocking into a group or swarm. Sometimes those changes add to what was there, and sometimes the changes undo what was previously done and perhaps replace it with something else. Contention occurs when intentions are not aligned. The groups cohere as long as the individuals keep checking and rechecking what part of their topic content has been changed. The topic page acts as a totem or flame to gather around.

Each topic is composed of an independent set of episodes, and each episode behaves as an independent burst of editing events. Each burst is thus made by a number of users N per episode. Sometimes it’s the same people, usually it’s different people. Also there’s a large number of bots that contribute to the editing.

Statistics

Using the change histories for each page, we can track which users made changes, which undid changes of others and so on. When users change on another’s content, this leads to “contention events” which can be counted. It correlates well with the mistrust measured from a sampling rate calculation based on fair estimation of kinetic work to summarise the change history (so-called kinetic mistrust).

The advantage of these simple criteria is that they lead to countable results. We can apply accepted methods of analysis to such quantitative analyses much more easily than we can analyse qualitative measures.

For example, one thing we see is that the number of editing episodes doesn’t die out very quickly. The contention over topics doesn’t settle into some agreed answer as we might expect if humans were moral creatures. The length of an article is a kind of proxy for time elapsed. As articles grow, the number of contentious edits falls off only very very slowly. The graph below shows a very weak exponential decay of work done by agents with article length.

Log of work input versus article length.

Early on in the work, I read somewhere that certain topics are more contentious than others: Michael Jackson and George W. Bush were supposed to be the most edited pages at some time. However, this turns out to be a false claim. There remains a high level of contention in all topics, from politics to mathematics. This suggests that the contention has nothing to do with the subjects, but has to do with the groups of people and bot-proxies making changes.

Users come together, make some changes, and then they go away for a while and come back to start over again. It doesn’t end. It doesn’t really matter what they are adding to the articles, what’s more important is whether their intentions are aligned or not. We measure this implicitly through the contention and turn it into numbers.

Another interesting fact is that the size of groups of users (N), working on each episodic burst, is quite small, and rarely grows very large. In fact most episodes consist of about 8 users on average, hence the narrow focused strip in the graph below. The amount of contention between the users does vary a lot, both in amount and duration, but the size of groups doesn’t. See the graph below.

Level of contentious edits as a function of average episode size

What about time? The durations of editing episodes are interesting because they are the first sign that there are different kinds of topics or at least groups with different behaviours. Some topics have short bursts and others have bursts that are orders of magnitude longer, as seen in the striped bands of the graph below. These are not noise artefacts, but stable features that don’t depend strongly on the random samples taken. As yet, I have no explanation for these.

The time duration to complete an editing episode as a function of group size. This is the first evidence of a difference between certain types of page. The bands are not statistical artifacts.

Perhaps the most telling graph of all is the last one: one with a very low-noise clear pattern. This is the frequency spectrum for group sizes. It almost looks like the famous black body Planck spectrum, but it’s a bit different. It counts how many episodes were edited by groups of a certain size. The crosses in the figure below show the actual data, and the line through them shows a prediction based on a highly idealised version of the Promise Theory of Trust.

This is most likely the result of a kind of detailed balance condition, just as in other systems that have predictable spectra. We can try to figure that out in detail. Such balance conditions combine processes and special conditions into an equilibrium condition. So, we wouldn’t expect the result to apply for all cases–but it may well apply for all similar cases.

If we think of this curve in terms of the costs and benefits for an agent who enters into a group formed by editing altercations, we can describe the rising and falling of the group lifecycle as follows.

When the graph is rising (the group is accreting users) it suggests that there is a benefit or motivation for new users to join the group, like an attractive force.
When it’s falling, it suggests the opposite.

Each shift right is like the value of adding one more user to the group. What is somewhat astonishingly clear from this graph is that groups only attract new users when they are very small. As soon as they exceed about 4 users, the tendency is for the group to break up. This conclusion is independent of any particular topic or any particular user. So it is broadly true of all users in Wikipedia.

Explanations

How can we understand this curve? Well, with some caveats (and an abundance of caution: after all the data are noisy and no idealised picture can take account of every detail), the shape comes from two competing processes: kinetic mistrust (promise) and cost of contention (imposition).

For small numbers, users are attracted to jump into the group fray, with a kinetic velocity (i.e. a level of mistrust), as described in an earlier article. This kinetic change velocity varies like the square root of the untrustworthiness potential, as assessed by the user for each of the (N-1) others in a burst of activity. Beyond a threshold of about 4 users, the cost of attaching a new user seems to turn tail and bring about an exponential collapse in the likelihood of adding new users.

The fact that this somewhat speculative idea predicts the shape of a spectrum of hundreds of thousands of episodic interactions is rather beautiful (if I do say so myself). It shows that it is mistrust, which is responsible for group formation, not trust. For a shared interest, without cooperation, mistrust is what keeps people looking to check and change again.

The more agents involved in a group, the less likely it is that they will share the same alignment of purpose, and the more quickly they are likely to break up. We might wonder then, what keeps larger groups together over time. Could it be a promise to a leader as a seed, or delivery of a product, or a completed article looking the way we want it?

If the users were fully aligned and working together, their contention would ideally be zero. This is very rare. Indeed, the Wikipedia data show that in most cases, contention peaks at the same value of N as the kinetic mistrust, at around N=8, and these facts alone define the spectrum above. For group sizes over 8 individuals, we must assume that the cost of agents fighting leads to a very different negative assessment: agents give up actively mistrusting and either walk away to save themselves, or simply decide to trust someone else to do the work after all.

This expression fits the curve. It has square root growth and exponential decay, slightly complicated by the dimensional scales.

Relating to specific promises (see the research notes), we can write this

It shows how growth is fostered by attention to detail and dissipation results from imposition of contention.

We thus have an idealised explanation for non cooperative group dynamics that includes contributions from trust (or mistrust) and trustworthiness, describing a tendency for agents to be attracted by mistrust–not by trust!

If we trust, we walk past and keep going. We only engage to form new promises because there’s a suspicion of some defect (the role of the “common enemy”). When we trust, we are not repelled, rather the attraction fades away and we are neutral–the herd drifts off the new pastures. Absence of mistrust leads to possible random drift away from a focused promise relationship, not a rude shove.

Forget about agency and free will

One of the big distractions around the moral philosophy of trust and behavioural science is the question of morals and whether what humans do can be predictable or not. This often involves ideological positions about free will and in particular the notion that we are driven by concern for righteous goals. There is no evidence of such an effect here, quite the opposite.

Statistics have no opinion of good and evil; basically, every system that is large enough can exhibit patterns of regular behaviour, regardless of whether their microscopic contributions are the result of rational or irrational behaviour. Promise Theory only shows how human intentions are equivalent to trajectories in a system of virtual dynamics. More than that, we don’t need to say.

What this means is basically this: any moral notions about working together because of similar feathers, or trusting in family or tribe, play almost no role in the studies of what happens when strangers come together. Unless there is a strong enough seed (a totem or leader) to keep members coming back to check on something, the Promise Theory suggests that these concerns are irrelevant. Ultimately, people will come back if they are either curious or mistrustful or because they have robotically turned a behaviour into a promise that doesn’t cost them too much to continue with.

The frequency spectrum of editing group sizes shows something about the tendency to collaborate. It’s based on hundreds of thousands of user episodes and remains invariant under random sampling variances. The results seem to show that trusting has a neutral effect to keeping groups together, whereas mistrusting acts as a directional force, attracting agents to go to battle for their own intents and purposes. These might later be turned into habitual promises, with robotic connotations, but the initial dynamics have to be based on mistrust. So in the end, the story goes like this:

The start of a work process by some individual is a random event that acts as a seed for others to accrete to.
Others will tend to make a rough assessment of trustworthiness about the group or topic before joining.
If the group or topic is sufficiently trustworthy to continue, engagement will initially favour the less trustworthy topics and groups that pass the threshold. After all, if everything is hunky dory why bother to sign up and change anything?
Eventually rising group size leads to excessive contention, or misalignment, and a kind of “phase transition” to a regime of group dissipation begins, once the group size exceeds the critical scale. The explanation for the size of the critical scale N=8 is thus far unknown.

The two part story of trust, i.e. the difference between semantics between kinetic and potential trust, is central to understanding these dynamics in processes. Trustworthiness (potential trust) is used to select which agents to engage with or avoid. Once an agent has been assessed to be sufficiently trustworthy (based on diverse criteria) to engage with, attention shifts to residual mistrust (kinetic work) that drives engagement. The more kinetically trusting an agent is of another, the less it pays attention to what the other does. Without residual mistrust, agents would be ``yeah, whatever’’ and drift apart.

Summary

I had no idea what I would learn from this work–yet it gave more than I expected. Sometimes what you didn’t know you already knew unfolds magnificently in the presence of a new hint. The result feels intuitively correct to me. You could say that my own interest in this topic is motivated by a mistrust of what is commonly written about trust. And that basically sums it up.

Can we apply this to other scenarios, like working in the office? What about societal opinion polls, and mobilizations of effort? Almost certainly we can. We just have to identify the promises that bring people together. Sometimes it’s work. Sometimes it’s the selection of a leader. Sometimes it’s just the desire to identify with a brand or a sports team. The more vague the promise, the less attention it will tend to receive from all quarters and the more stable it might be to drifting apart. Marketers will try to displace people’s trust in one thing by attracting them to something else they can’t quite believe in. There’s much one could say on this matter, but let’s leave it there for now.

When we get basic ideas and definitions wrong, the world seems mysterious. When we clear up these inconsistencies, the world comes into focus. For now, I hope that this work makes you question the way you view trust. The more you mistrust it, the more you’ll study it and stay with it–it might change your mind. But, if you trust your preformed idea, if you really don’t care, then you’ll causally drift away from this essay none the wiser!

Read the full results