Universal Data Analytics as Semantic Spacetime

20 min readSep 10, 2021

Part 7. Coincidence and the Four Horsemen of Semantics

After taming the tools in foregoing episodes, we now have a robust set of techniques for comprehending and organizing data, i.e. for turning input streams into analytical structures. These dynamical “knowledge representations” describe information flows or spacetime processes on all levels. Let’s now dive deeper, with some examples, and return to some subtle questions of modelling. To tell a story about causation — whether to uncover the internal workings of nature, or to invent new services for ourselves — we need to map out the scenarios in space and in time. It doesn’t matter whether the space is physical, simulated, or completely imaginary — the principles for representing the characteristics of locations and times along a system path are central to making this map.

In this installment, we examine the four previously mentioned core relationships of Semantic Spacetime so that we can identify them quickly for use in model scenarios. We’ll see how to recognize them in the context of key-values, documents, and graphs. We start, of course, by returning to teh central topic of semantic spacetime: processes and how they interact.

Eventful coincidence (x-x’)!

Coincidence is what happens when several things or processes meet at the same location: their timelines or trajectories join or cross — they are “incidentally” together. A related term co-activation is also used in biology for coincident proteins that activate processes in a kind of “logical AND” handshake, with proposal AND confirmation both required to switch on a process. It’s the same idea used in forensic investigations: if A and B are observed together, then there is some kind of connection between them — perhaps to be discovered or elaborated upon later. The role of coincidence is not always easy to discern, but in spacetime it’s simply a matter of expressing how events are composed from their coincident parts.

Co-activation is an important “valve” mechanism for regulating processes, in everything from control systems to immunology. In linguistics, coincident words convey compound semantics by forming phrases and sentences. It leads to a hierarchy of meaning from the bottom up — a kind of semantic chemistry. The particular combination of components present in the same place at the same time is the basis for encoding and elaborating specific meaning by composition of generic parts. But, we need to understand what “the same place” means for each process: when are processes independent, and when should they be seen as parts of a larger whole? The answer depends basically on characteristic scales for the processes. This is a subtle and difficult topic that I’ve addressed in my book Smart Spacetime. Combinatorics is a huge topic that spans subjects from chemistry to category theory.

As a taster of things to come, take a look at figures 1 and 2. These are process graphs translated directly into semantic spacetime by the Linux version of the traceroute program. It might come as a surprise to some that an Internet traversal isn’t just a linear chain — of course each individual observation does follow a unique path, but over coarse time, the map of possible paths splits into a multi-path integral view — quite analogous to a quantum multi-slit experiment. Along the path, the intensity of traffic at each point may be a sum over several coincident paths.

Figure 1: An Internet traceoute graphed end-to-end. Illustrating the complexity of process paths due to parallelism. Pink nodes are indistinguishable like a QM multi-slit experiment, and each arrow defines a proper time clock — so that on a map forms a kind of wavefunction for Internet service.

The pink nodes in the figure represent transverse coarse grains — indistinguishable multi-slit alternative path directions (spacelike hypersections). The nodes marked “wormhole” are unidentifiable locations (a kind of dark matter!) where we know the path went, but was unobservable. These are longitudinal coarse grains.

Figure 2: Adding a second destination and merging the graphs to build up a map of the whole network adds a fork in the path, but some common path — as seen from the same starting node.

The code to generate these figures is provided in traceroute.go.
Video on SST: https://www.arangodb.com/dev-days-2021/day-4/

The Semantic Spacetime answer to understanding coincidence is pragmatic: if two agents are bonded by some observed promises to one another, then they form an effective coarse grain or superagent, which can promise new features on a larger composite scale. Each grain is an event. Events have different semantics to spatial locations. They are spacetime process steps, directly tied to other nodes that represent invariant characteristics.

In Euclidean space, lacking clear boundaries scale is a subtle issue, but in graphs it’s straightforward. Atoms come together to form molecules. If nodes are joined by the right kind of edge or link, then they act as a supernode. Supernodes are interaction regions (like ballistic scattering regions in physics), and their composition represents a larger scale. Figure 3 shows two independent processes meeting at a location, which forms an event, and then continuing on independently.

Figure 3: Temporary coincidence is like scattering of processes, or ships passing in the night. As a causal diagram (Feynman diagram) it looks like this.

In the previous post, I showed how to easily register such events by using coincidence functions. These form event hubs that bind invariants into process steps. Such event hubs form a timeline of context specific events (e.g. PersonLocation(person,location) — see the passport examples from part 6). Some processes come together and remain together, forming a new invariant (see figure 4).

Figure 4: Sometimes two separate processes join and stay joined, forming a new “molecule” with different semantics. This is an interaction that transforms spacetime properties.

The scale of events and observers plays a role here, because some phenomena can only been seen by an agent of a certain size — too big and it won’t be able to resolve small details, too small and it won’t see large features. A hydrogen atom can always interact with much larger DNA, but it can’t unlock DNA’s functional code, because the two parts promise functionality on different scales. Unlocking the code would require something on the same scale as the DNA code, say a protein. This scaling principle obviously applies to processes, but it’s also reflected in the passive data derived from them too — when we’re “imaging” processes.

From a source of data, every exterior downstream process, sampling the data stream, is entitled to its own opinion about what is separate and what is whole, but there are usually some scale markers created by processes themselves. When the configuration of parts changes (as in figure 2), this becomes clear. This, indeed, is what leads to the great diversity of genes, particles, data, and other invariant characteristics. It’s the root of the great spectacle we see all around us.

Property or coincidence? Document or graph?

We have a choice about what happens in spacetime: we choose a scale for what encompasses a location (fine or coarse grained), and thus we have a choice about whether to consider properties as being interior to a location, or exterior between locations — as intrinsic properties or as coincidences of interaction. We can perceive each version of a story if we can observe it over such a characteristic scale, in the hope that this allows us to understand it.

Understanding is that state of satisfaction we achieve when we believe we can tell a story about events, in our own words and in sufficient detail, such that it lifts us emotionally. Understanding is a curious but necessary mixture of causal process and emotional assessment, which many scientists fail to recognize.

Let’s return to the smart room example from part 4, and think about representations by node or by interaction. Suppose you enter a room in your smart campus workplace, tripping a number of sensors. A flurry of database-related interactions quickly takes place inside the room, and outside it to data stores and shared services. We can represent these interactions as pseudo-code.

Biometric sensors send your data to a service that returns your probable userID. The system registers your presence in the room and increments a counter of people using the room. We could do simple counting statistics with key-value document pairs:

Location[userID] = roomIDRoom_Occupancy[roomID] ++Total_Room_Usage[roomID] ++

Location (in space) is a simple key-value property, established by a room coordinate. Room coordinates are usually names or numbers, not Euclidean or geospatial coordinates, but that depends on context: are we thinking of the functional architecture of the building, or the geospatial map used for town planning? As an isolated event, we have nothing else to compare it to, before or after, in this user story — so for now it just EXPRESSes properties of a location, and a CONTAINment of things within a room. The room security system is notified and looks up your access privilege documents, collected by service. The documents contain a number of registered devices. This could be a more complex document model:

AccessRights[userID,roomID] -> {userID, roomID, Accessrights: rwxa}

The security system checks the network bindings to look for those devices. If they are found at the location they are added to the room occupancy “expressed” state. If the wearers EXPRESSing biometrics (face recognition or fingerprints) leave the room (e.g.are detected nearby but outside the room’s functional boundary) then we might decrease the counter.

Room_Occupancy[roomID]--

This logic of what it means to place a person or a thing at a certain location is uncertain, so we need to verify this somehow by observation. Data models basically have to allow for this uncertainty. If the devices registered to userID leave the room, we might decrement the device counters, or warn users about devices left in the room, when the userID seems to have already departed the room. Meanwhile, the smart devices scan the room for those services promised. A room is a hub, and there are spokes pointing to service relationships that could be described as a graph.

List services and devices registered to HUB(roomID)

Suppose next that, every five minutes, the room updates its machine-learning array with the number of people and devices in the room. The association of specific individuals is also a graph, so we might represent it as CONTAINment of persons and devices by a room hub. Aggregation functions, on the other hand, look for statistical functions of data (see figure 1 in the previous installment part 6), e.g. using key-values bonded by coincidence functions on the left hand side incorporate event specific data on the right hand side:

Mean_Room_Occupancy[roomID,”Wed:Hr09:Min15_20”] = NewMean(roomID,”Wed:Hr09:Min15_20”)Variance_Room_Occupancy[roomID,”Wed:Hr09:Min15_20”] = NewVariance(roomID,”Wed:Hr09:Min15_20”)

The key value functions form an automatic histogram of approximate room occupancy:

Room_Occupancy[roomID]

The smart room’s resource hub registers the location of the device’s connections to services:

FOR EACH service IN ROOM(roomID),   GRAPH: device  ---is contained in --->  roomID   GRAPH: Device  --- subscribes to ---->  service

The roomID acts as a location in physical space. The service acts as a kind of location or anchor point in service space or cyberspace. When the number of subscribers becomes large, additional cloud resources might be called up and central database replicas could come online.

In this fragmented association format, everything is node-centric, and we can’t easily see the big picture. To go beyond this individual ballistic scale, we turn to graphs.

Graphs connect local key-value stores

As a graph database, the same virtual representation of this scenario looks like a cluster of attributes. It will always look something like a hub and spoke diagram when some property unites a number of related things. See figure 5 below.

Figure 5: Data (users) may be clustered around different physical or virtual locations in a hub pattern. Some hubs are fixed, others may be mobile (on planes or trains). In cyberspace, location is not the simple coordinate concept we learn about in Euclidean space, it’s a graph relation. When extended to cover a sequence of different locations could form a sequence of hubs to describe a journey (see figure 2).

Graphical representations allow us to see the whole, or one aspect of a scenario at a time — depending on the scale. Sparseness in between hub clusters (a Small Worlds structure) helps us to develop the concept of meaning itself, by allowing us to discriminate patterns as regions over graph space. This is helpful for clarity, but when we have more representations at hand, we can choose to represent exactly how we please.

The significance of a representation relies on how we use one data space to represent another space — placing related or similar things close together or unrelated things far apart. A sequential order represents a succession of distinguishable but related things, while the attributes that connect to each step describe its properties. There is a universal graphical interpretation here, based on a relationship to space (location or memory) and time (sequence or order) — and this leads to the four basic interpretations.

Side note: a lot is made of mathematical techniques such as convolution and pooling (coarse-graining) in machine learning. These are “just” ways to compute specific formalizations of spacetime processes. There’s no contradiction between using them and identifying semantics as we do below. But we should always understand what we’re really doing when applying some black box mathematics.

Riding the four horses of semantics

I’ve alluded to repeatedly, throughout this series, to the four types of connection in a Semantic Spacetime interpretation of the world. Now, having used these in practice, we can describe them more systematically.

EXPRESS: context attributes

Although the smart room data can be expressed as key value pairs, or as docuument data, a graphical interpretation can also be used to show how properties are connected to locations in a more intuitive light (e.g. figure 1). If the properties aren’t shared or used by others, then EXPRESSed relations are the perfect use-case for document database format. There’s usually no need to complicate the graph by exposing irrelevant noise, but sometimes it may be expedient to do so. As a relational graph, a key value store has the topology of a hub, represented by the name of the map, with keys expressing one or more properties, surrounded by a cluster of values. Although all the keys are different they all EXPRESS some kind of property of the same meta-type:

Users clustered around a room ID.
Users clustered around a common service (local WiFi)
Rooms visited by a single user during a day.
Devices or services accessed by a particular user.
Devices or services available within a given room.
etc.

An associative map structure is always hub-like, and the relationship shown by this particular graph (which EXPRESSES properties) is not transitive — it doesn’t go anywhere else except connect to the hub.

FOLLOWS: change and motion

A narrative is more than a list of qualities expressed by a fixed location or agent. It’s a sequence of events, each of which can express attributes. When one item or event FOLLOWS another, the two play a role of a causal transition, which is involved in ordering a narrative.

Relativity tells us that we have to decide what we consider to be in motion when a scene changes. There are two ways we can build sequential narratives. Of course, nothing is ever completely fixed forever. Everything we assume to be fixed is really part of a changing process too, but on a much longer timescale. Whether we view something as fixed (static) or in motion (dynamic) is really about the relative speed or the timescales of different processes (see figure 6).

*Figure 6: time interpreted as a succession of observations by a fixed sensor. Recall the passport example in part 6 of the series.*

The alternative view, in figure 7, is to imagine surrounding events as being fixed, and to consider the observer to be in motion.

*Figure 7: Time and location moving together as a sequence of observations by an observer that interprets itself to be moving. This is the relativistic complement of the scenario in figure 2.*

In either case, events arrive in an order which is invariant for that process. This is the meaning of causality, or requisite order. Causality is the key to describing dependency networks, e.g. where something follows a chain of delivery or is composed of smaller parts that it depends on. It’s also important in delivery of packages, financial transactions, in real supply chains or data networks of all varieties. The problem of routing is also largely about discovering chains with FOLLOWS semantics.

CONTAINS: inside or outside?

A concept that we take for granted in Euclidean space, but which can’t be directly represented by a graph is CONTAINment, i.e. what it means for one node to be inside another. However, if we allow clusters of nodes to behave like effectively a single “supernode” then that is possible, albeit with ambiguous edges. This includes the concept of semantic generalization too: being part of a related cluster is being a member of a weak or strong generalization. Think, for a moment, of a bank as a central location that unifies customer accounts. The customers and their money are not really inside the bank, but are associated with it, yet we see this as belonging. So a CONTAINS relation concerns both the scaling of things into larger things (see figure 8) and a sense of carrying or ownership in different circumstances.

Figure 8: By imagining a boundary around clusters connected by different relationships, such as CONTAINS or EXPRESS, we can imagine a hierarchy of hubs within hubs, something like atoms and nuclei. This is how we represent scaled regions in a graph.

CONTAINment is how we scale data types in computing too. Primitive types that EXPRESS int, float and string properties can be combined into struct types, which in turn EXPRESS a new collective name. These in turn can be CONTAINed within new struct types and so on in a hierarchy. In a numbered sequence array one struct type them FOLLOWS another, which we can see because one item EXPRESSes an index number which is higher or lower than another in the array .

Mixed representations

The four types give us a kind of graph coordinate basis to describe generic semantics by, when processes interact non-trivially, we end up with nodes linked by several different kinds of relationship. According to the semantic spacetime model, we can always classify the relationships as one of the four types — but that won’t be sufficient.

For instance, saying that Mark owns a device is semantically different from saying that Mark is carrying a device, yet both are naturally expressed by a CONTAINS relationship as a spacetime construct, because the device plus Mark form a kind of molecule that interacts as a single unit (see figure 9). The device EXPRESSes attributes of the hub, like manufacturer type, function, etc. Mark expresses attributes, like “male”.

Figure 9: two different ST styles (contains, expresses) used to model narrative relationships. We can sometimes choose to encapsulate expression as containment, e.g. in struct data types, tables, or documents formats.

Hubs bind spatial clusters together (whether in real space, cyberspace, or word concept space), but processes are things that also change in time. So we need spacetime to model process, not just location. This is where FOLLOWS comes in. Following a user journey through a network of services, we can track a user experience as a spacetime graph.

A sequence of related data, e.g. from sensory inputs generates a narrative. Events follow other events. Events express attributes and properties of the moment. We can handle these distinctions using the code patterns we developed in foregoing posts. For instance, we can ask for the graph neighbours of a hub that insides or which act as a container for others, or extract a complete subgraph of relations about containment in all its interpretations:

pairs := S.GetNeighboursOf(g,start,S.GR_CONTAINS,”+”)adjacency := S.GetAdjacencyMatrixByKey(g,”CONNECTED”,false)

Or directly in ArangoDB query language, from the SST association model

FOR link IN Contains FILTER link.semantics == “INSIDE” RETURN link
FOR link IN Near FILTER link.semantics == “CONNECTED” RETURN link

NEAR semantics

It’s tempting to think of closeness or proximity, the quality of being “near” something, as having to do with physical distance (which locally means hop or edge count in a graph), but as we’ve already seen, the semantics of distance depend on the semantics of the space you happen to be thinking of at a given moment. There are many possible interpretations. Agent properties can be close in shape, in location, be connected or tethered, be close in value, in time, etc. There are many formal definitions of distance that have been designed for data and for graphs, but they’re all basically ad hoc ways of embedding a graph in some Euclidean metric space, like a scatter plot. It’s understandable that we look for this kind of mathematical relation to automate similarity, but that’s also not how we decide similarity in practice. That kind of distance can change, so there’s no sense in encoding it.

The NEARness relation, in semantic spacetime, is used as a shorthand for combinations of other properties based on invariants and additional reasoning. NEARness qualifies as a relation in its own right because we have to define the invariant criteria for two nodes to be close. You could use a version of Pythagoras theorem if you like, or you can choose any discriminator suiting your purpose. Effective distance can’t be derived from other properties. That said, such a definition could be related to other conditions:

If nodes are (approximately) similar in their expressed properties, e.g. word spellings (color and colour, or inbound and in-bound). This interpretation is context dependent.
If they are directly connected by a small number of FOLLOWS links, we might consider nodes to be close (depending on the specific follow interpretation).
If they can be CONTAINED within a certain type (depending on the specific contain interpretation).

The criteria to express NEARness imply a choice of semantics. This definition, encoding, and preservation of meaning indeed is the whole point. Trying to generalize nearness leads to huge dimensional spaces in neural network learning. If possible we should always limit problems by dynamic and semantic scales to avoid this problem. In practice, NEARness is most often a symmetrical (or undirected) link, expressing a kind of correlation. Perhaps surprisingly, however, it’s possible for A to be near B but not for B to be near A. In a system of one-way streets, or one-way mirrors, I’m your friend, but you’re not my friend, A can be next to B without B being next to A. It’s perhaps an abstract notion on a human scale, but in terms of processes it’s highly important, e.g. in the biology of semi-permeable membranes, in quantum tunnelling, in electric circuit flows (e.g. diodes), and more.

Precise choosing and naming of semantics

The purpose of a name or any other kind of relationship is to explain something to a reader. For all the sometimes strained mathematical justification presented in literature as if it were unquestionable truth, we basically engineer things to produce the answer we want. The reader of your story will thank you for being clear without excessive nitpicking.

Choosing a name for a relationship is a pedagogical exercise. What we can learn by traversing relationships in a graph is a separate issue. Reasoning is commonly associated with such traversal logic, i.e transitivity — in fact, we often think of generalized reasoning as the ability to tell stories based on traversing a graph of relationships.

Transitive relationships — if A is related to B and B is related to C then A is related to C. e.g. “is the same as” Equivalence relations are transitive. If A is greater than B and B is greater than C, then A is greater than C. So inequivalence relations are also transitive. That’s basically because A,B, and C are all the same type of object or node.
Intransitive relationships: Suppose we start to model different kinds of things as nodes in the same graph. A belongs to B and B belongs to C. Does A belong to C? What could this mean? Try substituting some different things or persons for A,B, and C. Can we interpret the first “belongs to” in the same way as the second?
The book belongs to Mark.
The concept belongs to the book.
The concept belongs to Mark?

It seems clear that words are not enough to explain semantics unambiguously. There’s a way around this by appealing to the origin of concepts in the first place. The algebra of these links is described by Promise Theory. The four types do allow us to make a few rules for traversal however. I’ll return to that in the next installment.

Internet/BGP example

BGP is my go-to example for networking, because it covers so many concepts from scaling to causality, and the most important central core of the Internet that few know about, and fewer still understand. It’s everything from edge computing to centralized scalability — and today, it’s being re-used as a redundant spacetime switching fabric in datacentres too. We can sketch out a simple model of BGP issues in terms of the semantic horsemen.

The smallest addressable entity on the Internet is a Network Interface Card (NIS) which EXPRESSES an IPv4 and/or an IPv6 address, but in terms of routing, an IP prefix is the smallest fragment. The prefix CONTAINS assigned IP addresses in that space are interior to the prefix. It also EXPRESSES these to the outside world so that they can be reached.
End to end paths can be traced using the “ping” or IP ECHO protocol using a tool such as traceroute (see example above). In a trace route, each journey is a sequence of events called hops — each hop FOLLOWS the last, for a single journey. The next time we try it, the journey from end to end could be partly or completely different. Thus there is a difference between the map of all possible paths (the BGP “wavefunction”) and the actual path taken in a trial observation.

Figure 10: The approximate hierarchy of addressing in Internet management, with IP addresses at the bottom and assignment of prefixes by “AS” policy regions at the top.

One or more IP addresses map to a DNS/BIND domain, e.g. example.com. The DNS domain is an coarse grained overlay of virtual nodes which CONTAIN disjoint IP addresses and prefixes.
One or more prefixes maps to a routing domain, which attaches to an IP organization. These are typically hosted by large “Telco” providers. The old model of class A,B,C networks is replaced by CIDR prefixing.
An autonomous organization may be associated with a BGP domain. A BGP policy domain is called an Autonomous System (AS). On the interior of a BGP domain, routing is performed by some protocol (OSPF, IS-IS, RIP, etc). Between AS domains, routing is performed by border gateway routers using the eBGP protocol. eBGP routers are NEAR each other, by definition.
eBGP has no hierarchy, it is a “peer to peer” network. Each peer shares information about its routes and neighbours to all its neighbours.If there are several gateway routers in a non-singleton domain, BGP information is equilibrated by the separate iBGP protocol.

Figures 10, 11, and 12 sketch a rough schematic of this.

Figure 11: ASes form bubbles inside which local routing takes place. Between ASes, routes are exchanged voluntarily and cooperatively by “Border Gateways”, which represent **directions** using eBGP. BGP learns and maintains a local node view of which prefixes can be reached in which direction, by sharing data with peers.

Figure 12: An end to end path is routed by a number of protocols, passing through a number of autonomous regions, all of which have to learn about each other’s presence by “machine learning” of prefixes. This enables a ranking or prioritization of possible pathways. Algorithms on all levels determine final routes for forwarding end to end packets.

There are numerous technical challenges to mapping the Internet, which I won’t go into here. As a multi-model issue, we see the need to register “things” as documents, e.g. routers, ASes, organizations, etc, as well as their relationships as graph edges. As a spacetime issue, we have to worry about measuring times (geo-temporal timestamps, proper time ordering, etc) and locations (geo-spatial coordinates, IP location, etc) which are uncalibrated and non-deterministic. For now, we can simply note that the semantic spacetime model copes well with the multiple types and causal processes involved, and that the tools I’ve chosen here support all of these options in a straightforward and scalable way.

Summary

If you ever go to art galleries, there are probably pieces that “resonate with” you — pieces you like and feel you understand, and also pieces that mean nothing to you. One of the reasons we don’t always understand unusual art is that we look at it in isolation, rather than from the context or the frame of mind that inspired the work by an artist. Art is just a data model in some medium.
In other words, art tries to show us an interpretation of a scene by encouraging certain associations. Cognition relies on these conextual semantics. To use another analogy, we wouldn’t expect to understand the meaning of plastic moulded part 134468 of an IKEA bookshelf package without having ordered the package and unwrapping it. Without the context of IKEA’s flat pack, and the assumed process of furniture building, part 134468 is just a piece of plastic — quite meaningless. Yet this is what naive data collection so often does with data. We try to copy the “state” of a system without copying the context in which the state is meaningful, like dropping a random part into a museum for people to look at — no matter how well designed it would seem meaningless without a story to explain it.

Data representations are an art form: a basic form of cognitive expression — it’s how we define meaning from context. Context, in turn, is the anchor for semantics or meaning in a recurrent process that hopefully converges. As Picasso said:

“Art is the lie that enables us to realize the truth”

In data terms, context is a sum of sensory inputs which connect our state of awareness to recent history. In a graph with many different kinds of relation, we need to know if it makes sense to follow a particular kind of relation in order to infer something in a chain of reasoning. It turns out that — no matter what words we might use to describe or explain something — all relationships basically boil down to one of four types. These express ideas that organize dimensionally reduced concepts. Today it’s common to apply machine learning to try to find graph clustering properties, or ontology discovery. Whatever process we use to determine relationships, knowing the spacetime directions enables simpler reasoning. The reasoning behind the Semantic Spacetime model is subtle, and I described it in my book Smart Spacetime, but the summary is simple:

FOLLOWS (direction) explains the causal order of a process, i.e. dependency.
CONTAINS (hierarchical order) explains generalizations and group memberships.
EXPRESSES (intrinsic attribute) describes a node and its interior properties. The node acts as a hub for its attributes.
NEAR (comparison without order) expresses similarity or approximation to find related items during search. It can also express mutual connectivity.

In the next installment, we delve deeper into the causal algebra of implications for Semantic Spacetime and its four distinct types, to apply these ideas to even more cases of causal multi-path prediction, from edge computing to transport logistics.

Video on SST: https://www.arangodb.com/dev-days-2021/day-4/