Universal Data Analytics as Semantic Spacetime

23 min readSep 15, 2021

Part 8: The Cone of Semantic Future: Causal Prediction, Entanglement, and Mixed States in Graph Spacetime

Having developed some of the basic code patterns, and introduced the four spacetime relations, let’s explain how these four semantic interpretations can be used in practice — to understand and predict dynamical processes across multiple scales. I’ll use a couple of more realistic examples that show how these real-world processes form graphs: i) with some Natural Language Processing based on fragmentation of a stream of input, analogous to bioinformatics and ii) multi-path flow processes with alternative routes, such as we find in routing, supply chains, and quantum experiments. We’ll see how the coding of a data representation in applications can now be extremely simple, and how the four spacetime relations allow us to see and explore beyond constrained typological relationships, to reveal emergent discoveries about the data.

Graphs are active not passive structures

In part 7, I showed how to capture and generate spacetime process graphs, merging parallel timelines where necessary. At the data capture stage, we may be entirely unaware of the existence of a graph representation— only later, during analysis, would the graph be revealed as a network. Identifying a working representation early will be an advantage, but we mustn’t over-constrain a representation and kill off the benefits of a network in the process. Graphs naturally want to remain active processes, not become passive or dead archival data structures. In a sense, the construction and the interaction by search are continuations of the same larger cognitive process. The semantic spacetime co-activation method generates this for us without prejudice.

In this post, we’ll look at how to create and search active graphs and explain the meaning of their large scale structures as they grow. It turns out that searching graphs is the more interesting problem, because it exposes an apparent conflict between the desire for a single-threaded storytelling about data, and the multiplicity of causal chains in the larger spacetime that might ultimately contribute to an outcome. This is the “pipeline” problem. Unlike typological approaches (e.g. OWL, RDF, etc) that aim for uniqueness and rigidity of reasoning, a spacetime approach works with general superpositions of concerns that emerge unpredictably from complexity at scale.

There are two main themes to address in this post:

Causal path relationships as processes (timelike trajectories), which may include loops. These are naturally represented by graphs.
Locally expressed attributes and similarities (snapshots in space), which describe intrinsic nature (scalar promises). These may be represented either as graphs or as document internals.

Superposition, multi-slit experiments, path analysis — these sound exotic but are surprisingly common phenomena. For some they’ve come to be uniquely associated with subjects like Quantum Mechanics, for others it’s the Internet. In the last post I showed how such structurally complex graphs can emerge from “simple-minded” traceroute measurements of the Internet. One might have assumed such graphs to be purely linear, yet parallel processes — juxtaposed by projecting along a common observer timeline — reveal a kind of “many worlds” view of a network packet trajectory. The existence of parallel process clocks, integrated over a complete map, will always lead to a non-local picture!

Non-local Internet: The major Internet routing algorithms, e.g. like BGP, which bind together locally autonomous regions of the Internet, promise each other information about non-local “link state”, by a slow process of diffusion(see part 7). The alternatives are then “integrated” to form a kind of probabilistic map, with weights governed by underlying dynamics. The map behaves much like a wavefunction in quantum mechanics in predicting general features but not explicit deterministic outcomes. The Principle of Separation of Scales suggests that this formation of a guide map (as an independent process over longer timescales, and in advance of the trajectories it predicts) is a naturally separable process from the actual probabilistic enactment of each trajectory. This is what happens in transportation, in physical, chemical, and biological gradient fields, and it seems likely that it’s at the root of the Schrodinger wavefunction non-locality in quantum mechanics too.

The four characteristics (again)

It sounds like something out of Sun Tzu’s Art of War, but then we could indeed claim that data representation and mapping are about strategic planning. The implication of the four spacetime semantics is simple, and is depicted iconically in figures 1 and 3 below.

Figure 1: Transitive relations can be used for extended reasoning. From a focal point in a search, they reach forwards or backwards in process time forming past and future cones. Many roads lead to Rome, and Rome has roads to many places — making a conical structure, like an hourglass. The non-transitive relations have only a point-to-point effect so they look like clouds surrounding an anchorpoint.

Of these relations, we recall that:

FOLLOWS relations need prerequisites — i.e. from and to nodes, implying causation, which can trace paths in a consistent direction. If A follows B and B follows C then we have a story from A to C, though not necessarily with the same semantics as the prior links. The class of relations that falls under FOLLOWS tells us about causal order, e.g. is caused by, depends on, follows from, may be derived from, etc.

When we trace processes by traversing FOLLOWS links, we are not allowed to suddenly change to another type of link, as that is meaningless. The rules for paths are very simple in semantic spacetime — far simpler than in logical approaches like RDF or OWL.

FOLLOWS relations may also entangle nodes (making nodes co-dependent), e.g. linking in opposite directions with respect to the proper time direction, to form deadlocks or equilibria (see figure 2). In a proper time basis, these look like diagrams with acausal behaviour (loop diagrams). They oppose or describe countervailing processes, which appear to move backwards in proper time; this is because they form bound interior states of a new strongly connected composite agent, and we now have to view proper time on the composite scale — i.e. as the time which is exterior to the bonded superagent.

Figure 2: Countervailing causally ordered processes form entanglements of agents, in which time goes in no particular direction on a large scale. The arrows are FOLLOWS or directed because they represent conditional transitions. For example, a graph of trade with promises “I’ll pay if I get the goods” and “I’ll send the goods if you pay”. Some symmetry breaking is required to progress in exterior time. The agents are in “deadlock” or dynamical equilibrium until the causal symmetry is broken.

Sometimes physicists claim that entanglement (co-dependence) and mixed states are purely quantum mechanical features. This is not correct. They are characteristics of multi-scale maps, and we need to appreciate their significance in order to reason about graphs at scale. Several parallel interpretations could be in play in a graph transition. These interpretations have to be compatible, since they are spacetime compatible — but can they all be reconciled semantically in context? These mixed semantic states are illustrated below in the multi-path example.

CONTAINS relations explain what things are inside or outside other things, e.g. is a kind of, is part of, etc. This is important in coarse-graining or scaled representations of data. It has a transitive directionality too, but over a limited range — a largest scale (limited by finite speed) and smallest scale (a ground state of the invariant chemistry).

CONTAINS can be combined with FOLLOWS to tell multilinear stories with semantic generalization, e.g. iron CONTAINS(belongs to) metals and metals FOLLOWS(are required for/implies) strong structures CONTAINS(which contains/includes) bridges. Now we have a story connecting iron with bridges. There could be many more stories, and they will have the same structure, i.e. Iron (CONTAINS.Bwd) Node (FOLLOWS.Fwd) Node (CONTAINS) Node. The CONTAINS relation can be used to jump up and down the levels of description. Iron is a kind of metal, what are other metals that could tell the same story?

EXPRESSES: At each location, along a path formed by the above, we can stop and look parenthetically at a node to see what is EXPRESSed by it, e.g. Iron is a grey metal which rusts, melts at 1500 degrees Celcius, and is used to form alloys like steel. Expressing interior detail is the equivalent of opening a document in ArangoDB and looking inside, but we can also imagine referring to properties that are not inside a document, but are represented as exterior graph nodes. That’s a modelling choice. Not everything should be graphical in representation. We have to decide what scale we model at: what we consider to be interior and what we consider to be exterior to a location.

In the search above, we could add an inference based on exploration: Iron is used to make steel. Steel (belongs to) stable alloys and stable alloys (are required for) strong structures, notice that strong structures (are similar to) skeletons (which includes) hip joints. Now we find an application for Iron as a possible hip replacement.

NEAR: Finally, a model sometimes has a notion of things that are similar to other things. NEARness can be physical or logical, e.g. perspex NEAR(is like) glass (in the context of kitchenware or display technology). Iron NEAR(may be confused with) steel. This connection between iron and steel doesn’t negate or override the causal relation that steel derives from iron. They are both active as a “mixed interpretational state” to be resolved by context.

The future cone

No single relationship (say CONTAINment) is enough by itself to model a process, nor does it help to treat every relationship as different. This is why logical data types fail to capture properties fully.

When looking for predictive outcomes, we are usually trying to trace paths fowards in some generalized event timeline. From a given event, the possible subsequent events will typically spread out by some process into a “cone”-like subgraph structure. Searching predictive outcomes means generating this cone (watching out for possibly acausal loops).

Figure 3 illustrates the geometry of these processes in semantic spacetime. You might recognize pictures like this from light-cone sketches in Einstein’s relativity theory. The similarity is obvious and uncomplicated. Einsteinian spacetime is also a kind of semantic spacetime, and past and future are about causal order.

Figure 3: when we start a search from some node in a graph, the forward of backward cones for FOLLOWS follow the progression of the possible stories, while the EXPRESS links describe the node in situ. CONTAINS provides an orthogonal cone of generalization or restriction, while NEAR tells us about projected similarity.

Example 1: Natural Language narrative

Let’s use these observations about the four types to see how we would go about building a narrative from a data stream, by scanning documents (Natural Language Processing) and expressing it as a multi-scale graph. We can then generate the future cone for a search term to see what happened around the relevant finds. This is very like the co-activation picture used in the passport example. A narrative is obviously a chain of words, but if we pay attention to its inner structure, it’s actually a more complicated graph. Not all stories can be told in a simple linear fashion. No one knows this better than those who rely on delivery systems and supply chains, but we can also apply this to the kind of narrative we read in books (where we can more easily get data). See figure 4. We can use the boilerplate code in the SST library to accomplish this easily.

The idea behind text analysis is to do what we do to oil or DNA: fractionate it, i.e. smash it into small pieces and see how the pieces recombine to provide common and repeated signatures. It can be argued that this is how concepts are formed in our minds as proto-cognitive representations over time — a process which is the same basic process as the way complex organisms evolve out of patterns of protein, over very long times! To apply this, we need to identify some natural scales of the process of natural language — of writing. The bulk of the logic is about this fractionation process. We use the semantic spacetime functions to register events and their fractions.

Note this is different from what Amazon and others do in their machine learning method Word2Vec. This technique is a simple form of unsupervised learning that can be run on any laptop in realtime.

*Figure 4: A narrative is built from the four spacetime association types.*

If we scan documents, we have choices about how to classify what we see. Data “events” could be considered single characters, words, sentences, chapters, etc. I’ll take sentences as a smallest semantic unit that expresses intent specifically. Sentences CONTAIN words and word fragments of length N, called N-grams (some use this for N character strings).

Fractionation

We can split up sentences into N-grams — like splitting DNA into fragments. Some of these will act as “genes” that can be reused in different contexts, others are just junk padding. When an N-gram appears in more than one sentence, it indicates a semantic relationship between them, not just a probabilistic occurrence, so the sentences form a cluster around the fragment, which becomes an EXPRESS or CONTAINS hub (see figure 4). We can begin parsing text from books and articles in this way, measuring the importance of word and N-grams as key-value histograms, and building a graph data structure of important parts of the narrative (ranked by the key value histograms).

Using the SST functions, we register a single threaded timeline for each sufficiently important sentence, then find its fragments and add these to the event container, by the principle of co-activation.

event := S.NextDataEvent(&G, key, sentence)  // for each fragment, find unique key and annotate
  key := fmt.Sprintf("F:L%d,N%d,E%d",i,f,index)  
                         
  frag := S.CreateFragment(G,key,fragment)  
  S.CreateLink(G,event,"CONTAINS",frag,1.0)

Sample code for this process is provided in scan.go. Key-value stores are used, along the way, to build frequency histograms of different N-grams. Important sentences are ranked by the importance of the N-grams they CONTAIN. Importance is a “Goldilocks” property — not too frequent and not too rare is what makes a phrase “just right”. This is another occurrence of scale separation. We can combine these histograms with a graphical annotation of the text. Notice how little coding is needed to use the database. Only functions CreateFragment(), NextDataEvent() and CreateLink(). We never even see that we are using ArangoDB (though our choice makes the back end library much simpler).

Comprehending the graph structure computationally

Having parsed the important text and turned it into a graph, we can query the graph to reveal aspects of the narrative derived from books and articles. Part of the appeal of the semantic spacetime model is that the basic interpretations are always contained explicitly within the four basic path labels — not as arbitrary linguistic expressions, but as ordering, proximity, attributes and hierarchy. That means that the logical structure of the graph doesn’t have to depend on ad hoc choices of wording, or even author consistency. One can nuance the words but still keep the founding relationships as invariant characteristics. This simplifies reasoning algorithms, because there’s only a small number of possible combinations for path composition based on the four basic types. We simply look for the possibility cone based on a link type:

cone,pathdim = S.GetPossibilityCone(g,start,-S.GR_FOLLOWS,visited)

Take an example text. After scanning a document of President Obama’s short Inaugural Speech, a query finds the most important ranked sentences, i.e. those which overlap with the search phrase and contain important N-grams, and then we generate the causality cone from a sentence that scores high on the search term fragments to “explain” it. The causality cone is the bundle of all parallel paths from the starting point that FOLLOW the same causal direction (shown explicitly in example 2).

% go run query.go “journey”(Nodes/obama_dat_Sentence_246) “Our journey is not complete until our gay brothers and sisters are treated like anyone else under the law –- applause — for if we are truly created equal, then surely the love we commit to one another must be equal as well.”========== SHOW SPACELIKE CONE LAYERS ==============Timestep (layer)0 : ( ) Our journey is not complete until our gay brothers and sisters are treated like anyone else under the law –- applause — for if we are truly created equal, then surely the love we commit to one another must be equal as well. ( Nodes/obama_dat_Sentence_246 )1 : ( then ) We must act, knowing that today’s victories will be only partial and that it will be up to those who stand here in four years and years and years hence to advance the timeless spirit once conferred to us in a spare Philadelphia hall. ( Nodes/obama_dat_Sentence_262 )Timestep (layer) 2 paths 1

You might not recognize this as a story, narrative, or explanation per se, because it’s too short. But it represents a kind of red-circled summary of a very short narrative, organized just by spacetime principles. We can express it as conjoined sentences to form the explicit narrative, or as a future causality cone from the starting point. What this means should become clearer in the next example. Before looking at that, I want to mention a cognitive limitation of graph representations for data of different sizes.

Graphs are not necessarily intuitive pictures

We might intuitively expect a graph database to lead to a visual approach to understanding structure. The figures 5 and 6 below try this for the text analysis example above. the layout uses the simple algorithms built into the ArangoDB dashboard. Figure 5 looks “nice”, but lacks a sense of narrative. It has no direction, no clear start and no end. Figure 6, with more nodes on the other hand, may scupper the idea of visualization as a useful tool. As soon as we have more than a trivial number of nodes, the structure becomes very hard to make out visually. The arc of the story is not apparent, because node placement is harder to compute automatically without an additional visualization model. It turns out, our human cognition is quite limited to a narrow range of scales that we can perceive.

Figure 5: Graph visualization is a challenge to represent at scale. What appears to be a ball is actually a coiled snake. The clouds of N-gram fragments around each sentence event are superficial detailing, like a mass spectrogram of a chemical analysis.

The four types make it easy to find sustained paths (stories) from FOLLOWS, CONTAINS, and NEAR relations, as well as to expose locally descriptive relations using EXPRESS relations. This sounds straightforward enough, but we have to be careful of loops in a graph. Loops only occur naturally in FOLLOWS or NEAR relations. When searching for the future cone of influence we have to be careful to detect acasual loops by keeping track of which nodes have been visited along the search. The interpretation of loops depends on the context, but typically loops will occur in data that have been de-contextualized. This is why we should always separate contextualized events from de-contextualized fragments when modelling.

Although the data are based on a linear narrative, a chain, or string, as the scale of data and possible narratives increases, we easily lose sight of any obviously linear structure (figure 6) in order to fit it on a two dimensional page. The stringy snake-like structures of causation, and the unification of concepts by shared fragments, starts to generate a more complex structure that we have to search more carefully. Fragments become hubs and structure is turned upside down from what we expected. Eventually, as associations become dense, we can no longer perceive the separation between narratives and all that remains is a ball of conceptual stability, most of which is irrelevant ballast (figure 6). Visualizations are misleading at scale.

*Figure 6: As the scale of data grows, a clump of points basically looks the same in all cases and reveals nothing of importance to casual inspection. Human cognition is challenged by scale.*

Zooming in on a small scale can be equally perplexing. If we were expecting to see some regularity in the structure, we should think again. For what is in fact regular appears random thanks to our human senses, with their cognitive limitations. This is where the four types help to retain the meaning within the picture. The small scale structure looks dense, even though the amount of complexity is misleading. What saves us, under the covers, is the fact that semantic structure still dominates the shape of the graph, revealing simple reasoning paths. They are not visible to the human eye, but they are clearly present. This is both a virtue of the semantic spacetime method and a warning about assuming that “very detailed” actually means hopelessly complex. This visualization topic is beyond the scope of this series, but it’s a big deal.

Figure 7: On a small scale, high density can be mistaken for hopeless or inscrutable complexity, when in fact the graph is highly structured. We need other measures to understand story properties. This is where a semantic spacetime representation helps, and coarse-graining renormalization of the key processes..

Example 2: Multi-path route analysis

To give an explicit yet more complex illustration of the causal cone, we can look at a more carefully constructed scenario shown in figure 8. This is small, in terms of points, but has many semantic features in play. The nodes are obviously interpretatable as locations, and the links may be viewed as transport or transition alternatives from point to point. The graph is organized into layers, a bit like a neural network, so that we can imagine it to be part of a real world map. We can interpret this in any number of ways:

As a network routing problem, e.g. traceroute,
As a quantum mechanical slit experiment (path integral),
As a journey planning map guide for tourists, or
As a supply chain logistics map.

Add your own cases. The code in doors.go generates this graph in figure 8, by manual placement of nodes, and then uses algorithms to analyze it impartially into coarse grains, stages, and fibres. The purpose of this is to see if we can develop algorithms that detect structures, even when they can’t be seen — and what these structures mean.

If the rendered picture doesn’t look like this, you can use the ArangoDB browser to drag the nodes into place manually to check that the layout matches expectations.

Figure 8: The future cone of the starting node. A journey or multi-path race from the left (start) to one of three (target) destinations on the right, passing through a number of intermediate destinations along the route, where a traveller might choose a door or other transport option. This could be meant to represent a real network routing problem, an esoteric quantum mechanical slit experiment in which each pink node is a slit, a supply chain logistics map, or simply a journey planning guide for tourists. Multiple links between slits lead to possible mixed states.

Run the example code (doors.go) to generate the graph in ArangoDB. This produces the output shown in the highlighted boxes below. The output tries to show how we can understand alternative pathways using the same spacetime semantics. First we rely on the FOLLOWS relation to generate a directed process, which we call simply “comes before”, a bit like a race or a wavefront travelling outwards from the starting point. I.e. A (arrow) B means A comes before B.

Once again, we generate the future cone to compute all the forward possibilities from the starting point, as a generalization of the previous example (see doors.go). In the code, we can use the semantic relations to analyze and organize the graph into spacelike or timelike sections that we can see in the picture, thanks to its small size.

If we look at the spacelike hypersections, “stages”, or proper time cross sections, we see the picture in figure 7 represented as text in vertical (spacelike) cross sections from left to right:

mark% go run doors.go========== SHOW SPACELIKE CONE LAYERS ==============Timestep (layer) 0 paths 170 : ( ) Nodes/start_L0Timestep (layer) 1 paths 171 : ( comes before ) Nodes/door2_L11 : ( comes before ) Nodes/hole_L11 : ( comes before ) Nodes/gate_L11 : ( comes before ) Nodes/door1_L1Timestep (layer) 2 paths 172 : ( comes before ) Nodes/bike_L22 : ( comes before or leads to ) Nodes/passage_L22 : ( comes before ) Nodes/road_L22 : ( comes before ) Nodes/river_L22 : ( comes before ) Nodes/tram_L2Timestep (layer) 3 paths 173 : ( comes before ) Nodes/target1_L33 : ( comes before ) Nodes/target2_L33 : ( comes before ) Nodes/target3_L3Timestep (layer) 4 paths 17

This kind of stratification is interesting for many applications. In neuroscience, for example, cortical brain tissue is largely organized into layered approximate stages like this, with “columns” perpendicular representing pathways. Compare that to the figure.

Next, we can generate the orthogonal timelike cross sections, showing the eight distinct paths through the network from start to finish. These are the “stories”, “worldlines”, or “paths” (choose your ontology) represented by the paths or strings from left to right on the graph. There are eight distinct paths that together spread out from the point of origin like a cone. The farther away from the origin we are, the more possible states are available in general.

========== SHOW TIMELIKE CONE PATHS ==============0 Nodes/start_L0:(comes before):Nodes/hole_L1:(comes before):Nodes/tram_L2:(comes before):Nodes/target3_L3:(end)1 Nodes/start_L0:(comes before):Nodes/gate_L1:(comes before):Nodes/tram_L2:(comes before):Nodes/target3_L3:(end)2 Nodes/start_L0:(comes before):Nodes/gate_L1:(comes before):Nodes/bike_L2:(comes before):Nodes/target3_L3:(end)3 Nodes/start_L0:(comes before):Nodes/door1_L1:(comes before or leads to):Nodes/passage_L2:(comes before):Nodes/target1_L3:(end)4 Nodes/start_L0:(comes before):Nodes/door1_L1:(comes before):Nodes/road_L2:(comes before):Nodes/target2_L3:(end)5 Nodes/start_L0:(comes before):Nodes/door1_L1:(comes before):Nodes/river_L2:(comes before):Nodes/target3_L3:(end)6 Nodes/start_L0:(comes before):Nodes/door2_L1:(comes before):Nodes/river_L2:(comes before):Nodes/target3_L3:(end)7 Nodes/start_L0:(comes before):Nodes/door2_L1:(comes before):Nodes/tram_L2:(comes before):Nodes/target3_L3:(end)

Reducing nodes by symmetry factoring

Not all the paths in the scenario are actually distinct: some have only nuanced distinctions but end up in the same location. This may be important. Semantics scale in the same way that graph nodes scale. They represent distinct versions of narrative, but not distinct physical processes. So, next we can look for coarse-graining equivalences (see figure 9). There are several reasons to want to do this. The first might be to overcome the limitations of human cognition: we have to be able to reduce the number of moving parts in a narrative or description. Our human senses are designed to do this, in a way compatible with space and time aggregation. If two routes or nodes play the same role, we can combine them into a single supernode.

The next phase of the code finds these equivalences from the graph, as if the nodes were “scattering regions” with incoming paths and outgoing paths (particle physicists may recognize this as an S-matrix structure). Any nodes that have the same incoming and outgoing paths play the same role, so may be composed into an effective coarse grain (figure 9):

Figure 9: Several nodes can be composed into a single coarse grained supernode by different criteria. If all the members connect to the same incoming and outgoing nodes, that’s a pattern that can be represented by aggregation. This symmetry is a criterion for semantic graining.

It’s interesting to note that the message-passing graph computer Pregel (Apache Giraffe) is supported by ArangoDB, and that the “scattering process” analogy is the basis of its flow computation model. Each node is “aware” of a vector of vertex information, and edge information is passed by incoming messages, like a multi-slit process. Nodes compute by flow only — they are not allowed to perform exterior data accesses, e.g. to look up something in a service (DNS or TensorFlow server) so a pure graph traversal is not a data processing pipeline in its most general sense. It’s an open question whether one should use a graph as a guide about the semantics of boundary conditions, or as a strict data element in an algorithm — the approach clearly has significant limitations. If we take a purist Pregel-like view, then non-local look-aheads are not possible from a causal flow framework, and mixed boundary criteria like the supernode aggregation in figure 8 will not be possible. ArangoDB has some interesting Pregel integration, which extends its pure limitations, making it a pretty nice tool. This could be a ripe area for future study, in case somone is looking for a thesis topic!

The example code performs a non-local search based regions with common inputs and outputs. The snippet of output below shows which nodes are joined into coarse grains as a result of common inputs and outputs to a “path scattering” region. The stories are then rewritten to place parentheses around the supersets, using a little aggregation legerdemain, and finally the supernodes are summarized at the end.

========== SHOW TIMELIKE SUPERNODE / S-MATRIX PATHS ==============Join nodes 0 Nodes/hole_L1 1 Nodes/gate_L1 because between Nodes/start_L0 and Nodes/tram_L2Join nodes 0 Nodes/hole_L1 7 Nodes/door2_L1 because between Nodes/start_L0 and Nodes/tram_L2Join nodes 1 Nodes/tram_L2 2 Nodes/bike_L2 because between Nodes/gate_L1 and Nodes/target3_L3Join nodes 1 Nodes/gate_L1 7 Nodes/door2_L1 because between Nodes/start_L0 and Nodes/tram_L2Join nodes 5 Nodes/door1_L1 6 Nodes/door2_L1 because between Nodes/start_L0 and Nodes/river_L2Join nodes 6 Nodes/river_L2 7 Nodes/tram_L2 because between Nodes/door2_L1 and Nodes/target3_L3Symmetrized super-node paths:0 Nodes/start_L0 :(comes before) :super-Nodes/hole_L1=(Nodes/hole_L1,Nodes/gate_L1,Nodes/door2_L1,Nodes/door1_L1,) :(comes before) :super-Nodes/tram_L2=(Nodes/tram_L2,Nodes/bike_L2,Nodes/river_L2,) :(comes before) :Nodes/target3_L3 :(end) :1 Nodes/start_L0 :(comes before) :super-Nodes/hole_L1=(Nodes/hole_L1,Nodes/gate_L1,Nodes/door2_L1,Nodes/door1_L1,) :(comes before or leads to) :Nodes/passage_L2 :(comes before) :Nodes/target1_L3 :(end) :2 Nodes/start_L0 :(comes before) :super-Nodes/hole_L1=(Nodes/hole_L1,Nodes/gate_L1,Nodes/door2_L1,Nodes/door1_L1,) :(comes before) :Nodes/road_L2 :(comes before) :Nodes/target2_L3 :(end) :Symmetrized nodes:symm. supernode_1 == Nodes/tram_L2symm. supernode_1 == Nodes/bike_L2symm. supernode_1 == Nodes/river_L2
symm. supernode_3 == Nodes/door2_L1symm. supernode_3 == Nodes/hole_L1symm. supernode_3 == Nodes/door1_L1symm. supernode_3 == Nodes/gate_L1

This symmetry search code would be very difficult to write without the semantic spacetime model. We see its usefulness here because it preserves the basic relationships in a scale invariant way.

To see how we could get the same result from the graph database query language (ArangoDB’s AQL), as a scattering problem, see scatter.go. Basically, we want a query that identifies multiple nodes with the same incoming and outgoing FOLLOWS links:

FOR doc1 IN Follows 
  FOR doc2 IN Follows 
    FILTER doc1 != doc2 && doc1._to == doc2._from 
      RETURN { "In": doc1._from, "Out": doc2._to, Agg: doc2._from}

If we collect the Agg fields for the same <In|Out> pairs, this results in the partial aggregations:

door1_L1->target1_L3 = [Nodes/passage_L2 Nodes/passage_L2]door1_L1->target2_L3 = [Nodes/road_L2]gate_L1->target3_L3 = [Nodes/tram_L2 Nodes/bike_L2]start_L0->passage_L2 = [Nodes/door1_L1 Nodes/door1_L1]start_L0->road_L2 = [Nodes/door1_L1]start_L0->river_L2 = [Nodes/door1_L1 Nodes/door2_L1]start_L0->tram_L2 = [Nodes/door2_L1 Nodes/hole_L1 Nodes/gate_L1]start_L0->bike_L2 = [Nodes/gate_L1]door1_L1->target3_L3 = [Nodes/river_L2]door2_L1->target3_L3 = [Nodes/river_L2 Nodes/tram_L2]hole_L1->target3_L3 = [Nodes/tram_L2]

There is still room to identify overlapping members in the sets, so we still need to conjoin overlapping grains to complete the symmetry. In scatter2.go, I used the S.Set interface to find this. It’s not quite as refined as the golang method in doors.go, but it gets the result more directly by seeking symmetries of the transition function directly inside the database:

% go run scatter2.goNodes/river_L2<start_L0|passage_L2>   <-- Nodes/gate_L1   <-- Nodes/door1_L1   <-- Nodes/door2_L1   <-- Nodes/hole_L1Nodes/door1_L1<door1_L1|target1_L3>   <-- Nodes/passage_L2Nodes/passage_L2<door1_L1|target2_L3>   <-- Nodes/road_L2Nodes/road_L2<door1_L1|target3_L3>   <-- Nodes/bike_L2    <-- Nodes/river_L2   <-- Nodes/tram_L2

The Solace of Quantum Computing?

The foregoing reveals an interesting connection between process narrative and the field of quantum mechanics. One could say that future paths actually explore and map out dynamical possibilities within a system, while individual measurements select from the “database” of possible states latent in the map. As an interpretation of quantum mechanics we would say that each of the possible paths (contributing to the path integral) is a possible story with potentially different semantics. Some stories are equivalent, others are different. So there can be mixed semantic states, which just imply a redundancy (called degeneracy in QM) of interpretation. The sum over all stories provides a map of the process space, which is normally represented as the wavefunction.

When we have two nodes with different links of the same spacetime semantics, we see the graph analogue of quantum “mixed states” — or parallel alternative interpretations of a transition. A link may have two or more equally valid interpretations in different contexts, either of which (or both of which) may be active. The doors code aggregates these:

Nodes/door1_L1:(comes before or leads to):Nodes/passage_L2

Comes before and leads to are clearly compatible interpretations. They must be, because they were spacetime compatible in the four types. In this simple case, this is obvious, but as we explore more expressive link interpretations, it may be less so. Variant interpretations can represent equally possible transitions to the larger process of which the nodes are a part.

These mixed state interpretations really don’t imply many-worlds in the sense of cloned universes, as a few sensational authors like to claim! But they are many-worlds in the sense of Kripke semantics — due to the separation of processes and observer interpretations. They’re just contextual alternatives within a map that may or may not be in play during different stages of prediction. They were collected on a different timescale within the same system; they represent persistent possibilities for future behaviours (which is what maps do), thanks to the stability of the system over a longer timescale. Only when we eliminate them by direct selection from experience, do they get removed like chess pieces from the board.

One of the interesting aspects (no quantum pun intended) of quantum causality is the structural non-locality of maps. The wave function acts as a summary database rather than as a causal data pipeline in some respects. This means that strictly deterministic graph flow algorithms (e.g. Pregel) that rely entirely on in-band message propagation can’t emulate all such processes, because the enforced synchronization is not compatible with all possible paths. Nodes only have access to spacetime of a simply connected form. A calculation of supernodes based on common incoming and outgoing promises would be analogous to “Feynman propagation boundary conditions” or S-matrix causal structure. Directed graphs don’t necessarily have the unitarity property that would enable us to infer this without direct data access.

Summarizing the large scale structure of spacetime

The coarse graining shown in the doors example above brings us back to the concept of aggregate clusters and supernodes. Structure is there to be discovered. With a little forethought, it can often be anticipated. In computing, data structures come first and algorithms follow to match.

For the remainder of this series, I want to explore the issue of coarse graining (scale) and its meaning to different processes — this will give us some clues not only about how we reason about the properties of systems (from raw data to machine learning pipelines), but also how we can efficiently calculate those properties.

*Figure 10: The four spacetime relations in practice.*

As we move beyond the naive recentralization of technological infrastructure — from the migration to early cloud solutions, and back to a hybrid premises or co-location model of data computation and storage — the location of data is going to be a key issue. Where should we keep our data collections? Moving data is expensive because the size of data is typically much larger than the size of algorithms; thus it’s cheaper to move algorithms to data than vice versa. We see the benefits of embedding processing inside databases, as in the ArangoDB model — one of the reasons for my choice of tools.

There’s far too much to cover about modelling with the four relations in this series, and the semantics of neighbourhoods in a graph, but you can read more about how to use them in this summary on the project page.

In the next installment, I want to change tack and consider how we go from local details to coarse grained and global properties of graph and subgraph representations using a graph matrix representation. I’ll show how we can compute numerical estimates of network flows directly from semantic spacetime, either in Go or directly in ArangoDB .

If you’re interested in learning more about the issues in this post, you can read more in the paper Motion of the Third Kind (Virtual Motion) and in the popular book Smart Spacetime.