Semantic Spacetime 1: The Shape of Knowledge
Spacetime and information both have the basics of a geometry
“In our hunt for the final theory, physicists are more like hounds than hawks, sniffing around on the ground for traces of the beauty we expect in the laws of nature, but we do not seem to be able to see the path to truth from the heights of philosophy.”
–Steven Weinberg, Dreams of A Final Theory
Does knowledge have a shape? Is it possible to name things “correctly”? These seem like odd questions for an engineer or IT practitioner, but they could help to answer why attempts to organize knowledge across many levels have (ironically) failed in our Information Age. For, while we've certainly discovered a bit about how to classify things, at least in terms of bulk subject headings, our ability to connect individual statements of truth together has fallen foul of technologies — because those tools were developed from only half-formed ideas, statistics, and commercial successes, which normalized inadequate methodologies too hastily.
So, what might these questions even mean? How could knowledge have a shape?
Let’s try to propose some answers.
(This is an introduction to the SSTorytime project).
That is the question
One kind of shape is the way grammar and word order change meaning. Sometimes it matters how we stack and order information–which word or item comes first. Another shape is how we encapsulate it. Do we package it in books or web pages? What goes at the top and at the bottom of a page? There is the question of origin and destination:
- How do we know this? Where does it come from? How did it happen?
- What is this good for? Where will it lead to?
- What else is like this?
- What are the details?
What about specific words? Surely names and words don’t matter!? Does it matter how we spell color/colour? Probably not. Does it matter whether we say Sparrows or Passeridae for the birds outside our windows? That depends on the audience. Should we say Window or Vindauge (from whence the modern word came)? It turns out that words do matter sometimes, because of the context and the stories we want to tell, as well as the audiences we want to address.
This essay is about an idea called Semantic Spacetime, which is one way of understanding how meanings all relate back to some familiar concepts about space and time that everyone knows. It can help us to organize the rational interpretations about different viewpoints with fewer ad hoc choices. It’s an approach that scientists might actually understand and even like, because it’s based on a simple yet fundamental hypothesis about the shape of things that happen. And that’s what Natural Philosophy–now called Natural Science–has always been about.
Hawks rising above it all
As the Nobel laureate and physicist Steven Weinberg wrote (see the quote above), sometimes we can only see the truth by rummaging around in the weeds. Trying to fly too high above the fray could just isolate us from the truth. In practice, there is wisdom to be found in both perspectives.
For example, in the mathematical sciences (physics being perhaps the main culprit), there is a common affectation that a capable scientist should be above relying on the names of things. Mere names and symbols don’t or shouldn’t matter: the result is the same whether we call it “x” or “Alice” or “Bob”. If the reader is smart enough, they’ll understand that any name is as good as the next, in principle, but what about in practice?
This notion comes from the idea that it’s the structure of knowledge that matters more than its labels. By structure, we mean the scenario we are trying to describe, the roles and the behaviours, not whether we know the name of everything in the scene. If you’re investigating a murder, you want to describe the crime scene as carefully as possible–but the position of the body and the weapon don’t depend on what you call them? You could translate your knowledge into a foreign language and the story would still be the same, at least in some sense. If the words for gun and knife were the same in the foreign language, there might be some loss of information, but this is not usually the concern. In practice, this carefree abandon for words only works when what we are trying to express scenarios with a very clear “simple” structure–basic “atoms” of knowledge like statements about the behaviours of things, or rules that follow a clear pattern. These are the scenarios I’m addressing in the Semantic Spacetime project–by contrast with the text produced by today’s Large Language Models, for instance.
Of course, there is a kind of snobbery to this: “I know what I’m talking about, because I have a deep understanding of the underlying principles, and I’m happy to posture in front of you knowing something you don’t! I’m here not here to help you to remember, but to show you the depth of your ignorance!” Students certainly won’t thank anyone for being obscure, and especially not purposefully obtuse. The remembering of facts is sometimes frowned-upon, because — if you could simply work out the answer from first principles, then — why try to remember pesky details? Sometimes people will want to "own" a word like "energy" and disdain or confused other's usage of it, even when they are not all that different.
I fell for this scam when I was younger, and refused to remember facts. Later, I came to regret this, when I found that I actually couldn’t remember facts anymore, when I needed to! We all learned our multiplication tables in school, and that knowledge is still a valuable time-saver today. Remembering is our fast memory cache. No harm comes from remembering, as long as we still question our understanding.
So do names really matter? Over the years, I’ve decided that they matter a lot more than I thought earlier in life–and below, we’ll see how or why. When working on the many abstract topics that I’ve been privileged to spend time on, my feeling has been that it’s always worth spending time, up front, to decide the best possible notation, language, and terminology for any discourse, because recognizable names and symbols will equate to time saved later–and much confusion avoided! Later on, when we’ve forgotten how clever we were, we become the student again and the roles are reversed, for us to suffer our own punishment!
The kind of knowledge that has a shape, then, is the kind of knowledge that describes scenarios, patterns in the “bigger pictures” and so on. By thinking the world is made up of things and their ontologies, we’ve made a strategic error. It’s not The Shape of Things to Come, but The Shape of Processes Unfolding that we need to describe. This is knowledge of high value, because it’s often expensive to acquire. The enthusiasts of taxonomy and ontology, who swear by putting things into boxes have had their heads too far down in the weeds to be able to see the wood for the trees, so they fuss over distinctions that may not matter in the wider sense. The real question is: where in this landscape of phenomena does real meaning lie? At what scale?
What we are truly interested in is meaning, but meaning can only be communicated by using some kind of common name. Even information theory starts with defining an alphabet of the basic symbols.
Lock, key, and anchor
M: My name is Mark (with a k).
L: With a k? Ok, Kark, how can I help you?
Names, or identifiers as we call them in informatics, words or phrases, written or spoken, are the atoms of meaning. They are arbitrary labels to which we attach and remember significance. Such atoms are most useful when they lead to something bigger.
Word choices, on their own, are like the coordinates we use to map out meaning. They may be familiar shapes or any pattern or thing else recognizable to us — even a door handle. They are the recognisable “invariants”–anchors that we can rely on to pull up knowledge from the depths. Language is basically the accumulation of all these names, as patterns (with grammar) that can be associated with meanings. Meanings, then, are events, scenarios, things that happened, remembered as episodes that made us feel one way or the other. We can describe them up close (with microscopic properties), or from far away (where are they heading, and what are they part of). The association of meaning (which is closely related to intent) with these patterns is called semantics. That’s just its name–what matters is, of course, the process it represents, i.e. what we do with it.
People who study long and hard, in some field, end up normalizing their own understanding of semantics by developing a specialized jargon. So everyone’s view of knowledge is different. To use another mathematical or physical analogy, some people see the world in polar coordinates (distance, and angle: enemy at two o’clock!), some people in Cartesian coordinates (left, right, up, down, forwards, backwards). Physics is full of semantics that are so normalized, they are rarely discussed — we can take them almost entirely for granted. Philosophy, on the other hand, purposefully explores and sometimes muddles semantics as an exploration in its own right.
Whatever field we end up in, eventually, we find our views so entrenched, so familiar, that we can’t conceive of anyone else ever questioning them. Yet, we don’t always understand one another easily. Learning words is easy, but learning to understand the essence of a scenario from someone else’s description still mystifies even people who teach for a living!
How do people use graph databases today?
Graph databases are now associated with knowledge representations, because of the prevailing models around the so-called Resource Description Framework (RDF). The way we use graph databases today is somewhat muddled by commercial interests, but they handle some key cases that have to do with identifying correlations. A graph view is something like polar coordinates, because each node stands at the centre of its world and looks outward. An Entity-Relation or row-column database offers a Cartesian view in which the coordinates have no preferred observer.
If you are searching an entire space, then an unbiased Cartesian view makes sense, because you don’t want a preferred observer. This is what mathematical science calls the assumption of translation invariance. On the other hand, if you want to emphasize local context, so that your viewpoint is based around a specific entity under question, then the graphics model gives you a Ptolemaic, geocentric “polar coordinate” perspective that’s more efficient. You don’t need to search all the distant stars to find Earth satellites.
In knowledge terms, having a direct pointer to the closely related things, in orbit of a single focal point, is equivalent to having a local index. Indices are just pointer tables to related items within a “space” of information. So, from a graph node, we have a local coordinate system that’s better suited to exploring than one that spans the whole universe of knowledge. Our local context is preserved, or better represented for related knowledge.
Another way in which we use databases is to capture real time updates about something. Graphs usually represent slowly changing structural information rather than ephemeral changes, so time-series information is not usually something a graph is suitable for. However, this changes when we can keep information “within the nodes”, and make a distinction between what is within a node boundary and what is beyond the boundary or part of the neighbourhood. It’s really a scale question about information. Physicists use “hidden dimensions” to represent this kind of dynamical property, and AI researchers call them “feature vectors”. Interior state can’t grow without bounds, but it can “update” it knowledge and even remember past states using timelike or Bayesian averages with finite memory.
My own software agent CFEngine was a pioneer in this method: in fact it was amongst the first software monitors to employ the learning of compressed pattern based averages. Earlier approaches emphasized the endless time-series perspective, which is more common in science. In a restricted system, no such spacelike Cartesian average graph has any reliable meaning. This is a case where a time-like Bayesian approach is more meaningful than a spacelike frequentist view.
The use of a graph to navigate structure is sometimes mentioned in connection with fraud detection methods for online payments. Online fraud algorithms start by a global search to find the transaction payer or payee, and then explores that locale, seeking where we would like to update information and follow it in real time. The structural part of that is a graph, the time series is a separable internal state. When a particular transaction is ongoing, the parties involved are selected, so a centric view is appropriate. Searching an entire space for each update would be madness. Transactions can only involve two parties, so there is a clear focus. Of course, this assumes that we already know the spheres of influence of payer and payee–but that’s true of any data process. What you haven’t seen, you cannot know.
Even though transactions only involve two parties at a time, over many interactions (ensembles of data), there is a non-local effect that establishes memory traces. These memory traces become stigmergic feedback influences on how we interpret new data. Stigmergy is the process (like ant trails) that transmutes past process memory into current boundary conditions, adiabatically–a separation of fast and slow variables. It’s a common strategy when modelling approximately linear dynamics, but it also helps to sustain a sometimes problematic view about the pre-eminence of Markov processes in science.
The Cartesian view is appropriate when the dominant processes are adding new points to the whole “space”. The polar graphical view is appropriate when the dominant processes are centred around a particular location that already exists. We might need to update some of its attributes, but not its existence or key structural relationships.
What we definitely don’t want to do is represent each payment as a line in a graph. This would be a self-defeating strategy that would lead to an exponential growth in data. The value of machine learning lies in compressing data into patterns without retaining the bulk origins. Forgetting is one of the most important strategies in knowledge management, but it remains an almost heretical notion in IT.
In fraud detection, the detection of simultaneous transactions might already be a red flag, indicating that a single buyer or seller is trying to conduct large numbers of payments at the same time. It’s a guess, but it offers an indication to be followed up. The follow up will typical centre around the associates of the present location, so again a graph is a good approach.
Learning patterns of dynamic activity as memories within each local agent is thus a valuable form of PROPERTY knowledge, which is used in fraud detection. The expected frequency of payments to each individual can be learned by updating. Anything high frequency needs to be done in primary RAM. Updates to the database cache can be placed to avoid secondary storage contention. Notice that, when we say “within”, this shouldn’t be taken too literally in the sense of our human scale. In a virtual “space” (actually a graph) the meaning of distance is non-trivial and topologically motivated.
How does semantic spacetime help?
If I were trying to get to the post office, I wouldn’t start from here…
The notion of Semantic Spacetime (SST) came out of an attempt to apply the principles of Promise Theory to the patterns of connectedness that we understand as processes in space and time. The key principles include causal independence (i.e. locality or autonomy, and causality). To cut a long story short, SST predicts that our semantic concepts ultimately all boil down to ways of characterizing what happens around us, i.e. in either space and time. After all–there is nothing more in the universe than that.
Technically, semantic spacetime is about graphs not “spaces” in the strict mathematical definition. But we often play loose and fast with the word “space” because we’ve been predisposed to imagine everything in Euclidean terms.
Promise Theory deals with the meaning of intent in a scientific rather than a psychological way. It’s emerged as a framework for trying to understand the structure of events and scenarios better. Having structure makes it easy for machinery to help us with the heavy lifting, and it offers simple steps for students to follow when they are basically lost in semantic space. Semantic Spacetime isn’t a new technology (it’s not a new programming language or a new database, but we can make tools to apply it). Rather, it’s a change in the way you use any other technology.
In short, Semantic Spacetime tells us to forget about the ingrained notion of ontology, for knowledge representations, and instead think about scenarios. The trappings of old technologies like RDF and OWL encourage the wrong habits, but the basic idea of knowledge graphs remains helpful and can be adapted to employ good habits!
Our problem today is that, in spite of having many good computational tools at our fingertips, we don’t necessarily know how to use them to our advantage. In the information age, we think that tech is something you install and the problem goes away–but, it’s only by studying the fundamentals of knowledge can we go beyond the present. When it comes to knowledge, machines can only help us to remember — or we have to delegate to them with complete trust.
In simple terms, SST tries to answer the question: once we’ve decided on a basic subject and context, i.e. a starting point, where can one go next to understand more about some topic? How to get there? How should one explore? Knowledge is rarely useful in the kind of small bite sized pieces we favour today–to make it useful, it needs to be specific, have some depth to it, and unfold into the meaning of the scenario we have in mind. The goal is not necessarily to produce the kind of fluid and natural language text that today’s Large Language Models or “AI“ are proficient at generating–but rather to help us decide what's going on, what our role is, and what we should do next.
Promise Theory predicts a kind of abstract space and time in which every description or event or concept, can be represented as a location (also called a node or “agent”) in a kind of knowledge landscape, with a coordinate name to identify it. This is related to other locations in one of four elementary kinds of paths. Having four basic types of relationship doesn’t mean that there are only four possible names for how we relate one idea to another (the number is potentially infinite), but it means that no matter what kind of name we use to speak about what connects A to B, it will fall into a class of names whose meaning has to do with space or time in the broadest possible sense.
The four horsemen of the apodictic
Prepare to stretch your thinking beyond the trivial. If we can relate any kind of description of a process with four kinds of relationship, then what are they?
The four categories of relationship need names. I use the following nicknames (and integer numbers) in the semantic spacetime work:
0. NEAR/SIMILAR TO: X is similar to/near Y, Y is similar to/near X, for some meaning of “similar”.
- LEADS TO: X leads to Y, Y follows from X. This is an ordered proper time trajectory.
- CONTAINS: X contains Y, or Y is contained by X. This is a spatial encapsulation.
- EXPRESSES PROPERTY: X expresses property Y, or Y is a property of X.
The numbering of these name is chosen somewhat carefully, because 1,2, and 3 can have signs. A positvie (+) sign from A to B and a negative (-) sign from B to A, like the ends of a battery driving current in a certain direction. These polarized meanings are not reversible. 0, on the other hand, has the same meaning going forwards and backwards (which is why metric distance is a measure favoured by physicists).
For example, we might say: X decided Y, or Y was decided by X. This is a version of a generic relation “LEADS TO”.
If we say X owns Y (Y is owned by X), this has the semantics of virtual containment, so it’s a version of “CONTAINS”.
If we say the radio channel has a frequency of 92.1 KHz, this can only be an expression of a “PROPERTY”.
Finally, if we say that X sounds like Y (Y sounds like X) then this can only be “NEAR” — the things are similar by the criterion of a certain listener.
Space, time, and process symmetry
How can we be sure these four types are enough? One way is to just try examples of this for as many cases as we can–like hounds in the weeds. Another way follows from a simple hawkish hypothesis: that anything we could possibly imagine can only have come from what we have experienced during the long evolution of life and language, and that means that all our sensory experiences and the concepts we’ve developed can only be extended metaphors for things that happen around us (i.e. in time and space): scenarios, if you will.
The four meta-categories of relationship fall into a simple matrix (see the figure below) that show they are really expressions of space and time for two different worlds: the physical and the world of names–or information.
The horizontal rows in this figure describe the two venues we inhabit: call them situation-space for the physical world (i.e. what we traditionally call space and time), and information-space for the virtual world of names and labels. We tend to describe those two worlds using different names, but they are somewhat equivalent in the sense that they all represent states of being that are either equivalences or distinctions. This, of course, is the very basis of information: 0 = 0 and 0 != 1, and so on. Everything that finds its way into our brains is a representation of information.
The vertical columns represent “being part of a region or not”, on the left, and “observing distinctions” on the right: i.e. staying still or moving to another region to experience change. The left is what we mean by location or space, and the right is what we mean by time. (To paraphrase Aristotle, time is simply the change that we measure.)
Notice that, in a space with more than two dimensions, we could draw a ring around a region and be inside or outside it. In one dimension we can chop the line up into partitions (like with parentheses or punctuation), defining containers is always possible. Once we make something indivisible, we can only say whether things are close together in some sense, or far apart.
It turns out that we can go further in understanding the distinctions by studying the symmetries, or equivalences of how these relationships work. By looking at example graphs, we can understand the distinctions better. For example, how do we know whether being a vegetarian means being a member of a group or the expression of an attribute or characteristic about a person? This is where types enter implicitly.
When we say that A contains B, we are describing a spatial relationship with the semantics of encapsulation: a ring or parentheses around some region of things we consider to be equivalent or compositional. In information terms, they are related by a grouping that expresses a kind of common name for all the elements. Distinction, on the other hand, relies on some kind of process that first measures one property and then a different one. That process has a kind of velocity for sampling the world, and so it has the semantics of order or time. How we choose to measure these semantic distinctions is another story altogether, one that leads to natural science and mathematics by counting.
We can say that, if going to any one of the alternatives within a set doesn’t matter or is equivalent, with respect to continuing the path onward, then the set contains a group. The properties each element expresses is a different matter.
On the other hand, if a node fans out into inequivalent meanings, then it isn’t a true symmetry group.
Of course, we are always free to disregard certain attributes and even imagine others, just as biologists disregard attributes when grouping species together, and form larger phyla using the CONTAINS relation. A nitpicker (rummaging close by, perhaps with better smell or vision) could always find some reason to distinguish two members of a group, but a hawk flying overhead might sniff in a different way and say that the distinction is uninteresting. This is where intent or function plays a role in distinction. When we choose to distinguish attributes, which express different aspects of the starting node, then we use EXPRESS/PROPERTY.
Names don’t automatically imply structures or even convey meaning, but paths usually do. Such equivalences are always representative of a specific process, there are no universal properties that transcend context (though we sometimes pretend that certain processes have a special status). This observation is why ontologies fail to do what they promise. Words alone have to be understood through the concepts they stand for, so we need to go back to the process of discovery and what it means to us–and those concepts are phenomena, happenings: processes in space and time. This, in turn, is how mere naming in language can become science.
The small-scale structure of spacetime
Let’s example the four horsemen more closely.
NEAR — closeness or similarity between places or attributes is how we end up with the measurement of distance. Metaphorical closeness and physical closeness are the same thing because we can always find a way to represent distinctions by some information. Choosing to quantify closeness somehow (by some process, like comparing to a measuring stick or a rolling wheel) gives this a metric significance. Natural science relies on this a lot, and forgets about the others. Notice that there is no direction associated with nearness. If A is like B and B is like C, does it mean A is like C? Probably. But if C is like D, does it still mean that A is like D, or that C is more like A than D is? It doesn’t automatically mean that. If we want to order things, we need to be able to express a process in a sequence.
LEADSTO — describes sequence, and thus possible causality. If A leads to B and B leads to C, then usually A leads to C, but that might depend on the details. Does the process have a key to get to C from B? Does C deny entry?
A translated into B translates into C… but the intended meaning is local, so as we get farther away, there is an increasing uncertainty to the apparent transitivity.
CONTAINS — describes containment. For containment, being part of a container means that you are encapsulated by something that “unifies” or represents you as a part within a whole. There are lots of ways this can happen, but there is a directional hierarchy, unlike the case of NEAR .
EXPRESS — discriminators that identify compositional invariance, types like structs or classes in computing. This is not necessarily meaningfully transitive.
Labels offer a way of making distinctions or groups. Everyone with blue eyes! But there is a difference between having an attribute that allows you to identify a group and actually naming the group. Unless the group has been identified and joined up, it’s just a possible search result. Another way to make a group is to follow a path. The group is then the set of all locations along the path.
Our four semantic relations fit into a generalized notion of space and time. The question for semantics is: what do we choose to label as important information?
The meaning of generalization
One of the advanced concepts we understand is when something is a special case of something else, or conversely a generalization. What could this mean in terms of processes? Generalization is useful when we are thinking laterally. We look for other equivalent examples that we can learn from, to see a pattern.
Perhaps surprisingly, the equivalence reasoning above suggests that generalization does not really originate from CONTAINS, at least not in the usual sense we understand it. A group is not more general than its members. Rather it comes from attributes EXPRESSed. If we ask what is the name of the group of things that is equivalent, then we aren’t generalizing as much as accumulating more of the same. However, if we ask what attributes do I EXPRESS that make me special, then the inverse of that can be what generalizes me?
If Mark is exclusively part of the group of people with blue eyes does that imply an appropriate generalization of Mark is: all people with blue eyes?
- The members of a group are other people.
- The members of a property are the properties themselves, which are not people. A person is not an attribute in that sense.
There is some implicit typing that lies in the links between these namings. In informatics. one usually models things or classes of object as having a datatype or type-identity, but in fact the names can mean anything. It’s the allowed links between things express their types. In the case of informatics , there is an implicit arrow to members of a structure or class.
USING EXPRESS TO EXPLAIN THE TYPES
// how shall we use it?
EXPRESS (has property) relates a thing or concept to one of its attributes. The members are not necessarily related, but they may be found together in clusters.
// what can we expect of its members?
CONTAINS (has property) relates an umbrella group to one of its members. The members are this possible alternative for one another.
//what is it responsible for?
LEADSTO (has property) relates a causal object to the outcome for some process. The members are thus possible alternative outcomes.
//what else does this
NEAR (has property) relates a concept or thing to
So, a member of a group is not a specialization, it’s an alternative. There is no branching of type or diversity in a group; groups are homogenous encapsulations, or coarse grains. A branching of an expressed property, on the other hand, is a specialized contributor.
These symmetries have the style of local fibre bundles. Are there transformations that make sense within a group (space) when pursuing processes with a certain velocity (time)? I’ll leave that as an exercise for the reader.
Conceptual coordinates: bases and matroids
In the development of AI over the past decade, modellers have sought to represent concepts as vector spaces with large numbers of dimensions. This feels wasteful, because not every point in those spaces is meaningful. It’s reminiscent of the use of extra dimensions in Kaluza-Klein and String Theories of physics to represent attributes of localized “particles”, by embedding them in spaces–sometimes with more freedom than necessary. With careful boundary conditions, one can constrain a continuum to be a discrete set, using intermediate eigenstates that recovers the discrete options, but this is arguably an unnecessary step. A more economical representation, for small problems and discrete characteristics, is to use graphs.
This is where a discrete representation like Semantic Spacetime can be economical. (This freedom of representation is also at the root of the controversy in the relationship between Feynman diagrams and quantum fields in physics.)
The link between graphs and spaces is through boundary conditions and so-called eigenfunctions that emerge as possible solutions to connect the invariant boundary constraints. There are sparse clusters that represent ideas, as a process of thought, not as invariant objects, but as clusters of states and circuits that interconnect them. Sparseness of the states ensures a semantic separability. Without that, everything becomes “grey goo”.
I wrote, in my book In Search of Certainty (2013), that taxonomists and ontologists have misunderstood the nature of things. Ticking boxes is not a sufficient way to understand differences. Meanings are not always invariant attributes–not even in physics. Ontologies are just someone's convenient spanning trees. Only concepts are quasi-invariants, like eigenstates. But like eigenstates, there is usually freedom in how to span them. A choice of bases, which are proper names.
How do SST & N4L capture the shape of knowing?
Let’s wrap up this first part of the Semantic Spacetime conceptual introduction by asking: how can we apply these ideas to help humans learn and see their knowledge more clearly?
We’ve entered an age in which there is some pressure from people who have invested in Artificial Intelligence for us to accept replacing our own efforts with those of automated reasoning. There are certainly occasions when automated reasoning can be of great benefit to human society, but it can never be an acceptable replacement (regardless of whether it eventually becomes an adequate replacement). That's because no one can know something for you. It can only be an advisor.
Moreover, there is a basic fact about ourselves that we must face: our cognitive abilities are limited, and the amount of information we are asked to process (in a given amount of time) is increasing in both mass and speed, to challenge those limits. As long as we pursue projects of ever greater scale and complexity, we are forced to accept the role of fast machinery to apply the kind of brute force to certain problems that we cannot muster by ourselves, albeit with complete trust.
The Semantic Spacetime (SST) SSTorytime project is an initiative, supported by NLnet, to make it easier for us, as poor humans, to form good habits in collecting experiences and expressing them in knowledge representations that are quickly useful, by adopting process-oriented language. To get us all engaged in a habit of expressing our thoughts and ideas in a more structured way will never be easy. It’s a bit like telling people they need to go to the gym: no one really wants to go through the hurt of exercising to begin with, but at some point it becomes a good feeling and the benefits become tangible.
- We need some kind of graph to link up what we are trying to know, which can be cached in a kind of database for quick searching when we need reminding.
- We need to form the habit to revisiting our notes, as we would go to the gym to stay strong.
- We need to learn the habit of expressing things in a pattern that’s suited for inquiry, by following the logic of the four relations, rather than an imposed and imagined logic of naming.
The SSTorytime library implements a suitable graph database using our old Open Source friend Postgres. Then, I introduced a simple language to make entering knowledge and editing it support those good habits. N4L is a hopefully simple tool for compiling information that helps us to write in a structured shape, suitable for searching scenarios of the mind! It’s designed to be easy to learn, because getting knowledge right in our own minds is a long process. It’s more than just saving data to disk.
Why does anything have to be hard? The answer is twofold: i) writing in a disciplined style is always harder than writing ad hoc, because we have to check whether we are obeying a lot of restrictions, and ii) it’s only by doing work that we can learn. When learning, we’re seeking a compromise between ad hoc freedom and disciplined structure: that requires work to process, to order, to move around until the pattern satisfies us. If we’re too ad hoc, everything is a mess and it’s hit and miss to find anything. If we’re too disciplined, and expect too much from users, searching is fragile to the smallest mistake. Remember the Soup Nazi in Seinfeld?
We need training to normalize the habits of taking useful notes. Once you “know how” anything easy, but then everything’s easy when you “know how” and difficult when you don’t. Large Language Models produce seductively fluid text that claim to bypass the need for human effort, but that’s illusory. Information is not knowledge until you actually know and don't need to trust it yourself.
The components in SSTorytime are designed to allow anyone to use comfortable and familiar tools, both for capturing and editing information as plain text (don’t try to feed it a word document!). One can then take that plain text and feed it to N4L in a form that makes sense to the Semantic Spacetime computer model. Hopefully, this will make searching easier in the future–though that obviously remains to be seen! It’s still an experiment, filled with hope.
Semantics are ultimately about naming things well. Even in science, where we believe mathematics or repetition objectively take away any possible subjectivity, we rely on naming and identification to decide ad hoc limits. If you use the wrong concepts and the wrong coordinates, poorly chosen labels, you will be just as lost. So, in fact, language even underpins mathematics, because mathematics is itself just a language of constraints.
Simplicity is always hard won–and it’s won in a series of several battles, not in a single volley of text. That’s why AI can’t replace us, and why it’s not easy being a teacher. But simplicity and careful naming will help people who are trying to learn. This is the paradox of knowledge! Curating information and consuming information are very different tasks.
In the second part of this discussion, I’ll discuss how we can use this in practice for human advantage.