The Role of Intent and Context Knowledge Graphs With Cognitive Agents
Taming context as The Hard Problem of Knowledge Representation
Raise your glasses to the Knowledge Age! Learning has been repackaged as a commodity, the Data-Information-Knowledge-Wisdom hierarchy has been flattened by the bulldozer of Deep Learning, and we strip-mine original content mainly for indulgent trivia. But it’s not all prompt-and-paste; thanks to the very advances in Large Language Models, and the desire to cash in, people are getting interested in important problems again! The quest to patch up LLM magic tricks seems to be drawing people back into an old but unsolved problem of great importance: that of how to represent the semantics of learning, scalably and usefully, not just for posterity, but for actual day-to-day use. In this essay, I’d like to explain a few of the challenges around Knowledge Graphs and encourage you to participate in the SSTorytime (Semantic Spacetime) project. I’ll fill in some background about my own story before getting a bit technical, so fetch the popcorn and tuck in.
Knowledge Representation is an old problem with a chequered history. In a classic misstep, the computing industry standardised the nascent technology of Semantic Web too soon, trying to capitalise and get ahead. Standards bodies were common before Open Source innovation caught on. The result has been for consulting houses to double down on weakly appropriate methods and to strong-arm users into questionable projects.
As a Professor of Computer Science, and as a teacher of some 40 years, understanding knowledge has been a research problem and an intriguing challenge for me. My own solution to the challenges of knowledge representation has been, like most, to use graphs representing link semantics, but in a different way to the industry standard method. The quest has led me through five iterations of knowledge structures that culminated in the Semantic Spacetime (SST) concept. This can be found in my open source SSTorytime project, sponsored by NLnet. It strips away all specialised databases and query languages and replaces them with a vanilla PostgreSQL database that anyone can run.
Some history
I began to work on knowledge representation around 2006. Some of my early thoughts and research workshops quickly led me to seek out some form of knowledge graph, but dismissing RDF and Topic Maps as viable approaches. At the time, I had been working on developing autonomous infrastructure, using smart agents since 1993. My software CFEngine had taken over the world, and I was in the process of starting a company around it. I wrote these words: “Once you have divided labour between man and machine, all that is left for man to do is to shepherd knowledge.”
My thinking was more specific to infrastructure management than it sounds today, but still the slogan doesn’t quite ring quite as true as it once did. Only later, have I realized that if you want to shepherd knowledge, you can’t let machines do all the work for you. Because “it ain’t knowledge if you don’t actually know it (yourself)”. This would hound the success of the company.
If you don’t do it yourself, you won’t know it yourself, and you won’t care. In the Information Age, we claim to know something too easily. If you look up a fact on Wikipedia, you might say, “yes, I know that. I read it on Wikipedia”. But if you looked up a person in the phone-book, or company register, you wouldn’t claim to know them just because you’d seen their phone number and address. We don’t really know someone unless we know them like a friend: we’ve interacted with them enough to experience good, bad, when to approach, when to stay away, how to use, etc. Knowledge and hearsay are different things.
Ironically, it was a rebellion by IT workers against my AI automation that made me realize this. I created robotic automation for self-healing data centre servers, because I had better things to do than perform repetitive repair. I’d already done my years of learning, getting my hands dirty and I wanted out. I had the knowledge, but the younger generation were still trying to learn, and so they rejected a simple scalable answer that made their learning less critical. They wanted to solve that problem themselves, in their own image.
What I’d done was to make configuration simple, but it did not (ironically) make it easy. In fact, by forcing simplicity onto people, it forced them to think harder–and they didn’t like that extra burden. What AI is doing today is potentially the opposite: enticing people to play, and distracting them from learning by making it too easy. Working with knowledge is still a challenge for humans, no matter what corporate interests might claim.
Knowledge doesn’t grow on trees
When I started the new SST, I realised that making the graphs was the hindrance to using them well. Sketching a mind-map in a notebook is easy enough. We have visual minds, and engaging the motor cortex is always good for learning. But, turning that sketch into a data structure that can be parsed and searched is a lot less easy.
As soon as we force people to jump through bureaucratic hoops of defining types, then allocating and registering objects and so on, they quickly lose sight of the original goal and become blinded by the process. Keeping simple things simple is surprisingly hard. It’s a design problem, not an algorithm problem.
Computer Science has (in my opinion) made (and doubled down on) several classic mistakes in approaching the problem:
- It jumps into excessive formalization and rule-based methods.
- It expects reasoning to narrowly mean logic, when there is little evidence for that.
- It uses data types (and ontologies) to enforce logical distinctions.
With the advent of machine learning, the attraction to datatyping has waned slightly, shifting from determinism to probabilistic models. Knowledge Maps have been unable to adapt to this however. Can we understand reasoning without logical labels and datatypes? In the Semantic Spacetime theory, we can go quite far based only on the spacetime characteristics: what is ordered in time or space, what is contained by something else, and what interior/exterior properties something has (see this introduction). We don’t know the full answer to that yet, but perhaps with a combination of learning and spacetime structure, it might be possible. The patterns one finds in SST tend to be few in number, but they are less trivial than property graph triplets, while overloaded typing in ontologies increases the complexity of reasoning massively and drives thinking away from the “naturalness” of human language.
Recently, I wrote a deliberately provocative essay: Why are we so bad at knowledge graphs? I wanted to better understand and point out why people repeat the same clichés over and over again, without moving forwards. Books and articles encourage people to make trivial graphs that are not really better than old fashioned Entity Relation database models. Then, they claim a “naturalness” that could easily be refuted. This prompted me to ask what graphs are really good for, and so (like a good Professor) my research took a turn, which resulted in a model of world processes that I called Semantic Spacetime.
I knew that something mathematical would be a hard sell (I’d made that mistake before), so I started by writing a popular book about the work to garner interest first (see my book Smart Spacetime: How Information Challenges Our Ideas About Space, Time, and Process). It’s great, I recommend it!! :)
In the essay, I criticised simple minded property graphs by challenging people to make a graph of a quote from Lewis Carrol. I didn’t provide a solution. Several people were intrigued and wondered what the answer might be. Here’s a possible answer. The quote is here:
Your first thought might be: but this is not knowledge! It isn’t a recipe, or an instruction manual, it’s not even really a fact. That poses a question very few have ever thought about. What is knowledge then? Most people get stuck on the fact that natural language doesn’t fit the strict template of a database schema, and so it’s easier to dismiss it as “not knowledge” than to think again.
If a quote from a book isn’t knowledge, what is it? Is an apple knowledge? Are uses for apples knowledge? Are books themselves not even knowledge? But this is thinking too literally. It seems to arise because we are taught, from an early age, to be obsessed with things, rather than processes. But knowledge doesn’t always come directly from a specific thing or source. It’s a process about understanding processes–about telling stories.
When knowledge comes from doing something, for instance, we don’t expect there to be a single motion-capture recording of every movement of our bodies, from one episode, to be our knowledge about it. A recording isn’t knowledge. A single data point is little better than looking up someone in the telephone directory–that doesn’t mean we know them! We couldn’t then repeat the movements exactly–rather, we repeat and rehearse, train and extract, principles and metaphors. Yet this kind of sequential capture is exactly what graphs do well. To learn, we annotate our experiences and learn those condensed and distilled notes. It’s a process, and a process can be captured as a graph. So that’s how we can extract knowledge from this quote too. It takes a certain motivation, because we have to work for our reward–and this illustrates the crucial topic and role of intent in knowledge.
Before continuing that thought, let’s finish the Lewis Carrol problem. It took only a few minutes for me to type the following (see figure below) with a text editor, to write some notes. I wasn’t thinking about a graph at all–but, in fact, it is a graph. In this simple form (using the SSTorytime language N4L), it can still be revised in a user familiar way. There’s no wracking our heads with programming APIs or query languages. Imagine fiddling around with a database API for a whole morning to encode this!
When compiled into the SSTorytime graph database, it becomes browsable.
Notice that the quote hasn’t been replaced. It has been retained and expanded upon with logical / structured commentary. The natural language can still be valuable to the user. We don’t have to artificially replace it when it works better than the alternative.
It’s also worth pointing out, in passing, that graphs about knowledge do not generally form the large and beautiful images you see in advertising about graph technology online. The logical semantics of graphs break them up into rather small pieces, which is why logic is a poor model for understanding bulk knowledge. Isolated facts don’t join naturally together, as you see from the tiny fragment in the illustration above. Only by learning over a long time do some of these isolated fragments get joined up by apparently irrelevant connections. Such connections are far too complex to be modelled logically. Only spacetime concepts seem to unify them, within something I call the γ(3,4) representation.
Learning is easy, remembering is hard
Okay, let’s return to intent. The idea of seeking a graphical representation is, in some ways, a distraction from the key issues. Knowledge and memory are cognitive processes, not archives. They happen to produce graphs, but that shouldn’t be the focus. We are far better at talking and writing than we are creating logical data structures. So let’s think about what we want from a representation.
When we try to remember something, the extent to which we remember depends on cognitive cues. There are two main parts to this, but the key is (a throwback to Promise Theory), our old friend: intent. The meaning in memory is:
- What we care about it (i.e. intend to know): derived from our Intentional Focus
- What just happens to be when we experience: derived from our Ambient Context
Intent is about caring enough to learn, or less touchy-feely: being determined and aligned with the goal. Notice too that the intention behind something is distinct from the reason why we might be interested in it.
- What are you doing?
- Why are you doing it?
These are different questions. Philosophers like Searle and Anscombe have muddled these issues in philosophy without ever getting to the point. Without a consistent intentionality, going through the motions of learning is just junk collecting. The dissonance between instinctive, directed intent (system 1) and our later rationalisation (system 2) is a cause of much confusion. The first is questioning an impulse, the latter is storytelling–perhaps just gossip or maybe even lying to oneself. At best, intent and justification may align, but to outside observers, who may form their own opinions, this is an irrelevance.
For example, when I am learning a foreign language, I can easily confuse words that sound similar. I could just write down the words and causally notice the similarity, but if I want to remember the differences and similarities, I need to write a story about that specifically. So I document words that sound similar, and then they all pop up together in the same “orbit” when I search for them, along with an explanation. Seeing them together, I can compare and contrast and remind myself of the distinction immediately.
Knowledge is sparse
When we compress someone’s knowledge into books and libraries, we get the impression that knowledge is dense and plentiful. But, at any given moment, our awareness of stories and relationships is quite sparse. This is why we need the assistance of search engines and indexing.
Pragmatically, we simply want to be able to recover memories we once considered useful, because they were aligned with our intentions. Howver, what tends to happen, when we look something up, is one of three cases:
- We follow a trail, step by step, intentionally like breadcrumbs trying to find our way back.
- We search broadly and find too many matches, we skim, and we are overwhelmed.
- We search specifically and find no matches at all.
Finding a criterion match in IT is a logical (discriminatory) constraint, and (being binary) logical constraints are highly unstable. In order to select a precise answer, we have to ask a precise question. Too loose and we get too much, too exacting and we get nothing at all. There’s a tightrope in between that we are constantly walking.
Ranking answers according to some measure of relevance makes this much easier for us. Google did this with PageRank for the Web. The PageRank algorithm counts the bulk flows from other locations, which point to a particular resource. Pointing is a representation of intent: the intent to point out relevance. That’s why it works. But the Web is already a graph, so that was easy. If we also want to rank documents, books, and other sources in a less structured environment, we need to measure intent differently.
The intent to find what we need costs a certain amount of effort on the part of the searcher. There is a minimum effort to overcome a barrier to trying, to proceed and unlock an answer. This is something an agent could possibly measure. The idea is just like the work/energy model for trust, which was so successful at predicting human behaviour. It opens up the possibility of a physics of information. For example, we care less about accidental happenings (knocking a vase off a shelf) than for events in which careful motivation was expended to see it through.
I have friends who love going to the gym and can tell me all about how many exercises of “this and that” they did, with great unfeigned enthusiasm. I hate going to the gym. I might do the same exercises, but I have no intention to do them. It’s simply incidental passing of time while I am thinking about something else. As a result, I remember nothing (though hopefully my muscles do). My only intent is to improve my condition–which I hopefully accomplish, because I showed up. Muscle memory versus memory muscle!
Intent is nine tenths of achieving. When studying, we absorb what we have the actual intent to learn. So we would like to harness this for knowledge representation. We need to build intentionality into our knowledge representations.
I wrote a paper about this: On The Role of Intentionality in Knowledge Representation: Analyzing Scene Context for Cognitive Agents with a Tiny Language Model. The paper studies the tension between two aspects of cognition in natural language: intent and context. To understand why this is important for knowledge, we need to understand how context unlocks intent in the manner of a lock and key. To see how it works, let’s go back to the story of CFEngine as a simple case.
Through an agent’s context, pragmatically
Intent divides cognition into two parts: what we are focused on and what is merely ambient window dressing. The amount of window dressing we use to establish a context could easily grow without limit. How do we keep context logical and simple? One way is to use a trick from physics: separating the processes into those that are fast and those that are slow (see my book In Search of Certainty). In science, we wish to ultimately break down phenomena into counting in units, by dimensional analysis,
My understanding of context, and its role in reasoning, go back to the work I did around the CFEngine “AI configuration robot” in the early 1990s. It turns out that CFEngine did a lot right, before I understood the theory of it, but in a much simpler test case than we are generally thinking about today. Why simpler? Because the domain of automating machine healing was already quite structured and the options were naturally limited. All a machine has to do is select between a few pretrained options–the number isn’t generally very large before users start to complain about too much complexity.
To detect the circumstances of machine states, I found a way to convert observations of the environment into a set of flags or signals. These became “conditionals” or semaphores. Back in the 1990s, I called them “classes”, because this was a phrase used in computer science. By labelling policy choices with collections of flags, one could say when the choices should be considered to apply. Flags are a way to remove all of the obnoxious verbosity of if…then..else statements from programming and make them “declarative”.
Context can be understood basically a classification of the aspects of an environment that end up in conditional statements (classically if…then…else) using some spanning sets of flags, like time, location, type of environment, etc.In an earlier agent that I designed, called CFEngine, which configures computers all over the Internet, I designed a context system based on “smart sensory summaries”.
These expressions made use of a finite number of flags or tokens to represent the coarse grains of cognitive information, i.e. “state”: Monday, Friday, Hr10, Hr11, Min30_35. Note how coarse-graining time ranges avoids an infinity of classes. This gives us the virtual spacetime generalization of a calendar. These flags are easy enough to detect in a computer operating system. The time comes from the clock, from which we could simply use an integer value representing UTC, but that isn’t meaningful in the context of human activity. Monday is not a recognizable phenomenon from UTC time. An agent needs to calculate that from the clock, but that’s all straightforward.
So the approach is easy: an agent probes its environment and sets these tokens like flags or signals. If they are set, then they are “true” in a Boolean sense, and it means the subsequently highlighted policy is activated.
Other sensors can detect other features of an operating environment like RedHatLinux, Windows11, CPUTemperature_High, and so on. Any number of these tests can produce tokens that are atomic states of the environment. The context model is just a glorified (and more elegant substitute for) if…then…else conditional clauses.
That’s all fine for curated and controlled environments, like operating systems, where knowledge is restricted by the nature of the system. But what about other kinds of experience–the kind where we would be interested in making a knowledge map? You can see how we could extend this idea if we could only classify the state of the environment with a suitable granularity. Suppose we investigate a crime scene, we still want to assess what’s going on.
Again, a crime scene is a relatively specialized scenario (see the Murder Most Horrid (cluedo) example in SSTorytime). The clue is to try to classify arbitrary scenarios and use these labels to tag notes. It’s a way of searching for relevance.
A properly executed context reduction encompasses and eliminates what the industry calls “ontology” for knowledge, because smart sensors detect those cases and reduce them to flags directly. Ontology is part of the computation of context. If person is father AND father has child THEN person is grandfather. These statements are hopelessly trivial aliases, no deeper than “if computer is Windows and version is 11, then computer runs Windows11”. Duh. Typing (like all bureaucratic form filling) is intentionally stupid, so we shouldn’t expect intelligent answers to come from type schemas.
How to distill context from natural language without speaking a word
If we now try to take on the broader and more open-ended case of classifying natural language and human thought, then we suddenly have an embarrassment of cognitive riches to contend with. To extract actual logic from natural language requires very difficult large models due to the unruly variations of unstructured language. It’s almost impossible to reduce context down to the kind of string search methods that we could use for a database lookup, knowledge graph etc. This is related to the objections around the Chomsky grammar hypothesis for natural language.
With natural language, there are many different kinds of text. Articles on a particular subject are very different from stories that are taken from experiences of everyday life, or even fantasies. The grammatical and cultural variations of natural language go far beyond logic. As a reader, it’s up to us to intend how we want to receive it. One could try to coax a reader to behave simply, reining in complexity on a voluntary basis. Like any skill it requires a little training. This is what we mean by taking notes. It’s still intensely personal, and the learning benefit lies in the performance of that work, not in capturing the information digitally. So we see why company intranets become graveyards for knowledge: they are exchanging the actual desire to broadcast information for a voluntary procedure of reading when you feel like it–but where there is no obvious intent for anyone to seek out and receive the information. A circular email would likely be more effective than the average Wiki. Only Wikipedia can statistically sustain such voluntary activity due to the sheer size of its audience.
In Deep Learning, one attempts to automate the same kind of mining of intent, now with greater hubris, by trying to calculate accurate conditional probabilities for millions of possible context flags, by using as much data as possible. Now, we must have forgotten that fear of becoming too complex, because we luckily can’t see it anymore. It’s hidden in the inner-workings of Large Language Models. But so is the intent, so we may take them for granted.
In most cases the amount of knowledge we can cope with is much smaller than one needs to calculate a reliable probability. We can only say: yes, we have a certain number of hits for our search with different context labels and we try to select the one that best matches our current context. As a much simpler flag, however, context still works as a conditional, just not in an accurate probabilistic sense. It becomes a simpler kind of “best effort” selector switch.
What both of these cases do is to turn procedural stories into summarized declarations of a finite number of cases (a switch-case computation). The cases are declared either intentionally as a matter of policy, or they are picked up in training and classified intentionally, as the end result of a procedure that we are trying to cache. In a sense, we are identifying knowledge as a cached summary of some original process. But, we keep the original cases too as far as possible as examples to justify the selections. We don’t throw away the bathwater when making a baby! The knowledge is a set of notes summarizing our understanding, the evidence lies in the episodes that the notes refer to. This is also the structure of a scientific paper.
The debate about which of declarative programming versus imperative or procedural (flow chart) programming is “best” rages in computer science (which is more tribal than rational in these matters). Declarations can be precise and simple, but nearly everyone finds them harder to formulate than the more immediate step by step statements of procedures–for exactly the same reason why we find it difficult to create knowledge graphs. We have to work harder to distill intent into a compact logical representation. We prefer dithering over classifying. We do need both kinds of representation in general.
To make a knowledge graph, we should probably start simply: by declaring notes and thoughts about some source information (as in the Alice example above). We’re not trying to ape human language in that logic, we’re trying to make a new kind of structure: an index. This is how SSTorytime and its N4L representation have been designed.
Transforming text into notes as graphs, with a Teeny Tiny Language Model
In a focus article, there is a single topic and a discussion about it. When we read a novel, there may be many characters, the context shifts, the voice changes, and almost anything can happen. Making notes about such different kinds of story begs the question: what is the novel actually about? Anyone could provide that answer in a different way. It depends on the intent of the writer and the reader. The intent we can measure is only our perception of someone else’s possible intent, filtered by our own intent to pay attention.
Is a novel about a sequence of scenes, or is there a deeper thread on a larger scale? Is Moby Dick about whaling, or is it about vengeance and hate? Is it an allegory for something else entirely? Speculating about its meaning requires a process of cogitation or post-processing. Large Language Models can get away with some of this by stealing other people’s thoughts, but they can’t do it for themselves. To make a knowledge graph, the creator needs to decide what to think about the subject. There is no universal answer, no correct logical set of relationships. Without the intent to know, every version would be equally ad hoc. What would be put into a knowledge graph except a list of characters?
The number of possible semaphore tokens one would need to pursue a context mining approach is potentially infinite, hence the “Large” in Large Language Model. But–no problem, that ‘s exactly what natural language does for us. Our brains are good at it! Natural language is a collection of token semaphores that we have learned. They are words and strings of n-words, called n-grams. We can try to analyse the parts of the text to find those flags. The more interesting question is how little linguistic capability can the reader get away with and still derive some meaning from the text?
As I was developing the Semantic Spacetime idea in late 2019, I did some research into natural language processing, and ended up writing a couple of papers (one and two) about how intentionality might be extracted and measured from text, using only symbolic patterns, and without even needing to understand the words. It sounds like magic, but it’s just a question of identifying what costs effort. Natural language has developed a very clever system for encoding intent, which builds on a lifetime of human learning in complex societies, but a database doesn’t have that advantage. Large Language Models try to capture that experience artificially, but they are both extremely expensive and quite unintelligent in their power to contextualize so as to be sympathetic to the user. But surely our lower cognitive functions could still instinctively tell us something blunt about intent, without any knowledge of language. How else could other non-linguistic animals make predictions about their environments?
A simple hypothesis, based on the Semantic Spacetime model, is that intent retains a signature over the course of each separate text. We don’t need to train across billions of texts to see intent, we just need to look for some simple-minded patterns. It works surprisingly well for such a blunt instrument.
Intentionality and ambient context as expended work
As a physicist, I recognize the tension between the dynamics of a process (its size, rate of change, and states) and its semantics (the distinctions and dimensionality, symbolic aspects and meaning).
Intent can be represented as a chosen direction in a virtual “space” of possible outcomes–quite distinct from the reason why this course was charted. If we know the possible destinations, we can try to measure the direction, but usually there are too many possibilities. However, all is not lost. We can still try to measure some proxy for that intent by estimating the effort required to use certain words and phrases. This is what we do in physics when we use energy as an accounting parameter. If someone puts enough effort into a particular pattern, then it must reflect their intent.
It sounds like an odd thing to do, but no worse than trying to compute probabilities–-and much cheaper. There is statistical evidence that phrases can signal “attention seeking-ness” or intentionality to grab our attention. The reasoning goes like this. In order to speak or write a phrase, the messenger has to expend some effort. It costs some work. The degree of intentionality is higher if it costs us more, i.e. the barrier to getting there. That includes the amount of thinking in advance (which is hard to assess) and it involves the choice of words. Long words cost more to use than short words. Rare words cost more than common words.
Next, there is simple repetition. If we intend something, then we want to underline our intent by using repetition over a certain length of narrative or conversation. If we never repeat a word or phrase, then it was probably an anomaly, a mistake or a slip of the tongue. If we repeat too often, then what we’re saying is just noise or padding–the kind of habit phrases that mean nothing (e.g. when Americans say “like”, or Brits say “right?”) In between these extremes, there is a Goldilocks zone of repetition at which the intent to repeat is maximal, so we can try to pick out such phrases and measure their usage. This is what the Semantic Spacetime hypothesis predicts.
Repetition over the whole message (transverse counts) is like a probability. It’s a blunt instrument that throws away a lot of information by averaging. But we can also measure the longitudinal variation in usage–its process rate. Phrases that are uniformly distributed are more likely to be causal noise, but phrases that occur in a cluster, then a gap, then another cluster, etc. are more likely to signal intentional repetition of a theme in the narrative. Again, we can eliminate phrases that are too regular, and search for these common concepts. Low repetition is likely more intentional, and high repetition everything else could be called ambient or more contextual.
To the unassisted eye, these phrases match quite well with what one would consider to be conceptual fragments or “proto-concepts”. One can imagine combining these atoms into molecules to create bigger concepts. Those won’t necessarily match actual words in a language, but will form their own language. This is how languages evolve. After a while someone could give the combinatorial concept a proper name–as we might name a child.
When we try to remember the scene later, we will use the flags about context and intent as a lookup key. Now we are getting closer to a logical structure that can be turned into a relational or graphical form.
Suppose we take an example. In Moby Dick we can take pairs of words. But where do these pairs come from? How do we choose them? Of course, they have to appear in the text a significant number of times. Do we pick the most frequent? No, we select by the Goldilocks principle.
How do we know when a phrase like “killing it” means someone is doing really well or something is actually dying? What does “ATM” stand for? The answers have, obviously, something to do with context. There is no single correct answer, so we are not in the domain of logic or validation. The problem of matching meanings is not just about single words and phrases, but also about our questions on many levels. What does it mean to “do well”? That depends on context too. We need to figure out what context is, and how it’s used. In a communication, both sender and receiver are free to intend meanings differently.
We hope to use context in several ways. As the author or sender, we use a context to select a particular style of communication, a particular set of observations or signals we wish to make. Later, as a reader or receiver, we use contexts to try to decode the signals and make sense of them. Context is a switch: both as an encoder and as a discriminator or disambiguator. I call context the hard problem of knowledge representation, because it is very subtle. It uses strategies like indirection to attribute meaning
In Darwin’s Origin of Species, we find triplets:
As we increase the size of n-grams, the number of unique combinations falls off like a power law, so their significance is much greater. In these sequences, we notice that there are slightly different characters to the phrases of natural language that are anomalous and ambient. They are not merely flags, but descriptive background statements, common concepts, but also some intentionally highlighted actionable phrases. This all happens without any understanding of the phrases. The dominant sequences change over the course of the book, as the current focus of attention shifts. This is as we expect, but the relevant question is whether this amounts to a measure of context and intentional subjects within the sets. All we can do is to postulate and use our own inspection of the phrases to see if that might make sense. The presumption has no statistical validity, only a theoretical causal possibility.
In SSTorytime, the text2N4L tool performs this analysis and turns it into a basic set of graph-compatible notes.
One could try to give proper names to semaphore-like states derived from the books. To do so would need an extra layer of work to alias such combinations of things that lead us to recognize relevant characteristics about the scenario. Without considerable learning, that would be impossible–so one would prefer not to invoke such super powers if at all avoidable.
(intentional,contextual)
To a first approximation, we might say that intent (focus) and context (ambient remainder) are separable parts of a scene. The part we intended was our main focus, and then there was the rest. Context is the part we did not intend, it was merely the backdrop for what we were doing. But this is only the first approximation: there is more to it than that. Our involvement in the scene might be intentional, so we’ve already selected the background, in some sense, too. We may have arrived by choice. Context overlaps with intent. It refers to the ambient components of the scene, the history leading up to the moment, etc, but these are also chosen to some degree.
We use context to discriminate similar memories, based on the scene of the occurrence (that one time at band camp versus another time when I was late for my wedding). Another role for context seems to be to propagate memories by repetition. What we intend might only happen once, but what is common and familiar about a scene gets passed on. Far from being a backdrop, it might become the most important learning we do.
However, our capacity to remember something tends to be improved by our intent, since something we intend evokes a stronger emotional reaction than something we trivialize–so probably only a small part of context will be remembered. Even when we remember a scene like the Grand Canyon or a rocket launch to the moon, which makes a great emotional impact on us, we remember it mainly because we intentionally take the time to have those feelings and wallow in them. If we drive past at speed, we wouldn’t remember it in the same way. Our intent to pay attention connects several cognitive aspects of memory together.
Approximating Context in Semantic Spacetime Graphs
When we are searching for information that we trust (earned knowledge) we are in a new context, which emphasizes certain signals that lead to a heightened expectation of a certain kind of answer. In a technological assistant, we must try to encode those expectations with a “good and balanced mixture” of terms to search for. This is what a good index tries to do. Context enables or inhibits connections through the network. In the N4L note language, we write context that we intend to use in future in the colon marked section tags:
:: in a restaurant, ordering food, waiter, waitress ::En kopp kaffe, er du snill (norwegian for english) A cup of coffee, pleaseThese will automatically be encoded into the links of a graph. The tokens above, like “in the restaurant”, etc are like “states” of being, or part of a scene description. Ideally, any cognitive agent would be constantly evaluating this context and pulling up the relevant notes to use. It wouldn’t be waiting for an explicit search string or query to be entered,
In the SSTorytime project, we are using a knowledge graph representation, but not in the conventional way, because conventional knowledge graphs lack the ability to handle context and they do not support scalable representations. The idea is not to try to cut humans out of the loop, but to make it easier for humans to encode information that they intend to remember. In note taking form, context is a set of indexing tokens about when something is expected to be useful.
It’s easy to put a memory in a box for safe-keeping, like dropping it off with some data valet parking agent. But how to find it again? One car is easy enough. A billion cars is another matter. Computers solve the problem of storage, but they all suffer from the same problem of relevant retrieval.
Human memory is not like computer memory. Our memories are not about random access lookup of facts. They are basically about events, or episodes that happened to us. We encode these through contexts that matter to us. If we remember things, we do so because of the episode in which we stumbled upon them, or (better still) put in the work to discover them in the first place. This is why we go to school. Reading something is not as powerful as experiencing it in a classroom.
So, while intent is a measure of importance to us, context is both a discriminator and a handle for recall–like a primary key. It’s a lot more expressive and a lot more complex than a simple key-value association. In a recent paper, I wrote about a way of defining and measuring these two components of knowledge: On The Role of Intentionality in Knowledge Representation: Analyzing Scene Context for Cognitive Agents with a Tiny Language Model. This was a kind of idealization that science seeks.
Postscript
The key to knowledge lies in the kind of post analysis that we reach when we’ve done something many times, long enough to “know it like a friend”. Something or someone we intend to know.
There are well known philosophical treatise(s) by Anscombe and by Searle. Without focusing on the nature of intent as a phenomenon, they map out the scope of what it might be possible to intend, dig up adjacent moral issues such as why someone would make a certain choice, whether they would be able to go through with the choice due to moral qualms, and whether they could be prevented from doing so by someone else. These issues dance around intent without actually finding the elephant in the room. They stray into the territory of Promise Theory, which takes a scientific view of issues and separates the concepts quite cleanly. If philosophy intends to be vague, science intends to quickly settle for pragmatism. Today, our situation is that we have very practical problems to solve.
It’s always preferable to strip away human affectations and pretensions to reveal the core meaning. In fact, if we accept that intent is about processes, then dimensional analysis places quite strong constraints on how it can be represented. The answer turns out to look like “energy” or work.
As for knowledge graphs, the goal of capturing knowledge should never be to replace original rich content with an impoverished model for the benefit of making it easy for computers. It would be just like the hub-and-spoke molecular models we see in chemistry books and in textbook approaches to knowledge graphs. That would be like replacing actual chemistry experience in the lab with those plastic models–what fun would experiments be then?
Our challenge is to encourage a generation of spoiled IT users to relearn what knowledge is, to annotate and expand on original content with one’s own notes. Knowledge comes from our own work. From an intent to pursue. Give the SSTorytime model a try!
Some of this article was presented at the Kavli Institute Salon on Neuroscience, in Los Angeles in 2022.
