Rene Descartes has famously stated in the opening of Meditations on First Philosophy: “Cogito, ergo sum“, or better known as “I think, therefore I am.” In a world that objectivity is our default understanding, Descartes’s skepticism and John Locke’s empiricism have often been misunderstood to mean that the existence of the external world is subjective to each individual’s mind. The fact is, what they meant is that the dichotomy between subjectivity and objectivity is inherently inseparable. I am not here to argue that when I looking at the apple on the table it exists, and when I close my eyes, it ceases to exist. But, my confidence in the existence of the apple is gained through my first person observation. I want to believe the existence of an objective reality, but my observation, knowledge, and understanding of this objective reality are unavoidably subjective. I see, I hear, I touch, and I reason through my own senses and understanding. One of the great human flaw is that we want to believe, somehow, that our own senses and reasoning are more objective than others.
“Pure logical thinking cannot yield us any knowledge of the empirical world; all knowledge of reality starts from experience and ends in it.” – Einstein, Albert
Before I dive into the foundation of of language, I want to state that I am not a linguist, but a mathematician and computer scientist. But I think the foundation of language is a subject worthy discussing because language is the very foundation of human society. We express, communicate, and making decisions all using languages. And knowing what we mean when uttering sentences, is far more important than the sentence itself, I would argue. But, as noted before, I am not a linguist, and my understanding of the subject is limited. My hope is that this can inspire thought, questions, and challenges, instead of making more people to agree with me.
Let me start with an question:
How do we know that what we see as red color, is the same as someone else as seen as red?
Many readers probably have contemplated this question before. Before I venturing into the discussion, I want to distinguish the concept of property from value. Property is something that can be measured. The value of a property is the outcome of the measurement. For example, length is a property that can be measured using a ruler. While the outcome of the measurement whether it is 3 inches, or 3 cm, is the value. In the example above, the color is the property and red is the value. And property and value are unavoidably linked by a third party that performs the measurement. The result of the measurement (value) is represented using symbols, otherwise known as language. Now allowing me to present two tests to verify if participant A and B agree, if they color they see is the same:
Test 1: Participant A points at a color that A consider as, let’s say, red, asking that if participant B agrees that if they would agree that the given color is red. In response, participant B will answer either “yes, I agree” or “no, I do not agree”. We conclude that A and B are seeing the same red if and only if B answer with “yes, I agree”.
Test 2: Participant A points at a color that A considers as, let’s say, red, asking that what color has participant B think it is. We conclude that A and B are seeing the same red if and only if B answer with “red”.
For Test 1, reader may immediately realize that participant B don’t really need any understanding of color or red, instead, participant B can always answer “yes, I agree”, and passing the test. We can modify the test to include trick questions, such as sometimes A will as question if B agrees the color to be blue, even though what A think that it is red. But if B has no concept color, but just very good at telling whether A is asking trick questions or not, B can still pass the test.
As for Test 2, participant B is no longer biased by A‘s answer, so if participant B provides the answer that is identical to participant A‘s answer, we have good confidence that they agree on the observed color. But Test 2 also have limitations: what if participant B speaks a different language, what if participant B uses the word “Rojo”, “红”, or “أحمر” to represent what participant A saw as red? It may seem to be a trivial choice to just include different linguistic translations as part of the acceptable answer. But look at the following color:

To a lot of western language speakers this color is blue, while to many eastern language speakers this color is actually green. It is not that we saw a fundamental different color, but the concepts of blue and green are actually slight different in those two linguistic groups and therefore the same reference may be categorized into different groups. (Reader can refer to the Grounding Problem of Language section of my previous post for more information about the relationship between concept and references).
To overcome the boundary and ambiguity of languages, here is the list of criterias (The Postulates of Perfect Linguistic Agreement) I propose for verifying two entities (A and B) that agrees on the observation and measurement of a property.
- For a given property (for example color), if A produces identical symbols in two different measurements, B must also produce identical symbols(though not necessarily identical to A‘s symbols)
- For a given property (for example color), if A produces different symbols in two different measurements, B must also produce different symbols(though not necessarily identical to A‘s symbols).
For mathematical nerd there, we can express the relationship between A and B‘s symbolic representation of the given property as a bijective map. Note that I used the word entities here, because A and B do not need to be humans but anything that can make measurement and produce symbols.
For example, consider a mechanical watch and an iphone, both measures time, even though the mechanical watch shows time using the angles that different hands point at, while the iphone shows time using numbers. But the angles which the hands point at will be the same given the same time of day, and the numbers showing on an iphone will also the same. While at different time of the day (let’s assume within 12 hours for now), the angles of the hands will be different, and the numbers shown on an iphone will also be different.
Here is an example using programming language, to measure the position p of an element in an array a, for a C++ or Java programmer, that is a[p-1] and for a Fortran or Matlab programmer, that is a[p]. Furthermore, if the array is stored in a linked list, one of the measure of the position can be the memory address of the pth element of a. Memory address as measurement though still agrees with index as measurement, it is a lot harder to locate previous and following element using memory address than using index. That is, not all agreed measurements are equally convenient for computation. I will touch a bit more on this when I discuss mathematics and science in my later blogs.
The Postulates of Perfect Linguistic Agreement implies consistency of the measurements that A and B use on the given property, if A and B shares no hidden information.
Let me define consistency first:
Measurements are consistent if for all objects that have the same ground truth values of a given property, the measurements are also the same.
It basically says that if my measurements is consistent, I see a red flower, I should say that it is red, I see a red box, I should say that it is red, and so on. So, the consistency of measurements is actually can not be validated directly, because I can not obtain the ground truth value of a property without measuring it. I can’t know that a flower is red, if not by looking at it myself, or asking someone else to look at it and let me know. This comes back to the empiricism belief that we can be rid of subjective observers when observing the world.
But the Postulates of Perfect Linguistic Agreement can give us a way to test consistency. Here is a short sketch proof:
Let’s assume measurements of A is inconsistent, and A and B have perfect linguistic agreement. Because the measurements of A is inconsistent, we know that there exists a value of a property that A produces different symbols for. Because B and A are in perfect linguistic agreement, B must also produces different symbols given this same value. But if A and B shares no hidden information, all B have observed is the same value. Even though B can produce different symbols by, for example, flipping a coin, A and B can not have perfect correlation between the differences that they produce, i.e. B can’t produce a different symbol for the same value just as A producing a different symbol without communicating with A. Therefore there is a contradiction. So, if the Postulates of Perfect Linguistic Agreement holds, and A and B shares no hidden information, the measurements A and B makes, are consistent. But consistency is not objectivity. For example, an IQ test that biased with cultural references maybe consistent, but would not have a strong construct validity. But I don’t think it is possible to be completely objective, for what unit we use, what language we use to express the result are all subjective choices. For programmers, even though 0-based indexing and 1-based indexing are both consistent, but which one that a programmer prefer is subjective. Ultimately I don’t think subjectivity is inherently problematic in scientific research. The real issue is often researchers use inductive and abductive reasoning to extrapolate information that is not directly included in the data.
In reality, there is often some hidden information shared between A and B. For example, when we ask someone if they think a certain event is just or not, often that measurement of justice is heavily influenced by the person’s upbringing and cultural value rather than just the event itself. Those biases are especially prominent in fields such as history, sociology, and physiology, the fields that focused on the study of human. In recent years, there are more and more realization that the due to the fact that majority of those studies are performed by western scholars, even though they may agree with each other on the finding, but the results may still have been influenced by their shared perspective and values. To alleviate this bias, in some fields, we have invented complex instrumentation to assist us to make measurements. Though still possible (for example, an IQ test that tailor towards certain culture), it is a lot harder for a instrumentation to share the same bias with a human. But in general, the lesser information A and B share beyond the the property that is being measured, the more objective that measurement is.
Also, measurements are subjective to error. This is inevitable even with the most accurate machines. So instead of expecting Perfect Linguistic Agreement, in real world, we should expect Approximate Linguistic Agreement, that is:
- For a given property (for example color), if A produce symbols in two different measurements within an error margin, B must also produce symbols within an error margin (though not necessarily identical to A‘s symbols)
- For a given property (for example color), if A produce different symbols in two different measurements beyond an error margin, B must also produce different symbols beyond an error margin (though not necessarily identical to A‘s symbols).
For some fields, such a physics, the instrumentation is extremely accurate and the error margin can be really small, while in some other fields this can be a lot bigger (for example, of perception of color).
Therefore, instead of considering linguistic agreement as a black and white toggle switch, it is more realistic to think it a gradient. From inconsistent measurements on the extreme, to the more error-prone human sensation as measurement, to the particle detectors, each is more objective than the other.
In my next blog post, I will continue the discussion by presenting my view on how linguistic agreement is related to reproducibility of experiments, and maybe I will touch a little on the foundation of mathematics, specifically, do all mathematical concepts have a definition?
One thought on “The Postulates of Linguistic Agreement”