The Palantiri and Lord of the Rings and the Sins of a Data Scientist
Disclaimer: This has nothing to do with the technology company named Palantir. Peter Theil and I have one thing in common. We both have an obsession on Tolkien, which might have inspired Theil to christen his company based on a magical artifact from Tolkien’s books The Lord of the Rings and The Silmarillion. Here I am ranting on analogies of Palantiri in the realm of statistics and on parallels of Palantiri in our modern day.
The Palantiri (the plural for Palantir) are the seven Seeing Stones, the indestructible magical crystal balls which were carried by the Numenorians (the ancestors of Aragorn) to the Middle Earth. The Palantiri were used by the respective “owners” to communicate with each other from far.  The person who wielded the stone could direct the gaze of the stone to a real happening or event; the images of which could be transmitted and which in turn could be viewed by the other six. These stones were very deceptive. A person who viewed through his or her stone could see muddily, the happenings on the other side and could make wrong inferences and terrible decisions. First: it can be unclear whether the happenings were from past or future. Second: through the stone you could choose what to show or what to conceal to the other “stone watchers.”
An example is, when Sauron (the villain) used his stone to view through Saruman’s (another bad guy) stone, what was happening in Isengard, unbeknownst Saruman was defeated at Helm’s Deep and Isengard and that Saruman’s stone was with Pippin (a jolly little Hobbit). Seeing a Hobbit through the stone, Sauron assumed that Pippin was the hobbit whom he was searching for who possessed the Ring and jumped to the conclusion that Saruman possessed the Ring. This infuriated Sauron and asked Pippin to tell Saruman that the dainty Ring belonged to him. This created a wedge in the alliance between Sauron and Saruman. When Sauron viewed through Saruman’s stone, he got the imperfect information and made an erroneous observation and a wrong decision.
Another example, is when Denethor, the steward of Gondor saw through his stone, what Sauron wanted Denethor to see. Sauron intentionally transmitted through his stone, the images of the Black Fleet approaching Gondor. Denethor assumed his enemies, the Corsairs of Umbar, navigating the Black fleet; and out of fear of failure, he kills himself. In this case Sauron, “sampled” and filtered from the pixels and images and did not show Denethor that the Corsairs were defeated by Aragorn and it was in fact Aragorn and his army navigating the fleet, coming to the rescue of Denethor and Gondor.
Two sins of the palantiri. Sin 1: Thy observational error leadeth to wrong inferences and hence wrong decisions. Sin 2: Thou hast caused biased sampling of images and pixels leading to wrong decisions. Now let us talk about two common sins that a data scientist can succumb to, in the realm of data as well as modelling. Sin 1: Observational error. Sin 2: Bias. So let me put on my statistician hat and maybe a philosopher’s hat too.
Sin 1: Observational Error
To err is human. This sin, observational error is inadvertent and not malicious. Observational error is the delta between the measured value and the true value. So, my twin and I are trying to lose our weight. Our weighing machine is off by five kilos. We always “will be” five kilos heavier. This is what is known as a systematic error. By the way, I do not have a twin. Some days I may weigh myself wearing heavier clothing or I may forget to remove my “heavy” wallet from my pocket or due to parallax error, the reading may be incorrect. This genre of measurement error known as random error. Observational errors are quite common in survey data. Consider a survey question with a double negative. “He aint never told no lies!” can be confusing for non-English speaking survey respondents and this can lead to systematic observational error in the survey responses. Meanwhile, respondents can make inadvertent mistakes leading to random observational errors. How do you avoid this? My naïve recommendation is “Be Careful.” A safe recommendation will be to establish statistical procedures to estimate and monitor measurement errors using metrics like Standard Error of Measurement or Coefficient of Variation. From a decision modelling point of view, can we put a value or risk to imperfect information so that we can make better decision on which source of data to consult? In decision modelling, the value of perfect information can be modeled via Bayesian approaches like Expected Value of Perfect Information and Expected Value of Imperfect Information leveraging frameworks like decision trees and influence diagrams.
Sin 2: The evil twins — Confirmation and Selection bias
Confirmation bias is when you conduct statistical analysis to prove a point. So, for instance, he (or she) will keep looking for different statistical tests until he can prove his pre-determined hypothesis. If Spearman Rho does not prove his or her point he will move to Kendall Tau. How do you respond to him? I saw this cartoon where “Data scientist 1 is talking with Data scientist 2. Data scientist 1: This is your machine learning system? Data scientist 2: Yup! You pour data into this big pile of linear algebra, then collect the answer on the other side. Data scientist 1: What if the answers are wrong? Data scientist 2: Just stir the pile until they start looking right.”  Jokes aside, on a philosophical note, “Whether you go through life, believing that people are inherently good or people are inherently bad, you will find daily proof to support your case, both parties, the philanthropists and the misanthropes, simply filter dis-confirming evidence and focus on the do-gooders and the dictators who support their worldviews.”  Consciously look for dis-confirming evidence. Consider multiple data sources. Surround with diverse group of people. Encourage the devil’s advocates in your team.
“A company wanted to find out, on average, how many phones (landline and cell) each household owned. When the results were tallied, the firm was amazed that no single household claimed to have no phone.”  Was that subtle or what? Selection bias happens when you select data subjectively; we call it biased sampling. I was thinking about the many times I was disappointed with the pollsters. One form of selection bias can be owed to feedback loops. There were times I have seen “some people” considering a model that influence data generation (like rank data) and subsequently use the data generated to re-train the model. What are some remedies for this bias:
- Random. Simple random sampling. Stratified sampling. Clustered sampling.
- Ask whether the data represents the entire population?
A Final Thought
The palantiri has analogies in the modern age: the social media. A selfie of me holding a pina colada with a tight closeup and terse hash tags #vacation #chilling #BesideTheSea, posted in Facebook can be misconstrued by my fellow netizens across the “crystal” pixels of our phones, as me bragging about a Caribbean vacation. While, the selfie can be from my balcony in Bangalore while I am reading something as sinister as Olmi’s Beside The Sea.
The modern day palantiri are our smart phones and social media where I may be propagating something benign as a seaside vacation. But it can be something malignant also and can be tools in the hands of our modern-day Joseph Goebbels or a political demagogue propagating to his or her minions, fake news to oppress a group of people or to promote the Big Brother or to control the citizens. This is when detention centers can be playfully called “summer camps” by news anchors, or insane people can use euphemisms like “special treatment” or “final solutions.”
 Tolkien, J. R. R., & Inglis, R. (2011). The return of the king: [book three of the Lord of the Rings] : and the annals of the kings and rulers : [an appendix to the Lord of the Rings. Prince Frederick, Md: Recorded Books.
 Dobelli, R., & Griffin, N. (2013). The art of thinking clearly. New York: Harper.
 Orwell, G., Nesti, F., & Kamoun, J. (2020). 1984.