Chapter One - Getting to Know You
The Crystal Ball
The mid 18th Century through mid 20th century are often referred to as the “industrial age”. During the industrial age we saw the emergence of “big industry”, large manufacturing efforts on a global scale to sell mass-produced goods to the public. This era was really defined by the invention of the steam engine, which allowed for faster delivery of goods and more effecient manfacturing processes. While the manufacturing boom is typically viewed as the driving force behind the industrial age, the true value of “big industry” manufactured goods was realized through the ability to transport those manufactured goods to where they needed to go in a timely fashion, not the goods themselves.
The late 20th and early 21st Century is often referred to as the “information age”. Many people have heard, or even used this term, but do not fully comprehend what it means. Much like the industrial age, the information age is often associated with data, or “big data”. However, the true value in “big data” is not the data itself, but the ability to process and understand this data. Data analytics is the 21st Century equivalent of the steam engine, moving big data into a format which has value to businesses or consumers. Predictive data analytics will continue to shape and define our world throughout the 21st Century and beyond.
One of the earliest examples of predictive data analytics is weather forecasting. Now I know that often weather forecasters are given grief when they don’t accurately predict a snow storm, or if the weather is going to be sunny or cloudy. However, overall weather predictions are mostly accurate when it comes to high and low temperatures, preciptation chances, or movement of storms. These predictions become increasingly accurate the closer you are to the prediction window. For example, a weather prediction for tomorrow will be much more accurate than a prediction for 7 days from now. Why? Because more timely data is available for tomorrow’s forecast, while 7 days from now includes 6 days of unknown data. Each day’s prediction is based upon the previous day, each prediction with a margin of error. Early weather prediction computations were worthless because it took 24 hours to generate weather predictions 24 hours away. By the time the prediction was generated, the weather events had already happened. As time has progressed and computers become faster, the timeliness of these predictions can be increased. While it took the ENIAC computer 24 hours to predict the weather in 1950, a Nokia 6300 mobile phone in 2008 could generate the same prediction in less than a second. At this point we have reached a limitation in that the accuracy of weather predictions cannot be increased without better data or better algorithims, while the speed and timeliness of these predictions now provides diminishing returns for attempting to improve performance.
Imagine having a crystal ball which allows you to predict, with a relatively small margin of error, the predicted thoughts and reactions of the American people to news articles, movies, television, and political messages. Unfortunately the prospect of such a crystal ball existing is not far from reality, thanks to the advanced power of big data analytics. We have reached the point where these predictive algorithims are now constrained by accuracy, with constant efforts to improve.
Everything you do online is tracked. Your personal information, your likes, your dislikes, your browsing habits, what services you use, how often you use the Internet and social media, even the physical locations you take your cell phone, are tracked, categorized, filtered, sorted, and distributed for sale. While some companies have restrictions on data usage, others do not.
It is not at all uncommon for political parties, candidates, or political action committees to purchase your information for not only targeted advertisement campaigns, but also market research efforts. This data is used not only to determine how you live your life, but also what political issues are are most likely to support or reject based upon predictive analytics.
Predictive Pregnancy
To understand the potential power these predictive analytics have, we only need to look at the 2012 incident where the retailer Target successfully predicted the pregnancy of a high school teenager outside Minneapolis, Minnesota. Target identified shopping patterns of pregnant women based upon purchases such as unscented lotion, cotton balls, and vitamin supplements, and terrifyingly were able to predict not only that a woman was pregnant, but her due date within a few days. This data was then used to send targeted advertisements to these women based upon the predicted stage of their pregnancy.
Target’s usage of these powerful predictive analytics only came to light when the teen’s father complained about his daughter recieving these (unknown to him) targeted advertisements for maternity clothing and nursery furniture. Target has since revised their advertising technique. Instead of sending only maternity related advertisements, Target now sends to “predicted mothers” (there’s a phrase for the 21st century) seemingly generic coupon booklets with targeted coupons spread throughout. For example, a coupon for a lawnmower might be next to a coupon for diapers.
I have personally seen these predictive analytics in action, both online and in the mail. One excellent example is when I was looking to potentially take out a personal loan for a real estate transaction. I performed preliminary research on a popular loan website to find out what rates were available to me a few months before Christmas. I input a random dollar amount and asked for available rates. Ultimately I decided not to go through with the personal loan, and secured alternative financing. Remarkably, every year before Christmas I now recieve mailers from random loan companies and banks announcing that I’m pre-qualified for the exact amount I entered into the loan website! Apparently the loan website believed that this could potentially be a season expense, and that I was considering a rather expensive Christmas gift such as a vacation or other expensive gift. If I were indeed looking to book a dream vacation, how fortunate I would be that such an offer would arrive in the mail at exactly the right time!
Now that you understand how powerful predictive analytics can be, and their commercial applications, let’s take a look at their political applications.
Cambridge Analytica
In 2014 Cambridge Analytica, a London based political consulting firm, used Facebook data of tens of millions of users to generate psychological profiles of voters. This information was then sold to the Donald Trump campaign to help in influencing the public to elect Trump as President of the United States in 2016. Similar to how Target can predict pregnancy based upon purchasing patterns, data research firms have discovered that it is possible to generate a predictive profile of a person’s life and personality based simply upon what they “like” on Facebook. These predictions can include aspects such as substance use, political attitudes and physical health.
How could the Trump campaign used this data? Most likely the data was used to target specific geographic areas to increase conservative voter turnout. By identifying where conservative voters lived and an opportunity for increasing voter turnout was realized, the Trump campaign could simply focus additional resources in that area to increase the turnout. Even more powerful, the analytic data would allow the Trump campaign to carefully focus the rally speech on the topics which would matter most to nearby area residents. One of the best examples of this strategy is most likely the campaign rally in Hagerstown, Maryland. This rally, help in April of 2016, was held at the Hagerstown Airport, which is right next to Interstate 81, a major North/South corridor. This placed the rally within driving distance of Virginia and Pennsylvania, two battleground states. The results? Trump defeated Clinton in Pennsylvania by only 44,000 votes, less than one percentage point. This move and similar moves like it secured Trump 20 electoral college votes from Pennsylvania, a feat which Mitt Romney was unable to pull off in 2012 against Barack Obama.
The 2016 Election and the “Silent Majority”
Nearly every single news outlet had predicted that Hillary Clinton would win the 2016 Presidential Election. Some election prediction models even placed Clinton at a 99% chance of victory. So what happened? How were the models, which had previously successfully predicted many elections, suddenly so wrong? Not only did Trump win, he won with 304 electoral college votes - 34 more votes than needed to secure the presidency.
One of the largest influences in this discrepancy, according to the head of Monmouth University’s polling institute Patrick Murray, was “Non-response among a major core of Trump voters.” Kristen Tate, a contributor to The Hill, held similar thoughts as Murray, in that Republicans became silent and refused to answer polls after being publicly harassed for their political views. This “demonizing” of Trump supporters resulted in their silence until election day, through fear of losing friends or family, or even their jobs, all because they supported Trump. Supporters were called racists, bigots, and overall made to feel socially unacceptable for publicly expressing their political views. However, this did not change their political views, but instead likely motivated them more to make sure they showed up on election day to vote for Donald Trump. With people being publicly harassed for supporting a certain political candidate, would you truthfully answer a random stranger calling you on the phone asking who you intend on voting for? For many in 2016, the answer to this question was a most likely a resounding “no”, and the large discrepancy in poll projections supports this conclusion.
Defending Against Predictive Analytics
One of the ever growing concerns during the Industrial Age was the damage to our environment caused by machines mining minerals, processing chemicals, or even clearcutting forests. In order to stop this damage environmental activists would intentionally sabotage the machines by throwing metal objects such as wrenches into the machines. While this ultimately did not stop the machines, it did temporarily halt the machines and draw attention to the cause. Eventually environmental regulations were implemented to reduce or eliminate enivornmental impact from big industry.
The Information Age presents a similar issue as the Industrial Age, except instead of trying to protect the environment, we must work to protect our very own privacy, our minds. Unfortunately, our data will continue to be harvested, sold, analyzed, and utilized to influence everything from our purchases at the store to votes at the ballot box. While it is possible to “opt out” of many of these services, there are so many data collection firms out there that it would be nearly impossible to opt-out of them all. A better option may be to do as many did during the 2016 election, and intentionally provide either misleading data, or no data whatsoever. Refuse to answer polls and surveys, utilize “Incognito” browsing, or even just start randomly searching for things on the Internet you have no interest in actually buying. Refuse to answer or even lie to surveys if someone trys to find out your personal interests or who you intend to vote for. Much like the enviornmental activists throwing wrenches into machinery, if you inject bad data into the predictive analytics the predictive analytics will break.
By breaking the predictive analytics, you devalue your data. Data which cannot be used to successfully profile or predict becomes garbage data, completely worthless to advertisers and pollsters alike. Throw a wrench into the cogs of the predictive analytics machine, and your data will no longer be of value to anyone, for a short while at least. But more importantly, you will help in drawing attention to the ever growing threat to our privacy presented by the massive data harvesting and predictive analytics used to influence our lives. Hopefully, if enough people act and speak loudly on the cause, more action will be taken to protect our privacy. In the meantime, keep throwing those wrenches.