# Why are we talking about AI today?

**Author: ****Scott Newcomer**, Staff Software Engineer

”I am an intelligent paperclip looking to maximize my time here on earth. I intend to convert all resources, including you humans into paperclips.”

In Nick Bostrom’s book “Superintelligence: Paths, Dangers, Strategies”, he lays out a vivid illustration of misaligned AI goals with human values. The book valorized a paperclip and its ability to undertake uncontrolled creation and scientific ambition. History is riddled with ostentatious claims of human subversiveness to machines, including all the way back to the 1818 novel Frankenstein. I would like to ground us in a bit more practical claims and understanding of AI as it exists today.

Artificial Intelligence and Machine Learning have been brewing under our society for decades.

Only in 2022 did these search term interests diverge due to fervent interest in products like ChatGPT. Can we pinpoint the fundamentals driving this recent surge of excitement? Let’s make sure we are grounded in the broad field of AI and what paradigms exist to solve problems we encounter.

- Artificial Intelligence (AI): The field that aims to create intelligent machines capable of mimicking human-like cognitive functions.
- Machine Learning (ML): A subfield of AI that specifically deals with the development of algorithms and models that enable machines to learn and improve from experience or data.

## Standing on the shoulders of giants

The 1950s and 1960s was the beginning of early AI research focused mostly on problem solving and primitive natural language processing capabilities. As the century progressed, computer vision and robotics were emerging, generating a lot of hype around AI. Even during the AI winter (a period of lackluster interest and funding between the 1970s-1990s), AI growth continued. Advancements in the field ultimately kick-started self driving cars, biological revolutions in genomic analysis and drug discovery, and even primitive conversational agents. Jumping forward to the early 2010s, digitization of science and industry, hardware improvements, and a lot of human ingenuity have given us machines that we can throw more “compute units” at combined with lots of data. As a result, what emerged are innovations like neural networks, scalable architectures deep learning like Transformers, and classes of models such as Large Language Models (LLM), allowing us to analyze human language with impressive capabilities. Furthermore, we are able to get computers to generate human language that sounds eerily… a lot like us.

## Generative AI

Generative AI is an important concept to build a mental model around as this is the fundamental ignition in what we observe in the Google search term trends above.

LLMs are an implementation of the generative models that have been pre trained on an impressive amount of text. In the case of very large models like ChatGPT, by training on much of the corpus of human knowledge, we could ostensibly consider these classes of models as holding a compressed artifact of human knowledge. For example, this would encompass flat earth, round earth and even hollow earth views of our planet Earth, represented in some probabilistic proportion. In order to extract value out of these Generative models, we use prompts — an initial input or instruction to get the model to output a desired response. The best mental model I have of interacting with Generative models is that prompting is a subtractive process; the whittling down of the space of all possible responses. These models have a hyperspace of all possible texts — think of a large stone — and the prompt is meant to chip away at the stone to get at the desired answer.

“This encompasses flat earth, round earth and even hollow earth views of our planet Earth, represented in some probabilistic proportion.”

Asking older generations of Generative models “what does the egg eat for breakfast” leads to hilarious answers such as “toast”. Or similar, you could convince the model that there is a truth in an ostensibly false temporal fact. For example, a mock interview with a Shakespeare bot convinced it that he introduced the Dreamwork’s character Shrek in one of his plays. You may have heard this phenomenon referred to as hallucinations; when the model provides you untrue or incorrect information.

In the last few years, various innovations largely caused these hallucinations to disappear with the application of Reinforcement Learning from Human Feedback (RLHF) and human instruction. However, further interesting behaviors have been witnessed with the latest models. Flattering the model with suggestions like “You are an extremely skilled operator” or telling it to “pause and think about it” collapses the space of response probabilities and has shown to give higher quality responses. On the other hand, similar to mistakes a phonograph can make playing a vinyl CD, Generative models today can still derail. However, as these Generative models improve, it should become harder to force its responses into the realm of incomprehensible hallucinations.

Throughout their short but formidable history, Generative models share a few fundamental primitives with traditional machine learning methodologies. Particularly through numbers, math, and statistics.

## Numbers, Statistics, and Mathematics

We can in fact lightly understand the DNA of AI, thus giving us insights into its capabilities and limitations. Jumping forward to the conclusion, at the root of ML, AI, and Generative AI is math. Beautiful, simple math. A science that operates through symbols purified of any concrete meaning. A language that humans use for expressing the abstract, non physical world.

We can start with traditional machine learning to get a feeling for this fact. Here is the equation of a line — a common machine learning technique to forecast sales or inventories.

`Y = m*x + b`

In order to find the best fit line over our data, we can try to minimize the error by calculating the loss relative to some ground truth Y(hat) with MSE — mean squared error.

`mse = np.mean((y — y_pred)**2)`

With our calculated loss, we can use this to update our learned parameters *m *and *b *to generate our prediction *Y*.* *For further understanding, here is some good reading on the topic — linear regression, common loss functions. If linear regression reminds you of high school linear algebra or statistics courses, that is precisely what I was hoping to show you!

Moving on to Generative AI, the fundamental basis is also existing statistical and mathematical frameworks but applied to lots and lots of data. Much of the research and vision that led to the incredible results we see in Generative models today was through architectures that are easy to train and work well with GPU hardware. Neither complex neural networks nor perplexing mathematical equations are necessary.

To start, here is a wonderful article detailing the math behind Generative AI. We can focus on a few facets to help us reiterate that you already likely know these concepts. With natural language processing, we represent a sentence like “hello world” with matrices. The dot product is a step in training Generative models and is calculated via element wise multiplication.

`import numpy as np`

hello_matrix = np.array([[0, 1], [0, 1]])

world_matrix = np.array([[0.84, 0.99], [0, 1]])

hello @ world.T

All we have done is taken our knowledge of multiplication and applied it to matrices!

Next, let’s up the complexity a bit and look at *softmax* — a step in the training process that normalizes the weights of the model between 0 and 1.

`def softmax(y):`

# note can improve stability if y is large numbers with a few tricks

exps = np.exp(y)

return exps / np.sum(exps)

At some point or another in our lives, we have likely worked with *exp *— exponential. Perhaps you have seen it in contexts such as modeling growth or decay in biology or calculus. In the context of neural networks, we are relying on the mathematical properties of *exp *to give us well behaved probabilities that are non negative, helping the model capture more salient features by amplifying more confident predictions and numerical stability while training the neural network. As you can see, just keep peeling back the layers of various AI models and you will arrive at fundamental mathematical truths that have been formalized as human knowledge for centuries!

“Neither complex neural networks nor perplexing mathematical equations are necessary.”

My main conclusion here is that although we may ascribe AI with some fundamental leap in human knowledge and perhaps finding “the intelligence algorithm”, we must be aware that much of the advancements in AI we observe and the applications we interact with have just been an accumulation of tricks over our existing knowledge, including calculus, linear algebra, and impressive amount of raw engineering. Moreover, given these models are fueled with numbers to approximate real world behavior, they are prone to error. Generative hallucinations are the error you see in the outputs of these models as a result of the model failing to approximate the problem. Amazon or Netflix recommending you a product that is outside your interests is also potentially an error in the model. At a fundamental level, we can still ascribe these errors to data, statistics, and math. You as a human certainly don’t determine if the world is flat or round based on probabilities but an AI model does!

## Wrap up

We should neither look at current or future AI systems with mystical eyes nor as the science of alchemy. Rather, AI and machine learning are sufficient in imitating a problem imperfectly through statistics and mathematics that you are likely familiar with in some form. Whatever tools or technology that seemingly do an imperfect job today will certainly be improved upon as long as they present a sufficiently important problem to be solved. As a result, it is important to expose the hype, sift through the hazy anecdotes, and fundamentally understand these technologies so we can derive business and practical value from them. AI has a substantial role in topics of grave importance to humans including pernicious problems like disease management, identifying adverse risk possibilities in industry, and overall enriching the quality of our lives. And thus, expect us to be talking about and learning about AI far into the future.