Introduction
The mathematical world has undergone a significant transformation over the last 80 years. While many historians believe that the Golden Age of mathematics occurred during the 19th century, the rise of artificial intelligence over the last several decades has shifted the landscape of mathematics, the life sciences, literature, art, and even the ways humans and society at large interacts in a way we have never seen before. In this paper, I will be discussing my takeaways from What Is Life by Edwin Schrodinger and The Computer and the Brain by John von Neumann, material covered in my undergraduate course on Pattern Theory, and my future as a mathematician carrying these ideas with me as I go along in my life as a future student, professional, and life-long learner.
What Hasn’t Changed
With The Computer and the Brain and What is Life having been published in 1958 and 1944 respectively, a great deal of work has been done which has added to the foundation laid by von Neumann and Schrodinger. However, what makes these pieces of literature so important is the extent to which they were ahead of their time in the works they discussed and the influence they had on future discoveries. To this day, What Is Life remains highly influential in many different fields, including biology, physics and even philosophy. Schrodinger set the stage for how we think about information in terms of storage and purpose. In What Is Life, Schrodinger considers the possibility that genetic coding could be seen as an ‘aperiodic crystal’ with DNA being able to store and transmit information in a crystal-like way in a way unique to itself, which is easily repeatable and consistent over time. While Schrodinger was ultimately incorrect in his analysis of the physical qualities of genetic coding, his realization that information exists all the way through living systems, and is ultimately the reason for life itself, remains relatively true to this day.
Schrodinger commits a significant amount of time in What Is Life to the discussion of entropy and how living things are able to be such complex entities through resisting the second law of thermodynamics. The concept of order-from-disorder is discussed in chapter 6 of What Is Life as Schrodinger analyzes the concepts of negative entropy and the natural organization of things. Metabolism is ultimately the word that Schrodinger uses to characterize the process that living things utilize in order to avoid decay, or death (pp 70). Schrodinger goes on to state that things ”can only keep [alive]… by continually drawing from its environment negative entropy” (pp 71). Through the metabolic process, life is able to go on through consuming and exerting energy, or calories. This energy gained is what enables animals, with humans being the subject of particular interest in our case, to maintain a high level of order that ultimately is responsible for making living things what we as humans are on the most basic level: conscious, interconnected creatures of thought, emotion and action. On page 73 of What Is Life, Schrodinger reevaluates the definition of negative entropy to the following: ”entropy, taken with the negative sign, is itself a measure of order”. As Schrodinger has described, the key element to order-from-disorder is the ability for living organisms to harness energy from it’s surroundings and use this energy to do something. This something can really be anything, ranging from muscles working together in the hunt of a wild boar to the firing of neurons in the brain while completing a problem set. Computers cannot do this since they are not influenced by the surrounding environment. Schrodinger hit the nail on its head with his belief of the fact that information is stored and that living entities shift entropy around in the universe. While this may seem trivial on the surface, the proofs and math backing the position are far from that.
Within the idea of order-from-disorder stems the concept of self-organization. For example, when a newborn baby is hearing things for the first time, everything is out of order. Over time, as they begin to hear things over and over and associate words with objects or concepts, order arises. This high-level experience of self-organization is unique to humans. Computers do not experience order-from-disorder in the way biological systems, such as humans, do. Despite the fact that computers perform computations on massive amounts of data, they simply do not have the capacity for self-organization. While computers can handle disorder when given some sort of manner in which this disorder can be handled, self-organizing systems are able to adapt and respond to changing environmental conditions in a flexible manner, which is not possible with the rule-based algorithmic nature of computing.
The brain is the strongest self-organizing entity in known existence. Man does not know the bounds of which the human brain can stretch. The brain has the ability to organize and establish its own structure and functionality by adapting and learning over time. Synapses, the connections between neurons, are always changing in response to experiences that the brain undergoes. In a similar fashion, neurogenesis allows the brain to adapt to changes in the surrounding environment and maintain cognitive function. This is just like a neural network, where the network is constantly updating after receiving new information. The brain is able to form complex neural networks that allow for the efficient processing of information. Like a neural network that exists in a computer, the network is constantly adapting and reorganizing in response to changes in the environment. Ultimately, the brain is self-organizing in that it is able to exist in a state of equilibrium between order and disorder; the brain is stable since it is always able to perform yet also flexible in that it adapts to the inflow of new information through electric signals pulsing through the brain.
What Has Changed
The most obvious factor that has changed since the writing of these pieces of work is the sheer computational power of computers. In The Computer and the Brain, John von Neumann predicted that computers would not be able to outperform humans in terms of strategic games. However, the computational power of artificial intelligence today has proven this incorrect. Just look at who, or what, beat the best chess player in the world now 25 years ago. Another essential topic that is discussed in The Computer and the Brain is how information is processed in computers. At the time, computers were still very weak tools compared to today’s machines and were not really capable of processing information in parallel in a helpful manner. Despite this, von Neumann did realize the potential for parallel processing in computers. He proposed several ideas for building parallel processing computing systems, and these ideas led to the development of things such as multi-core processors and GPU accelerators. Neurons, or the processing units of our brain, are reactionary entities that respond to inputs in an all-or-none style. The output is digital in nature due to the state of neurons. Despite this, the brain appears to be at a significant disadvantage when compared to a computer. It turns out that the building blocks used to process information in the first digital computers were actually much faster than neurons, and obviously the computer is far more efficient at obtaining large amounts of precise information. This is all because, according to von Neumann, neurons in our brain work in parallel while computers process information easily.
The rise of artificial intelligence and deep learning have completely reshaped how we think about the potential of computers. The impressive capabilities of new AI, such as GPT-4, that appear human-like due to the conversational nature of the program are very impressive. While vonn Neumann was very thorough in his mathematical analysis of computers and the brain, he does not touch on the ethical dilemmas that arise from the rapid improvements in technology. One of the ethical issues associated with AI is bias. Machine learning algorithms are trained on massive data sets that reflect the biases of their creators. This can lead to systems that perpetuate inequalities that exist in the data. For example, facial recognition technology has been shown to be less accurate when analyzing people with darker skin tones, which can lead to discriminatory outcomes in areas such as law enforcement and job hirings.
The work that Schrodinger laid on in What Is Life set the stage for lots of important work in the field of chemistry. The structure of DNA and the role of nucleotide bases in heredity were mostly unknown at the time What Is Life was published and it was not until several years later when Watson and Crick explained the structure of DNA in 1953. The discovery of nucleotide bases in DNA was a massive step in the understanding of genetic coding. While Schrodinger’s work did not directly address the structure of DNA, his thoughts on genetic information transmission were crucial to building out future discoveries in genetics, such as the development of CRISPR gene-editing technology.
My Perspective
When I was first introduced to math, it seemed to me like a game: a race against the person sitting next to me, a race against the calculator, or whatever the context may be. I have a distinct memory of my second grade teacher Ms. Gronett telling me that I needed to keep up the good work in my math studies. I would sit on the bean bags in the back corner of the classroom and churn out times tables practice sheets like it was my life’s purpose. I found satisfaction in the objectivity in the answers I obtained, which was not the case in the other subjects we were being taught. I found comfort in the repetitive nature of math, and while I enjoyed social studies, English and other subjects, I discovered that math spoke to me in a way that was unique from other subjects.
Over the years, there have been many different things that have sparked my curiosity in math. At first, when I first started learning math and understanding the use cases, it was baseball statistics and dinosaurs. My third grade self worked tirelessly to formulate some sort of statistic that could capture the value of a player. While my mathematical scope was obviously very limited at the time, I did not let this hinder my passion or attempt to create some sort of solution to this problem of player valuation. After baseball came my love for basketball, which still persists strongly to this day. Being a water polo player for the last 11 years, I have a deep appreciation and respect for sport and competition. However, water polo is the quintessential stats-don’t-tell-the-whole-story sport. Unlike baseball, basketball, or even football, there is simply no way to control for all possible variables in the sport of water polo. Being a defensive-minded player during my career, starting when I was 11 and all the way up until this past season here at Brown, made me come to this conclusion. The goal is simple: stop the man in front of you from scoring. Over the years and having spent countless hours in the pool training, my abilities have not been honed through a statistical manner like baseball or basketball players. With this in mind, the computer versus the brain component of this discussion can come into play. The level of nuance that arises in each unique play in water polo makes it so statistical methods, like Markov Chain Monte Carlo, regression analysis, or hypothesis testing, are not as effective at producing highly effective results compared to if the person, or computer-led person in this hypothetical scenario, is using a neural network instead. I believe that a computer-generated player could not pick up on the nuances of each situation like an actual player, who has observed similar, but still uniquely different, situations in the past and can therefore use all of their stored information in a way that applied to the scenario at hand. Information is accessed through more of a neural network type form by referencing very similar scenarios and adapting to the present moment.
Finally, I have entered into a new stage of my personal interests over the last few years. During my last 2 years of high school, I realized that I need to use my energy in a way that produces unique, positive results. At first I was not sure how I would go about this, but after a summer internship working in the Stanford biology depart I realized that this energy can best be channeled through trying to help the world through protecting and upholding nature. Like when I first realized my passion for math, this topic of sustainability was naturally gravitating for me because it just makes sense: we need our planet to continue the survival of the human race, and we are actively destroying it! It is only logical for me to commit my life work to changing the way people view the natural resources around us in order to drive change and revert so many of the negative effects that humans have had on the planet. I am tired of the negative or zero sum games, and hope to help build a world where people can build things and come together and revert
Agriculture
Let’s start with food. In What Is Life, Schrodinger poses the question ”How does the living organism avoid decay?” and he follows with the reasons being ”eating, drinking, breathing, and (in the case of plants) assimilating” (pp 70). Every natural system on the planet is connected in one way or another. Consider agriculture: the sun, soil, water, plants and animals all work in harmony with one another and there are many co-benefits that each piece provides for the entire system. This is like the concept of the invisible hand in markets. If one of these players is out of balance, the entire system can be thrown off. Regenerative agriculture provides a solution to addressing the extractive nature of traditional farming methods through holistic management techniques that lead to compounding benefits for the farmland being worked on. Neural networks and the Bayesian paradigm can both help to improve and optimize these strategies. Each of these respective methods can serve a very similar role in the management of regenerative agricultural strategies, but there are a few distinctions to make that differentiate the two. There are a wide range of ways that these solution methods can be implemented, such as predicting crop yield year over year, analyzing soil nutrients, pest detection and control, energy optimization, livestock management, animal health modeling and climate modeling.
First, let’s start with Markov Chain Monte Carlo. MCMC is a powerful statistical tool that cab be used to analyze massive data sets and generate probability distributions for uncertain parameters. In the context of regenerative agriculture, MCMC can help aid some of these modeling systems mentioned in the end of the previous paragraph. Let’s look into a few of these.
Soil lies at the heart of of regenerative agriculture since it’s health is the most direct player in the growing of produce and feed needed for animals. Let’s consider nitrogen levels, which is one of the most important nutrients in driving healthy soil. Say we want to model the amount of nitrogen in soil over time as crop tilling and cattle grazing strategies are implemented on a farm. We can set up the following formula to model the change in the amount of nitrogen in the soil over time:
\[N_{t+1} = N_{t} + f(N_{t},P_{t}) - L(N_{t})\] In this equation, \(Nt\) is the amount of nitrogen in the soil at time t, \(F(Nt, Pt)\) is the amount of nitrogen added to the soil through the regenerative techniques that are implemented, and \(L(Nt)\) is the amount of nitrogen lost from the soil due to various processes, such as leaching or seasonal changes.
In order to estimate the parameters of this model, or in other words the function \(f\) and the rate of nitrogen loss \(L\), we can implement the MCMC method. First, we need to define prior distributions based on historical data and knowledge of the system we are working in. Once this data has been collected, we can use MCMC to sample from the posterior distribution of the parameters. For example, we could set up the following prior distributions
\[f(N_{t}, P_{t}) = N(\mu_{f}, \sigma_{f})\] \[L(N_t) = e^{\lambda}\]
where \(\mu_{f}\) and \(\sigma_{f}\) are the mean and standard deviation of the distribution of nitrogen inputs, and lambda is the rate for the exponential distribution of nitrogen loss. Then, we can use MCMC to sample from the joint posterior distribution of \(f\) and \(L\), given the data we have on soil nitrogen at different times. By sampling from this distribution, we could estimate the parameters of the model and make predictions about future soil nitrogen levels based on the implemented regenerative management practices. It is important to note that this model is certainly an oversimplification of the model that would be used in practice. While this can certainly be an effective way to determine the effects of regenerative practices, let’s now consider how neural networks can be used to analyze this same problem.
There are several advantages to implementing a neural network in this scenario than using MCMC. A neural network may be better for a few reasons: efficiency, non-linearity and generalization befits. Neural networks tend to operate more efficiently than MCMC algorithms since they can parallelize computation over a large data set. Secondly, neural networks are better equipped to model complex, non-linear relationships between inputs and outputs. It is often the case in agriculture where many factors influence soil nutrient levels. Thirdly, neural networks can be trained on a subset of data and then used to make predictions on new data, while MCMC usually requires fitting a specific model to all data points each use. In terms of setting up the neural network, an input, hidden, and output layer will be established, then the network must be trained and tested based on accessible historical data. The input layer consists of nodes that represent the environmental factors that affect soil nitrogen levels, such as soil pH, rainfall, temperature, the regenerative strategies applied and more. Next, the hidden layer is made up of nodes that do calculations based on the input data to generate new features that are relevant for predicting the output. For example, the hidden layer could include nodes that determine the rate of nitrogen fixation or the amount of nitrogen leached from the soil. Next, the output layer consists of nodes that give the predicted nitrogen levels in the soil after implementing the determined regenerative practices. Using these layers as the foundation, the neural network will be trained using a set of input-output pairs that represent the relationship between the environmental factors in play and the soil nitrogen levels. The weights of the nodes in the network are adjusted during the training process in order to minimize the disparity between predicted and actual nitrogen levels according to historical data. After the network is trained, it can be tested using new input data to predict nitrogen levels of soil. The key to the strength of the neural network is the non-linearity of soil health and how all the different variables work together in unique ways, as things do in a living, breathing system. A neural network is more likely to capture these complex interactions and make accurate predictions about the impact of regenerative agriculture on nitrogen levels. Neural networks are more practical than MCMC in this case due to how computationally intensive MCMC can be and how it requires fine tuning of the algorithm to generate strong results in practice.
It is critical to the accuracy of the neural network that the network has a proper depth in order to avoid the overfitting of data. As the depth of a neural network increases, the number of parameters also increases. As a result, the network learns how to create more complex representations of the data, or the network is able to capture more nuanced relationships in the data. Deeper networks can be more difficult to train since they naturally require more data and longer amounts of training time to produce results. Deeper networks can be faulty in that they can suffer from vanishing gradients which will lead to unusable information. If the network is too deep, then the network can become too specialized to the training data and perform poorly when analyzing new data. To minimize the potential damage of overfitting, it is imperative to implement strategies such as regularization and early stopping of the program.
In What Is Life, Schrodinger conveys that living organisms appear to be able to decrease their entropy locally and maintain a higher degree of order than the things around them. In terms of regenerative agriculture, the goal is to maintain and improve soil health through practices such as cover cropping, crop rotation, and reduced tillage. These practices have positive impacts on the surrounding environment, such as increased biodiversity, better soil structure, and improved carbon sequestration abilities. All of these can be seen as forms of negative entropy in the system. Using this framework, regenerative agriculture can be seen as having a positive impact on entropy through decreasing disorder in these ecosystems. Schrodinger would almost certainly think of regenerative agriculture as a positive example of how living systems can use energy and information to decrease local entropy and maintain a higher order degree when considering the laws of thermodynamics. Everything is connected.
Language and Understanding
In addition to food systems, language, and how people perceives and under- stands language, is another topic that I have a new perspective on based on what we have covered this semester. A previous decoding project was a very interesting exercise in that it made me think about how we as humans understand and articulate language. I have a new-found perspective on the way in which babies are influenced by the speech around them and convert this information into their knowledge foundation.
Let’s consider how a baby begins to absorb sounds, and eventually words, phrases, and sentences, into its memory. In this project, the ’mother text’ that was used to train the program and assign probabilities to letter locations was Moby Dick. While this text is lengthy as far as books go, with there being over 200,000 words, the amount of words children are exposed to is in a whole different ballpark. According to a study done by Betty Hart and Todd Risley, babies hear about 30 million words in their first year alive. As babies become able to truly understand words and associate these words with objects or thoughts, the amount of words that small children register explodes with an increase in brain capacity as they get older. When a child is three years old, the brain is already about 80 percent developed, and reaches 90 percent by year 5.
Unlike the brain, computers are rigid in structure and do not undergo a growth period like a babies brain does. In the case of text decrypting, the computer already has the adequate storage and processing capabilities to generate probability distributions of a letter or word being before or after another letter. The reality that the training which the program undergoes only takes in letter and/or distributions, and not the manner in which things are said or the way in which the words are delivered, make the way in which computers understand language objectively inferior to that of a human when the information being given to it is in text form. The amount of information that the way in which something is said carries significant weight in terms of truly understanding the purpose or intention of the thing being said.
There are many things we can control in life. However, it takes a certain level of knowledge and ability to be able to have control over something. In this light, it is natural that small children do not have control over much since their parents, or an equivalent, are meant to take care of them and provide them with the things they need to survive. Babies cannot control the things that are being said around them that they register in their brain. Essentially, the ’mother text’ that the baby is using to train it’s brain is not chosen under its own jurisdiction. I find it fascinating that for the period of time when the brain is developing the most, the person carrying the brain has no control over the factors most strongly influencing this development. So what does this mean?
Humans are social creatures. Furthermore, we are a product of our environment. Despite this statement in the last sentence being something that everyone has heard at some point in their life, it is difficult to fully internalize what it means and just how significant it is, especially as it pertains to the perception of others. My big takeaway from thinking about the amount of things that influence who we are which ultimately one has no control over is that people need to be more sympathetic and understanding that everyone is unique in the way in which their brain has developed. Everyone has their own ‘mother text’, and some people get dealt a poor hand in this regard due to factors completely out of their control, such as the family they are born into, where they live growing up, and the communicative nature of the people they are around during the early stages of brain development. Essentially, we can only control so much of who we are: a large part is scripted based on where we happen to show up in the world. Let us not forget this fact that we are truly a product of our environment, and that it is impossible to truly understand the environments which others have been in. Let us be more understanding of one another, listen more deeply, and realize that there is only so much we can know about someone. Do not assume too much. Give others the benefit of the doubt. A certain amount of all of us is literally our environment.