Understanding Probabilities in Investing

Cheney Li

Understanding Probabilities in Investing An important yet often overlooked aspect of investing is that while we are all aware of the high uncertainty involved, we seldom examine the probability distributions. Let's consider three scenarios: 1. A stock has a 30% probability of gaining 50% and a 70% probability of gaining 40%, resulting in an expected return of 44%. 2. A stock has a 50% probability of gaining 50% and a 50% probability of gaining 30%, resulting in an expected return of 40%. 3. A stock has a 30% probability of gaining 60% and a 70% probability of gaining 15%, resulting in an expected return of 28.5%. If you show me the expected returns, I would prefer the first one. However, if you show me the distributions, I would prefer the third one. Why? Because in a world with high uncertainty, the only way to survive consistently is to ensure that you can endure the worst-case scenario. Decisions based on distributions allow us to better manage downside risk. Probability distributions also provide a perspective on how the world operates. Many historical developments conform to what is known as a power law distribution, while others fit a normal distribution. Both are beautiful curves, but their meanings are entirely different. Take human lifespan as an example; it can be described as a normal distribution. The most important reason is not that the average has the highest frequency, but rather that the boundaries are very clear. Regardless of the expected lifespan figure, it will never exceed twice the average value. Similarly, height and weight can often be analyzed with a similar logic; for instance, if the average height is 170 cm, there will not be anyone with a height of 3.4 meters (excluding conditions like gigantism). Many situations related to network effects and dissemination follow a power law distribution. In social networks, whether it's the number of followers, video views, or the R naught of infectious disease spread, power law distributions exhibit a notable characteristic: the probability of extreme endpoints occurring is much higher than one might expect. Although probability distributions themselves do not directly provide any information about the world, if we observe a phenomenon that conforms to a certain distribution, we can infer a lot about the underlying logic of that phenomenon. For example, if we see something that follows a normal distribution, we can likely assume that the underlying process is governed by the central limit theorem. Let's consider an example of applying probability distributions to problem-solving: many people discuss the concept of "smart money" when investing in stocks. If we observe that a stock has been bought by many hedge funds in the past, we might decide to buy that stock as well. But if everyone were buying randomly, what kind of probability distribution would we get? The person who buys most accurately is likely just the one who happened to bet correctly. However, if the entire investment market is a near-random world, then returns should conform to a normal distribution, meaning that individuals with returns twice the average should not exist. Similarly, we can consider the lifespan question: does the probability of a person living to 70 depend on whether they have already lived to 60? Or, if a fund manager has had excellent returns in the past, what kind of return probability distribution should we expect? If we can integrate probability distributions, the ideal scenario would actually be to consider Bayesian probability, which suggests that we can never know the exact probability of rolling a 1 on a die, but our understanding of that probability will change as we observe more data. Regarding the Kelly Criterion, there's still a question that remains unclear: how do we estimate the profit multiple b and the win probability p? Mean Reversion and "Luck" The first tool we can use is mean reversion. Trees don't grow to the sky. A coin toss can't always land heads up. This is easy to understand. However, what's difficult to grasp is why, when each coin toss has a 50% chance, it seems like the probability of getting heads decreases after several consecutive heads. Let's forget about calculating the binomial distribution of a Bernoulli trial for a moment. Can we truly understand this intuitively? Is it because the coin is calculating its own luck, using a bit more with each toss until it's all used up? The issue actually lies with that 50%. Ask a schoolchild the probability of a coin landing heads up, and they'll tell you it's 50%. But where does this number come from? What does it mean? Does the world split into two at the moment of the toss, with one person seeing heads and another seeing tails? Any probability textbook will tell us: 1. Toss the same fair coin without controlling how it's tossed or bumping it (there are coins with one side heavier, where the probability of landing heads isn't 50%; people have even created machines that ensure the coin lands heads up every time). 2. Keep tossing, and count how many times it lands heads out of every hundred tosses. 3. You'll find that as the number of tosses increases, the number of heads in 100 tosses gets closer to 50. So, it's not that the coin is counting, but rather that when we say a coin has a 50% probability of landing heads, we've already counted in advance! Why does it seem like the probability of getting heads decreases after several consecutive heads when each toss is a 50% chance? This is a classic example of the gambler's fallacy. As long as the basic physical conditions remain unchanged, the coin's tendency to land heads or tails won't change; the coin itself has no memory. There's no luck that gets used up, but when you get eight heads in a row, you should consider yourself lucky. Reversion occurs because there is a mean. This indicates that although there is randomness, the random component (the part unrelated to extreme values and the mean) still has a "spectrum," and things won't be too outrageous. No matter what, we're still tossing the same coin, and we believe that the factors determining heads or tails at different times and places are roughly the same (physicists call this symmetry). However, the idea that "things generally won't be too outrageous" isn't a natural law like Newtonian mechanics; it merely comes from our definition of the mean. It's hard to say whether the existence of a fixed mean is due to some properties of the coin or our perspective. So when do things deviate from the "spectrum"? Several scenarios: 1. Trends. For example, China's GDP in recent years. This won't revert to the GDP of the Qing Dynasty. Although both are called GDP, the economy after industrialization and urbanization is not the same as a small-scale peasant economy. 2. Discontinuities. The average mortality rate before and after the discovery of the smallpox vaccine is completely different. 3. The random component can accumulate and won't dissipate over time. The third scenario is also known as a random walk because this process is like a drunkard walking, staggering to the point where you don't know where they've gone, and they certainly won't come back. In academic textbooks, stocks are classic random walks with no mean, so no one can predict future stock prices based on past prices. If a drunkard strays too far, they might fall into a ditch. Can a company's value really be infinite? Coincidence Mean reversion implies that coincidences are hard to come by. But coincidences do occur. Just how small is the probability of a coincidence happening? Is flipping a coin ten times and getting heads more than eight times a coincidence? It seems reasonable to say so. How small is the probability of getting heads more than eight times? Let's calculate: P(flipping heads more than 8 times out of 10) = 5.74% P(flipping heads more than 16 times out of 20) = 0.13% P(flipping heads more than 24 times out of 30) = 0.02% It turns out to be almost 5%! Out of a hundred people, five might find themselves seeing more than eight heads. Only as the sample size increases does the proportion of getting 80% heads gradually decrease. In small samples, the probability of coincidences occurring is much larger than we imagine. People are particularly inclined to come up with explanations for things (which is why success theories/conspiracy theories are so popular). So we easily overlook sample size and see false patterns in coincidences. In the face of coincidences, we easily ignore the "spectrum." Subjective Probability Theory Developed in Gambling How can we eliminate the influence of coincidences on our perceptions? First, we need to re-evaluate our understanding of probability. No one in the world can find a coin with a 50% probability of landing heads. To prove a probability of 50%, one would need to flip the coin an infinite number of times. Then this poor fellow would have to spend their entire life flipping coins, and when they die, their descendants would have to take over. Why infinite times? Because we can't say that 1,000 or 10 million times is good enough. Imagine a rather grim scenario: a farm has a pig (assuming this pig can live forever if not slaughtered). If this pig were to predict the probability of being slaughtered tomorrow, what would it think? But this doesn't prevent us from believing that the probability of a coin landing heads is 50%. We can consider probability not as something measurable in the real world (like some attribute of the coin), but as a perception in our minds, similar to the utility discussed before. We constantly update our perceptions of an event based on new information. Although the poor pig can't figure out the probability of being slaughtered one day, the farmer can do the following: 1. Record how many days each pig lives before being slaughtered. 2. Sum up the days and calculate an average (or better yet, create a bar chart of pig lifespans, with the left side small and the right side large). 3. See how far a living pig is from this average. The farmer can not only calculate probabilities and have a sense of the spectrum but also consider opening a farm-themed casino where the bet is that if a pig lives past 800 days, you win 100 bucks. From how much someone is willing to pay to participate in this game, we can understand what probability they believe in their mind for a pig living past 800 days. Information and Conditional Probability If we previously found that pigs were slaughtered after 800 days, and one day a pig was slaughtered after just 500 days, how does the probability of the next ten pigs living past 800 days change? Perhaps the farmer's taste has changed (assuming the farmer wouldn't slaughter early just to win bets), and suddenly wants to eat something tender. Will pigs now always be slaughtered around 500 days? But is there still a mean for slaughter days, like some humanitarian regulation requiring pigs to first enjoy a happy youth? Will slaughter days revert to the mean? So for the bet where you win 100 bucks if a pig lives past 800 days, how much are you willing to pay? If you've read this far and remember a sentence from the coincidence section, you'll find it has come true again. We seem to be coming up with a causal explanation for a phenomenon once more. But if probability is defined this way, it becomes an extremely subjective matter, dependent on whether a person is optimistic or pessimistic, whether they think the pig's lifespan is long or short, and how large or small they believe the mean reversion effect to be.