Current location - Loan Platform Complete Network - Big data management - The Physical World of Normal Distributions and the Spiritual World of Hypergeometric Distributions
The Physical World of Normal Distributions and the Spiritual World of Hypergeometric Distributions

Doesn't it always feel like there's a bit of a disconnect somewhere?

- because half of the above quote is made up by the author.

If human understanding of vectors implies a desire to master and even transcend the limitations of space and time, then human study of the normal distribution must symbolize a desire to control and even escape from the mercy of fate.

The normal distribution, also known as the Gaussian distribution, is one of the most important cornerstones of science. Its name comes from the "Prince of Mathematics" Johann Karl Friedrich Gau? (It's not called a positive distribution, though.)

The normal distribution will be no stranger to those who study science and engineering. With the application of statistics in the social sciences and even the humanities, (as well as the so-called "big data" boom,) the application of the normal distribution has broken through the traditional physical world, and pushed directly into the spiritual world of mankind - from machine learning to neuroscience research, and from behavioral psychological analysis to ideological philosophical thinking.

The normal distribution has too many elegant properties and wide-ranging applications to mention here.

The most important of these is called the Central Limit Theorem, which states that a large number of independent random variables added together will have a normal distribution with mean values.

As an aside: although this is how it is used in reality, the two most fundamental conditions, large numbers and independence, have been virtually impossible to achieve since the beginning of biology.

Higher math, calculus, higher algebra, higher geometry, probability and statistics, and so on, what's the big deal?

Two words: extreme.

- This must be the greatest invention that the Enlightenment brought to mankind. Unfortunately it is almost exclusively used by the scientific community, so perhaps the word 'discovery' would be more accurate.

The Hypergeometric distribution is also one of the most basic distributions in probability theory, but it is much less widely used than the normal distribution. In fact, it is used almost exclusively in relation to sampling, such as the pass rate in product sampling, or the winnings in Texas Hold'em poker.

It can also be approximated by the normal distribution under certain conditions - essentially an application of the Central Limit Theorem mentioned above.

The reason why the author relates the hypergeometric distribution to the world of the human mind is that its application, or rather its definition from the very beginning, is more probabilistic and thought-experimental than statistically observational and analytical.

Of course, the greatness of the normal distribution is in carrying out both.

Allow me to recapitulate the reasons why probability theory is more abstract than statistics in a slightly folkloric way.

Statistics says: a fair coin, tossed 1,000 times, found 461 times heads up; the frequency of heads up is 0.461; if tossed 10,000 times, or more, this frequency will be closer and closer to 0.5.

Probability says: a fair coin, every time you toss, the probability of heads up is 1/2.

In other words, the theory of probability is a priori. In other words, probability theory is a priori and statistics is posteriori.

Incidentally, the thing that links the two is Bayes' theorem.

Regardless of what the real world is like, human descriptions of the physical world are often normally distributed.

When we estimate a distance, such as measuring the length of a table with a centimeter scale, we need to estimate the value in millimeters. When we aim at a target to shoot, throw, kick a ball, such as shooting athletes to participate in the Olympic Games, each is hoping that the gun can hit the bull's-eye ...... shooting percentage, betting odds, per capita income, life expectancy, and so on, there is a so-called expectation (that is, the average) of the concept! --The two conditions needed for this concept to be valid are precisely: 1. the probability distribution has only one peak; 2. the probability distribution is almost symmetrical. The normal distribution is the most convenient and durable of the eligible probability distributions.

And the physical world itself, because of the randomness of microscopic particles and the large number of macroscopic observations, at least in classical physics when considering randomness, such as estimating and controlling rockets into space, the uncertainty of the whole process is enough to deal with with a normal distribution.

However, if the rocket is carrying an astronaut who can unlock a panic button (perhaps to launch a nuke to clear an inescapable asteroid) by entering a 4-digit code - and because of the stress of the situation he only remembers the first 2 digits of the code, he can only enter the last 2 digits randomly. He might get it wrong several times and keep repeating it until he succeeds or becomes benevolent.

Throughout the process, the latter and former passwords are not independent of each other, and there are only 100 possibilities for a **** so there's not a lot of them, so there's no room for a normal distribution to intervene.

Similarly, in the process of playing Texas Hold'em, holding the J, Q, K, and A of spades, the probability that the next card will be the 10 of spades is also not related to the normal distribution.

The author suggests that hypergeometric distributions reflect conceptual perceptions such as strategic choices - they are subordinate to the mental world and are characterized by an either/or choice among a small number of options.

The biggest difference between a hypergeometric distribution and a normal distribution is that it has considerable skewness, especially when the sample base is small.

There is a classic experiment in which subjects read a passage from a speculative fiction novel, list all the suspects they see, and label them with the probability that they are the murderer.

The experimenters found that almost none of the subjects' lists of suspects summed to 100% - which is clearly illogical.

But was this really just because the subjects lacked or ignored this most basic of statistical truisms?

When the subject estimates the probability of a suspect, he instinctively realizes the uncertainty of the value itself. That is, the value itself is a random number that has its own probability function describing its distribution.

And when we are asked to write down just one value to represent this function, we instinctively use the majority (Mode, which corresponds to the peaks of the probability distribution), not the mean.

- The lesson of this instinct is that when we sample from a sample that matches this distribution, the number obtained is more likely to be close to the plural.

But the plurality is not the mean. The plurality, the mean, and the median (Median) are all one thing for random numbers with symmetric probability distributions, but not for hypergeometric distributions.

"If the sum of a set of random numbers is known to be equal to 1, then the sum of their means is also known to be equal to 1," is undoubtedly true, but the same proposition is not valid for the plurality.

Also for most subjects, "the probability that a suspect is a murderer" is not so much "the probability that the suspect is the murderer in this group" as "how confident am I that the suspect is the murderer".

Obviously, it's back to Bayesianism here.

In reality, too many research subjects simply don't measure up to the use of the Central Limit Theorem, but it is ultimately misused - because it's convenient, it works well, and it doesn't turn out too badly in most cases.

Isn't it wonderful that so many complex stochastic processes can be described with just a mean and a variance?

Yet isn't it really just laziness?

"Memory is an unreliable thing, so the less you need to remember, the better." (Later, I realized that I still had to memorize ...... and in a language I didn't necessarily understand.)

But mathematicians are rigorous, and they don't misuse any theorems. Often, they are so overly rigorous that many mathematical theories are conjectures first and breakthroughs later. For centuries it was physicists who were the pioneers, and after the turn of the millennium computers joined the bandwagon.

So it's not mathematicians who abuse theorems, it's the guys who do applied disciplines with the tools of math.

And yet, as if to shift the blame, the author blames the mathematicians for this misuse.

Because they are so indifferent to the applied disciplines, the fundamental critical and constructive nature of mathematics is not transmitted. Mathematics, as a compulsory subject for all science and engineering majors, is not only not widely loved, it suffers from open contempt, both in the East and in the West.

- This, surely, must be the problem of mathematicians.

Of course, it is not beyond the author's comprehension that when one is truly immersed in what one loves, one does not need to compare oneself with others and belittle them in order to achieve self-fulfillment.

From single objects to multiple objects, from deterministic events to random events, human beings have begun to study their behavior mathematically.

In 1994, John Forbes Nash Jr. was awarded the Nobel Prize in Economics, along with two other game theorists.

So why doesn't the Nobel have a math prize?

Because there are always other ways for mathematicians to do things - this one is a joke, but the Nobel Prize has an extremely important condition for awarding the prize, which is to be 'alive'.

There are too many mathematicians who die young, and very few whose mathematical theories are widely used while they are alive, and Nash was definitely one of the lucky ones.

As a graduate of Princeton University, Nash represents a new generation of mathematicians moving into applied disciplines, especially in economics.

Of course game theory has applications far beyond economics, but finally, to this day, economics around the world has become accustomed to numbers and has come to fetishize data. One statistic suggests that today's Princeton economics graduates are more confident than their counterparts circa World War II - because they use mathematical tools rather than analyzing problems by historical experience.

It is not technology that has advanced the evolution of human civilization, but faith. Web technology matured long before the dot-com era, but at that time the Internet was used by only a small fraction of the population, was not commercialized, and therefore lacked the infrastructure and talent to spread it around the globe.

Computer technology came into common use not because it was so useful, but because its usefulness was recognized by the general population. Nash furthered the mathematization of economics after both world wars, which in turn drove the digitization of financial institutions as well as industry in general. More and more data is being collected, and its analysis is becoming more and more useful. The world was linked by giant wires and fiber optic cables, and the Earth literally became a giant computer, as predicted by the Hitchhiker's Guide to the Galaxy.

Thinking about it the other way around, if economics hadn't been mathematized by those generations, the numbers bouncing around in the financial markets would have been meaningless, and we'd still be living in a time when we had to go through private channels to get valid intelligence - non-digitized intelligence that is difficult to transmit through digital carriers! It is important to remember that multimedia such as music and video is something that has come about in recent years after the internet has become sufficiently advanced; early computers and networks could only handle numbers and characters. And even then, stock exchanges were among the first to adopt the Web.

In short, Nash's contribution was not just in an economic sense, but in an economic sense.

In March 2016, AlphaGo beat Lee Se-dol.

Go is the quintessential two-player zero-sum game of perfect information certainty.

The first thing I want to emphasize here is the difference between perfect information and complete information. Simply put, with perfect information, the participants in a game know each other's goals; whereas with complete information, it is only about the game itself.

For example, suppose a terrorist took control of the global network like in the movie and threatened the president of South Korea by launching a nuclear missile against South Korea, and Lee Se-dol had to lose the game on purpose - and AlphaGo was just a Go AI, not the mastermind behind it; then the game would not be complete information. the game would not be fully informative. But as long as the game is Go, and you can't steal the positions of the pieces by magic, then the game must be perfectly informative.

Deterministic is very well understood, in the sense that there are no regrets, and a piece that you want to play in a star position will not fall to a minor or a 3-3 for some inexplicable reason. Determinism has two advantages over stochasticity.

One, for the player, the game has a theoretically deterministic solution.

In practice, of course, the complexity of Go is very large, far exceeding that of all types of chess. Currently, chess AI is best used in chess, where there are professional tournaments where teams of players and AIs play against each other, and the AIs give advice to the players, who are free to choose whether to follow it or find another way to play. While most of the time the advice given by the AI will be favored, there are also times when a player has a flash of insight and wins by making a move that neither AI had expected.

The complexity of Go, on the other hand, has nothing to do with the way the pieces move, but stems entirely from its huge 19x19 board. Just as we can use a small board in the early stages of learning Go, early Go AIs started with small board challenges.

The second is that it's black and white: a win is a win, a loss is a loss, and AlphaGo's battle with Lee Sedol was a five-way battle, or two out of three. If this game is a coin flip, three wins in five games, then anyone can beat the AI, anyone can be honored as the world champion.

Most critically, because the benefits of the game (both fame and fortune) are directly tied to the outcome on the board, there is little difference between perfect information and perfect information. Even if Lee Se-dol really went into the game with the intention of wanting to lose, the fact that he lost on the outcome would not change.

Finally, Go is a one-on-one zero-sum game.

In terms of war games (or real war itself), this kind of game is zero-sum, where it's either you or me, so direct combat is inevitable. But if there are multiple forces, it's perfectly possible to take an alliance and **** with the same encroaching enemy.

The U.S. and the Soviet Union never had their own purposes, but in order to defeat the Nazis, in the end, they still united, although only for a time and a place. But even at that time and place, they could not really cooperate, it is said that sophisticated politicians must be calculated before the start of the war on the distribution of benefits after the war - if not on both sides of such a leader, after the war, how can the formation of almost symmetrical strength of the Cold War pattern?

Yet random games create problems.

It is generally recognized that the problem with randomness is the luck component. That is, assuming Go is a game with a random component, and AlphaGo wins in five games and three, humans could roar "bad luck" and subsequently demand another 300 rounds.

But the problem was solved by this deadbeat request - that is, by playing multiple times and looking at the average score.

In 2015, Professor Michael Bowling of the University of Alberta published a paper in Science that he and his colleagues had 'weakly' solved two-player Texas Hold'em - they had developed a program called Cenotaph. developed a program, Cepheus, that guarantees undefeated play when both players are dealt the same hand. Note that "weak" here emphasizes knowing both players' cards, i.e., simplifying Texas Hold'em, which was originally a game of imperfect information, into a game of perfect information; and "guaranteed undefeated" doesn't mean that you won't lose every game, but rather that you won't lose, on average, if you gamble with Cepheus over many games.

Of course the realistic question is how many matches are needed to prove superiority. Both the humans and the AIs participating in the tournament change, so there's technically no way to do an exact duplicate of the experiment.

The next problem is imperfect information.

In a sense, imperfect information can also be randomness. To continue with the poker example, the fact that the bottom card is the ace of spades, while certain for oneself, is uncertain for one's opponent.

Yet the effect of this uncertainty on strategy is very different from the consistent randomness for both players described in the previous paragraph. While the probability that "this unknown base card is the ace of spades" is the same for the opponents as the probability that "the next newly dealt card is the ace of spades", the probability of "making a raise based on the fact that the base card is the ace of spades" is very different from the probability of "making a raise based on the fact that the base card is not the ace of spades". The uncertainty of the base card itself combined with the uncertainty of all possible strategies based on the base card is the real uncertainty at hand.

Going back to theory, such uncertainty simply increases the amount of computation - but there's no doubt that AI cracks random games, and therefore becomes much harder.

Finally, there's the problem of multi-player games.

With imperfect-information games, multiplayer gaming first adds unknown undercards, which is to say it adds uncertainty directly from quantity.

At the same time, as mentioned in the previous section, multi-player games in reality have the possibility of alliances, and the parties to the game may not be clear to each other about the existence of similar agreements - that is, multi-player games increase the imperfection of the information, and once again from the structure of the multi-layer uncertainty compounded.

The key question is how do we test who is better at a particular (or class of) AI compared to humans? Assuming a team develops an extremely good mahjong AI, does it make sense to have two AIs compete against two humans? Would the humans tamper with each other for the sake of human dignity?Is it possible for AIs to cheat with each other?

It might make sense to set up such an AI on an online game platform: players entering the game can't tell if the other player is an AI based on the opponent's ID; there are enough matches. --But studying your opponent's hand (game path) in advance in order to win is what a professional player is supposed to do. Isn't such anonymity inherently unfair to the human side? For example, AlphaGo has a collection of classic games, including Lee Se-dol himself as a matter of course, but AlphaGo itself has far fewer games, and Lee Se-dol is already at a disadvantage even before the game starts.

In describing the process of randomized multiplayer gaming with imperfect information, the author gives two typical examples, poker and mahjong.

The author's question is: why are these types of games always directly related to gambling?

This question seems a bit unreasonable. But just look at the definition: almost all sports are random games.

For example, shooting, good players can make every shot is very close to the center of the target, but it is impossible to guarantee that every shot is right in the center of the target, so the game is not a duel, a shot to determine the winner. In track and field events, the results will be affected by the physical condition of the runners, the weather and the venue, etc. Especially in sprinting, there are also "gun pressure tactics", which adds a lot of attractions to the competition. Team sports, such as soccer, are influenced by too many random factors, either natural or man-made.

It's not that sports have nothing to do with gambling, as the cloud of betting on football will never go away. But as long as the athletes themselves don't gamble on their own losses, they're still trying to win, which means that in principle the sport can be played for glory without gambling. However, even if Texas Hold'em and Mahjong are completely de-gambled, you still have to use chips or something like that to show how much you've won or lost.

Also not a problem is imperfect information, which is very common in team sports: the flurry of fancy moves given by a baseball coach, the gestures behind the back of a volleyball second baseman as his teammate serves the ball, the little note that Lehmann got before the penalty kicks in the quarterfinals of the World Cup in '06. ......

As for multi-player events, it seems that in sports, there's a lot more to the game than just the game of poker. Multiplayer events, there doesn't seem to be anything like poker or mahjong in sports competition where solo combat is the main focus. Things like Formula 1, long-distance running, cycling and the like all place great emphasis on teamwork, and Armstrong didn't win the Tour de France seven times in a row just by getting high. Similar cooperation in poker or mahjong is easily seen as cheating. This difference in perception is not explored in depth here, as it is likely that the taboo on teamwork was spawned because gambling was in the forefront, not the other way around.

Using China and Japan as a base, the competitiveization of mahjong is slowly progressing around the world.

Groups in China are pushing for something like "duplicate bridge," where the composition of the tiles in each hand at different tables is the same, with the same four players in the southeast, northwest, and north-west. As with bridge, the aim is naturally to reduce randomness.

However, I don't see much point in reducing randomness by increasing the correlation of players in the same seat at different tables - or perhaps it's simply counterproductive, since fewer randomly-generated mountains of cards actually reduces the sample size and increases the variance.

Of course, since the tournament is run entirely on a computer, and the efficiency of the players' cards is recorded by the computer at the same time as the tournament, it still makes quite a bit of sense.

In order to reduce randomness, Japan's top tournaments use "competitive ルール" also called "A rule", which is not much different from regular Japanese mahjong, mainly because it reduces the number of treasure tiles.

That said, if you're thinking about reducing randomness, Washin's Nest Mahjong is probably a good way to do it:

The last one is certainly not necessary.

When the author claims that sports are random, I'm sure many people agree but don't want to agree, at least not completely.

- If you keep improving your level, you will eventually be able to crush weak opponents. But that's not what spectators, or even top athletes, crave, and we crave from the bottom of our hearts a showdown at the top of the game that will live in infamy.

That's right, this is the charm of the competition: two evenly matched masters, sword and halberd strokes life-threatening, a split second, a fraction of a second ......

And like poker or mahjong and other purely luck-based games, even a novice may be able to beat the world champion.

If it's just one game, then sure, but a whole game? A top match, especially a title fight, is a full day, or even two?

The overall magnitude of the flood complemented by the uncertainty of the details is our mindset.

With it, our ancestors dodged wild beasts and crossed turbulent waters, and in the ancient days of fertility worship, more was good, more was holy, more was hymn.

But at some point, scarcity became not just a natural balance of supply and demand, but more directly embedded in our culture and genes.

We give this law to all the games we invent, war, hunting, sports, chess - the least accessible, the most valuable.

Ask: what is the biggest difference between chess and the other examples?

- That is that the rules of chess are made by us and are not subject to the laws of physics.

The physical world of normal distributions is what we are used to; the mental world of hypergeometric distributions is where we are just green urchins. And for the unconstrained, there are ancient methods of discipline:

Gambling, which is the best way to hold participants accountable by making them win or lose on the backs of physical materials.

Having answered the question at the beginning of the chapter, let's conclude with the difference between the two distributions, or worlds.

In the physical world, we struggle with the negative correlation between mean and variance. Larger returns often mean expensive costs as well as high risk.

Not to say that this is not the case in the rules of the game as formulated by us, and it is also at the extreme end of the author's spectrum to think of objects with hypergeometric distributions as corresponding directly to the mental world.

But the normal distribution is based on the things themselves, and the hypergeometric distribution is based on the combinations, that is, the relations between the things - that should be a fair enough summary.

The objects of the physical world, and our experience in general, derive from the repetition and stacking of things. And since the dawn of biology, or even earlier, since the emergence of biological macromolecules, evolution has spawned complexity not just in a quantitative sense, but in a structural sense. And human society is even more so-called 'superstructure' - just as living beings as physical beings necessarily follow all the laws of physics, but there is no need to use quantum mechanics and relativity theory to study the color of the fur of cats and dogs - our spiritual world is constructed on the foundation of the physical world, and does not represent the laws of the two worlds. It doesn't mean that the laws of the two worlds are common to each other.

In conclusion, I would like to pay my sincerest respects and deepest regrets to the entire Age of Enlightenment.

The Enlightenment promoted rationality - not only in scientific research, but also in the self-discovery of human nature and the civilization of society as a whole - but all of this was just a pipe dream, and human beings were only able to hold themselves accountable in the most old-fashioned way.