Big Data’s Mathematical Mysteries

At a dinner I attended some years ago, the distinguished differential geometer Eugenio Calabi volunteered to me his tongue-in-cheek distinction between pure and applied mathematicians. A pure mathematician, when stuck on the problem under study, often decides to narrow the problem further and so avoid the obstruction. An applied mathematician interprets being stuck as an indication that it is time to learn more mathematics and find better tools.

I have always loved this point of view; it explains how applied mathematicians will always need to make use of the new concepts and structures that are constantly being developed in more foundational mathematics. This is particularly evident today in the ongoing effort to understand “big data” — data sets that are too large or complex to be understood using traditional data-processing techniques.

Our current mathematical understanding of many techniques that are central to the ongoing big-data revolution is inadequate, at best. Consider the simplest case, that of supervised learning, which has been used by companies such as Google, Facebook and Apple to create voice- or image-recognition technologies with a near-human level of accuracy. These systems start with a massive corpus of training samples — millions or billions of images or voice recordings — which are used to train a deep neural network to spot statistical regularities. As in other areas of machine learning, the hope is that computers can churn through enough data to “learn” the task: Instead of being programmed with the detailed steps necessary for the decision process, the computers follow algorithms that gradually lead them to focus on the relevant patterns.

→ Quanta Magazine

Algorithms Need Managers, Too


In Shakespeare’s Julius Caesar, a soothsayer warns Caesar to “beware the ides of March.” The recommendation was perfectly clear: Caesar had better watch out. Yet at the same time it was completely incomprehensible. Watch out for what? Why? Caesar, frustrated with the mysterious message, dismissed the soothsayer, declaring, “He is a dreamer; let us leave him.” Indeed, the ides of March turned out to be a bad day for the ruler. The problem was that the soothsayer provided incomplete information. And there was no clue to what was missing or how important that information was.

Like Shakespeare’s soothsayer, algorithms often can predict the future with great accuracy but tell you neither what will cause an event nor why. An algorithm can read through every New York Times article and tell you which is most likely to be shared on Twitter without necessarily explaining why people will be moved to tweet about it. An algorithm can tell you which employees are most likely to succeed without identifying which attributes are most important for success.

→ Harvard Business Review

In Defense Of The Gaussian Copula

The Gaussian copula is not an economic model, but it has been similarly misused and is similarly demonised. In broad terms, the Gaussian copula is a formula to map the approximate correlation between two variables. In the financial world it was used to express the relationship between two assets in a simple form. This was foolish. Even the relationship between debt and equity changes with the market conditions. Often it has a negative correlation, but other times it can be positive.

That does not mean it was useless. The Gaussian copula provided a convienent way to describe a relationship that held under particular conditions. But it was fed data that reflected a period when housing prices were not correlated to the extent that they turned out to be when the housing bubble popped. You can have the most complicated and complete model in the world to explain asset correlation, but if you calibrate it assuming housing prices won’t fall on a national level, the model cannot hedge you against that happening.

→ The Economist

Fintech: Search For A Super-Algo

The rise of the machine learning :

The quantitative investment world plays down the prospect of machines supplanting human fund managers, pointing out that the prospect of full artificial intelligence is still distant, and arguing that human ingenuity still plays a vital role. But the confident swagger of the money management nerds is unmistakable. Already there are quasi-AI trading strategies working their magic in financial markets, and the future belongs to them, they predict.

→ Financial Times

What I Learned from Losing $200 Million


That was one hell of a trade. Boy, what a wild ride.

The Sunday after Lehman fell, pacing my empty trading floor, I realized once and for all that my models and reports could no longer tell me what to do. The one unmistakable fact was that my risks would increase if oil continued its decline. I decided that when I came in on Monday, I’d place a big bet that WTI would do just that.

And on a Saturday morning bike ride up the Hudson, it occurred to me that Mexico might be willing to restructure its deal—selling us back the option it owned, and buying a new one—in a way that would lock in billions of profits for the country, while giving me a much needed windfall too. I dropped my bike in a bush and texted our salesperson about the idea.

There were many other decisions and guesses, some made alone, others with help from my team, and still others made by my boss. All were guesswork, none could I have anticipated in stress testing, and all involved abandoning my original strategy along with the illusion of control it gave me.

→ Nautilus