Stock Market Mosaic: breaking the Da Vinci Code

written while reading "The Da Vinci Code" novel by Dan Brown

Sergey & Tatiana Tarassov

Lyric Introduction

This story has begun 10 years ago. Once a year, my University mates meet together. There are many very interesting people among them. Though we work in different fields now (mostly in science) and also in different countries, we still have many things to discuss. We speak about everything: remember our student past, share stories regarding our work, families, business, discuss books and articles that we have read and movies that we have seen. Sometimes we discuss different scientific ideas, and it is very interesting - as there are true professionals in many sciences there. The thing that is in my mind now and that is the subject of this article has come to me at one of these meetings. Because we used to drink some wine while talking,  I do not guarantee the accuracy of scientific details. But the idea as the whole did fascinate me then and kept coming back to my mind during all these years. Here it is...

There was some research center doing some research on some natural process ("some" - not because of conspiracy, but only because I do not remember or, quite possible, it has been never mentioned). Finally, they came up with the set of numbers calculated by some special way. It was a set of whole number digits (like, for example, 5, 13, 17, 25, ..). The task was to understand the inner law of this set, how and why these particular numbers follow in this particular way (the example provided is not a real set; it is not important, and you will see why...). Long time they could not figure out what the law is behind the set. One day, there was a problem with the computer, and they found the mistake in the program's code. This mistake gave them the hint how this problem might be solved. 

Trying to figure out the mistake, the guys printed the list of natural digits. Then they marked the digits they were investigating by X symbol, others - by *. And it was a very nice looking printout!!!

It looked like this:

****X*****

**X***X***

****X******

Here the set of digits from 1 to 30 is displayed. The digits of our set (5, 13, 17, 25 as an example) are marked by X. I repeat once again: I do not know any scientific details, I have heard this story for no more than 5 minutes only. But I remember  the name "the cognitive geometry" and that this idea is used somehow in cryptography. Later, I tried to get some information on the subject, but did not find much neither in the Internet, nor in the library. Maybe, because this is the field of cryptography? I do not know. 

For me, this approach was something totally different from what I used to work with before. To reveal some regular patterns, I used to deal with spectrograms or auto covariation diagrams. But that "nice looking printout" was a part of something new, unknown to me and fascinating. 

 

Applications

This story has bothered me for years. I kept asking myself: what if behind the stock market behavior there is some kind of mosaic pictures like that one? How to reveal this mosaic? I tried to program this idea, to apply it, several times. I started with making a printout similar to the one described by inventors of this technique. Like put on sheet of paper all price bars and mark the most sufficient ones by a special color. Programming this technique, I have faced some problems that I do not want to discuss in this article. Dealing with these problems, I understood that the best tool to describe such kind of regular structures might be the Neural Net technology (I will show it below). But technical difficulties at that time - such as huge calculations and a lot of computer RAM, - did not allow me to get the answer. It was like "Digital Fortress" for me.

Then there were several years of working on Neural Nets. I tried to find the one that will be able to deal with that idea. It became possible only after creation of the Object Oriented Neural Net (the core idea of the new program). Moreover, this idea might work well for stock markets forecast.  

Let me show you how to create the real forecasting Neural Net model based on this idea using Timing Solution program. I applied this technique for three price data sets (MSFT 5 min bar, DJI 5 min bar,  and crude oil daily prices data).  When you create inputs for Neural Net, press this button:

This button allows to create the special categories of ULE (Universal Language of Events) that describe cycles through the natural numbers (a divisor and a remainder together do this job):

Press "OK" button; the program will create the list of 1275 events (which is the list of all possible combinations of divisors and remainders while the maximum divisor is 50):

How it works? The program counts all available price bars and gives them a number: 1 - for the first price bar; 2 - for the second price bar, etc., 10,000 - for the 10,000th price bar. Then it marks the price bars in the respect to the event (which is some combination of the divisor and the remainder). Thus, the record "PRICE BAR: DIVISOR 4 REMAINDER 2" means that this event is active for these bars only 2, 6 (2+4), 10 (2+2x4), etc.

In other words, in respect to this event, we mark every 4th price bar. And all these 1275 events correspond to different values of the divisor (max 50, in our example) and the remainder. Each event gives us a different set of marked bars. It is like a kaleidoscope - each turn brings out a new combination of colored pieces. Or like a mosaic - a new combination of small pieces makes a new colored pattern.

 

Language of Mosaic

The events recorded above plus Neural Net technology describe the mosaic pictures very well. These events are very similar to fixed cycles events, but the Neural Net technology allows to learn how these cycles interact one with the other. This approach gives much more than the standard procedure of combining fixed cycles as it reveals the ability of these cycles to play with each other. The usage of Neural Net technology allows to reveal much more non obvious connections between the cycles; their interaction can provide very interesting pictures in comparison to simple summation provided by simple linear modeling.

The following is the example of pictures that can be provided by the interaction of two fixed cycles only.

Let's imagine a magical world. The main feature of this world is that the stock market is absolutely predictable there. It moves exactly like a sinus curve with the period = 30 days, something like this:

The price changes between $9 and $11. All habitants of this country are millionaires, because all they need to do is to buy a share of this stock once a month and then sell it in 15 days. No technical/fundamental analysis, no risk management , everything is absolutely predictable, because everything is explained by this sinus curve:

Then something goes wrong in this wonderful world; a "bad" cycle has appeared. Look at this picture:

This bad cycle with the period = 13 days (shown as the blue diagram) destroys the happy picture we have seen before. It interacts with the main 30-days cycle this way: when the 13-days "bad" cycle is in its negative phase (marked as "inversion"), the main 30-days cycle "mirrors" its usual movement. This is an example of a non-linear interaction between two cycles. It makes the final picture much more complicated:

Any linear combination of these two cycles provides much smoother and more predictable picture.

In other words, if there are any regular pictures or patterns in the stock market price movement, this Neural Net can reveal them. Therefore, we can create the forecasting model based on these events. 

This is the beauty of Neural Net technology, we need only to provide the adequate language that can describe the phenomena that we try to forecast. We do not need to care about the process because we describe the phenomena themselves and work with them (like with the price mosaic in our example). We do not need to worry about all possible combinations and their interactions, the Neural Net does it better. This approach allows us to work with the processes we do not know in details (or we do not know at all) and we do not have a proper explanation for. We deal here only with the phenomena that are snapshots of the process in time. But the language that describes this phenomena should be maximally compatible to these phenomena (hence we have the problem of preprocessing the data...).

Back Testing - Intraday Data

I trained the Neural Net using 5 min price data for MSFT security. The projection line based on these events correlates to the price (to be more precise, to the price oscillator); the correlation coefficient is 0.108. The correlation coefficient is calculated for 4 trade days after LBC (Learning Border Cursor):

This is the magnified picture for one day:

Here is the same Back Testing produced for DJI 5 min data:

The correlation coefficient is 0.074.  Besides that, for the close enough future (beyond  5 trade days after LBC), the system loses its forecasting ability.

It looks like some regular patterns present in the price movement. And these patterns are formed by cycles interaction. It is the reason why the linear model does not produce any results:

 

One interesting observation: when I used all price points (50,000 price bars), the result was worse than for 5,000 price bars only. I think it might be caused by:

  1. Possible skips in price data. Suppose that some fragment of mosaic is deleted, the whole picture will be destroyed.

  2. Possible changes in time of the stock mosaic picture for some still unknown reason.

For practical needs, I would recommend the easier way:

 

Back Testing - End of Day Data

I did this Back Testing for crude oil daily price data. The price points from 1983 to February, 2001 were used to train the Neural Net. The forecasting ability has been tested from February 2001 to April 8, 2004 (786 price points). This is the forecast (a part of it):

The correlation coefficient between the price oscillator (I used the relative price oscillator with the period=5 bars) and the forecasting line for the whole testing interval (786 points) is 0.08. Statistically it means that practically any accident is excluded. Possible artifacts here are:

  1. The weekly price cycles are excluded; we calculate here the spectrum - not a weekly cycle.

  2. Annual cycle - we analyze the price oscillator that reflects short term movements.

One interesting fact observed is that non-linear effects ( the interaction between the cycles) are very important here. This diagram shows how the correlation (on testing interval) changes in time:

The red curve is the correlation calculated for Neural Net model, while the blue curve is calculated for the simple linear model. The jumps of blue curve mean that the program cannot find a simple linear solution for this task while the red curve constantly increases its value. 

Practically, it means that for long term forecasting (a year and more) the fixed cycles are not so important as the mosaic figures formed by the interactions of these cycles.

 

Afterwards

This research will be continued. It takes time - because the Neural Nets applied for this task are very "heavy" and need to provide a lot of calculations. The main parameters necessary to vary are:

  1. Amount of hidden units in the Neural Net that reflects how intensively the cycles interact with each other.

  2. The maximum value of the divisor:  

  3. The algorithm of forming these events:  .  Checking "Fractals", options we use only the most important divisors (all other divisors can be represented as combinations of the existing ones). 

 

December 1, 2004

Toronto, Canada

� Timing Solution