TL;DR:** No**

In the vastness of the Web, it is full of materials, manuals, ready-made solutions, assemblies and other stuff dedicated to forecasting the prices of cryptocurrency and traditional exchange assets, smelling of quick and easy incomes with a minimum of effort. And although different people write them, with different approaches, on different platforms and with different paradigms, they all have one unchanging common attribute – **they do not work.**

Why? Let’s get it right.

## Introduction

Let’s get acquainted, my name is Denis and, in my free time, I do research in the field of artificial intelligence and, in particular, artificial neural networks.

In this article I will try to describe the problems that novice researchers of artificial neural networks are creating for themselves in the pursuit of financial independence, spending valuable time with near-zero efficiency.

I hope that, within the framework of this article, it will be possible to maintain a sufficient balance between the complexity of the material and the ease of perception, so that the text is moderately simple, understandable and interesting both to people who are not related to this field, and to those who have been studying problems in this industry. I must say right away that there will be no formulas here, specific terminology is also minimized.

I do not work for Google. I do not have twenty degrees. I did not intern at NASA. I did not study at Stanford, and I bitterly regret it. However, I still hope that I understand what I’m talking about when it comes to forecasting systems and, at the same time, I’m pretty closely connected with the cryptocurrency world in general and the Cardano project in particular.

Of course, I, as a crypto enthusiast engaged in neural networks, simply could not help but bring into the foggy field of application of AI regarding cryptocurrencies.

### The essence of the problem

As mentioned earlier, there are so many materials that seem to have been worked out and seemingly deep, with examples, on this subject, so much that your eyes run wide. And the authors are so sure that their experiment, unlike the previous few hundred, is successful, that one wonders why the next article does not end with photos with a “lamb” on a personal island, and a list of authors of “kaggle kernels” related to price forecasting to bitcoins, does not duplicate Forbes lists.

Interestingly, regardless of the place and language of publication, all these articles end with approximately the same text: “Well, the result is quite good, **everything almost works,** you just need to tighten up a few hyperparameters and everything will be fine.”

And, of course, the graphs on which the neural network ideally indicates the price, such as:

We will return to them in the future and consider them carefully.

And, in order not to be unfounded, here are examples of such articles: one, two, three.

## How it all started

The idea of predicting new prices at old is far from new. In fact, this applies not only to cryptocurrencies. It just so happened that they are closer to me personally, but the homeland of what is called “technical analysis” is, after all, traditional exchanges. Those are the ones where, according to the films, they are all in expensive costumes, but at the same time they scream like girls at a concert of their favorite band.

Trying to see the future from the past, people invented a huge number of all kinds of tricky oscillators, indicators, signaling devices based on mathematical statistics, probability theory and, at times, frank pareidolia.

Perhaps the most popular is the search for figures. Fifteen minutes of reading the Internet, and even now on Wall Street! It’s so simple – you just need to find “Bart Simpson’s head”, “butterfly”, “flag (not to be confused with a wedge! 11)”, “azure falling in a vacuum turret”, on the graph, to build many, many lines and, quite openly , interpret it to your advantage!

Almost all of these solutions have one small, but very dense and severe drawback – they perfectly capture trends … after the fact. And if something is declared as not fixing, but predictive, then it is interpreted so freely that ten people, looking at the same chart with the same indicator, will give ten independent forecasts. And, which is characteristic, at least one of them will most likely be right!

But it will also be established **after the fact**. And the rest will simply say “ah, well, we inattentively read the signals incorrectly.”

Do not misunderstand me. It is quite possible that a real Wall Street trader, who has 20 screams and 200 suicide attempts over the years, can most likely stack a bunch of indicators and oscillators on each other and, like the operator from the movie “The Matrix”, read useful there data flavored with a sufficiently high mat. waiting for a successful transaction. I even admit that specifically you, the reader, also know how. Without a drop of sarcasm, I admit. In the end, for some reason, they are being invented, improved, these indicators …

### Modern problems require modern solutions!

By the year 2015, everyone had already heard neural networks. Rosenblatt had no idea how much they would be heard. Thanks to responsible, professional, media-savvy people, mankind has learned that neural networks are the most electronic version of the human brain that can solve any problem faster and better, with unlimited potential and in general, here we’ll jump directly into the light through a singularity dark future. That’s how lucky.

But there was one “but.” For the time being, neural networks lived only in reserved mathematical packages, in a very very low-level form, supporting mathematicians and scientists with graphs in MatLabs.

But popularization did its job and attracted a lot of attention of developers of various degrees of independence to the industry. These same developers, being, unlike ordinary mathematicians, people endowed with noble laziness, began to look for ways to throw several levels of abstraction on this matter, making life easier for themselves and everyone, showing the world a very convenient and high-quality high-level tools like Keras or FANN. In this zeal, they succeeded so much that they brought work with neural networks to the level of “just once and works”, opening the way to all comers to the world of miracles and magic.

It is miracles and magic, not mathematics and facts.

### The birth of a legend

Neural networks have become available, close and easily used for everyone. Seriously, the FANN implementation is even for PHP. Moreover, it is included in the list of basic extensions.

What about Keras? In 10 lines, you can compile a recurrence-convolutional network without understanding how convolutions work, or how LSTM differs from GRU! Artificial intelligence for everyone and everyone! And let no one go offended!

I think, in part, the terminology played the most cruel joke. What are neural network outputs called? Yeah. Predictions. Predictions. A neural network predicts one data over another. It sounds just like what you need.

Manuals for high-level libraries protect the user from complex terms, matrices, vectors, transformations, differential calculus, mathematical meanings of these gradients, regressions and regularization losses.

And, most importantly, they protect the romantic image of the “electronic model of the human brain capable of everything” from harsh reality, in which the neural networks are just an approximator, which, roughly speaking, is nothing more than an evolutionary step up a notch from an ordinary linear classifier.

But it doesn’t matter when you assemble your first solver for CIFAR-10 from the listings from the documentation, without making any effort, or even understanding what is going on. There is only one thought in mind:

### what can I say, what can I say, people are so arranged …

Here it is, a technological miracle! You just give him some data at the input, others at the output, but it itself finds a connection and learns to predict outputs by inputs. How many problems can be solved! How many tasks can be leveled!

So much to predict! Interestingly, do other people in general know? With this toolkit, my possibilities are endless! UNLIMITED!

But what if you feed the neural network with candles from the cryptocurrency exchange / stock exchange / forex, giving it a candle from the next time interval? She will then learn to predict new values from previous ones! After all, this is what it was made for! A neural network can predict anything, there would be data, but data on the history of quotes is a dime a dozen! Oh, inspiration, only a moment, but so beautiful!

### Why not?

Because in a real world that is different from the world created by the media, it doesn’t work like that. Neural networks are not a machine for predictions. Neural networks are approximators. Very good approximators. It is believed that neural networks can approximate almost anything. With only one condition – if this “something” lends itself to approximation.

And then a novice researcher falls into the hook of cognitive distortion. The first and main mistake is that historical quotation data seems to be more than just statistics. You can draw so many triangles and arrows on them after the fact that only a blind person when looking at it will not be obvious that this all has a certain logic that simply could not be counted in time. But which the Machine may know.

Looking at statistics, a person sees a function. The trap slams.

What is the second mistake / cognitive bias? But here’s the thing.

### And it works with the weather!

This is a very common argument that I hear in cryptocommunities, in dialogs about the possibilities of predicting something from historical data using statistical analysis methods. It works with the weather. The essence of the distortion is that “if A works for B, but it seems to me that B is the same as C, then A should work for C as well.” A kind of pseudo-transitivity, which rests in an insufficient understanding of the processes that underlie the differences between B and C.

With the same success, it can be assumed, for example, that the pedals in the cockpit of the aircraft are brake and gas with an automatic transmission, and not a horizontal steering wheel at all. The intuitive perception of some things, unfortunately, is not always correct, because it does not always rely on a fairly complete set of data about the situation / system / object. Hi Bayes! How are you?

Let’s get a little deeper into the theory.

### Chaos and the Law

It so happened that all processes and events in our reality can be classified into two groups: stochastic and deterministic. As I try hard to avoid the dreary terminology, let’s replace them with simpler terms: unpredictable and predictable.

As Obi-Wan correctly tells us, it’s not so simple. The fact is that, in the real world, not the theoretical one, everything is a little more complicated and absolutely predictable and completely unpredictable processes simply do not exist. As a maximum, there are quasi-predictable and quasi-unpredictable ones. Well, that is, here are almost unpredictable and almost predictable. Almost almost, but no.

For example, snow falls quasi-predictably from top to bottom. In almost 100% of cases observed. But not in my kitchen window! There it snows from the bottom up due to the peculiarities of air flow and the shape of the house. But not always! Also in almost 100% of cases, but not always. Sometimes in my kitchen window it also falls down. It would seem that such a simple thing, but for the same observer in two different cases, behaves completely differently, and both behaviors are normal and quasi-predictable with almost 100% probability, although they completely contradict each other. Not bad? The quasi-predictable event turned out to be … quasi-unpredictable? Further more.

At this moment, our friend Bayes begins to laugh. What about unpredictable events? I will not use the prefix “quasi”, okay? Everyone already understands that I mean it. So here. Take something completely unpredictable. Brownian motion? A great example of a completely unpredictable system. It is so? Let’s ask quantum physicists:

The fact is that, theoretically, even such a complex system as Brownian motion on a real scale, in theory, can be simulated and predicted the state of this system at any point in time in the future or the past. In theory. About how much calculations, capacities, time and sacrifices for the Dark Gods are needed for this, we tactfully keep silent.

And also a predictable, in the general case, system, which becomes unpredictable if you lower the scale to the level of particular cases, is actually quite predictable if you expand the scope of observation of a particular case to include external factors, obtaining a more complete description of the system in this very particular case.

Well, the truth is, knowing the specifics of air flows in a particular place, you can easily predict the direction of snow flight. Knowing the specifics of the “relief” of a particular place, one can predict the direction of the air flow. Knowing the specifics of the terrain, one can predict the specifics of the terrain. And so on and so forth. At the same time, we again began to zoom in, but now for a specific event. Separating it from the “general” definition of the behavior of this event. Someone, stop Bayes, he has an attack!

So what do we get? Any system is simultaneously predictable and unpredictable to one degree or another, the difference is only in the scale of observations and the completeness of the initial data describing it.

## What does the weather forecast and exchange trading have to do with it?

As we found out earlier, the line between a predictable and unpredictable system is extremely thin. But strong enough to draw a line dividing the weather forecast and trade.

As we already know, even the most unpredictable system actually consists of completely predictable fragments. To model it, it is enough to go down to the scale of these fragments, expand the scope of observation, understand the patterns and approximate them, for example, using a neural network. Or derive quite a specific formula that allows you to calculate the desired parameters.

And here lies the main difference between the weather forecast and the price forecast – the scale of the largest predictable simulated component. For weather forecasting, the scale of these components is such that they are well … they can be seen from the orbit of the Earth with the naked eye. And what is not visible, for example, temperature and humidity, can, thanks to weather stations, be measured in real time also throughout the planet. For trade, this scale … more on that later.

The cyclone will not say “I’m tired, I’m leaving,” disappearing out of the blue at an unpredictable point in time. The amount of heat received from the Sun by a particular hemisphere of the planet varies with the same pattern. The movement of air masses on a planetary scale does not require atomic simulation and is quite simulated at the macro level. A system called “weather”, which is a random event on the scale of a specific point on Earth, is quite predictable on more global scales. And still, the accuracy of these predictions leaves much to be desired at distances of more than a couple of days. The system, although predictable, is very complex so that it can be modeled with reasonable accuracy at any point in time.

And here we come to another important property of predictive models.

### Self-sufficiency or autonomy of predictions

This property, in general, is quite simple – a self-sufficient forecasting system, or an ideal forecasting system, can do without external data, not counting the initial state.

She’s perfectly accurate. To predict the properties of a system in state N, it is enough for her to obtain the calculated data in state N-1. And knowing the state of N, you can get N + 1, +2, + m.

Such systems include, for example, any mathematical progression. Knowing the state at the reference point and the number of this point in a series of events, one can easily calculate the state at any other point. Cool!

And also this is the answer to the question why the accuracy of the weather forecast dramatically falls over a long distance in time. Looking into the future, we make a forecast based not on the real state of the system, but on the predicted one. Moreover, not with 100% accuracy, unfortunately. As a result, we get the effect of accumulating forecast errors. But this despite the fact that we know almost all the significant “variables” and the description of the system can be called almost “complete”.

### What about quotes?

And with quotes, things are much worse. The fact is that in weather forecasting, almost all of the obtained and predicted data is **both the cause and effect** of events. The consequence of the events of the previous step, the cause of the events of the next step. Moreover, those significant data and events that are not both cause and effect are most likely simply cause and carry a powerful payload. For example, the amount of heat received from the sun at a point in time. And it is invariable. It is this that increases the self-sufficiency indicator of such forecasts. The consequence flows into the reason for the events in the next step. This is a completely non-Markov process that can be described by differential equations.

While the statistics of quotes are mainly **either consequences or 50 \ 50**. The growth of quotes can trigger a further increase in quotations and become a cause. And it may not provoke and cause. And it can provoke profit taking and, as a result, a fall in prices. Historical data on exchanges look solid. Volumes, prices, “glasses”, so many numbers! The vast majority of which are good for nothing, being only the result, echo, of events and causes that lie far beyond the plane of this statistics. On a completely different scale. In a completely different scope.

When modeling future quotes, we rely only on the consequences of events that are many times more complex than just the percentage deviation of the purchase volume. Price does not shape itself. It cannot be** differentiated by itself**. If the market is expressed as a metaphorical lake, the stock chart is just ripples on the water. Maybe this wind blew, maybe they threw a stone into the water, maybe the fish splashed, maybe Godzilla jumps 200 kilometers on a trampoline. We see only ripples. But in this ripple, we are trying to predict the strength of the wind in 4 days, the number of stones that will be thrown into the water in a month, the mood of the fish the day after tomorrow, or maybe the direction Godzilla will go when he gets tired of jumping. He will come closer and deploy the trampoline again – the ripples will become stronger! We catch the trend, hop hop hop!

This is a very important point:The self-sufficiency of the predicted model, based only on the observed consequences,tends to zeroas the number of factors outside the monitored sphere of related events grows. In other words, you cannot model the system well enough without having a sufficiently complete description of it.

Unfortunately, the scale of the maximum possible modeled component of the system, in the case of the market, comes down to man. Not even to a person, but to his psychophysical state, on which the reaction to market behavior depends, and which, by this very reaction, will itself influence the market. That very reason flowing into the consequence! Only thousands, if not millions of unique, individual people, will have to be modeled. With personal problems, worries, hormones, interactions, everyday activities.

And it’s not just about traders in the market on a global scale. It is also about the people behind specific projects. It is about the problems and successes of projects in the future. It is about important events in the same future. Events, sometimes extremely unpredictable. It turns out that in order to predict the future, we need to know the future.

In total, we need a sphere of observed conditions, which is completely inaccessible to us. The scale of the simulation, which for us is completely unattainable.

Well, that is, in theory, of course, achievable. Brownian movement, in theory, is also a very simulated and predictable system, remember? Then remember the price of the practical implementation of such a simulation. This price is prohibitively higher than the process of feeding a neural network with exchange candles. At least at the time of this writing.

## But what about the graphics?

Really. At the very beginning of this article, we presented charts with extremely high forecasting accuracy, bordering on places with 100%.

What do you see? Take a closer look. Great coincidence, isn’t it? Perfect, just perfect. And on the first and second graphs, the neural network naturally ahead of quotes one step ahead!

Remember, I mentioned high-level libraries for working with neural networks, and then this did not get any development in the text of the article? Now get it. Wide availability of anything, certainly reduces the bar for the average user. The same thing happens with neural networks. “Kaggle kernels” is a record. Any non-narrow-section section is simply buried in tons of solutions, the authors of which, in the vast majority, have no idea what they are doing at all. And from below, each decision is supported by pillars of laudatory comments from people who understand the issue even less. “Great job, what you need!”, “I’ve been looking for a kernel suitable for my tasks for so long, here it is! How to use it? ” etc.

To find among this something really interesting and beautiful is very, very difficult.

<rabid snobbery> As a result, we have such a phenomenon as people who easily operate with a rather complicated mathematical apparatus, but are not able to read graphs. </ rabid snobbery>

After all, time on the X scale moves to the right, and a prediction, ideally, should be obtained before the event.

## Just hyperparameters are not twisted yet

We are all happy when our neural network shows signs of convergence. But there are nuances. In programming as such, there is a rule “started does not mean earned.” When we just start to learn programming, we are immensely pleased with the fact that the compiler / interpreter was able to understand what we slipped into it and did not throw us errors. At this level of formation, we believe that errors in a program are only syntactic.

In the design of neural networks, everything is the same. Only instead of compilation is convergence. It agreed – it does not mean learning exactly what we need. So what did it learn?

An inexperienced researcher, looking at such beautiful graphics, is likely to come out. But more or less experienced, alert, because there are not so many options:

- The network is clearly retrained (re-meaning “redundant” rather than “re-”)
- The network exploits a flaw in teaching methods
- The network has approximated the Exchange Grail and is able to predict the state of the market at any moment in time, “spreading out” an endless chart from just one candle

What option do you think is closer to reality? Unfortunately, not the third. Yes, the network really learned. She really amazingly accurately gives results, but why?

Although artificial neural networks are not an “electronic model of the human brain,” they still exhibit some properties of the “mind”. Basically, this is “laziness” and “trick”. And at the same time. And these are not the consequences of the emerging in a couple of hundreds of “neurons” of self-awareness. These are the consequences of the fact that the term “optimization” is actually hidden behind the populist term “education”.

A neural network is not a student who is studying, trying to understand what we are explaining to him, at least at the time of this writing. A neural network is a set of weights whose values must be adjusted or optimized in such a way as to minimize the error of the result of the neural network relative to the reference result.

We give the neural network a task, and then we ask it to “pass the exam.” According to the results of the “exam”, we decide how successful it is, rightly believing that in the process of preparing for the “exam”, our network has acquired sufficient knowledge, skills and experience.

See the catch? Not? Yes, here it is, on the surface! While your goal is to teach your network useful skills, in your opinion, its goal is to pass the exam.

At any cost. By any means. Perhaps, nevertheless, with some students she has more in common than was stated in two paragraphs earlier …

So how do you pass the notorious exam?

### Memorize

The first option on the list of possible reasons for such incredible accuracy. Almost any novice researcher of artificial neural networks certainly knows that the more neurons in it, the better. And, even better, when there are many many layers in it.

But it does not take into account the fact that the number of neurons and layers increases not only the network potential in the field of “abstract thinking”, but also the amount of its memory. This is especially true for recurrent networks, because their memory capacity is truly monstrous.

As a result, during the optimization process, it turns out that the best option for passing the exam is … regular cramming or “overfitting”, “overfitting”. The network will simply learn all the “correct answers” by heart. Absolutely not understanding the principles by which they are formed. As a result, when testing the network on a data sample that it had never seen before, the network begins to carry nonsense.

For this reason, for training deep / wide networks you need much more data, you need regularization, you need control over the minimum error threshold, which should be small, but not too much. And, even better, find the right balance between the size of the network and the quality of the solution.

Good. We will consider. We will throw out the extra layers. The architecture is simplified. We’ll implement all sorts of different tricks. Will it work now? Is not a fact. After all, number two on the list of easy exam options:

### Outsmart the teacher

Since the classification of a neural network does not get for the process, but for the result, the process by which it achieves this result may differ slightly from what the developer intended. This is one of the most vile moments of working with these beautiful animals – when the network learned, but not that.

When you see graphs with course predictions that perfectly repeat the real course, think about what you taught the neural network? Super accurate to predict prices? Or maybe just repeat them like a parrot?

Be sure that the network, which has almost 100% accuracy on the training set and the same on the test set, simply repeats everything that it sees. Networks in which the prediction graph is shifted one step in time to the right (examples of graphs 1 and 2) simply repeat the price value from the previous step, which is passed to them in a new one. The graphs, of course, look very encouraging, and almost perfectly match, but they have no predictive power. You are able to announce yesterday’s price today yourself, for this you do not need to study at Hogwarts or polish Palantir, right?

But this is if you give them values from the previous step, comparing with the value of the current step. Sometimes people just give meaning from the current step, comparing it with the next step. In this case, we get beautiful graphs that match the original ones almost perfectly (examples of graphs 3 and 4).

Sometimes you can see graphs that do not match perfectly, softer, as if smoothed, interpolated. This is usually a clear sign of a recursive network that is trying to link a new result with a previous one (graph example 3).

All these results have only one thing in common – the neural network has learned to pass the 5-plus exam. But she did not learn how to solve the task assigned to her in the way that was required and does not bear any practical benefit for the researcher. Just like a student, but with a cheat sheet, right?

Why does the network repeat the previous values, and not try to generate new ones? Yes, simply during the training, she comes to the reasonable conclusion that, usually, the closest point to the next point on the graph is the previous one. Yes, the magnitude of the error in this case floats, but it, on a large sample, is stably smaller than the magnitude of the error when trying to predict the next state in a quasi-random process.

Neural networks can perfectly generalize. A generalization of this kind is an excellent solution to the problem.

Alas, no matter how you twist hyperparameters, the future will not open for her. The graph does not move one step back in time. Yes, the Grail is so close, but so far.

### Stop. No no no. But what about algorithmic trading? She exists!

Exactly. Of course it exists. But the key point is that this algorithmic trading is not the same as algorithmic divination. Algorithmic trading is based on the fact that the trading system analyzes the market at the current time, making the decision to open and close a transaction based on a large number of objective parameters and indirect signs.

Yes, this, technically, is also an attempt to predict the behavior of the market, but, unlike predictions for days and months in advance, the trading system tries to work at the most permissible small time intervals.

Remember the weather forecast? Remember that its accuracy drops dramatically over long distances? It works both ways. The shorter the distance, the higher the accuracy. You, looking out the window, even without being a meteorologist, can predict what air temperature will be in a second, right?

But how does it work? Is this not contrary to all that has been said? But what about ripples on the water, how about the lack of data? What about Godzilla, after all !?

But no, there are no contradictions. As long as the trading bot works at very small intervals, really small, from a minute to fractions of a second, depending on the type, it does not need to know the future and does not need to have a complete picture of the market. It is enough for him to understand how the system around him works. In what circumstances it is better to open a deal in it, in which to close. A trading bot operates on such a small scale that its field of view is capable of covering enough factors to make a successful decision over an acceptable short distance. And for this he absolutely does not need to know the global state of the system.

#### Conclusion

The article turned out to be big. More than I expected. I hope it will be useful for someone and will help save time for someone who decided to try their luck in search of the Holy Grail of Commerce.

Let’s highlight the main points:

- Godzilla has a trampoline
- You need to understand how the tools with which you solve the problem
- It is necessary to understand the limits of applicability and adequately assess the solvability of the problem as such
- It is important to be able to correctly interpret the results of the toolkit
- Neural networks are function approximators, not predictors of the future
– these are also functions`f(x)=x and f(x_n)=x_{n-1}`

- To simulate the state of a system, you need to have a complete or close description of this system
- Statistics – only a partial, selective description of the consequences of the system
- A good forecasting system should be moderately self-sufficient
- Neural networks cannot be taken for a word, they are insidious, cunning and lazy
- Want AI to help you trade? Teach him to
**trade**

Thanks to everyone who read to the end!

**P.S. No, this is not an article about the “fundamental analysis of VS technical.” This article is about “there are no miracles.”**

## 11 comments

Summary of the article.

1) The market is pretty chaotic, and chaos cannot be predicted – therefore, neural networks (and everything else) do not work if they look only at the price chart.

2) I love letters.

scale. On a small scale, where the process can still be called non-Markovian, its behavior is more or less predictable. Algorithmic trading there works quite tolerably. But it’s hard for people to trade at such levels, they need longer periods of time. And forecasting is already breaking on them, yes. The meaning of the article is precisely to describe the reasons for such a situation from the inside as detailed as possible. Otherwise, the value of such statements will be no more than that of IMHO.

2) For this reason, at the very beginning there is specially TL; DR with an even shorter content of the article.

It means that the exchange is the environment, and the neuron is trading in it? How do you calculate the reward? By type could / could not make a successful deal?

I think RL is the most normal option. For example, OpenAI taught the computer to play Dota 2, and in it the actions of the opponent can also be chaotic. Or am I crazy?

You may ask a question – do you have experience that on a small scale algorithmic trading still works correctly? Because in my experience, there are the same chaotic problems as in large gaps.

There is. As a hobby, I’m drinking a bot that learns to trade, not predict prices. So far, the results are encouraging, although they are purely “laboratory”, on a simulated local exchange, taking into account commissions and all that. Recurrent networks are used, reinforced learning. But here it is more of academic interest, at least at this stage.

The trouble is that as long as the interval is lower, the accuracy is higher, but at the same time, the amount of data and the training time are growing. The M1 interval so far works as a sufficient compromise. At larger intervals, the network ceases to understand what is happening. On smaller – you will not save time.

Well, in general, algorithmic trading was invented a long time ago, and it does not necessarily use neural networks. A substantial percentage of transactions in the markets belong to trading bots.

data with all related data, including prices, volumes, relative movements, number of transactions, weighted average prices, own state of the “agent”. This allows you to squeeze the maximum speed for training and check the performance in different periods before running the simulation on the “long-term”.

Yes, I also think RL is the best solution for this task.

To calculate the reward, we use our own methodology, which reduces the evaluation of the transaction to an analytical task. Let’s just say that, knowing the state of our depot, your average price, the external price and the nature of the transaction, you can assign an objective assessment to this transaction that really describes it, and not just “could not”.

You can, of course, just use the balance state after the transaction, but then the learning speed will tend to a complete enumeration of states, and the network will have to approximate the “extra” logic, which can be brought to the application level.

Plus, the balance assessment works only after the fact, and the transaction assessment works in real time and allows you to evaluate all available options in advance, increasing the effectiveness of learning for iteration within the era.

Try digging a little the other way. No need to teach the network to trade. Train her on the microstructure of the market to determine its (market) condition. For stable profitable trading in the long-term, it is more than enough to determine whether we are flat (launching MM), aggressive accumulation / distribution (launching a trend), or the state is “incomprehensible” (sitting on a fence).

ps In general, I believe that attempts to stick neural networks to trade are from laziness. I don’t want to understand how the market works, I feed how much more data there is for the neural network, maybe it will understand … My bots can quite successfully cope with the task of analyzing the market microstructure and without neural networks, after it comes to understanding what data is important for analysis – it can already be fed to the input ML will be good, but only when there is understanding – a simple quantitative analysis or regression is enough. To submit empty price series to the ML entry, and even in the OHLC format, is stupid, not only because they do not carry cause and effect relationships, but also because this is the very first and banal thing that comes to everyone who is interested in this area. So if something was there, then this inefficiency has already been found and eliminated.

Actually, about that and the article. And no, not from laziness. From interest. It would be lazy, I would not bother with the low-level TensorFlow API, data analysis, preparation, writing support functions to evaluate what is happening. Roughly speaking, a neural network as such in a pet project takes about 30 percent of the logic. The rest is ordinary applied mathematics.

And yes, training is carried out under the “supervision” of a system that reduces trade to solving an analytical problem that is able to make decisions completely autonomously without training as such. If you turn off commission fees – it shows miracles on bends.

The task of the neural network in this case is to approximate its logic, but at the same time accumulate experience that helps to overcome “local minima” in which a “clean solution” that does not take into account time is stuck.

Just in the article it is explained that chaos can be predicted (weather, for example) But the essence, rather, is that chaos is different. And some of them, too, seem to be possible, but senseless. And certainly not the way amateurs try to do it.

I am currently testing a neural network in the real market which, on the EURUSD pair, predicts the color of the next M1 candle. Predictions with a probability of more than 70% occur in 2% of predictions, i.e. every 50th candle. I filter these predictions, which gives about 20 predictions per day. Everything works on the M5 too. The author did not succeed, therefore, so many words. Advice to the author: there is really not enough chaos in the crypt, and history is not enough, start with EURUSD.

1) M1 is an interval sufficient for work, on it the system is really more or less predictable, I have already covered this point above.

2) Predicting the color of a candle – is it like binary options? Just green / red? Without magnitude of movement?

3) “Predictions with a probability of more than 70% occur in 2% of predictions, i.e. every 50th candle. ” – and the rest? “Every 50th candle” does not sound very statistically significant, to be honest. In such developments, testing should always be carried out in conjunction with an agent acting randomly. And compare the number of successful transactions with him. In the case of a binary option, a random agent will issue, over a theoretically long distance, approximately 50% of the correct decisions.