Trading strategies, neural networks and genetic algorithms…

In the first post, I wrote about some of the features I hoped to implement into software for (simulating) the trading of share price volatility.

The first approach I want to try is to use neural networks where the weights within each neural network are adapted or ‘trained’ by a genetic algorithm over the course of hundreds of ‘generations’. Each neural network (from now on referred to as ‘network’) will be a ‘strategy’, providing buy (or sell) prices for this day only (this is a daily model; however it could be operated over different time periods, assuming the data is available).

As a very brief description of my proposed approach: a set of neural networks are first trained to fix their internal state (their set of weights). Once trained, each network provides buy (or sell) prices and probabilities. Each network does this by taking a set of inputs that are pre-processed & scaled data, mostly derived from past price data (the price data of whatever share we are investigating, but it could be based on other price ratios, volumes, dates, indicators, as well as previous states of the strategy – more on this in a later post). The inputs are next multiplied by the weights, summed and pushed through ‘sigmoid’ functions. This is repeated through several layers of the network. The final layer is the output layer. The output is selected to be the aforementioned high and low prices (for buy or sell), together with probabilities for each price. So if the share price moves outside the high-low price band then a decision is made to buy (or sell), subject to the probability for that band.

Together the selected set of networks operates as a team or ‘ensemble’, voting on the price band and probabilities. By ordering the networks by their prices and summing the probabilities, deterministic prices are derived for different level of ‘voting’.

The software and the networks will operate in one of three modes:

Training Mode

During training there is a population of say one-thousand networks with randomly generated weights. Each network runs through every simulated trading day of the training period (typically 1 or 2 years) – each simulated trading day the software decides whether to buy (or sell) based on the network’s price and probability outputs. After all networks have traded and their performance assessed, a new generation of networks is created using a genetic algorithm (GA). The GA selects pairs of networks from the existing generation based on their ‘fitness’ (how successful they were at trading) and performs a ‘mating’ operation to create a new generation of ‘child’ networks. Over the course of many generations, the GA slowly evolves the networks’ weights to maximise the ‘objective function’ – the success at trading during the training interval. During the above process, each network is also ‘back-tested’ to see how well it performs against data that it has not encountered before during training – typically the 7-12 months that immediately follow the training period. Only the networks that are most successful during the backtest period are saved into the ensemble used for voting.

Backtest Voting Mode

In voting mode, all the saved networks operate over a specified (past) date interval using ‘voting’ at a predetermined percentage level. Typically, the voting percentage is incremented – e.g. by 5% from say 20% to 50% voting – in order to determine a historically optimal voting percentage (over whatever date range has been selected). Adaptive voting, where successful networks are reinforced at the cost of unsuccessful networks, is a more complex alternative approach. It is reasonable to expect the ensemble of networks to perform well over the date period that was used for training and for backtesting (as they were individually selected based on their ability within this period). What is unknown is how well the ensemble performs after the backtest period has finished – and what percentage of voting amongst the ensemble produces the best returns.

Live Mode  

In the live mode, the daily open price of the share is the trigger to calculate the low and high price for the buy (or sell) for today. To operate in the Live mode, the voting percentage must be fixed to some value. By sorting all the networks in order of price, the code sums the probabilities for each network until the specified voting percentage is reached. Actual live prices can be pulled in from live data streams and an email sent when a buy or sell decision is reached. Or they could interact with a share broker’s software ‘API’ to automatically trade without human intervention. I doubt I’d want to go that far – maybe with a virtual portfolio.

Confidence

Which leads to my next thought – even if everything appears to work well, a key consideration is having confidence that all the software is working as expected. While some bugs are inevitable, there absolutely must not be any bias or foresight required in the process. Foresight arises when knowledge of future price data is required to derive today’s price and probability bands. What I call bias arises when the order in which code operates affects the outcome – e.g. if the ordering of the code implicitly assumes the daily high always comes before the daily low.

One way to be reasonably confident is to check that the network outputs generated for today in the ‘live’ mode are repeated tomorrow (and subsequently) in the Backtest Voting Mode. I could write a load more on this subject – together with the difficulties/feasibility of replicating the simulated returns in the ‘real world’ but that probably too much waffle for one sitting.

So what all this is leading to is hopefully a system that can recognise complex patterns in data and use this ability to suggest a price band at the start of each trading day (in the Live mode). If past data patterns prove to be a predictor of future data patterns then hopefully – and on average -there may be a long term out-performance relative to a buy-and-hold approach.

Next post…