Preliminary Results with the Neural Network Ensemble

Please see the disclaimer. This blog does not contain investment advice.

My previous post discussed the plan for developing volatility-trading strategies using artificial intelligence – starting with a stochastic neural network ensemble-based approach. I should point out that I began working on this problem a long time before starting this blog and the neural network based approach is now quite well developed.

At this stage I could go into more details on the development of the software but I prefer to leave that for later and jump into some preliminary results. So far I’ve managed to incorporate most of the desired features that I listed in my previous posts; the exception being the adaptive voting. Plus there’s more work to do on some aspects including the network inputs – developing new inputs and determining which are most significant.

To date I’ve generated strategies (neural-network voting ensembles) for about 40-50 companies (nearly all my testing thus far has been with FTSE 250 plus a few from AIM). Perhaps obvious but the candidate shares with the most potential for beating buy and hold are those that are most volatile. A high beta (volatility metric) is the closest readily available measure, though not ideal. Of the companies tested thus far, there is quite a lot of variability in performance  of the strategies (even at similar levels of volatility in the share price). What is more, this variation in performance across companies appears to be consistent across several different training periods (at least with data since 2011, I haven’t done much testing with data prior to then). It is not yet clear why there should be such variability.

To recap – in case you haven’t read the previous posts – the neural network ensemble votes each day and provides a lower and upper price band; the algorithm only recommends buying (or selling) if the price goes either above or below the band. The way in which the band is calculated (by summing probabilities) means that sometimes there is only an upper or only a lower price (typically if selling or buying, respectively). The trading cycles between a buying state (when fully in cash) and a selling state (when fully invested). I have not explored shorting.

For the first trials, I wanted to see how well the ensemble of networks performs over several years of data – data that comes immediately after the training and back-test periods. With these trials, the networks are trained from Jan 1 2011 to Jan 1 2012 and back tested from Jan 1 2012 to July 1 2012 (more precisely, the algorithms simulate trading from the first available trading day after the start date up to the last trading day before the end date – for both the training period and the back-test period). The ensemble is then tested on the new data at different voting percentages (from 1st July 2012 up to the present date). Note that the simulations have ignored dividends as this data is not included in my data feeds; this may be a significant omission for some companies.

Note also that all the results discussed are entirely auto generated – I select the company(ies), set up the training and backtest dates (3 dates in total since backtest starts when training finishes), select the set of inputs (into the networks) optionally adjust a few parameters … and press the run button and wait a few hours. I want to avoid biasing results, so I don’t do any manual filtering/selection of networks, post training. The software algorithms automatically save networks that meet the performance criteria. Once the software has finished the training process, the saved networks are ready to be tested in voting mode.

I typically use an initial simulated investment of £3K. The black plot in the charts (see figure 1) is the value of the investment over time for buy and hold (and thus reflects a fixed multiple of the share price). The coloured lines represent the simulated values of the investment when following the ensemble’s trading advice, for different voting percentages (subject to the usual caveats when simulating trading – data errors and insufficient volume may make some of the trades infeasible).

The second type of plot is a ratio plot (see figure 2), where the vertical value reflects the increase/decrease in the number of shares held relative to buy-and-hold e.g. a value of 2 would indicate approximately twice as many shares are now held relative to the initial investment. Note that the horizontal periods of each chart represent being invested in (a constant number of) shares; the positive slopes represent the periods in cash when shares are sold and repurchased at a lower price; the negative slopes are where shares are repurchased at a higher price (so not desirable). A vertical value less than 1 means the simulation now has fewer shares than it started with. A break in the chart up to the present date indicates that the simulation is in cash, awaiting an opportunity to repurchase the shares.

The following plots are for the simulated trades of RIO. This is one of the most consistent and promising performers over the years tested thus far, with mostly positive gains over a spread of voting percentages.

Training Dates: Jan 1 2011 to Jan 1 2012

Backtest Dates: Jan 1 2012 to July 1 2012

Inputs: Set_9

These are the simulated results of running the ensemble of neural networks for data immediately after the training and backtest, so from July 1 2012 to present date (figure 1). The legend shows the voting percentage. It shows that 15% was optimal for this simulation (this is atypical – the best voting percentage is more typically in the 25-40% range), with a current value above the 4300 line, whereas buy and hold has a current value below the 2050 line.

Fig1_20160103

Figure 1

This is the corresponding ratio plot for the RIO simulation (figure 2) – it shows the ratio of the number of shares currently held relative to the number of shares at the initial investment – i.e. the gain relative to buy and hold. For a share that has lost a considerable proportion of its value (since the simulation started), it is not surprising that the multiple is greater than 1 for RIO over this period – any strategy that stays out of the market for periods on a falling share price will do better than buy and hold.

Fig2_20160103

Figure 2

The power of voting…

I find this next illustration interesting (figure 3). It shows a list of the individual networks (from the ensemble of 66 voting networks), listed in descending order of gain in total value. I developed this view of the data to see the relative performance of individual networks over different timeframes (because the networks are stochastic when run individually, I’ve used a probability value >0.5 is true, else false).

It shows that the best network (from Jul 12 to present) had a gain of 20%, as did the next best two, followed by 19% for the following two, etc. So a simulation with the best network has a current value of ca. 3600 (a 20% gain relative to the initial investment of 3000); the worst performing network has a current value of <1600 (a 45% loss which is even worse than buy and hold). Of course, this is with hindsight; all the networks did well during training and backtesting and we have no idea from the outset which network will be the best or worst into the future. Compare this to the 15% ensemble voting current value of >4300 – which is a better performance than even the best individual neural network out of the 66. Of course, the 15% voting value is also unknown from the outset, but at least this value of the voting percentage takes an early lead, at least in this example for RIO. The same cannot be said for the ordering of the individual networks – the ordering can change significantly over different simulation time-frames.

Fig3_20160103

Figure 3

Optimal Voting percentage During Training & Backtesting

The obvious question is: Could the 15% optimal voting percentage be predicted from the performance of the ensemble over the training and backtest period? With this example of RIO, the optimal performance comes again from a 15% voting rate (figure 4). Unfortunately, this seems to be a fluke – most of the simulations I’ve done do not have the same percentage during – and post – backtest & training. (Note that the gain in share value in this chart is largely meaningless – the networks were ‘evolved’ to optimise their performance over the data from this period – although too great a performance could indicate over-fitting to the training data, which would be detrimental to future predictions).

Fig4_20160103

Figure 4

RIO turns out to be one of the best performing simulations (in terms of beating buy-and-hold and potentially making a gain at optimal, or near-optimal, voting percentages). Whether this continues to be the case presumably depends on whether the training data of 2011 to 2012 continues to be a good model for future share price dynamics. The 2011 to 2012 could be categorised crudely as an overall downward share price path with periods of volatile sideways movement – a trend that has broadly continued to the current day (perhaps more downward than sideways!). If in the future, this trend ceases – e.g. if the share price were to recover strongly, it is unlikely the ensemble would do well, as it has not experienced such a scenario during training. One test is to look back before the training period. 2009 – 2011 was characterised by a strong share price recovery following the 08 crash, with the share price more than tripling in value. Running the simulation over the 09-11 period (figure 5) shows that the ensemble underperformed buy-and-hold for all voting percentages for most of the time.

Fig5_20160103Figure 5

Building an ensemble from networks trained during different periods may mitigate this effect – or may just dilute the information contained in the most recent data. This is something I hope to look at further at some point – developing a heuristic or perhaps using a reinforcement learning algorithm – to adapt the voting weights of networks or ‘buckets’ of networks.

For the next posts, I’ll add some results for other companies from the FTSE