Backtesting trading strategies

TL;DR

Backtesting is the process of evaluating trading strategies using a market simulator and historical data. As any simulation, it has a degree of error but could be used as a necessary, but not sufficient indicator of the strength of the strategy. Building on the results of the previous publications, this post will deal with the process of augmenting detailed trading data with accumulated aggression indicators and using it to evaluate a simple trading strategy.

Getting data

The principle used to guide the trading strategy development is that the accumulated aggression is a strong indicator of price movement, allowing to identify key moments in which sudden surges happen and the open of a position could be favorable. As such, the data used to evaluate the trading strategy includes the calculated accumulated aggression on a moving time window.

To start with, the historical data is collected from MT5 using the procedure below:

def get_tick_data(
    ticker,
    start_date,
    end_date,
):
    tick_data = pd.DataFrame()

    if mt5.initialize():
        if start_date is None or end_date is None:
            date_to = datetime.datetime.now(datetime.timezone.utc)
            date_from = date_to - datetime.timedelta(seconds=3 * 60)
        else:
            date_to = dateparser.parse(end_date)
            date_from = dateparser.parse(start_date)

        ticks = mt5.copy_ticks_range(ticker, date_from, date_to, mt5.COPY_TICKS_TRADE)

        tick_data = pd.DataFrame(ticks)

        tick_data["datetime"] = pd.to_datetime(tick_data["time_msc"], unit="ms")

    else:
        print(f"MT5 error: \n{mt5.last_error()}")

    return tick_data

The first parameter of the procedure, ticker, represents the instrument on which the data should be collect. Our example will use MESU23, the Micro E-mini S&P 500 Index active on the time of writing. The second and third parameters, start_date and end_date, correspond to the start and end date of the time window on which trading data of the ticker should be collected. It is expected that the format is as follows: 2023-07-27 16:20:00 UTC+0000, representing date, time and time zone.

In the case that the chosen time window represents a low volume period of trading, without orders being executed at every second, the procedure below fill the gaps and normalizes the data sample.

def cut_minute(grouped_df: pd.DataFrame, freq: int):
    result_df = grouped_df
    missing_seconds = list(set(range(60)).difference(set(grouped_df.second.unique())))

    if len(missing_seconds) == 0:
        complement_df = pd.DataFrame()
    else:
        complement_df = pd.DataFrame(
            {
                column_name: [grouped_df[column_name].unique()[0]]
                * len(missing_seconds)
                for column_name in result_df.columns
            }
        )

        complement_df["volume"] = 0
        complement_df["flags"] = TICK_BUY
        complement_df["second"] = missing_seconds
        complement_df["datetime"] = [
            datetime.datetime(
                result_df["year"].unique()[0],
                result_df["month"].unique()[0],
                result_df["day"].unique()[0],
                result_df["hour"].unique()[0],
                result_df["minute"].unique()[0],
                second,
            )
            for second in missing_seconds
        ]

    result_df = pd.concat([result_df, complement_df])
    result_df.sort_values(by="datetime", inplace=True)
    result_df["interval"] = pd.cut(
        result_df.second,
        pd.interval_range(0, 60, freq=freq, closed="left"),
        include_lowest=True,
    )

    return result_df

As a last step of the preprocessing, the data is formatted as open, high, low and close values, along with the calculation of volume as accumulated aggression in the procedure below.

def ohlc_delta(grouped_df):
    open_price = 0.0
    close_price = 0.0
    max_price = 0.0
    min_price = 0.0

    if len(grouped_df.loc[grouped_df["last"] > 0]) > 0:
        open_price = grouped_df.loc[grouped_df["last"] > 0]["last"].iloc[0]
        close_price = grouped_df.loc[grouped_df["last"] > 0]["last"].iloc[-1]
        max_price = grouped_df.loc[grouped_df["last"] > 0]["last"].max()
        min_price = grouped_df.loc[grouped_df["last"] > 0]["last"].min()

    volume_df = grouped_df.copy()

    volume_df.loc[volume_df["flags"] == TICK_SELL, "volume"] = (
        -1 * volume_df.loc[volume_df["flags"] == TICK_SELL, "volume"]
    )

    current_datetime = datetime.datetime(
        volume_df.year.unique()[0],
        volume_df.month.unique()[0],
        volume_df.day.unique()[0],
        volume_df.hour.unique()[0],
        volume_df.minute.unique()[0],
        volume_df.interval.unique()[0].left,
    )

    raw_data = {
        "datetime": current_datetime,
        "open": open_price,
        "high": max_price,
        "low": min_price,
        "close": close_price,
        "volume": volume_df.volume.sum(),
    }

    return pd.DataFrame(raw_data, index=[0])

The resulting data created by the procedures above will have a format similar to Figure 1 below:

Figure 1: Raw MT5 data

The complete code used to preprocess the trading data can be found in this GitHub gist.

Backtesting module

Backtrader will be used as the backtesting platform, on which a simple trading strategy will be evaluated. Assuming the accumulated aggression as the main signal for placing an order, the strategy can be implemented as in the code below, following the documentation guidelines for the basic functionality.

class TestStrategy(bt.Strategy):
    def __init__(self, volume_threshold=100, price_threshold=1.5):
        # Keep a reference to the "close" line in the data[0] dataseries
        self.dataclose = self.datas[0].close
        self.volume = self.datas[0].volume
        self.counter = 0

        self.order = None
        self.buyprice = None
        self.sellprice = None
        self.volume_threshold = volume_threshold
        self.price_threshold = price_threshold

    def log(self, txt, dt=None):
        """Logging function for this strategy"""
        dt = dt or self.datas[0].datetime.datetime(0)
        print(f'{dt.isoformat(" ", "seconds")}, {txt}')

    def notify_order(self, order):
        if order.status in [order.Submitted, order.Accepted]:
            # Buy/Sell order submitted/accepted to/by broker - Nothing to do
            return

        if order.status in [order.Completed]:
            if order.isbuy():
                self.log(
                    "BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f"
                    % (order.executed.price, order.executed.value, order.executed.comm)
                )

                self.buyprice = order.executed.price
                self.sellprice = None
            else:  # Sell
                self.log(
                    "SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f"
                    % (order.executed.price, order.executed.value, order.executed.comm)
                )

                self.sellprice = order.executed.price
                self.buyprice = None

        elif order.status in [order.Canceled, order.Margin, order.Rejected]:
            self.log("Order Canceled/Margin/Rejected")

        self.order = None

    def notify_trade(self, trade):
        if not trade.isclosed:
            return

        self.log(f"Operation profit, gross: {trade.pnl,}\n")

    def next(self):
        if self.order:
            return

        # If there is no open position
        if not self.position:
            # If accumulated aggression of the last period is greater
            # than the threshold, send a 'buy' order.
            if self.volume >= self.volume_threshold:
                self.order = self.buy()

            # If accumulated aggression of the last period is smaller
            # than the threshold, send a 'sell' order.
            elif self.volume < -1 * self.volume_threshold:
                self.order = self.sell()

        # If there is already an open position
        else:
            if self.buyprice is not None and self.sellprice is None:
                # Long position
                target_gain = self.dataclose - self.buyprice > self.price_threshold
                max_loss = self.buyprice - self.dataclose > self.price_threshold
                if target_gain or max_loss:
                    self.order = self.sell()

            elif self.sellprice is not None and self.buyprice is None:
                # Short position
                target_gain = self.sellprice - self.dataclose > self.price_threshold
                max_loss = self.dataclose - self.sellprice > self.price_threshold
                if target_gain or max_loss:
                    self.order = self.buy()

The volume_threshold indicates the value of accumulated aggression for a 2-second window triggering an order placement: should be above that value, a buy order is created; if below -1 times that values, a sell order is created.

Once the position is taken, the strategy monitors the price and identify either a target_gain or a max_loss of $1.5, which is the condition to close the position.

The execution of the strategy is done using the main code below:

def main():
    cerebro = bt.Cerebro()
    cerebro.addstrategy(TestStrategy)
    cerebro.addanalyzer(btanalyzers.TradeAnalyzer, _name="tradeanalyzer")
    cerebro.broker.setcash(10000.0)

    data = btfeeds.GenericCSVData(
        dataname=sys.argv[1],
        nullvalue=0.0,
        dtformat=("%Y-%m-%d %H:%M:%S"),
        timeframe=bt.TimeFrame.Seconds,
        datetime=1,
        open=2,
        high=3,
        low=4,
        close=5,
        volume=6,
    )

    cerebro.adddata(data)

    print("Starting Portfolio Value: %.2f" % cerebro.broker.getvalue())
    thestrat = cerebro.run()[0]
    t_a = thestrat.analyzers.tradeanalyzer.get_analysis()
    print("Final Portfolio Value: %.2f" % cerebro.broker.getvalue())

    print(f"Total trades: {t_a.total.total}")
    print(f"Winning trades: {t_a.won.total} ({100*t_a.won.total/t_a.total.total:.2f}%)")
    print(
        f"Losing trades: {t_a.lost.total} ({100*t_a.lost.total/t_a.total.total:.2f}%)"
    )


if __name__ == "__main__":
    main()

The result of the execution will print a log message for each trade and a final report. By using it with the preprocessed data created before should generate an output similar to the one below:

$ python algotrading-05.py MESU23_tick_data.csv
Starting Portfolio Value: 10000.00
2023-07-27 16:22:02, BUY EXECUTED, Price: 4618.25, Cost: 4618.25, Comm 0.00
2023-07-27 16:24:36, SELL EXECUTED, Price: 4620.00, Cost: 4618.25, Comm 0.00
2023-07-27 16:24:36, Operation profit, gross: (1.75,)
(...)
2023-07-27 17:19:40, SELL EXECUTED, Price: 4609.75, Cost: -4609.75, Comm 0.00
2023-07-27 17:19:58, BUY EXECUTED, Price: 4608.00, Cost: -4609.75, Comm 0.00
2023-07-27 17:19:58, Operation profit, gross: (1.75,)

Final Portfolio Value: 10015.50
Total trades: 26
Winning trades: 17 (65.38%)
Losing trades: 9 (34.62%)

Conclusion

The use of a backtesting platform as backtrader is a good practice that should be followed as often as possible when developing new strategies. It is a necessary but not sufficient criteria before venturing to trade real money, meaning that a strategy failing in the backtesting would almost certainly fail in a real context, as a succeeding one might still fail in a real context, but it got a better chance at it.

Disclaimer

Futures and options trading has large potential rewards, but also large potential risk. You must be aware of the risks and be willing to accept them in order to invest in the futures and options markets. Do not trade with money you can not afford to lose. This post is neither a solicitation nor an offer to buy or sell futures or options. No representation is being made that any account will or is likely to achieve profits or losses similar to those discussed. The past performance of any trading system or methodology is not necessarily indicative of future results.

This post is provided for informational and educational purposes only. This material neither is, nor should be construed as an offer, solicitation, or recommendation to buy or sell any securities. Any investment decisions made by the reader through the use of this content is solely based on the reader independent analysis, taking into consideration their financial circumstances and risk tolerance. The author shall not be liable for any errors or for any actions taken in reliance thereon.

The references made to MT5 and AMP Global in this post do not represent a recommendation. I am not affiliated to neither platform, service nor their parent companies.