Backtesting trading strategies
TL;DR
Backtesting is the process of evaluating trading strategies using a market simulator and historical data. As any simulation, it has a degree of error but could be used as a necessary, but not sufficient indicator of the strength of the strategy. Building on the results of the previous publications, this post will deal with the process of augmenting detailed trading data with accumulated aggression indicators and using it to evaluate a simple trading strategy.
Getting data
The principle used to guide the trading strategy development is that the accumulated aggression is a strong indicator of price movement, allowing to identify key moments in which sudden surges happen and the open of a position could be favorable. As such, the data used to evaluate the trading strategy includes the calculated accumulated aggression on a moving time window.
To start with, the historical data is collected from MT5 using the procedure below:
def get_tick_data(
ticker,
start_date,
end_date,
):= pd.DataFrame()
tick_data
if mt5.initialize():
if start_date is None or end_date is None:
= datetime.datetime.now(datetime.timezone.utc)
date_to = date_to - datetime.timedelta(seconds=3 * 60)
date_from else:
= dateparser.parse(end_date)
date_to = dateparser.parse(start_date)
date_from
= mt5.copy_ticks_range(ticker, date_from, date_to, mt5.COPY_TICKS_TRADE)
ticks
= pd.DataFrame(ticks)
tick_data
"datetime"] = pd.to_datetime(tick_data["time_msc"], unit="ms")
tick_data[
else:
print(f"MT5 error: \n{mt5.last_error()}")
return tick_data
The first parameter of the procedure, ticker
, represents the instrument on which the data should be collect. Our example will use MESU23
, the Micro E-mini S&P 500 Index active on the time of writing. The second and third parameters, start_date
and end_date
, correspond to the start and end date of the time window on which trading data of the ticker should be collected. It is expected that the format is as follows: 2023-07-27 16:20:00 UTC+0000
, representing date, time and time zone.
In the case that the chosen time window represents a low volume period of trading, without orders being executed at every second, the procedure below fill the gaps and normalizes the data sample.
def cut_minute(grouped_df: pd.DataFrame, freq: int):
= grouped_df
result_df = list(set(range(60)).difference(set(grouped_df.second.unique())))
missing_seconds
if len(missing_seconds) == 0:
= pd.DataFrame()
complement_df else:
= pd.DataFrame(
complement_df
{0]]
column_name: [grouped_df[column_name].unique()[* len(missing_seconds)
for column_name in result_df.columns
}
)
"volume"] = 0
complement_df["flags"] = TICK_BUY
complement_df["second"] = missing_seconds
complement_df["datetime"] = [
complement_df[
datetime.datetime("year"].unique()[0],
result_df["month"].unique()[0],
result_df["day"].unique()[0],
result_df["hour"].unique()[0],
result_df["minute"].unique()[0],
result_df[
second,
)for second in missing_seconds
]
= pd.concat([result_df, complement_df])
result_df ="datetime", inplace=True)
result_df.sort_values(by"interval"] = pd.cut(
result_df[
result_df.second,0, 60, freq=freq, closed="left"),
pd.interval_range(=True,
include_lowest
)
return result_df
As a last step of the preprocessing, the data is formatted as open, high, low and close values, along with the calculation of volume as accumulated aggression in the procedure below.
def ohlc_delta(grouped_df):
= 0.0
open_price = 0.0
close_price = 0.0
max_price = 0.0
min_price
if len(grouped_df.loc[grouped_df["last"] > 0]) > 0:
= grouped_df.loc[grouped_df["last"] > 0]["last"].iloc[0]
open_price = grouped_df.loc[grouped_df["last"] > 0]["last"].iloc[-1]
close_price = grouped_df.loc[grouped_df["last"] > 0]["last"].max()
max_price = grouped_df.loc[grouped_df["last"] > 0]["last"].min()
min_price
= grouped_df.copy()
volume_df
"flags"] == TICK_SELL, "volume"] = (
volume_df.loc[volume_df[-1 * volume_df.loc[volume_df["flags"] == TICK_SELL, "volume"]
)
= datetime.datetime(
current_datetime 0],
volume_df.year.unique()[0],
volume_df.month.unique()[0],
volume_df.day.unique()[0],
volume_df.hour.unique()[0],
volume_df.minute.unique()[0].left,
volume_df.interval.unique()[
)
= {
raw_data "datetime": current_datetime,
"open": open_price,
"high": max_price,
"low": min_price,
"close": close_price,
"volume": volume_df.volume.sum(),
}
return pd.DataFrame(raw_data, index=[0])
The resulting data created by the procedures above will have a format similar to Figure 1 below:
The complete code used to preprocess the trading data can be found in this GitHub gist.
Backtesting module
Backtrader will be used as the backtesting platform, on which a simple trading strategy will be evaluated. Assuming the accumulated aggression as the main signal for placing an order, the strategy can be implemented as in the code below, following the documentation guidelines for the basic functionality.
class TestStrategy(bt.Strategy):
def __init__(self, volume_threshold=100, price_threshold=1.5):
# Keep a reference to the "close" line in the data[0] dataseries
self.dataclose = self.datas[0].close
self.volume = self.datas[0].volume
self.counter = 0
self.order = None
self.buyprice = None
self.sellprice = None
self.volume_threshold = volume_threshold
self.price_threshold = price_threshold
def log(self, txt, dt=None):
"""Logging function for this strategy"""
= dt or self.datas[0].datetime.datetime(0)
dt print(f'{dt.isoformat(" ", "seconds")}, {txt}')
def notify_order(self, order):
if order.status in [order.Submitted, order.Accepted]:
# Buy/Sell order submitted/accepted to/by broker - Nothing to do
return
if order.status in [order.Completed]:
if order.isbuy():
self.log(
"BUY EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f"
% (order.executed.price, order.executed.value, order.executed.comm)
)
self.buyprice = order.executed.price
self.sellprice = None
else: # Sell
self.log(
"SELL EXECUTED, Price: %.2f, Cost: %.2f, Comm %.2f"
% (order.executed.price, order.executed.value, order.executed.comm)
)
self.sellprice = order.executed.price
self.buyprice = None
elif order.status in [order.Canceled, order.Margin, order.Rejected]:
self.log("Order Canceled/Margin/Rejected")
self.order = None
def notify_trade(self, trade):
if not trade.isclosed:
return
self.log(f"Operation profit, gross: {trade.pnl,}\n")
def next(self):
if self.order:
return
# If there is no open position
if not self.position:
# If accumulated aggression of the last period is greater
# than the threshold, send a 'buy' order.
if self.volume >= self.volume_threshold:
self.order = self.buy()
# If accumulated aggression of the last period is smaller
# than the threshold, send a 'sell' order.
elif self.volume < -1 * self.volume_threshold:
self.order = self.sell()
# If there is already an open position
else:
if self.buyprice is not None and self.sellprice is None:
# Long position
= self.dataclose - self.buyprice > self.price_threshold
target_gain = self.buyprice - self.dataclose > self.price_threshold
max_loss if target_gain or max_loss:
self.order = self.sell()
elif self.sellprice is not None and self.buyprice is None:
# Short position
= self.sellprice - self.dataclose > self.price_threshold
target_gain = self.dataclose - self.sellprice > self.price_threshold
max_loss if target_gain or max_loss:
self.order = self.buy()
The volume_threshold
indicates the value of accumulated aggression for a 2-second window triggering an order placement: should be above that value, a buy order
is created; if below -1 times that values, a sell order
is created.
Once the position is taken, the strategy monitors the price and identify either a target_gain
or a max_loss
of $1.5, which is the condition to close the position.
The execution of the strategy is done using the main code below:
def main():
= bt.Cerebro()
cerebro
cerebro.addstrategy(TestStrategy)="tradeanalyzer")
cerebro.addanalyzer(btanalyzers.TradeAnalyzer, _name10000.0)
cerebro.broker.setcash(
= btfeeds.GenericCSVData(
data =sys.argv[1],
dataname=0.0,
nullvalue=("%Y-%m-%d %H:%M:%S"),
dtformat=bt.TimeFrame.Seconds,
timeframe=1,
datetimeopen=2,
=3,
high=4,
low=5,
close=6,
volume
)
cerebro.adddata(data)
print("Starting Portfolio Value: %.2f" % cerebro.broker.getvalue())
= cerebro.run()[0]
thestrat = thestrat.analyzers.tradeanalyzer.get_analysis()
t_a print("Final Portfolio Value: %.2f" % cerebro.broker.getvalue())
print(f"Total trades: {t_a.total.total}")
print(f"Winning trades: {t_a.won.total} ({100*t_a.won.total/t_a.total.total:.2f}%)")
print(
f"Losing trades: {t_a.lost.total} ({100*t_a.lost.total/t_a.total.total:.2f}%)"
)
if __name__ == "__main__":
main()
The result of the execution will print a log message for each trade and a final report. By using it with the preprocessed data created before should generate an output similar to the one below:
$ python algotrading-05.py MESU23_tick_data.csv
Starting Portfolio Value: 10000.00
2023-07-27 16:22:02, BUY EXECUTED, Price: 4618.25, Cost: 4618.25, Comm 0.00
2023-07-27 16:24:36, SELL EXECUTED, Price: 4620.00, Cost: 4618.25, Comm 0.00
2023-07-27 16:24:36, Operation profit, gross: (1.75,)
(...)
2023-07-27 17:19:40, SELL EXECUTED, Price: 4609.75, Cost: -4609.75, Comm 0.00
2023-07-27 17:19:58, BUY EXECUTED, Price: 4608.00, Cost: -4609.75, Comm 0.00
2023-07-27 17:19:58, Operation profit, gross: (1.75,)
Final Portfolio Value: 10015.50
Total trades: 26
Winning trades: 17 (65.38%)
Losing trades: 9 (34.62%)
Conclusion
The use of a backtesting platform as backtrader
is a good practice that should be followed as often as possible when developing new strategies. It is a necessary but not sufficient criteria before venturing to trade real money, meaning that a strategy failing in the backtesting would almost certainly fail in a real context, as a succeeding one might still fail in a real context, but it got a better chance at it.
Disclaimer
Futures and options trading has large potential rewards, but also large potential risk. You must be aware of the risks and be willing to accept them in order to invest in the futures and options markets. Do not trade with money you can not afford to lose. This post is neither a solicitation nor an offer to buy or sell futures or options. No representation is being made that any account will or is likely to achieve profits or losses similar to those discussed. The past performance of any trading system or methodology is not necessarily indicative of future results.
This post is provided for informational and educational purposes only. This material neither is, nor should be construed as an offer, solicitation, or recommendation to buy or sell any securities. Any investment decisions made by the reader through the use of this content is solely based on the reader independent analysis, taking into consideration their financial circumstances and risk tolerance. The author shall not be liable for any errors or for any actions taken in reliance thereon.
The references made to MT5 and AMP Global in this post do not represent a recommendation. I am not affiliated to neither platform, service nor their parent companies.