Skip to main content

1 overview

Only care about real money performance?

What is RL Trading?

  • RL Trading is a side project of mine, it started with my idol Elon Musk advertising Doge Coin.
  • As a Physicist and programmer, the first day I stepped into the realm of investing, I had simulations and statistical analysis in mind.
  • At the beginning, I wanted to see if Reinforcement learning is good for trading. Unfortunately, it performed much worse than algorithms designed by myself. This reminds me of my work on Quantum Optimal Control: classical methods that we know exactly how they work perform much better than RL. Deep Reinforcement Learning Doesn't Work Yet
  • However, I would still like to keep the name RL Trader. You may understand it as Riemann-Lagrange Trader, tribute to my love for Physics.


  • BTC trend for the backtesting period (I chose this period on purpose because anybody can make money during big bulls):
    • 2021-04-30 to 2022-06-12, 5 minutes resolution
    • I trade 327 different cryptocurrencies on Binance Spot (trading fee 0.1% + 0.1%) without filtering out the "shitty" coins. This is to see how robust the algorithm is against "shitty" coins. Of course I can achieve much better results if I just filter out the bad coins, but no one can guarantee "good" coins like Bitcoin won't one day collapse like LUNA.
    • I also tried shorting on Binance Futures (137 "good" pairs, trading fee 0.02% + 0.04%) with leverage. But the result suggests Spot is more profitable and less risky.
    • The above translates to 32,467,184 steps for each simulation
  • Backtesting result:
    • 0.01 means 1%
    • 5 slots, no leverage. For example, if I buy BTC and the price go up 20%, I make 20% / 5 = 4%
  • Relationship between cumsum and cumprod:
let p be the average profit, which is smallthen cumsum Snk=1np=npand cumprod Pnk=1n(1+p)=(1+p)n=(1+11/p)(1/p)(np)eSnusing limx(1+1x)x=e\begin{align*} & \text{let } p \text{ be the average profit, which is small} \\ & \text{then cumsum } S_n \approx \sum_{k=1}^n p = n p \\ & \text{and cumprod } P_n \approx \prod_{k=1}^n (1+p) = (1+p)^n = (1+ {1\over 1/p})^{(1/p) (np)} \approx e^{S_n} \\ & \text{using } \lim_{x\to \infty} (1+{1\over x})^x = e \end{align*}
  • drawdown_cumsum and drawdown_cumprod are defined as:
def drawdown_cumsum(cumsum):
max_cumsum = -1
drawdown = np.zeros_like(cumsum)
for i,s in enumerate(cumsum):
max_cumsum = max(max_cumsum, s)
drawdown[i] = s - max_cumsum
return drawdown

def drawdown_cumprod(cumprod):
max_cumprod = 0
drawdown = np.zeros_like(cumprod)
for i,p in enumerate(cumprod):
max_cumprod = max(max_cumprod, p)
drawdown[i] = p / max_cumprod - 1
return drawdown

Trading philosophy

Sanity check

  • For sanity check, I compare RL Trader with a Random Trader:
class RandomTrader:
x = {
'slots': [5, Integer(3, 10)],
'take_profit': [0.01, Real(0.01, 0.1)],
'stop_loss': [-0.2, Real(-0.5, -0.1)],
'timeout': [1440],

def make_signals(s, df):
r = np.random.rand(len(df['close']))
df['buy'] = r > 0.99
df['sell'] = r < 0.01
return df