My Stock Trading Journey: December 2025

Tuesday, December 23, 2025

A Rule-Based Multi-Indicator Trading Strategy Built for Machine Learning

This post presents a rule-based forex trading strategy using Stochastic Oscillator, RSI, MACD, and EMA-200.
On its own, the strategy delivers around a 50% win/loss ratio (sometimes slightly less). That is intentional.

The real objective is not immediate profitability, but to collect large volumes of structured, labeled trading data that can be used to train machine learning (ML) and AI models capable of identifying higher-probability winning trades.

1. Strategy Philosophy: Rules as a Data Generator

Most trading systems are judged solely by win rate.
This one is judged by data quality.

The strategy is designed to:

Systematically trigger trades under clear, repeatable conditions
Capture both winning and losing outcomes
Cover as many market scenarios as possible
Produce balanced, unbiased datasets

A ~50% win rate is acceptable and even desirable at this stage, because:

It avoids skewed labels
It forces the ML model to learn real distinctions
It reduces overfitting risk

In ML-driven trading, coverage and consistency matter more than raw performance.

2. Why a 50% Strategy Is Valuable for ML

A rule-based strategy that wins half the time creates:

Clean decision boundaries
Equal exposure to success and failure
Honest representations of market behavior

This allows an ML/AI model to learn:

When does this setup work — and when should it be ignored?

With sufficient data and proper training, the model can learn to filter out low-quality trades and identify conditions with a higher probability of success.

3. Indicators Used

The strategy combines four core indicators, each modeling a different market dimension:

Stochastic Oscillator – entry timing
RSI – momentum bias
MACD – trend confirmation
EMA-200 – higher-timeframe trend filter

Each indicator is encoded as a discrete state, making the system deterministic, explainable, and ML-friendly.

4. Indicator Rules and State Encoding

A. Stochastic Oscillator (x)

SELL (x = 1):

%K > 80
%D > 80
%K crosses below %D
EMA-200 > Close Price

BUY (x = 2):

%K < 20
%D < 20
%K crosses above %D
EMA-200 < Close Price

Stochastic provides timing, not trend prediction.

B. RSI (y)

SELL bias (y = 1): RSI < 50
BUY bias (y = 2): RSI > 50

RSI defines momentum alignment.

C. MACD (z)

SELL (z = 1): MACD line crosses below Signal line
BUY (z = 2): MACD line crosses above Signal line

MACD confirms momentum transition.

5. Entry Logic: Controlled and Repeatable

Trades are entered only when all indicators align.

BUY Entry


x = 2 AND y = 2 AND z = 2

SELL Entry


x = 1 AND y = 1 AND z = 1

This strict confluence ensures clear trade intent and produces clean training samples.

6. Exit Logic: Outcome Labeling Over Optimization

Exits are not optimized for maximum profit.
They are designed for consistent, unambiguous outcome labeling.

Definitions

b → number of 5-minute candles after entry
a → trade direction (1 = sell, 2 = buy)
c / e → current profit or loss
k, j → counters for consecutive losses

Exits are triggered by:

Time in trade
Profit/loss behavior
Drawdown and loss-streak protection

This ensures:

Trades close within predictable windows
Outcomes are well defined
Labels remain reliable for ML training

7. The Role of ML / AI in This System

The rule-based layer:

Generates structure
Captures intent
Labels reality honestly

The ML/AI layer:

Learns patterns the rules cannot express
Identifies market contexts where the strategy performs better
Filters low-probability trades
Estimates win probability instead of blindly executing rules

When done properly — with enough data, correct labeling, and strict validation — an ML/AI model can learn to predict which trade setups are more likely to win, even if the underlying rule set itself has only a ~50% win rate.

The edge does not come from the rules alone.
It comes from the model’s ability to discriminate.

8. Final Thoughts

This strategy is intentionally imperfect.

Its purpose is to:

Be consistent, not clever
Capture every meaningful scenario
Produce massive, diverse datasets
Serve as a foundation for AI-driven decision making

A rule-based system that wins 50% of the time but records everything correctly is far more valuable than an over-optimized strategy that collapses outside backtests.

The rules collect the data.
The AI finds the edge.

Sunday, December 14, 2025

Generating RSI, MACD, and Stochastic Indicators in Python Using Pandas

I know I’ve covered parts of this before, but this time I decided to start fresh with a new project. What I’ve built so far is a data pipeline designed specifically for a machine learning–based trading system. The main reason for creating this pipeline is a limitation in MT5: it can only provide up to 80,000 rows of historical data, which is nowhere near enough for training a reliable machine learning model. For my use case, I need at least 2 million historical records to properly train and validate the model.

In this post, I’ll walk you through how I compute three essential technical indicators—RSI, MACD, and the Stochastic Oscillator—which form the foundation of my training dataset. In upcoming posts, I’ll dive into the trading strategy itself. I’ve already implemented and backtested the system, and the results are very promising. That said, I still plan to add more features and refinements so readers can learn from each step. Who knows—maybe following this blog might even help someone build a path toward becoming a billionaire.

Technical indicators are the backbone of most trading strategies. In this post, I’ll walk through how to generate three widely used indicators — RSI, MACD, and the Stochastic Oscillator — using pure Python and Pandas, starting from raw OHLC price data.

This approach is lightweight, transparent, and ideal for backtesting, signal generation, or machine learning feature engineering.

📊 Prerequisites

Your input data must contain the following columns:

time
open
high
low
close

The data is loaded from a CSV file (init_data.csv) and processed into a new dataset (training_data.csv) with the computed indicators.

🔹 Relative Strength Index (RSI)

RSI measures momentum by comparing recent gains and losses. It oscillates between 0 and 100 and is commonly used to identify overbought and oversold conditions.

Formula logic:

Compute price differences
Separate gains and losses
Calculate rolling averages
Convert to RSI scale


def calculate_rsi(data, period=14):
    delta = data['close'].diff()
    gain = delta.where(delta > 0, 0)
    loss = -delta.where(delta < 0, 0)

    avg_gain = gain.rolling(window=period).mean()
    avg_loss = loss.rolling(window=period).mean()

    rs = avg_gain / avg_loss
    rsi = 100 - (100 / (1 + rs))
    return rsi

🔹 Moving Average Convergence Divergence (MACD)

MACD is a trend-following momentum indicator based on Exponential Moving Averages (EMAs).

MACD Line = EMA(12) − EMA(26)
Signal Line = EMA(9) of MACD


def calculate_macd(data, short_period=12, long_period=26, signal_period=9):
    short_ema = data['close'].ewm(span=short_period, adjust=False).mean()
    long_ema = data['close'].ewm(span=long_period, adjust=False).mean()
    macd_line = short_ema - long_ema
    signal_line = macd_line.ewm(span=signal_period, adjust=False).mean()
    return macd_line, signal_line

MACD crossovers are commonly used to detect trend reversals and momentum shifts.

🔹 Stochastic Oscillator

The Stochastic Oscillator compares the current close to the recent price range.

%K shows the current position within the range
%D is a moving average of %K


def calculate_stochastic(data, k_period=14, d_period=3):
    low_min = data['low'].rolling(window=k_period).min()
    high_max = data['high'].rolling(window=k_period).max()

    percent_k = ((data['close'] - low_min) / (high_max - low_min)) * 100
    percent_d = percent_k.rolling(window=d_period).mean()

    return percent_k, percent_d

Values above 80 typically indicate overbought conditions, while values below 20 suggest oversold levels.

🔄 Updating the Dataset with Indicators

All indicators are computed and appended to the dataset in a single function. Rolling calculations naturally produce NaN values, which are removed afterward.


def update_data(data):
    required_cols = ['time', 'open', 'high', 'low', 'close']
    for col in required_cols:
        if col not in data.columns:
            raise Exception(f"Missing required column: {col}")

    data['RSI'] = calculate_rsi(data)
    data['MACD_Line'], data['Signal_Line'] = calculate_macd(data)
    data['Stoch_K'], data['Stoch_D'] = calculate_stochastic(data)

    data.dropna(inplace=True)
    return data

🚀 Final Output

The main program:

Loads init_data.csv
Computes RSI, MACD, and Stochastic
Saves the enriched dataset as training_data.csv

This output can be used for:

Strategy backtesting
Signal detection
Machine learning model training
Trade analytics

🧠 Final Thoughts

Generating indicators manually gives you full control and transparency over your trading logic. It also helps avoid black-box dependencies and makes your system easier to debug and extend.

In future posts, I’ll build on this foundation by:

Combining indicators into trading signals
Adding backtesting logic
Preparing features for machine learning models

If you’re building your own trading system, this is a solid place to start.

Friday, December 12, 2025

The Myth of Guaranteed ML Profits in Forex Trading

The idea is seductive: What if you could build a machine learning (ML) system that guarantees winning in forex trading? In theory, such a breakthrough would change not only personal wealth but the structure of global financial markets. In reality, however, markets are far more complex, adaptive, and unforgiving.

This article breaks down what would theoretically happen, why guarantees don’t exist, and what is actually achievable—and extremely valuable—when ML is used correctly in forex trading.

Theoretically: What Would Happen If a Guaranteed System Existed

If a forex ML system were truly provably consistent, the consequences would be dramatic.

You Would Outperform Banks and Hedge Funds

Most large financial institutions already deploy advanced ML, AI, and quantitative models. A guaranteed system would outperform even these players, giving its owner an unprecedented edge over banks, hedge funds, and market makers.

Compounding Would Make You Extremely Wealthy

Even without spectacular win rates, compounding does the heavy lifting. A modest, consistent edge—say a 55–60% win rate with solid risk–reward (RR)—can grow capital exponentially over time.

Liquidity Becomes Your Enemy

Success creates its own problem. As position sizes grow, your trades start to move the market. Slippage increases, fills worsen, and the very edge that made you profitable begins to decay.

Brokers and Regulators Take Notice

Unusual consistency triggers attention. Accounts may be flagged, spreads may widen, execution quality may degrade, or regulatory scrutiny may increase. Markets do not reward anomalies for long.

Reality Check: Why “Guaranteed” Does Not Exist

Forex markets are structurally hostile to certainty.

The Nature of Forex Markets

Non-stationary – Patterns change over time
Reflexive – Traders influence the very markets they trade
Noise-dominated – Randomness overwhelms short-term signals
Exposed to black swans – Wars, central bank shocks, flash crashes

The Limits of Machine Learning

ML learns historical correlations, not future truths
Models overfit easily to past data
Performance collapses when market regimes change (rate cycles, crises)

Even the most advanced firms—Renaissance Technologies, Citadel, JPMorgan—do not possess guaranteed models. They operate on probabilistic edges, not certainty.

What Is Actually Achievable (And Extremely Valuable)

The real holy grail is not perfection—it is robust expectancy.

A small, repeatable statistical edge combined with strict risk control.

A Realistic Example

Win rate: 52–58%
Risk–Reward: 1:1.5 or higher
Risk per trade: 0.25–1%
Drawdown: Strictly capped

This approach alone already outperforms over 95% of retail traders.

ML’s Best Role in Forex (Where It Actually Works)

Machine learning excels as a support system, not a crystal ball.

1. Regime Detection

Trending vs. ranging markets
High vs. low volatility environments
News- and event-sensitive periods

2. Trade Filtering

Identifying when not to trade
Avoiding low-quality, low-probability setups

3. Position Sizing and Risk Control

Dynamic position sizing
Volatility-adjusted stop losses

4. Ensemble Decision Systems

Combining ML with rules-based strategies
Using confidence scoring, not absolute buy/sell predictions

A Practical Path for Serious Traders

If you have a technical background, the correct path is disciplined and structured:

Define a rule-based strategy first
Use ML to:
- Improve entries and exits
- Filter poor trades
Backtest rigorously across:
- Multiple currency pairs
- Multiple years
- Multiple market regimes
Forward test (paper trading → small capital → gradual scaling)
Expect months of drawdowns, even with a valid edge

This is how professionals build systems that last.

The Brutal Truth

If someone genuinely possessed a guaranteed-winning forex ML system, they would:

Not sell it
Not advertise it
Not trade retail-sized accounts
Use it quietly, with strict capital limits

Sunday, December 7, 2025

📊 Building a Robust EURUSD Data Pipeline with Python and MetaTrader 5

Are you tired of manually downloading historical data for your Forex trading strategies? To build truly effective backtests and AI models, you need a clean, persistent, and automated data pipeline.

This post will guide you through creating a Python script that uses the MetaTrader 5 (MT5) terminal to automatically manage a data file for the EURUSD pair. Our pipeline will handle two critical tasks: an initial bulk download and seamless daily updates.

🛠️ Prerequisites

To follow this tutorial, you'll need:

MetaTrader 5 Terminal: Installed and running (even in the background).
Python: (3.8+ recommended).
Required Libraries: Install them using pip:

pip install MetaTrader5 pandas

Step 1: Connecting to the Data Source

Our first step is establishing a secure, programmatic connection to your MT5 terminal using your demo account credentials. We'll use the initialize_mt5() function to handle this cleanly.

Note: We are using 5-minute bars (mt5.TIMEFRAME_M5) and targeting a MetaQuotes Demo server for this example.

Step 2: The Core Logic: Initial Fetch vs. Daily Update

The true power of this pipeline is its ability to switch modes. When the script runs, it first checks if the target file, init_data.csv, exists.

🚀 Mode A: Initial Data Load

If init_data.csv is not found, we assume this is the first run. We execute a bulk download of the last 80,000 bars (5-minute interval) using mt5.copy_rates_from_pos. This ensures you have a strong foundational dataset.

🔄 Mode B: Seamless Daily Update

If init_data.csv is found, the script switches to update mode. Since you plan to run this after the market closes (or once per day), we only fetch data for the previous full trading day to avoid gaps and duplicates.

We use a helper function, get_last_trading_day_dates(), to determine the precise start and end times, and then use mt5.copy_rates_range() to pull the specific 24-hour block of data.

Step 3: The Complete Data Pipeline Script

Here is the complete, robust code. You can save this as a Python file (e.g., eurusd_pipeline.py) and set it up to run once daily via a cron job (Linux/macOS) or Task Scheduler (Windows).