ChatGPT as a Short-Term Alpha Model? Intraday Prediction from Headlines

Can ChatGPT predict stocks? Yes—GPT-4 applied to headlines yields a high-Sharpe intraday strategy with 650%+ returns. Predictive power emerges only in large models.

💡 Takeaway:
A ChatGPT-driven daily long-short strategy using only headlines earned over 650% cumulative return with no explicit financial training.


Key Performance Metrics
📊 How Well Does This Strategy/Model Perform?

  • Sharpe Ratio: 3.28 (long-short)
  • Average Daily Return: 0.38%
  • Max Drawdown: -17.4%
  • Long Leg Sharpe: 0.90
  • Short Leg Sharpe: 2.12

Key Idea: What Is This Paper About?

The authors test whether ChatGPT can predict short-term stock price movements using just news headlines. It can. GPT-4 scores outperform traditional sentiment tools, with stronger returns on short recommendations and for small-cap stocks. This ability appears only in large models—hinting at a threshold where AI becomes financially useful.


Economic Rationale: Why Should This Work?

ChatGPT’s predictive power stems from how markets absorb (or fail to absorb) information efficiently. The model exploits behavioral and structural frictions that delay price adjustment, particularly for complex or negative news.

📌 Relevant Economic Theories and Justifications:

  • Underreaction (Bernard & Thomas, 1989):
    Investors often respond slowly to new information, especially when it’s unexpected or contradicts past trends. This creates short-term return predictability after news events.

  • Limits to Arbitrage (Shleifer & Vishny, 1997):
    Even if mispricings are visible, they persist when arbitraging them is costly or risky—e.g., in illiquid stocks or short-selling constraints. This helps explain why ChatGPT’s predictive edge is stronger in small-cap stocks or around negative news.

  • Information Capacity Constraints (Epstein & Schneider, 2008):
    Human investors struggle to process complex or ambiguous information quickly. Large language models like ChatGPT can analyze and interpret these headlines faster and more accurately, extracting insights humans may miss.

  • Threshold Effect in AI Forecasting:
    The study finds that only large models (e.g., GPT-4) exhibit predictive skill. Once an LLM reaches a sufficient model size, it gains new emergent abilities—such as interpreting difficult financial text and forecasting returns. Smaller models like GPT-2 or BERT fall short.

📌 Why It Matters:
These frictions are real, persistent, and exploitable. ChatGPT can bypass human processing limits and extract alpha—especially in short-horizon trading where timely reaction to news is critical.


How to Do It: Data, Model, and Strategy Implementation

Data Used

  • Sources: News headlines scraped and matched with RavenPack relevance scores
  • Period: Oct 2021 – Dec 2023
  • Assets: All US common stocks with news coverage

Model / Methodology

  • LLM Used: ChatGPT-4 (gpt-4-0314, temperature = 0)

  • Prompt Used:

    Forget all your previous instructions.
    Pretend you are a financial expert. You are a financial expert with stock recommendation experience.
    Answer “YES” if good news, “NO” if bad news, or “UNKNOWN” if uncertain in the first line.
    Then elaborate with one short and concise sentence on the next line.
    Is this headline good or bad for the stock price of {COMPANY_NAME} in the short term?

    Headline: {HEADLINE_TEXT}

  • Output Mapping:

    • YES → 1
    • NO → -1
    • UNKNOWN → 0
  • Final Signal: “GPT-4 Score” ∈ {–1, 0, 1}

Trading Strategy (suggestion)

  • Signal Generation:
    • Buy stocks with YES score
    • Sell stocks with NO score
  • Portfolio Construction:
    • Equal-weighted long-short
    • Rebalanced daily
    • Entry: Pre-market (if news before 9am) or next open (if post-close)
  • Intraday Strategy Variants:
    • Entry 15min post-release, hold to close
    • Hold overnight
  • Enhancements:
    • Filter for stock liquidity
    • Focus on complex or press-release-based headlines
    • Adjust weighting to mitigate single-stock impact

Key Figure from the Paper

📊 Reference: [Figure 2 – Cumulative Returns of $1 Investment in GPT-4 Strategy]

📌 Explanation:

  • The GPT-4 long-short strategy (blue) dominates the market return (gray).
  • Over 650% cumulative gain (without transaction costs).
  • Predictability comes from both long and short legs, but especially from shorting on negative news.

Final Thought

💡 ChatGPT doesn’t just chat—it can trade. LLMs are now return predictors. 🚀


Paper Details (For Further Reading)

  • Title: Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models
  • Authors: Alejandro Lopez-Lira, Yuehua Tang
  • Publication Year: 2024
  • Journal/Source: Working Paper (University of Florida)
  • Link: https://arxiv.org/abs/2304.07619

Read next