Multi-Agent LLM vs Single-Prompt ChatGPT for Stock Analysis

If you've ever pasted "should I buy AAPL?" into ChatGPT, you got back a paragraph that hedges every direction and ends with "consult a financial advisor". That isn't because the model can't reason about stocks — it's because one prompt is trying to do five jobs and ends up doing none of them well.

The single-prompt failure mode

When you ask a single model to analyse a stock end-to-end, it tries to be a fundamentals analyst, a chart reader, a news scanner, a macro economist, and a portfolio manager simultaneously — in one context window. Three things go wrong:

First, attention is finite. The model spends its "thinking budget" on whichever angle the prompt emphasised most, neglecting the others. Second, conflicting signals get smoothed. If fundamentals say BUY and technicals say SELL, a single-prompt answer averages them into a HOLD with low confidence — which is rarely the optimal trade. Third, there's no adversarial pressure. A bull case never gets seriously challenged because the same model wrote both the bull and the bear sides at the same time.

This isn't a flaw in the model — it's a flaw in how you're using it. Putting a brilliant generalist in five jobs at once doesn't make them five times more productive.

What changes with role separation

Our pipeline runs five specialist analysts with separate prompts, separate context windows, and separate sources of evidence:

The fundamentals analyst sees only SEC EDGAR filings and computed ratios. It is not allowed to look at price action.

The technical analyst sees only OHLCV and our Alpha158-lite factor library. It is not allowed to look at news.

The sentiment analyst sees only Reddit / 东方财富股吧 posts. It is not allowed to see fundamentals.

The news analyst sees only headlines and dates. It is not allowed to see chart patterns.

The macro analyst sees only FRED + OpenBB time-series. It is not allowed to see ticker-specific data.

Each one produces a structured opinion: thesis, evidence, confidence. Then — and this is the key step — a bull persona and a bear persona read all five reports and write opposing pitches. Then a trader synthesises. Then a risk committee kills the trade if leverage or position size violates limits. Finally a manager signs off with a confidence-weighted BUY / HOLD / SELL.

Why the debate step matters

This is the piece a single prompt cannot do. When you ask one model to argue both sides, it produces symmetrical, hedge-y arguments. When you ask two separate instances — one explicitly told it's a bull, one explicitly told it's a bear — you get the strongest version of each case. The trader role then has real material to weigh, instead of synthesising from a pre-smoothed average.

The debate architecture is designed to make uncertainty explicit, but calibration still has to be proven out of sample. We publish completed live decisions now and leave performance blank until a reproducible backtest is verified. The current record is on /track-record.

"Just use a longer prompt"

People try this. The problem isn't prompt length; it's context contamination. Once a model has seen the price chart, its reading of the 10-K is anchored. Role separation is the mechanism that prevents anchoring — it's the same reason real investment committees give analysts independent assignments before convening.

The cost trade-off

Honest answer: multi-agent is more expensive. A single ChatGPT prompt costs <$0.01. Our pipeline averages $0.04–$0.10 per decision because it runs 8–11 model calls. That's still nothing per decision, but it's an order of magnitude more.

The router uses only providers configured in the running deployment. It may try compatible models within that real provider family; if all real attempts fail, the decision fails explicitly rather than switching to a mock response. See the cost model →

See the full 7-stage pipeline →

When single-prompt is fine

Asking ChatGPT "explain what a P/E ratio is" or "summarise Apple's latest 10-K": single prompt is the right tool. Conceptual lookup, summarisation, definitions — one model, one prompt, done.

Asking for a directional trading decision with calibrated confidence: that's where role separation pays for itself.