I built a Stock Market prediction tool using ChatGPT and other OpenAI APIs.
All right, it’s no surprise that AI has recently amazed many people and is capable of more than we would have expected it to at this stage.
In the past months, I’ve been researching the origins of AI, machine learning, neural networks, supervised and unsupervised training methods, fine-tuning base models and expanding capabilities with embedding features, and more.
I have been creating AI-powered tools for a lot of my day-to-day projects, like:
A GPT-4 powered automated database design system that writes scripts for an entire database in minutes rather than days. I can write the data tables I need and explain the desired outcome, which will create the database, and write a script to create each table in PostgreSQL, Postgress ddl, and TypeORM.
Custom search features and data generation directly in our project management tool. (A custom tool that we use for engineering purposes). Creating test data used to take a few minutes to hours. With AI, it takes a few seconds.
What’s next?
I wanted to see what else I could build that wasn’t solely focused on technical development. The database automation tool is incredibly useful but still very “static.” It can be done by anyone that knows how to design a database but in more time.
If you look at the more recent LLMs, you’ll discover they are incredible at completing text or code, summarizing, highlighting, and matching similarity. But, one of the essential features is that most have been trained on billions of documents and the entire public internet.
Some industries, like law, medicine, technology, aerospace, and engineering, rely heavily on “knowledge,” historical data, and documentation. This is why people study for so long to enter these industries; they need the knowledge to operate in them.
Another industry, which relies heavily on historical data and reasoning, is the finances, and more specifically: the stock market.
If you look into what investors (Both long and short term) do, it is mostly: keeping track of data like company financials, sec fillings, trade volume, trade price, and indices, And: filter and reasoning news articles.
AI can perform each one of these tasks, but in much faster and better ways, so I decided to get to work.
Stage 1
I initially started the project as a test to see if it could predict market moves based on news headlines. I built an automated script that fetched the latest top news from 1 financial news outlet every hour > Rate 1–5 how much it predicted it would affect a certain market or stock ticker based on historical data > Write if it is expected to have negative or positive impact > Explain its prediction with historical examples > Get the ticker if any.
I now had something solid: each article had a 1–5 predicted affection score and an explanation for what could happen based on historical data. About 8 hours later, I started checking how the predictions at the start of the day were based on end-of-the-day results, and to my surprise: Over 70% of the predictions were made correctly.
Stage 2
I was now pretty excited and decided to develop the tool further.
First, I researched and connected many other news outlets and data sources: media outlets, government API’s RSS feeds, stock exchange alerts, live sec-fillings, and more. I now have over 300 sources connected.
The second thing I did was create a filtered list of articles with a prediction score of 3 or more.
I then connected a real-time stock market API to get the current price, volume, and all historical price data for a certain stock ticker.
I combined this data with the reasoning of the article and any other articles related to the ticket or company for that day.
I then asked it to run 2 more analyses:
Explain if a security price was under or overvalued based on each day’s closing price for the last 5 years, based on daily volume, price, company financials, and news.
Explain what it would expect to happen over the next weeks considering the articles and the daily vol and price of the last 3 years.
Stage 3
A few days later, I decided to combine the filtered data into different “lists”:
Hourly stock alerts. (Mostly focused on news headlines and historical data)
Daily “Long-term” analysis on undervalued stocks. (Based on news articles, financial statements, and reports, legal news)
A list of stock tickers I follow to get a weekly report valuation and predict the next moves based on all the data mentioned above.
Stage 4
I now had a solid system that essentially did the work of hundreds of stock brokers: It analyzed hundreds of articles per hour, filtered only the ones it expected to change markets, performed an in-depth analysis based on financial data, and based its decisions on all the historical data.
I then put each “List” into an email and send it to myself regularly.
Below is an example of the “Hourly stock alert” email that I send myself, which includes a list of tickets that are expected to make market moves with a prediction score of 3 or more.
The email lists all tickets with just the score and impact on top and a more in-depth explanation underneath. I’ve also included a link for each ticket, which takes me to a page where I can read the full analysis the tool made and perform more analysis if needed.
The predictions shown below are 80% correct:
Further improvements:
Currently, I am focusing on reinforcing the model's foundation by training it further, using the collected data to rectify any inaccurate predictions made earlier
If you would like to get involved or have any recommendations for investing strategies, news sources, or other relevant information, please feel free to contact me
This tool is not official financial advice. The tool’s predictions are based on Machine Learning outputs combined with real-time market data, news data, and 25 years of historical market data.