Benefits of SMA Machine Learning Algorithms

Social Market Analytics has extensive Intellectual Property in three distinct areas:  Topic model creation, account filtering and natural language processing (NLP).  I have written blog post about SMA topic model creation capabilities and the impact of our account filtering algorithms.  This blog answers the question – “Do your machine learning algorithms really add value to the NLP process?”.  Answer -> Yes. The chart below illustrates the statistically significant benefits of Social Market Analytics Machine Learning Algorithms in isolation. 

Start date for this analysis is 11/20/2018 and the end date is 4/30/2019.  This period was chosen because of the significant market draw down in December.  We use dictionaries with three distinct rule sets.  We use a static dictionary as of the start and end dates and compare resulting predictive returns with a point-in-time dictionary (production).  Our patented NLP scores Tweets using the dictionaries at each time, S-Scores are calculated from the generated Tweet scores.  The point-in-time dictionary represents word additions, phrases, and grammatical logic as they are made. 

We isolate the impact of our NLP process by turning off account filtering applied to the Twitter stream.  To ensure we are pulling Tweets only discussing companies and securities, we are using our topic model filtering algorithms.  We regularly publish our full return charts to illustrate the impact of our entire process. 

Let us start by defining the lines in our chart.

Red Line = Tweets are scored using our dictionary of words and phrases as of 11/20/2018.  This illustrates the performance with no machine learning applied on a go forward basis. This is the base case. This line represents the least amount of learned information.

Black Line = Tweets are scored using words and phrases applied Point-In-Time.  This is the production feed SMA customers receive.  We use Supervised and Unsupervised Machine Learning.  There are impacts from both during this period.

Green Line = Represents the Perfect Information scenario. Take the most up to date dictionary of words and phrases (4/30/2019) and apply them backwards.  All information learned during the volatile period is included.  This represents the values expected to be received on a go forward basis.

The charts below represent the cumulative Open to Close return of securities selected based on S-Score 20 minutes prior to market open.  S-Score measures the tone of the current conversation relative to historical benchmarks.  We select securities with an |S-Score| > 2.  Securities with S-Score > 2 are purchased on the open.  Securities with S-Score < -2 are sold short on the open.  SMA Chart lines represent a theoretical long/short portfolio. Isolated long and short sides are available upon request. 

For comparison purposes S&P 500 open to close chart for the analyzed period is below.

The chart below illustrates the cumulative O-C performance illustrating the impact of our ML algorithms.  As expected, the lowest performance is the red line representing the dictionary at start date.  The back line represents SMA production data and green line represents the perfect information case. 

 Sharpe and Sortino Ratios for our test period are below.  SP Return = 3.3%, Sharpe = .58 and Sortino = .96.

Again, this only looks at the impact of SMA NLP and does not include account filtering.  At SMA we believe it’s not just what is being said but who is saying it.  We employ a twelve variable algorithm to score and filter all Twitter accounts Tweeting about companies/securities to identify our approved account universe.  As you can see SMA NLP is a learning system with demonstrable impact.  To learn more please contact us at contactUs@SocialMarketAnalytics.com.

Thanks,

Joe

Social Market Analytics Identifies Most Accurate Twitter Accounts

Social Market Analytics aggregates the intentions of professional investors as expressed on Twitter.  We identify these professional investors using our proprietary twelve factor ranking system.  One factor is the forward accuracy of Twitter accounts.  If a Twitter account is Tweeting bullishly based on our patented NLP process and the security subsequently moves higher over specified periods that account is deemed to be accurate over that period.  Overall accuracy is aggregated across time for each account.  We have been tracking account accuracy out-of-sample for the past seven years. – it is impossible to recreate this data.  SMA is the only provider with out-of-sample account accuracy.  We found significant variability in account accuracy for supposed professional investors.  Social Market Analytics account scoring algorithms are extremely effective in excluding non-professional professionals.

SMA’s Accurate Account algos aggregate expectations from the most accurate Twitter accounts for individual securities for a specified time period: 1-Day, 2-Day, 1-Week, and 1-Month holding periods.   Definition of ‘Accurate’ – correctly identifying directional movement of the security’s price.  We do not include size of move – their sentiment is positive and the security moved higher.

We calculate consensus expectations of these accurate accounts on individual securities.  Accurate account universes differ across holding periods. Some accounts are more accurate in the short-term (Day trades), while others are more accurate for longer holding periods (up to one month).

Securities with significant consensus for both long and short are available through our API’s, Widgets and in Reports.  Below is a widget identifying securities with the most positive and negative consensus.   In this example, SMA’s accurate account universe is currently 100 bullish on MCO over the next 24 hrs.  Positive, negative and neutral are identified separately.

accurate accounts

To discuss getting access to these or any other SMA data feed or widget please contactus@socialMarketAnalytics.com

Thanks,

Joe

Social Market Analytics Receives Second Patent

Most of my blogs center around the predictive nature of Social Market Analytics data. This blog is different.  At Social Market Analytics we are continuously expanding and improving our technology.  These innovations sometime lead to such unique technology that a patent application is warranted.  As many of you know this is a lengthy and challenging process.  We are proud to announce SMA recently received our second patent as an extension of our original patent.  SMA now has two granted U.S. Patents.

Patent

Our first Patent was on our three-component system for extraction, evaluation and publication of metrics on social media feeds. The high-level diagram with Twitter as an input is below.

process

Each process above uses SMA created technology.  Extractor allows for the rapid ingestion of data.  Evaluator filters noise on both the author and content level and Calculator creates custom predictive metrics for multiple time frames and purposes.  As our processes evolve we apply for patents to protect this unique technology.  Our second granted U.S. Patent revolves around publication of metrics and alerting customers to breaking information available through our Twitter metrics and other sources. Although, we are exciting about our 2nd U.S. Patents, we already have our 3rd patent application in preparation!

Thanks for reading,

Joe

 

Extreme Positive Gold Sentiment on CME Active Trader Website.

Social Market Analytics (SMA) data is live on the CME Active Trader Website.  Real-time sentiment and indicative Twitter volume is used by traders to generate new ideas.  Sentiment data is predictive across various time frames.  High sentiment commodities go on to outperform and negative sentiment commodities underperform.  SMA covers 36 commodities on the CME website for: Agricultural, Equity Indexes, Energy, Metals, Interest Rates & FX.

On Monday 9/24 Gold Sentiment crossed through extreme positive at 7:30 am central time.    https://activetrader.cmegroup.com/Products/Metals

GoldBlog1

Clicking on the chart expands the time frame for further analysis.

GoldBlog2

To learn more about Social Market analytics commodity sentiment data or more about the CME implementation: ContactUs@SocialMarketAnaltics.com.

To receive alerts like this in real time follow us on Twitter at @sma_alpha.

UIUC Bitcoin Trading System Practicum Presentation

Every year Social Market Analytics (SMA) is proud to work with the University of Illinois Masters of Science in Financial Engineering Students on a practicum project. In the past we have explored looking at sentiment to predict the VIX, enhancements to traditional indexes and smart beta ETF’s. This year we decided to tackle the most popular topic of the last year – Bitcoin Trading!   We worked with RCM Capital’s Strategy Studio Platform for back testing to develop a Bitcoin trading strategy combining price momentum with sentiment to keep you in the market when Bitcoin is trading up and minimizing draw downs when Bitcoin retreats as it did in early 2018.

Social Market Analytics tracks sentiment on the top 275 market cap currencies, the below Bitcoin strategy performs similarly on other Crypto currencies.

The students did a wonderful job in strategy construction and explanation.  I will undoubtedly leave something important out.  ContactUs@SocialMarketAnalytics.com for details.

At it’s core the strategy buys on a price breakout with a sentiment confirmation.  Exit when price breaks down and is confirmed with sentiment.  Buy when the price crosses above (K) standard deviations over a 21 day moving average of price.  Variable K ranged from .5 to 2. Results shown use a .5 standard deviation multiplier.  Strategy visualization is below.

BitcoinStrategyVisual

Your first trigger is a breakout above K- Standard deviations of the 21 day moving average.

The confirming signal is based on the Social Market analytics S-Score value.  S-Score is a normalized representation of Bitcoin’s Sentiment time series over a look back period and is updated every minute.  It measures the tone of the conversation on Twitter relative to the benchmark time period.  If Bitcoin is breaking out and the sentiment is 2 standard deviations more positive than normal you initiate or add to your position by 50%.  If the conversation is 1 standard deviation more positive than normal  increase the position 25%.  If the standard deviation price break out is not confirmed by sentiment then no position change.

There was no short position initiated with futures.  Exit criteria are opposite entry criteria.  Price break below K – Standard deviations below a moving average. Confirmation with S-Score.

BitcoinResults

Dollar P/L results indicated this portfolio successfully navigates the the bitcoin draw down of early 2018.   2018 in isolation is below.

Bitcoin-2018

Overall performance with Buy & Hold Bitcoin comparison.

BitcoinStats.png

Sharpe ratio and draw down improve dramatically with the momentum and sentiment confirmation.

stats2

Again, please ContactUs@SocialMarketAnalytics.com for more information on our offerings.

Thanks again to the University of Illinois MSFE students and RCM  Capital Markets for contributing to this project.

Regards,

Joe

 

 

 

Introducing the Social Market Analytics (SMA) 50 Long Index

Social Market Analytics has been creating security level sentiment metrics for six years.  As we build an out-of-sample history we are able to build longer holding period indexes. I have blogged about longer term factors before, this is the most comprehensive portfolio strategy built using sentiment level data.  This blog will discuss the application of sentiment to a long only 50 stock, re balanced annually, index.

SMA50 Index is a new, capitalization weighted index comprised of 50 stocks with these features:

  1. The highest average unique message source counts, from SMA’s filtered Twitter data stream, observed over a 50-day look back interval, and
  2. High daily average dollar trading volume (ADV), > $20 Mil, over a 50-day look back interval.  We are looking for liquid stocks.

The SMA50 index measures the aggregate performance of stocks with high levels of crowd sourced commentary and high market liquidity.

  1. SMA50 is reconstituted each year on March 15th.  The core constituents are selected once a year.  They are re-weighted monthly based on the below tilt methodologies.
  2. SMA50 is the “Parent Index” for SMA50 Factor Tilt Products

Below is the historical performance of the SMA50 Index.  We will add tilting to the index based on sentiment and momentum.

SMA501

The following factor tilt indexes are derived from the equity universe of the SMA50 parent index.  Factor Tilt Indexes are re-balanced monthly on the first market day of the month.

SMA-MT: Momentum Tilt

– Designed to deliver the performance of an equity momentum strategy by emphasizing stocks with high risk-adjusted price momentum.

  • A momentum value is determined for each stock in the SMA50 parent index Universe by combining the stock’s recent 12-month and 6-month price performance. This is the standard implementation of a price momentum value.
  • This momentum value is then risk-adjusted to determine the stock’s Momentum Score.
  • All securities in the SMA50 Universe are weighted by the product of their Momentum Score and their market cap, as follow:

Momentum Weight for SMA-MT  = Momentum Score * Market Capitalization Weight in the SMA50.  Momentum weights are normalized to sum to 100%.

SMA50_MT

SMA-ST: Sentiment Tilt

– Using SMA’s S-Score and SV-Score as factors, emphasize stocks with positive levels of social media sentiment and intensity, while attenuating stocks with low sentiment levels.

  • A composite factor score is determined for each stock in the SMA50 parent index Universe from the linear combination of the stock’s monthly S-Score and monthly SV-Score.
  • This composite factor score is used to determine the stock’s Sentiment Score.
  • All securities in the SMA50 Universe are weighted by the product of their Sentiment Score and their market cap, as follow:

Sentiment Weight for SMA-ST  =  Sentiment Score * Market Capitalization Weight in the SMA-50.  Sentiment weights are normalized to sum to 100%.

SMA50_ST

SMA-SMT: Blended Tilt

–Define a factor which is a combination of sentiment and momentum tilts.

  • A combined factor is determined for each stock in the SMA50 parent index Universe from a linear combination of the stock’s Momentum and Sentiment scores.  Initial results for the blended tilt factor used an equal weighting of Momentum and Sentiment scores.
  • This combine factor score is then standardized and used to determine the stock’s Senti-Momentum Score.
  • All securities in the SMA50 Universe are weighted by the product of their Senti-Momentum Score and their market cap, as follow:

Senti-Momentum Weight for SMA-SMT  =  Senti-Momentum Score * Market Capitalization Weight in the SMA-50.  Senti-Momentum weights are normalized to sum to 100%.

SMA50_Combined

Comparative performance for all four theoretical portfolios is below.

SMA Relative Performance

Overlaying standard benchmark performance you can clearly see the effectiveness of the SMA 50 with various tilt strategies to outperform the benchmarks.

SMA Relative Performance bench

The SMA 50 family of indexes provide a low turnover way to benefit from exposure to social sentiment.  To learn more please contact us at ContactUs@SocialMarketAnalytics.com

Social Market Analytics Now Has Six Years of Out-Of-Sample History!

Social Market Analytics, Inc. (SMA) is celebrating six years of out-of-sample data in US Equities.   This data is unique in that it is a true representation of the Twitter conversation at each historical point-in-time.

Since our launch, SMA has become a leader in providing sentiment data feeds to the financial community.  Our data has become an integral part of our customers investment process.  Our customers are Quantitative Trading Firms, Hedge Funds, Sell Side Brokers, Traders and many others. SMA data is suitable for HFT, Quantitative Trading, Risk, Short Lending, Smart Beta, Fama-French Models, VAR among others.  Predictive signals range from a few minutes to quarterly.

SMA’s analytics generate high-signal data streams based on the intentions of market professionals.  Our patented machine learning process has produced six years of strongly predictive data as illustrated in the chart below.  This chart illustrates the subsequent performance of stocks based on pre-market open (9:10 am Eastern) sentiment scores.  Stocks with high sentiment subsequently out perform as illustrated by the Green line.  Stocks with strong negative sentiment go on to under perform as evidenced by the red line.  The blue line represents a theoretical equally weighted long short portfolio.  The table below illustrates Sharpe and Sortino ratios.

 

Fullhistory