Historical Accuracy Filter Applied to Sentiment Further Increases Predictive Power

At Social Market Analytics (SMA) we create predictive signals by aggregating the intentions of professional investors as expressed on Twitter.  We have accumulated eight years of out-of-sample data illustrating the predictive nature of the data.  We publish sentiment metrics to illustrate the tone of the current conversation relative to historical conversations.  One of our key metrics is S-Score.  S-Score is effectively a Z-Score, the measure of deviation from the mean.  An |S-Score| > 2 means the current conversation is two standard deviations from the mean over the predefined lookback period. 

 In the prior blog we explored the benefits of SMA patented machine learning algorithms on return characteristics.  In this blog we incorporate rolling back tests on the predictive signal to select portfolio securities.  SMA data is predictive across sectors and industries but as with any factor there are securities that react more predictively than others.  

For chart below we use a rolling one-year accuracy metric for predicting subsequent O-C return.  The faded lines are S-Score values only.  Bolded lines represent a theoretical portfolio with accuracy filters overlaid with S-Score values.  Only select |S-Score| > 2 securities that have moved in the predicted direction 60% of the time over the last year (bold lines).  This is all out-of-sample data. 

S-Score > 2 return values are very similar for accuracy filter and S-Score only.  S-Score < -2 had a large impact from the accuracy filter.   Securities reacting negatively to negative Twitter conversation as measured by S-Score continued to underperform relative to sentiment only.  This is another example of using sentiment combined with other metrics leading to statistically significant predictive signals.   

To see how sentiment can be used in your models ContactUs@SocialMarketAnalytics.com.

Advertisements

Benefits of SMA Machine Learning Algorithms

Social Market Analytics has extensive Intellectual Property in three distinct areas:  Topic model creation, account filtering and natural language processing (NLP).  I have written blog post about SMA topic model creation capabilities and the impact of our account filtering algorithms.  This blog answers the question – “Do your machine learning algorithms really add value to the NLP process?”.  Answer -> Yes. The chart below illustrates the statistically significant benefits of Social Market Analytics Machine Learning Algorithms in isolation. 

Start date for this analysis is 11/20/2018 and the end date is 4/30/2019.  This period was chosen because of the significant market draw down in December.  We use dictionaries with three distinct rule sets.  We use a static dictionary as of the start and end dates and compare resulting predictive returns with a point-in-time dictionary (production).  Our patented NLP scores Tweets using the dictionaries at each time, S-Scores are calculated from the generated Tweet scores.  The point-in-time dictionary represents word additions, phrases, and grammatical logic as they are made. 

We isolate the impact of our NLP process by turning off account filtering applied to the Twitter stream.  To ensure we are pulling Tweets only discussing companies and securities, we are using our topic model filtering algorithms.  We regularly publish our full return charts to illustrate the impact of our entire process. 

Let us start by defining the lines in our chart.

Red Line = Tweets are scored using our dictionary of words and phrases as of 11/20/2018.  This illustrates the performance with no machine learning applied on a go forward basis. This is the base case. This line represents the least amount of learned information.

Black Line = Tweets are scored using words and phrases applied Point-In-Time.  This is the production feed SMA customers receive.  We use Supervised and Unsupervised Machine Learning.  There are impacts from both during this period.

Green Line = Represents the Perfect Information scenario. Take the most up to date dictionary of words and phrases (4/30/2019) and apply them backwards.  All information learned during the volatile period is included.  This represents the values expected to be received on a go forward basis.

The charts below represent the cumulative Open to Close return of securities selected based on S-Score 20 minutes prior to market open.  S-Score measures the tone of the current conversation relative to historical benchmarks.  We select securities with an |S-Score| > 2.  Securities with S-Score > 2 are purchased on the open.  Securities with S-Score < -2 are sold short on the open.  SMA Chart lines represent a theoretical long/short portfolio. Isolated long and short sides are available upon request. 

For comparison purposes S&P 500 open to close chart for the analyzed period is below.

The chart below illustrates the cumulative O-C performance illustrating the impact of our ML algorithms.  As expected, the lowest performance is the red line representing the dictionary at start date.  The back line represents SMA production data and green line represents the perfect information case. 

 Sharpe and Sortino Ratios for our test period are below.  SP Return = 3.3%, Sharpe = .58 and Sortino = .96.

Again, this only looks at the impact of SMA NLP and does not include account filtering.  At SMA we believe it’s not just what is being said but who is saying it.  We employ a twelve variable algorithm to score and filter all Twitter accounts Tweeting about companies/securities to identify our approved account universe.  As you can see SMA NLP is a learning system with demonstrable impact.  To learn more please contact us at contactUs@SocialMarketAnalytics.com.

Thanks,

Joe

Performance of SMA Data in a Volatile Market

As we move towards the fourth quarter people have been asking how SMA data is performing in this volatile market.  Those who have been following us over the years have come to expect the Open to Close (O-C) Chart to illustrate the performance of our data.  We have been publishing this chart since the launch of the company in early 2012.  The machine learning applied to our NLP has provided increasing predictive power to our data as our out-of-sample training set continues to grow.  Even as we increased our asset class and security coverage.  Twitter has continued to be the go-to source for breaking news and conversation.  This rich and growing source of communication has allowed us to continue to improve our data.

The below charts illustrate the standard subsequent O-C performance of securities with |S-Score| > 2,  20 minutes prior to open.  As you can see even with a volatile market the Open to Close performance significantly outperforms on the long and short sides over the last year and the full history continues to perform well.  YTD the long-short portfolio has a cumulative return of 25.26% with a 5.18 Sharpe Ratio.  This is primarily driven by the long side.  Securities with an S-Score < -2 returned 1.20% – significantly underperforming the benchmark S&P.  This chart is updated through Friday 8/23 to illustrate performance during a large market down day.

YTD-8-26

Full history is below.  This chart illustrates the Machine Learning component of our data.  As more data is added to the out-of-sample historical set the training become more effective.

Fullhistory-8-26To learn more about our data please contact ContactUs@SocialMarketAnalytics.com.

Thanks,

Joe

SMA releases ‘Crypto Fast’ Sentiment Data Feed

Social Market Analytics (SMA) the leader in predictive social media data feeds has added the ‘Crypto Fast’ to its suite of API data feeds. SMA’s S-Factor and Activity Cryptocurrency data feeds have been in production since December 2017. “Clients asked for a bespoke feed for a shorter baseline with 1-Hour price projections. Although clients can create their own baselines and metrics with our Activity Feed, clients wanted SMA to do the development work and produce and support the product which has been named ‘Crypto Fast’.”
The SMA Crypto Fast Feed provides faster moving signals than the SMA S-Factor feed. The S-Factor Feed is a 24H lookback with a 20 baseline with decay which supports intraday out to 2-3 trading days. SMA’s Activity Feed is in isolation of what happened in each minute, which can be narrow to HFT or customized to any period including W, M, Q.
Like the calculation of SMA’s S-Score, the ‘Crypto Fast’ is a normalized sentiment score with a shorter 1H lookback period with a 12H baseline to better take into account the high volatility in the cryptocurrency market. For each crypto asset, SMA makes a 1-hour price projections based on its Crypto Fast and price momentum. SMA provides a projected return, as well as a projected range on the return with a 95% confidence interval. The accuracy field reflect how often the subsequent return has fallen within the projected return range historically.

 

Photo credit: SMA has partnered with TheTie to power sentiment www.thetie.io
SMA APIs went into production in 2011 for U.S Equities and have grown to include UK Equities, ETFs, FX, Futures, and Cryptocurrencies. SMA produces over 25 distinct APIs across 6 asset classes www.socialmarketanalytics.com

Social Market Analytics (SMA) Partners with Coin Metrics to provide Real-Time Sentiment Data Feeds

Coinmetrics

Coin Metrics and Social Market Analytics (SMA) announced today a partnership to incorporate SMA’s Crypto Currency Data Feed into the Coin Metrics Market Data Platform.

Alternative data such as social media platforms and data feeds have become a vital source of information for traders, particularly in the Crypto Currency Markets. The SMA Crypto Currency Sentiment Feed will offer the Crypto Currency community a tool for including social media sentiment data in their trading and portfolio strategies and expand Coin Metrics market leading Crypto Asset market and network data products.

“As the Crypto Investing market continues to mature, institutional investors are demanding data from trusted partners. These institutions are looking to make data-driven decision by accessing sources of data that they understand from their legacy investing frameworks. We believe that the power of combining sentiment data with granular network and market data is fundamental to building a deeper understanding of crypto assets. Coin Metrics is excited to partner with SMA, who has a long history of providing sentiment data to traditional capital markets participants and share Coin Metrics’ principles and values. The ability to provide an all-in-one Crypto Financial Data solution is a huge convenience for institutions.” Comments Tim Rice Co-Founder and CEO of Coin Metrics.

“Artificial intelligence and Natural Language Processing are moving into our everyday lives at light speed, and perhaps into financial markets even faster than that. We feel strongly at SMA that participants in Crypto Currency markets will benefit from our unique process in this emerging field, both in its approach to filtering social media data and in the analytical methodology used to develop our proprietary metrics. We’re excited to partner with the Coin Metrics team to offer this service through a versatile industry leading platform” said Joe Gits, Co-Founder and CEO of SMA.

About Coin Metrics

Coin Metrics was founded in 2017 as an open-source project to provide the public with actionable and transparent network data. Today, Coin Metrics delivers market and network data, analytics and research to its community and wider industry. https://coinmetrics.io/

About Social Market Analytics, Inc.
Social Market Analytics quantifies social media data for traders, portfolio managers, hedge funds and risk managers using patent pending technology to detect abnormally positive or negative changes in investor sentiment. SMA produces a family of quantitative metrics, called S-Factors™, designed to capture the signature of financial market sentiment. SMA applies these metrics to data captured from social media sources to estimate sentiment for indices, sectors, and individual securities. A time series of these measurements is produced daily and on intraday time scales. For more information, including a User Guide to S-Factors™, please visit www.socialmarketanalytics.com

Social Market Analytics Identifies Most Accurate Twitter Accounts

Social Market Analytics aggregates the intentions of professional investors as expressed on Twitter.  We identify these professional investors using our proprietary twelve factor ranking system.  One factor is the forward accuracy of Twitter accounts.  If a Twitter account is Tweeting bullishly based on our patented NLP process and the security subsequently moves higher over specified periods that account is deemed to be accurate over that period.  Overall accuracy is aggregated across time for each account.  We have been tracking account accuracy out-of-sample for the past seven years. – it is impossible to recreate this data.  SMA is the only provider with out-of-sample account accuracy.  We found significant variability in account accuracy for supposed professional investors.  Social Market Analytics account scoring algorithms are extremely effective in excluding non-professional professionals.

SMA’s Accurate Account algos aggregate expectations from the most accurate Twitter accounts for individual securities for a specified time period: 1-Day, 2-Day, 1-Week, and 1-Month holding periods.   Definition of ‘Accurate’ – correctly identifying directional movement of the security’s price.  We do not include size of move – their sentiment is positive and the security moved higher.

We calculate consensus expectations of these accurate accounts on individual securities.  Accurate account universes differ across holding periods. Some accounts are more accurate in the short-term (Day trades), while others are more accurate for longer holding periods (up to one month).

Securities with significant consensus for both long and short are available through our API’s, Widgets and in Reports.  Below is a widget identifying securities with the most positive and negative consensus.   In this example, SMA’s accurate account universe is currently 100 bullish on MCO over the next 24 hrs.  Positive, negative and neutral are identified separately.

accurate accounts

To discuss getting access to these or any other SMA data feed or widget please contactus@socialMarketAnalytics.com

Thanks,

Joe

IHS Markit Analysis of Social Market Analytics LSE Equity Data

Social Market Analytics partners with IHS Markit to distribute our S-Factor data through the IHS Markit Research Signals Feed.  Recently, IHS Markit added our LSE 1000 Equity S-Factor Feed to their data offerings.  In conjunction with the launch IHS Markit authored a research paper exploring the predictive nature of the SMA S-Factor data on LSE equity securities.   We are thrilled with the outcome of this independent research showing the predictive nature of Twitter based factors.   Below is summary of the paper conclusion section.

Download the paper here:  https://cdn.ihs.com/www/pdf/0219/Social_media_indicators_in_the_UK.pdf

IHS Markit focused primarily on the SMA’s S-Score and S-Volume sentiment metrics. SMA S-Scores open-to-close return spreads at the +/-2 tail averaged .097%, persistent to 10-day (.177%) and 20-day (.298%) holding periods.  On a cumulative basis, we report a pre-close spread return of 75% for buy rated stocks versus 6% for sell rated stocks and 9% for the market.  Results were robust to filters on minimum Tweets and to long-only strategies.  Applying the S-Volume > 1  filter, open to close spreads for the +/-2 tail strategy average .83%, and exceeded the stand-alone factor results at each of the longer holding periods.  For buy portfolios, S-Score (with S-Volume >1) open-to-close excess returns at the +2 tail averaged .044% (0.049%) and increased in general with each incremental extension in holding period reaching 0.296% (0.389%) at 20 days, confirming the benefits of signal to long-only portfolio managers.

Lastly IHS Markit researched one of their proprietary SMA based metrics, Relative Standard Deviation of Indicative Tweet Volume.  They also found strong results,  Stocks with volatile Tweet volume pre-market tend to outperform open-to-close (spread:0.224%, excess return 0.07%), while these stocks also outperform over longer time horizons, reaching a spread of .342% out to 20 days (excess return:  .196%).

For more information please ContactUs@SocialMarketAnalytics.com.

Thanks,

Joe