At Social Market Analytics (SMA) we create predictive
signals by aggregating the intentions of professional investors as expressed on
Twitter. We have accumulated eight years
of out-of-sample data illustrating the predictive nature of the data. We publish sentiment metrics to illustrate the
tone of the current conversation relative to historical conversations. One of our key metrics is S-Score. S-Score is effectively a Z-Score, the measure
of deviation from the mean. An |S-Score|
> 2 means the current conversation is two standard deviations from the mean
over the predefined lookback period.
In the prior blog we
explored the benefits of SMA patented machine learning algorithms on return
characteristics. In this blog we
incorporate rolling back tests on the predictive signal to select portfolio securities.
SMA data is predictive across sectors
and industries but as with any factor there are securities that react more
predictively than others.
For chart below we use a rolling one-year accuracy metric for
predicting subsequent O-C return. The
faded lines are S-Score values only. Bolded
lines represent a theoretical portfolio with accuracy filters overlaid with
S-Score values. Only select |S-Score|
> 2 securities that have moved in the predicted direction 60% of the time over
the last year (bold lines). This is all
S-Score > 2 return values are very similar for accuracy filter
and S-Score only. S-Score < -2 had a large
impact from the accuracy filter. Securities
reacting negatively to negative Twitter conversation as measured by S-Score continued
to underperform relative to sentiment only.
This is another example of using sentiment combined with other metrics
leading to statistically significant predictive signals.
To see how sentiment can be used in your models ContactUs@SocialMarketAnalytics.com.
Social Market Analytics has extensive Intellectual Property in three distinct areas: Topic model creation, account filtering and natural language processing (NLP). I have written blog post about SMA topic model creation capabilities and the impact of our account filtering algorithms. This blog answers the question – “Do your machine learning algorithms really add value to the NLP process?”. Answer -> Yes. The chart below illustrates the statistically significant benefits of Social Market Analytics Machine Learning Algorithms in isolation.
Start date for this analysis is 11/20/2018 and the end date is 4/30/2019. This period was chosen because of the significant market draw down in December. We use dictionaries with three distinct rule sets. We use a static dictionary as of the start and end dates and compare resulting predictive returns with a point-in-time dictionary (production). Our patented NLP scores Tweets using the dictionaries at each time, S-Scores are calculated from the generated Tweet scores. The point-in-time dictionary represents word additions, phrases, and grammatical logic as they are made.
We isolate the impact of our NLP process by turning off account filtering applied to the Twitter stream. To ensure we are pulling Tweets only discussing companies and securities, we are using our topic model filtering algorithms. We regularly publish our full return charts to illustrate the impact of our entire process.
Let us start by defining the lines in our chart.
Red Line = Tweets are scored using our dictionary of words and phrases as of 11/20/2018. This illustrates the performance with no machine learning applied on a go forward basis. This is the base case. This line represents the least amount of learned information.
Black Line = Tweets are scored using words and phrases
applied Point-In-Time. This is the
production feed SMA customers receive. We
use Supervised and Unsupervised Machine Learning. There are impacts from both during this
Line = Represents the Perfect Information scenario. Take the most up to date
dictionary of words and phrases (4/30/2019) and apply them backwards. All information learned during the volatile
period is included. This represents the
values expected to be received on a go forward basis.
The charts below represent the cumulative Open to Close return of securities selected based on S-Score 20 minutes prior to market open. S-Score measures the tone of the current conversation relative to historical benchmarks. We select securities with an |S-Score| > 2. Securities with S-Score > 2 are purchased on the open. Securities with S-Score < -2 are sold short on the open. SMA Chart lines represent a theoretical long/short portfolio. Isolated long and short sides are available upon request.
For comparison purposes S&P 500 open to close chart for the analyzed period is below.
The chart below illustrates the cumulative O-C performance illustrating the impact of our ML algorithms. As expected, the lowest performance is the red line representing the dictionary at start date. The back line represents SMA production data and green line represents the perfect information case.
Sharpe and Sortino Ratios
for our test period are below. SP Return
= 3.3%, Sharpe = .58 and Sortino = .96.
Again, this only looks at the impact of SMA NLP and does not include account filtering. At SMA we believe it’s not just what is being said but who is saying it. We employ a twelve variable algorithm to score and filter all Twitter accounts Tweeting about companies/securities to identify our approved account universe. As you can see SMA NLP is a learning system with demonstrable impact. To learn more please contact us at contactUs@SocialMarketAnalytics.com.
Social Market Analytics aggregates the intentions of professional investors as expressed on Twitter. We identify these professional investors using our proprietary twelve factor ranking system. One factor is the forward accuracy of Twitter accounts. If a Twitter account is Tweeting bullishly based on our patented NLP process and the security subsequently moves higher over specified periods that account is deemed to be accurate over that period. Overall accuracy is aggregated across time for each account. We have been tracking account accuracy out-of-sample for the past seven years. – it is impossible to recreate this data. SMA is the only provider with out-of-sample account accuracy. We found significant variability in account accuracy for supposed professional investors. Social Market Analytics account scoring algorithms are extremely effective in excluding non-professional professionals.
SMA’s Accurate Account algos aggregate expectations from the most accurate Twitter accounts for individual securities for a specified time period: 1-Day, 2-Day, 1-Week, and 1-Month holding periods. Definition of ‘Accurate’ – correctly identifying directional movement of the security’s price. We do not include size of move – their sentiment is positive and the security moved higher.
We calculate consensus expectations of these accurate accounts on individual securities. Accurate account universes differ across holding periods. Some accounts are more accurate in the short-term (Day trades), while others are more accurate for longer holding periods (up to one month).
Securities with significant consensus for both long and short are available through our API’s, Widgets and in Reports. Below is a widget identifying securities with the most positive and negative consensus. In this example, SMA’s accurate account universe is currently 100 bullish on MCO over the next 24 hrs. Positive, negative and neutral are identified separately.
To discuss getting access to these or any other SMA data feed or widget please contactus@socialMarketAnalytics.com
Social Market Analytics aggregates the intentions of professional investors as expressed on Twitter. We apply our patented filtering and natural language processing(NLP) to Tweets to proactively select Twitter accounts to use in our predictive metrics. We track several metrics to gauge the predictive nature of our dataset. For this blog I am going to illustrate one of these metrics.
2018 was a rough year for the SP500, it lost about 9% (rolling one year). Given market loss and the high volatility we thought it would be an ideal dataset over which to run an experiment. Two questions we get regularly are: How would your data perform in a bear market? And what is the benefit of your NLP and account ratings systems? This blog will answer both questions from the perspective of 2018 market performance.
The table below illustrates performance of six theoretical portfolios. These portfolios represent stocks with Social Market Analytics S-Scores of 2 or higher (Long signal) or Social Market Analytics S-Scores of -2 or lower (Short signal). S-Score compares the tone of current Twitter conversations with average tone of Twitter conversations over the last twenty days. Social Market Analytics has multiple baseline for multiple prediction periods.
Each security in our universe represents a proprietary Topic Model. Each Topic is a collection of rules used to include or exclude specific Tweets from security buckets. For example, if you are looking for Tweets about Ethan Allen furniture (ETH) you do not want to include Tweets about Ethereum Crypto Currency (Also symbol ETH) conversations.
We created portfolios with our account filtering algorithms and compared them with portfolios of all twitter accounts discussing our Equity Topic Models. The purpose of the run was to quantify the ability of our patented account filtering algorithms to identify professional, and hence more accurate, investors. Spoiler alert: Our account filtering improved the long/short return by 50% (18.73 for 2018 versus 12.53 NLP only)
NLP applied only:
The NLP only portfolios illustrate the power of our NLP process to accurately identify and fine grain score Tweets discussing securities and companies. Our patented process reads each Tweet multiple times to identify if and how strongly someone is voicing a view of expected future performance. The NLP only portfolios illustrate the predictive power of our NLP in isolation. When you apply the Account filtering you get a predictive boost.
Account Filtered + NLP applied:
Account Filtered plus NLP portfolios illustrate the benefit of applying our account filtering metrics. Early in the life of Social Market Analytics we learned its not just what is being said on Twitter but who is saying it. We developed proprietary metrics to identify investors more likely to be correct about the future direction of a security. When the conversation of these professional investors is significantly more positive than the average conversation over the last 20 days those securities significantly outperform. When the conversation of these professional investors is significantly more positive than the average conversation over the last 20 days those securities significantly underperform.
Portfolios are constructed of securities with an S-Score of 2 or higher (long) or -2 or lower (short). All portfolios are equally weighted. A negative value for a short portfolio denotes a positive return to that portfolio. Short portfolios are supposed to move lower. All securities are entered on the Open based on a 9:10 am Eastern time S-Scores and exited on the Close. There is no overnight exposure.
We use SP500 as our performance benchmark. SP return is calculated from open to close in the same manner as the selected securities. Using open to close performance the SP500 returned -16.89% for comparison. As you can see from the table the S-Score > 2 outperformed the market and negative S-Score securities significantly underperformed the market (generating positive alpha). The L/S portfolio with NLP only returned +12.54%, NLP plus account filtering improved that performance by 50% to +18.73%. We do not illustrate this as a single factor model but removing 10% a year for slippage and commissions still significantly outperforms.
Please contact us with any questions or to see how SMA’s NLP and filtering capabilities can be used in your investment process. ContactUs@SocialMarketAnalytics.com
This year has been tough for most investment strategies. Firms using traditional sources of data are generating the same underwhelming returns. Two years ago, Social Market Analytics, Inc. (SMA)(Twitter) launched the SMLCW index in partnership with the CBOE. This index is re-balanced weekly and comprised of the twenty-five securities selected from the CBOE large cap universe with the highest average S-Score over the prior week. It’s A long only index of super-cap stocks with unusually positive Twitter conversations.
SMA publishes a family of metrics providing a full representation of the Twitter conversation across equities (US and LSE), commodities, currencies, ETF’s & Cryptos.
S-Score is a normalized representation of the current Twitter conversation of professional investors as identified by Social Market Analytics patented algorithms. SMA has access to the full Twitter feed through our licensed partnership with Twitter and listens in real-time for any mention of topics and securities of interest. These Tweets are scanned in real-time for sentiment and influence of the poster and compared to prior conversations over the look back period. Securities with higher S-Scores subsequently outperform and securities with negative S-Scores under-perform.
SMA S-Scores are predictive over multiple prediction periods. With seven years of out-of-sample data we can extend our comparison baselines and predict over longer periods.
Year-To-Date the SMLCW index is up over 7.5% while the SP500 is flat. Subtracting a couple percent for commissions/slippage and the index is still significantly positive. This is not a back-test, this index has been live and on your quote screens for nearly two years. YTD actual performance chart from the CBOE site is below.
As mentioned, this is a long only index. During the recent market drawdown this long index has been performing. SMA negative S-Score stocks have been moving lower at a significant rate – generating positive alpha. Below is a chart of the SMLCW index compared to the SP500. for any questions or to learn more please contact us at: ContactUs@SocialMarketAnalytics.com.
Social Market Analytics, Inc. (SMA) partnered with the Cboe in January 2017 to release the SMLCW Index ‘Cboe – SMA Large Cap Weekly Index’. The SMLCW Index is a Long Only Index that has outperformed since it was released and has continues to outperform in the recent market volatility and sell-off. In the chart below the S&P500 is flat for the year and SMLCW is up nearly 5% YTD.
SMA has two U.S. Patents around its machine learning and NLP processes that produce predictive analytics at the security level across U.S. and UK stocks, ETFs, FX, Futures, and Crypto Currencies
The SMLCW portfolio is an equally-weighted Long Only portfolio of 25 stocks drawn from the CBOE Large-Cap Universe with the highest average 5-period S-Scores. Stocks in this universe (a) are in the top 15% capitalization tranche of stocks that are the underlying for options listed on the CBOE (approximately 3000 stocks) and (b) have a market capitalization greater than or equal to $10 billion.
The CBOE Large-Cap Universe is reconstituted quarterly on the third Friday of the month. The SMLCW portfolio is reconstituted every Friday at 8:30 am CT, based on average 5-period SMA S-Scores at 8:10 am CT. A period is a date on which there is sufficient social media data to derive SMA S-Scores. Stocks are deemed sold and purchased at market-on-open prices. The portfolio is held until 8:30 am CT on the next Friday. If Friday is a business holiday, the portfolio is rebalanced on the preceding Thursday.
Most of my blogs center around the predictive nature of Social Market Analytics data. This blog is different. At Social Market Analytics we are continuously expanding and improving our technology. These innovations sometime lead to such unique technology that a patent application is warranted. As many of you know this is a lengthy and challenging process. We are proud to announce SMA recently received our second patent as an extension of our original patent. SMA now has two granted U.S. Patents.
Our first Patent was on our three-component system for extraction, evaluation and publication of metrics on social media feeds. The high-level diagram with Twitter as an input is below.
Each process above uses SMA created technology. Extractor allows for the rapid ingestion of data. Evaluator filters noise on both the author and content level and Calculator creates custom predictive metrics for multiple time frames and purposes. As our processes evolve we apply for patents to protect this unique technology. Our second granted U.S. Patent revolves around publication of metrics and alerting customers to breaking information available through our Twitter metrics and other sources. Although, we are exciting about our 2nd U.S. Patents, we already have our 3rd patent application in preparation!