Since joining RavenPack in 2008, Peter Hafez, Chief Data Scientist, has been a pioneer in the field of applied news analytics, bringing alternative data insights to the world’s top banks and hedge funds. Here he answers our questions on the prospects for the alt data market and RavenPack’s latest innovations.
What do you think are the biggest challenges facing data scientists/AI experts/quantitative investors in 2018/2019?
One of the biggest challenges facing Data Scientists and AI experts over the coming years is whether or not they can live up to the massive hype in the marketplace and deliver on the promise.
Over the last few years, big data and artificial intelligence have attracted a lot of attention from the finance industry. Large investments are going into new technologies driven by investors, who are expecting nothing less than a miracle.
Fundamental firms are hiring teams of Data Scientists, who are brought in to revolutionize entire organizations in record speed. However, rather than working on providing immediate business value, the newly hired Data Scientists often end up spending most of their time on building out infrastructure, preparing for the future. This will result in disillusionment for some firms who will find it challenging to deliver on the promised value. This is partly due to unrealistic expectations, and partly due to resistance to change within the organization itself. On top of that, many of the Data Scientists are coming from other industries outside of finance, which also introduces a cultural challenge. While it is important that companies stick to this path of investments in order to survive in the future, they have to think about how to best allocate their resources. Close collaboration with data and technology vendors should be an important part of the internal data science strategy as it will free up resources to think more about how to integrate the new and valuable insight within the organization, rather than focusing entirely on engineering work.
Looking ahead a year from now, how do you see the structure of your market changing?
Looking a few years ahead, I believe we will see further consolidation of the alternative data market. Today, more traditional data sources are delivered primarily by a few large companies such as Bloomberg, Thomson-Reuters, IHS Markit, Factset, and Standard & Poor’s. While they have all been a little slow at getting into the alternative data space (with some exceptions), we can expect that as datasets are able to showcase stronger value, the “usual suspects” will start engaging in a shopping spree acquiring the more prominent alternative data vendors.
We should also expect to see some of the newer marketplaces starting to catch on, helping to distribute content from smaller vendors. The consolidation will be the result of customer desires to make the data science process easier through pre-structured or normalized data across sources, but also driven by the desire to deal with fewer vendors. While 2017/18 were the years of the alternative data explosion, we’ll soon have to get used to seeing more and more smaller vendors disappearing and struggling to find their foothold in the market place, among the 1000+ competitors out there.
As the alternative data space further matures, we will see more Fundamental investors entering the marketplace. They will have different requirements when it comes to data delivery. While Quants or Data Scientists are happy to consume data via API, Fundamental investors are more likely to consume data via dashboards or other user-interfaces. Today, we are already seeing some of the larger banks building their own internal platforms for bank-wide roll-outs. However, we still need to see similar platforms becoming commercially available to the broader market. Kensho, which was recently acquired by Standard & Poor’s, is a step in this direction. Others will soon follow.
What is going to be the biggest area of investment for RavenPack over the next 12 months?
As a Big Data Analytics provider, investing in new technology, data and artificial intelligence is core to our success in the marketplace. As an example, over the last few years we have transitioned our business into relying more on cloud computing, which has allowed us to onboard and process larger amounts of unstructured data than ever before, taking advantage of the scalability offered by the cloud.
To stay on the cutting-edge, we continuously invest in our core NLP engine to develop new technology that can help us extract additional insights from a wide range of textual content. We’re also investing in enterprise search centered around RavenPack’s entity and event tagging to return more relevant search results to the end-user, as it relates to finance. These investments will allow both Quants and Fundamental analysts to identify information that they need to make decision making easier.
Can you share an example of how your system has been used by a new customer?
Over the years, a significant number of research papers describing various use-cases have been written on our data by academics, sell-side analysts and by the RavenPack Data Science team. As an example, Deutsche Bank looked at sentiment and media buzz to filter short-term reversal strategies, arguing that a divergence in stock price doesn’t always lead to convergence when one of the companies in the stock pair experiences negative sentiment combined with abnormal media attention. J.P. Morgan has looked at enhancing value strategies with RavenPack sentiment data, showing how cheap companies with negative news shouldn’t be considered buying opportunities, as it might be a good indication of a potential “value trap”.
Using RavenPack data, CitiBank’s research team looked at strategies around CAPEX announcements vs. reported CAPEX and found that the former resulted in positive share price performance with a long drift of around three months, which stands in contrast to the price reversal typically observed following reported CAPEX. Empirical Research found that RavenPack sentiment can be particularly useful in aiding to Failure Modeling with Failure candidates associated with negative media sentiment proving particularly toxic. Similarly, media darlings usually would not be considered good shorting candidates.
Most recently, J.P. Morgan showed how RavenPack’s macro sentiment data added value in a cross-asset class style timing model by providing a timelier (daily) forward-looking view on economic outlook and expectations.
There has been a lot of talk about NLP and how advances in technique can open up access to vast information resources. What are the most exciting and innovative ways that RavenPack is incorporating these strategies into its offering?
Finance professionals are exposed to a flood of information from both external and internal sources that adversely affects productivity in the investment process. RavenPack provides a systematic solution for analyzing all of your unstructured content to help generate alpha, reduce risk and increase efficiency in trade surveillance and compliance. We combine proprietary Artificial Intelligence (AI) tools including Natural Language Processing (NLP) with Big Data technologies, enabling the discovery of insights that turn your textual data into a strategic asset.
RavenPack Text Analytics consolidates the unstructured data sources you use everyday (emails, instant messages, documents, contact databases, and more) into a single, enriched format tailored specifically for financial applications. Key enabling functionalities are: Search, Alerts and Data visualization.
What is your biggest professional achievement to date?
As the Chief Data Scientist, I would consider my contribution to the success of RavenPack as my biggest professional achievement. It’s been very exciting to have observed first-hand how the industry has evolved over the years to become much more aware of the value that Big Data can bring to the investment process. I’d like to think that we have been a contributing part of that.
In the early days, only the most sophisticated quantitative hedge funds knew what to do with the data. However, with improvements in technology and more use-cases becoming readily available, the Big Data landscape has opened up for many more investors.
Being one of the pioneers of the AI and Big Data revolution, RavenPack has had significant impact on the industry over the last 15 years. In particular, we have made strong contributions to defining and creating the standard for News Analytics in finance. These days, when it comes to Natural Language Processing (NLP) and Big Data Analytics for Finance, RavenPack is seen as one of the market leaders. Not only because we have been around for a long time with proven products and research, but also since we have shown that we are capable of delivering continuous innovation and value to our clients.
What are you most looking forward to at the AI & Data Science in Trading conference?
I am looking forward to hearing about the latest innovations and applications of AI and data science in finance. In particular, I’m excited to learn from fundamental firms that have been through a successful transition to becoming more data-driven as an organization.
The conference organizers have done an amazing job putting together an exciting agenda and in bringing together thought leaders in the industry. This promises to be one of the “must attend” conferences of 2019.
View an extract of this session held at the Generation AI: The New Data-Driven Investor event in September 2018.
You can also access the full video and slides here.