Behind the Hype: Investing in AI

by | Mar 25, 2021 | Investing, Technology

How do venture capitalists identify the entrepreneurs behind the next billion-dollar companies? Kunal Mehta, a VC himself, spent two years looking for an answer by interviewing dozens of the venture capitalists who backed some of the decade’s top technology companies: Airbnb, Uber, Twitter, Facebook, SpaceX, Tesla, Pinterest, Snapchat, and Instagram. The result, Mehta’s book, Finding Genius: Venture Capitalists and the Future They Are Betting On, includes the VCs’ perspectives on new technologies such as artificial intelligence and blockchain, as well as the industry-specific investment frameworks they use to make decisions.

Mehta uses anecdotes from successful venture capitalists to identify useful trends for investors and entrepreneurs interested in backing and building technology companies. In the following excerpt, Rayfe Gaspar-Asaoka, a principal at the investment firm Canaan Partners, shares his insights for evaluating artificial intelligence companies.

Finding Genius: Venture Capitalists and the Future They Are Betting On

I had the pleasure of meeting Rayfe Gaspar- Asaoka, an investor with Canaan Partners, through our shared interests in the future of mobility and industrial automation. At the time we met, Rayfe had just closed an investment in Apex.AI, a company working with automotive developers to implement complex artificial intelligence (AI) software into vehicles to support autonomous driving. Bold initiatives such as these are often established by teams of Ph.Ds., scientists, and subject matter experts; and the founders of these sophisticated technology companies search for investors who can match them at an intellectual level. As entrepreneurs in this industry will attest, Rayfe is that type of investor. In collaborating with Rayfe, I have been consistently inspired by his ability to break down complex topics and technologies to reveal the true value they provide to those without technical experience. Case in point: AI.

The study of AI has a rich history that dates to well before it became a topic of conversation in popular media or culture. This field has attracted top researchers, scientists, entrepreneurs, academics, and programmers from all over the world. The widespread potential applicability of AI across all industries and its impact on the global economy are not overstated. When I approached Rayfe about contributing a chapter to Finding Genius, he immediately saw the value in sharing the framework he uses to help ask the right questions to determine if an AI company is built for long-term success. As you will learn through this chapter–a primer on AI, as well as a forward-looking perspective on industrial automation–Rayfe is a systems thinker and a technologist who also has an ability to tell a story. That’s a rare and valuable combination.

In earlier chapters of this book, I have discussed the importance of developing an “information edge.” Rayfe is able to set himself apart from other investors who cannot evaluate a technical product or application. As Rayfe reveals in this chapter, venture capitalists are “often investing in AI companies before any commercial maturity. This means that understanding the AI technology at a fundamental level is critical to the investment decision, especially given all of the hype and promise around AI.” This specific market segment requires a deep technical know-how if investors want to succeed by identifying winners early on. Rayfe shares a framework for investing that cuts through the noise to determine what is truly an AIfirst company with the potential to create long-term value. These frameworks and questions are not only relevant to AI companies but also provide a relevant foundation from which entrepreneurs and venture capitalists can think about other industries or nascent technologies.

In 2017, Andrew Ng, a Stanford professor and one of today’s giants in the field of artificial intelligence, famously said, “Artificial Intelligence is the new electricity. Just as electricity transformed almost everything 100 years ago, today I actually have a hard time thinking of an industry that I don’t think AI will transform in the next several years.” On the surface, that seems like a very bold claim. But taking a step back: What is artificial intelligence and why does it have the potential to enable change in every facet of the way we live, work, and interact with the world?

AI is a technology that allows machines to perform tasks at a level comparable to, and in some cases, superior to that of a human. While AI has been dramatized over the decades through futuristic science fiction stories, AI is already here, and in fact powers a lot of today’s world without us even realizing it. Every time we open up our Netflix app, there is an AI algorithm running in the background that personalizes our recommendations based on our past history and preferences. Or whenever we make a command to Siri, Alexa, or Google Assistant, there is an AI algorithm that processes our voice into a machine- readable command and action. And just like a human, the AI technology powering these actions is constantly ingesting data (more shows watched, more commands heard) and continuously improving.

I’m an early-stage investor in startups, and I think it is important to understand not only the differences between the various types of AI but how to sift through the current AI hype. The challenge is finding the companies that leverage AI as an essential pillar of their long-term success versus those that are using AI as a marketing buzzword. I’d like to share a couple of the frameworks that I use to help me understand a company’s AI technology (their AI toolkit), its long-term potential for success, and a particular application of AI that I am excited about.

As an early-stage investor in AI startups, I am often investing in companies before any commercial maturity. This means that understanding the AI technology and differentiation at a fundamental level is critical to the investment decision, especially given all of the hype and promise around AI. 

What’s in your AI Toolkit?

Given the near-infinite combinatorics of tasks humans can perform, the field of study of AI is a broad one that can be broken down into various subspecialties, each with its own set of algorithms and models that are optimized for a particular task. The major fields of AI today are machine learning, deep learning, and reinforcement learning, although up-and-coming areas such as transfer learning and Generative Adversarial Networks (GANs) are quickly becoming table stakes in today’s AI applications. I think of each of the different subfields and algorithms as tools for engineers to use as part of their AI toolkits. Each tool in the AI toolkit has its own strengths and weaknesses based on the setup of the problem and the data given. Below is a short definition of each, with a few examples to help you better understand the use cases of each.

Machine learning

Machine learning is one of the foundational domains within AI. Many of the building blocks that were developed in the field of machine learning have served as a framework for other domains as well. In our toolkit analogy, machine learning is the hammer, a versatile, must-have tool that everyone must-have.

There are many different types of machine learning models, but they all share the same workflow: 1) ingest data (known as “training data”), 2) make predictions based on that data, and 3) optimize the predictions over time. The recommendation algorithm for Netflix is a real-world example of a machine learning algorithm. Every time you open the Netflix app, it ingests data (your watch history), makes predictions on that data (shows relevant to your preferences), and then optimizes that over time (based on how close your selections match the recommendations).

While there are many machine learning models (and more coming out of research every day), each problem can be broken down along two axes that help to structure the type of algorithm to use. On the first axis is classification versus regression; this is simply defining whether the answer to the problem should be discrete (classification) or continuous (regression). For instance, if you are trying to build a model to predict the price of Bitcoin in 10 years, the answer is a continuous one, with it ranging from $0 to priceless, and including every number in between. This would fall into the camp of regression machine learning algorithms. On the other hand, a model that predicts whether or not an email you receive should be marked as spam or pass through into your inbox is discrete (spam or inbox), and best defined as a classification problem. Machine learning algorithms can be built to work with both linear data (linear regression) and non-linear data (logistic regression, neural networks, etc.).

The second axis is supervised learning versus unsupervised learning. “Supervised” means that you have historical, accurate, labeled data, known as “training data,” which you can feed the machine learning model to in order to produce an accurate initial prediction. “Unsupervised” simply means that you do not have that labeled, accurate data set to begin with, so the machine learning algorithm puts a stronger emphasis on the observations of the initial outputs in order to quickly optimize. For example, in the spam filter example above, if the input data used to train the machine learning model is a set of emails and corresponding classification of spam or not spam, then this would be a supervised learning problem. However, if you only had an initial data set of emails and did not know whether or not it should be labeled as spam, this would be an unsupervised learning problem, as you are relying on the feedback from the machine learning algorithm to identify the differences between a regular email and spam email without explicitly labeling one or the other. The benefit of an unsupervised learning algorithm over supervised is that you do not need to individually label the training data set. But the tradeoff is that you often need more data and feedback loops in order to produce a strong prediction from your machine learning model.

Deep learning

If machine learning can be thought of as the hammer in your AI toolkit, deep learning is the set of screwdrivers. It can be incredibly useful in certain situations, but there is a bit more complexity (head type, size of screw) involved in order to use it properly. The recent rise in the use and effectiveness of deep learning algorithms is one of the biggest drivers of today’s excitement around AI.

Most of today’s deep learning algorithms are based on a neural network, which is a type of non-linear machine learning algorithm. A neural network algorithm is built to mimic the structure of the neurons in our brain. Each node (neuron) is connected to other nodes, and those connections are weighted based on the type of data being processed. In the same way the neurons in our brain associate information, a deep learning neural network is built to do the same. Each node of the neural network captures certain features of a dataset in order to make a prediction on new, incoming data. Relating this back to the way it works in the human brain: When we meet someone new, that experience is broken up into certain features that we unconsciously store, such as the shape of the person’s face, where and when you met them, and the sound and spelling of their name. And with each new meeting, we continue to capture those features in neurons and form connections between people and places based on how related these experiences are. The more neurons (or nodes) available, the more features from that information we can capture, resulting in more accurate predictions.

There are various flavors of neural networks, such as convolutional nets (CNN) and recurrent nets (RNN), each with its own specific set of use cases. CNNs are often used to make image predictions; a real-world example of a CNN is the FaceID algorithm on the iPhone. FaceID uses an initial scan of a user’s face to build a CNN with each node capturing features that are used to uniquely identify a person. Each time going forward, when that user places their face in front of the phone, the algorithm will make a prediction of whether or not that face matches the features stored in the CNN. If there is a strong match, the phone unlocks. In contrast, RNNs are better suited for time-series data. An example of an RNN application is the algorithm powering your Alexa or Google home speaker; as you speak, it captures the speech data over the course of that sentence, and once you are finished speaking, it then processes that data.

What has made deep learning neural networks so powerful recently is the number of nodes and layers of nodes that can now be computed. However, deep learning algorithms are not only very compute- and data-intensive, but the complexity of the interactions between nodes leads to what’s commonly referred to as a “black-box” solution. Just as you unconsciously store features from an experience into neurons, it is not easy to determine what features a neural net extracts from a dataset. You may remember someone because of their name, while I may remember that person because of the features of their face. A deep learning algorithm is the same; the complexity of the neural net makes it difficult to understand why the model outputs something, despite it often being highly accurate.

Despite all of that, the prediction power of deep learning algorithms compared to traditional machine learning often outweighs the “black-box” cost. And with the exponential increase in compute capabilities thanks to processors like GPUs and cloud computing, as well as the massive amount of available data today (80-90% of today’s data has been created in the last two years), deep learning algorithms are now one of an engineer’s preferred tools from the AI toolkit.

Companies that can build a business model that leverages their differentiated AI in a way that is fundamentally disruptive to the traditional economics of their competitors will build long-term value that cuts through the noise within a sector. 

Reinforcement learning

The third major field of AI that I often see companies use is reinforcement learning. In your AI toolkit, reinforcement learning is the set of wrenches. Just like deep learning, it is a more specialized tool that requires upfront preparation to use effectively. Reinforcement learning algorithms are often used when an objective can be reached through rewards in a cause-and-effect manner. We see this type of behavior in the real world every day. For example, when dog owners are teaching dogs to sit, they will often do this by giving the dog a treat (a reward) when the dog performs the behavior and no treat (a punishment) when the dog does not. Over time, the dog will learn to associate sitting on command with a treat.

Reinforcement learning algorithms are similar. Unlike deep learning, which requires an extensive set of upfront training data, reinforcement learning models are best used when there is a goal (e.g., teach the dog to sit), along with the ability to train the model by providing cues along the way (e.g., treats when the dog sits, no treats for when the dog does not). Today, reinforcement learning is widely used, from the control algorithms that optimize the path of a robot within a warehouse to use in the most complex video games and puzzles.

This is just a short introduction that only scratches the surface of the complexity of the subdomains within AI. But while billions of dollars and decades of effort are being spent pushing the limits of one particular specialty within AI in research, it is the ensemble models that combine multiple domains of AI expertise and research together into a single solution that have produced some of the best-performing AI today. Just as you would use more than just a hammer to build a house, the best solutions are often constructed by combining multiple tools from an engineer’s AI toolkit. For example, the famous 4-1 chess win by Google Deep Mind’s AlphaGo against Lee Sedol in 2016 was built on a combination of deep learning methodologies coupled with reinforcement learning.

As an early-stage investor in AI startups, I am often investing in companies before any commercial maturity. This means that understanding the AI technology and differentiation at a fundamental level is critical to the investment decision, especially given all of the hype and promise around AI. This leads to the next framework I want to share, which is around investing through the hype.

Investing through the AI hype

There is no shortage of companies using some sort of AI to build a new product or service, ranging from the latest consumer app to the next enterprise software that promises to reinvent the way an enterprise works. The power of machine learning, deep learning, and reinforcement learning is real. The challenge today is determining what is truly an AI-first company creating long-term value versus one masquerading as an AI company in order to take advantage of the current market hype.

One way that I look at the potential of an AI startup is by using a two-by-two framework that evaluates the company along its technology and business model innovation. On one axis, I look for companies that have differentiated data sets or algorithms. Differentiated datasets can be both proprietary datasets and unique access to scalable, labeled data. Examples of this are Netflix’s dataset of user preferences based on users’ watch history or Facebook’s photo tagging feature, which allows a large number of labeled photos to be amassed, done almost entirely by leveraging the user base.

On the algorithm side, this often takes the form of a new mathematical model out of academia or a unique combination of existing AI models that has been optimized to solve a particular problem. Access to this proprietary dataset and/or algorithm allows the company to build a long-term competitive moat around their technology. As with the AI toolkit description above, there is a strong feedback loop between AI algorithms and data. The better the data, the better the algorithm will perform at future predictions. And the better those predictions, the better the output data, which is then fed back into the algorithm. What this means it that a company with even the slightest head start with a better proprietary data set or algorithm will have an ever-increasing advantage over their competitors. This winner-take-all characteristic of AI is one of the things that makes these companies so powerful.

The second way to evaluate an AI company is through the innovation of their business model. Companies that can build a business model that leverages their differentiated AI in a way that is fundamentally disruptive to the traditional economics of their competitors will build long-term value that cuts through the noise within a sector. For example, Amazon’s Kiva robots are used to bring products from one end to the other of their one-million-square-foot fulfillment center. This drastically reduces the number of humans needed to do retrieval, instead allowing them to focus on tasks that require more cognitive load, such as picking and packing items into a customer’s box. The use of these AI algorithms that enable their robots to autonomously navigate the warehouse disrupts the traditional unit economics of the business. Amazon not only has AI powering their back-end logistics, but like Netflix, they have built a recommendation engine on the front end that personalizes the site for each individual user. The use of AI throughout the business is one of the reasons Amazon has built the largest e-commerce websites in the world and can offer a superior customer experience with a disruptive model of two-day, one-day, and even one-hour shipping.

Companies that excel on both axes will not only have a differentiated business model but will enjoy the data set/algorithm defensibility in a space where competitors struggle to survive the new world order. As an investor, I use this framework as a starting place to help me ask the right questions to determine if an AI company is built for long-term success. It is often the case that companies excel along one axis but not the other. This results in short-term success, but competitors that come along with better access to unique data sets/algorithms or innovative business models will ultimately win out. The companies that will succeed in this next wave of AI will need to excel along both axes. Not only will these companies change the way an industry views their business, but by the time the competition figures it out and tries to challenge them, it will be too late to break the AI company’s defensive moat of better data and algorithms.

Building on the framework for investing in AI startups, one area that I am particularly excited about is at the intersection of AI and the physical world, aka intelligent robots. Today’s world is still largely manual and human labor intensive. Take the largest industries in the world today that leverage physical labor; from construction to manufacturing to agriculture, 80% of the tasks are still done by humans, with relatively simple machines to aid in very specific pieces of the remaining 20% of work. Today’s robot is often designated to do repetitive tasks in very constrained environments. But as companies continue to use their AI toolkit to enhance robots to deal with more complex scenarios, I predict that the 80/20 split of human/machine will not only flip but intelligent robots will unlock new business models. Humans no longer will be limited to simplifying the manufacturing line based on the low-level capabilities of robots but will be free to set up complex environments that are better optimized for rapid production of improved service. Early-stage investors are often searching for the next big platform shifts in technology and industry, and I believe this has all of the makings of a big one.

One of the biggest barriers to intelligent robotics penetrating industries such as agriculture or retail has been the high capex with unproven ROI. However, there has been a commoditization of sensor hardware over the last decade, largely driven by the rapid innovation cycles in consumer smartphones and personal electronics. HD cameras, flash memory, and compute processors cost pennies compared to what they used to cost. This not only has greatly lowered the barrier for startups to take on the capex required to build robots, but it has enabled new business models such as RaaS (robots-as-a-service) that allow once-skeptical industry incumbents to now consider intelligent robots as a viable solution to augment human labor. In addition, this has exponentially increased the amount of sensor training data that a young startup can capture and process for its AI algorithm, which rapidly levels the playing field against the incumbents. Today, startups like Blue River Technology in agriculture and Bossa Nova in retail are leading the charge, but this is just the beginning.

“Today’s robot is often designated to do repetitive tasks in very constrained environments. But as companies continue to use their AI toolkit to enhance robots to deal with more complex scenarios, I predict that the 80/20 split of human / machine will not only flip but intelligent robots will unlock new business models.”

AI is at the heart of these robots’ ability to make decisions and take actions in massively unstructured environments. It cannot be understated how different intelligent robots are versus the machines we think of today. Human perception is a highly complex process dependent on our past, current, and future predictions of the world. The physical world is incredibly unstructured. The analogy of a nicely organized Excel table doesn’t exist in real life. While humans are innately skilled at perceiving and making decisions with imperfect information, machines are historically not. The reason why robots were only used to automate 20% of the physical world was because the environment needed to be structured enough for a robot to make sense of it. Take the industrial robot arm from ABB or Kuka that is used to build an automobile; it takes months to program that robot to do a single task along the manufacturing line. Because of that, a company needs to produce thousands, even millions of a single line in order to be profitable. But as these robots improve their ability to rapidly adapt to learn and execute new tasks in a complex, unstructured environment, it will open up new ways to build a business with entirely new economics. We have already seen this happen with Amazon’s acquisition of Kiva changing the economics of logistics, and this is continuing with companies like Zume in the food space and Google’s Waymo and GM’s Cruise in transportation.

When I meet with startups building intelligent robots, I go back to first understanding where the company falls along the two frameworks I shared above. First: What is in their AI toolkit? What combination of machine learning, deep learning, and reinforcement learning are they using? And second: Do they have access to proprietary data/algorithms coupled with a disruptive business model? There are a number of startups that are building intelligent robots applied to traditionally labor-intensive industries that excel along both of these frameworks. From my point of view, intelligent robots are the “how” that goes with Andrew Ng’s statement about AI transforming industries. And while we are in the early innings of it all, I predict that today’s startups that are leveraging AI to build intelligent robots will be tomorrow’s giants.

Recent Posts

Explore Topics