Given the scarcity of data science and engineering talent, how will AI ever reach its full potential at scale?
Table of contents
Last month, Google introduced Pathways, a “next-generation AI architecture” which promises to help solve the problem of today’s models being overly specialized for a single specific task. Pathways, at this point, is an architectural concept at Google Research and, presumably, some time off from seeing the practical application. That said, the concept is certainly intriguing to me, most importantly because — in my humble opinion — the amount of scarce resources necessary to apply AI to a single specific task is the single biggest inhibitor of AI becoming more ubiquitous.
Why artificial intelligence and machine learning aren’t scalable today
Before we get into discussing pathways and their lofty goals, let’s break down why AI and ML aren’t scalable today. With any new AI/ML use case, a significant amount of work and talent is needed simply to build a working proof-of-concept, much less build a scalable, production-ready implementation.
Let’s say you work for a large, institutional buyer of residential real estate in the burgeoning single-family home rental market. You have great models that take into account the area of the country, location, and size of the property in pricing monthly rent. That said, you know from experience that the quality of finishings in areas such as the kitchen and master bathroom also has a significant influence on potential rent. So how do you go about incorporating this information into how you price your rental inventory?
In residential real estate, you likely have a significant amount of information at your disposal. You’ll certainly have pictures of the kitchen and master bathroom, a text description from a real estate agent or appraiser, and even quantified data such as square footage. So how do you take all of this rich data and make educated inferences?
To start, you’ll need a data engineer who can centralize all of this data from multiple source systems and take care of data wrangling which might include square footage data coming in from different formats depending on the source system. Then you need a data scientist to extrapolate features from the text data, photos, and structured data and apply machine learning models (multiple) to value the impact on potential rent. Then you need a software engineer or machine learning engineer to deploy the model into production, including any upstream data ingestion and downstream delivery of the model outputs. This process of data collection, data munging, model development, testing, and production deployment often takes many months if not more than a year.
All of this is assuming that your initial hypothesis is that kitchen and master bath finishings actually impact rental value. For each of these successful models that make it to production, you’ve gone through the weeks or months-long process of developing proofs-of-concept for five other models that didn’t yield any commercial value. Solving this specific task of valuing the potential lift in rent from the higher quality kitchen and bath finishings has become a multi-million dollar R&D and software development investment. If the task involved has an eight-figure return potential, this investment might make sense. But how many business problems will go ignored and unoptimized simply due to lack of large-scale impact?
What’s the cumulative impact of ignoring the applicability of AI to such problems?
After all, for a large enterprise, the cumulative impact of several AI projects delivering six and seven-figure returns adds up. For small and medium enterprises, any single project of that size is transformational. So how do we get to the point where we can tackle these types of projects at scale? The answer is both simple and difficult: we have to make AI more ubiquitous by making it more broadly applicable and, at the same time, more efficiently developed and implemented.
The pathways to scaling AI application
Pun intended. But back to Pathways…
In its announcement, Google Research introduces some concepts I believe are critical to address in order to see exponential growth in AI adoption.
First, today’s ML models are trained to complete a single task. Take our residential real estate example from earlier. Let’s suppose we’ve trained a model to recognize high-end finishings in a kitchen. That’s a single task. Doing the same for high-end finishings in a bathroom is a completely distinct task even though there’s most certainly some overlap in terms of the features that distinguish high-end finishings (premium tile, premium cabinets, faucet hardware, etc.). Being distinct tasks, the new bathroom model is trained completely from scratch with nearly the same level of development time as the original, simply because the model isn’t generalizable across different tasks.
Second, models today only leverage a single modality of input data, be it structured data, audio, text, images, or video. How much more effective would our model to recognize premium finishings and their potential impact on rents be if it could leverage photos, text descriptions, and structured data such as square footage all in the same model? Using more modalities of information not only would make our models more accurate, but it also helps them become more generalizable.
Lastly, Google’s Pathways architecture seeks to solve for the density and inefficiency of today’s neural network models which activate the entire network to complete a task, regardless of how complex the task is. As models become more generalizable to multiple tasks, it may not be necessary to use the entire model to complete simpler tasks, and a sparsely activated model can both learn a wider variety of tasks as well as be more efficient in its completion of each.
Scaling AI and ML isn’t a new challenge
Attempting to scale AI and ML beyond the limits of scarce, highly-specialized resources is nothing new as these constraints have been recognized by the industry for years.
When I started the Customer Analytics function at UPS back in 2014, I started getting approached by vendors promising that their software could help turn business users into “citizen data scientists.” The promise was to use automated machine learning pipelines to simplify the process of building a model to the point that any business user—more realistically one with decent data skills—could build a model comparable to what a data scientist could build.
At the very least, ML automation tools could make a decent data scientist more efficient by running multiple models in parallel to find the best performing and reduce the overall time it took the data scientist to build and deploy a well-performing model. Companies such as DataRobot have been very successful with this value proposition and will be key to helping scale the application of AI for years to come.
Similarly, IaaS players Microsoft, Google, and Azure have all released some form or another of AutoML modeling, helping developers and other technically-capable individuals without much experience in machine learning to build and deploy models.
Even Business Intelligence tools such as Tableau and Microsoft Power BI have incorporated automated machine learning into their visualization tools, allowing users to use point and click tooling to apply machine learning to tasks such as explaining an uptick in a trend.
But at the end of the day, all of these tools, while valuable contributions, still operate within a core constraint that each machine learning model exists to accomplish one, and only one, task.
Changing the game
If future advances in AI and Machine Learning architecture such as Google’s Pathways are truly successful in breaking through this one model-to-one task constraint, the implications are simply game-changing.
The countless hours and dollars invested in R&D to yield a single model will no longer have a single ROI model around them but potentially several. Each project can now support multiple use cases and can be backed by multiple business cases. Projects that were marginal in terms of their investment worthiness could now prove foundational in support of others.
The scarce Data Engineering, Data Science, and Machine Learning Engineering resources within your organization will have their impact magnified as they’ll be empowered with the tools to tackle problems and models with multiple applications, immediately boosting their impact on the business. All of the massive investment in data lakes, data warehouses, data integration, ETL/ELT, machine learning platforms, supporting MLOps, etc. will have its return magnified.
Looking ahead to the next constraint
As we look to take these new models supporting multiple tasks into production, companies will quickly run into another core constraint to scaling the value achieved from AI and ML — scaling production implementation.
As scarce as Data Engineering, Data Science, and Machine Learning Engineering resources are, talented Software Engineering resources are increasingly hard to find as well. Even if you could scale the applicability of your AI and ML models 10x using new architectures such as Pathways, how can you realize the value without developers to implement those models within the day-to-day workflows of your business users?
In my next post, we’ll focus on the second fundamental constraint in accelerating AI adoption, system integration, and workflow automation, and the solution we already have in front of us—low code, no code tooling.