A new way to optimize and prioritize AI projects for the GPU shortage

September 2, 2023 8:20 AM

Many file drawers and files: GPU concept

Image Credit: VentureBeat made with Midjourney

Head over to our on-demand library to view classes from VB Transform 2023. Register Here

Generative AI, enabled by massive language fashions (LLMs) like GPT-4, has precipitated shockwaves in the tech world. ChatGPT’s meteoric rise has triggered the international tech business to reassess and prioritize gen AI, reshaping product methods in actual time.

Integration of LLMs has given product builders a straightforward way to incorporate AI-powered options into their merchandise. But it’s not all easy crusing. A evident problem looms massive for product leaders: the GPU shortage and spiraling prices.

Rise of LLMs and GPU shortage

The rising variety of AI startups and companies has led to excessive demand for high-end GPUs comparable to A100s and H100s, thereby overwhelming Nvidia and its manufacturing accomplice TSMC, each of whom are struggling to meet the provide. Online boards like Reddit are abuzz with frustrations over GPU availability, echoing the sentiment throughout the tech neighborhood. It’s turn out to be so dire that each AWS and Azure have had no alternative however to implement quota techniques.

This bottleneck doesn’t simply squeeze startups; it’s a stumbling block for tech giants like OpenAI. At a current off-the-record assembly in London, OpenAI’s CEO Sam Altman candidly acknowledged that the pc chip shortage is stymieing ChatGPT’s development. Altman reportedly lamented that the dearth of computing energy has resulted in subpar API availability and has obstructed OpenAI from rolling out bigger “context windows” for ChatGPT.

Event

VB Transform 2023 On-Demand

Did you miss a session from VB Transform 2023? Register to entry the on-demand library for all of our featured classes.

Prioritizing AI options

On the one hand, product leaders discover themselves caught in a relentless push to innovate, going through the expectations to ship cutting-edge options that leverage the energy of gen AI. On the different hand, they grapple with the harsh realities of GPU capability constraints. It’s a fancy juggling act, the place ruthless prioritization turns into not only a strategic determination however a necessity.

Given that GPU availability is poised to stay a problem for the foreseeable future, product leaders should assume strategically about GPU allocation. Traditionally, product leaders have leaned on prioritization methods like the Customer Value/Need vs. Effort Matrix. This methodology, nonetheless logical in a world the place computational assets had been plentiful, now calls for a little bit of reevaluation.

In our present paradigm, the place compute is the constraint and not software program expertise, product leaders should redefine how they prioritize numerous merchandise or options, bringing GPU limitations to the forefront of strategic decision-making.

Planning round capability constraints might sound uncommon for the tech business, nevertheless it’s a commonplace technique in different industries. The underlying idea is simple: The most useful issue is the time spent on the constrained useful resource, and the goal is to optimize the worth per unit of time spent on that constraint.

Technology success metrics

As a former advisor, I’ve efficiently utilized this framework throughout numerous industries. I imagine that tech product leaders also can use an identical method to prioritize merchandise or options whereas GPU constraints exist. When making use of this framework, the most easy measure of worth is profitability.

However, in tech, profitability won’t all the time be the acceptable metric, notably when venturing right into a new market or product. Thus, I’ve tailored the framework to align with the success metrics typically utilized in tech, outlining a easy 4 steps course of:

1. Contribution

First and foremost, determine your North Star metric. This is the contribution of every product or function, one thing that encapsulates the essence of its value. Some concrete examples would possibly embrace:

An improve in income and revenue
Gains in market share
Growth in the variety of each day/month-to-month lively customers

2. Number of GPUs required

Gauge the variety of GPUs wanted for every product or function. Focus on key elements together with:

Number of queries per person per day
Number of each day lively customers
Complexity of the question (what number of tokens every question consumes)

3. Calculate contribution per GPU

Break it down to the specifics. How does every GPU contribute to the general purpose? Understanding this will provide you with a transparent image of the place your GPUs are finest allotted.

Prioritize merchandise primarily based on contribution per GPU

Now, it’s time to make the powerful choices. Rank your merchandise by their Contribution per GPU, and then line them up accordingly. Focus on the merchandise with the highest Contribution per GPU first, making certain that your restricted assets are channeled into the areas the place they’ll make the most impression.

With GPU constraints now not a blind spot however a quantifiable consider the decision-making course of, your organization can extra strategically navigate the GPU shortage. To convey this framework to life, let’s visualize a state of affairs the place you, as a product chief, are grappling with the problem of prioritizing amongst 4 totally different merchandise:

	Product A	Product B	Product C	Product D
Revenue Potential (Contribution)	$100M	$80M	$50M	$25M
Number of GPUs Required	1,000	450	500	50
Contribution Per GPU	$0.1M/GPU	$0.18M/GPU	$0.1M/GPU	$0.5M/GPU

Although Product A has the highest income potential, it doesn’t yield the highest contribution per GPU. Surprisingly, Product D, with the least income potential, gives the most substantial return per GPU. By prioritizing primarily based on this metric, you can maximize complete potential income.

Let’s say you’ve gotten a complete of 1,000 GPUs at your disposal. A easy alternative may need you opting for Product A, producing a income potential of $100 million. However, by making use of the prioritization technique described above, you can obtain $155 million in income:

Priority Order	Product	Revenue Gain	GPUs
1	Product D	$25M	50
2	Product B	$80M	450
3	Product C	$50M	500
Total		$155M	1,000

The identical methodology might be utilized to different contribution metrics, comparable to market share achieve:

	Product A	Product B	Product C	Product D
Market Share Gain (Contribution)	5%	4%	2.5%	1.25%
Number of GPUs Required	1,000	500	500	50
Contribution Per GPU	0.005%/GPU	0.008%/GPU	0.005%/GPU	0.025%/GPU

Similarly, deciding on Product A would have led to a market share achieve of 5%. However, making use of the prioritization technique described above, you can obtain 7.75% in market share achieve:

Priority Order	Product	Market Share achieve	GPUs
1	Product D	1.25%	50
2	Product B	4%	450
3	Product C	2.5%	500
Total		7.75%	1,000

Benefits and limitations

This various prioritization framework introduces a extra nuanced and strategic method. By zeroing in on the Contribution Per GPU, you’re strategically aligning assets the place they’ll make the most substantial distinction, whether or not when it comes to income, market share or every other defining metric.

But the benefits don’t cease there. This methodology additionally fosters a better sense of readability and objectivity throughout product groups. In my expertise, together with my early days main digital transformation at a healthcare firm and later whereas working with numerous McKinsey shoppers, this method has been a game-changer in situations the place capability constraints are a crucial issue. It’s enabled us to prioritize initiatives in a extra data-driven and rational way, sidelining the conventional politics the place choices would possibly in any other case fall to the loudest voice in the room.

However, no one-size-fits-all answer exists, and it’s value acknowledging the potential limitations of this methodology. For occasion, this method could not all the time encapsulate the strategic significance of sure investments. Thus, whereas exceptions to the framework can and ought to be made, they ought to be rigorously thought-about somewhat than the norm. This maintains the integrity of the course of and ensures that any deviations are made with a broader strategic context in thoughts.

Conclusion

Product leaders are going through an unprecedented scenario with the GPU shortage, so discovering new methods of managing assets is required. In the phrases of the nice strategist Sun Tzu, “In the midst of chaos, there is also opportunity.”

The GPU shortage is certainly a problem, however with the proper method, it could even be a catalyst for differentiation and success. The proposed prioritization framework, specializing in Contribution Per GPU, gives a strategic way to prioritize. By zeroing in on Contribution Per GPU, firms can maximize their return on funding, aligning assets the place they’ll make the most impression and specializing in what issues the most to the long-term success of their firm.

Prerak Garg is senior director of cloud and AI company technique at Microsoft and a former McKinsey and Company engagement supervisor.

DataDecisionMakers

Welcome to the VentureBeat neighborhood!

DataDecisionMakers is the place consultants, together with the technical individuals doing information work, can share data-related insights and innovation.

If you need to examine cutting-edge concepts and up-to-date info, finest practices, and the future of knowledge and information tech, be part of us at DataDecisionMakers.

You would possibly even take into account contributing an article of your individual!