UPDATED 14:00 EDT / NOVEMBER 27 2023

AI

Exclusive Q&A: Ahead of re:Invent, AWS chief Adam Selipsky lays out a broad AI strategy

As Amazon Web Services Chief Executive Adam Selipsky prepares for his third re:Invent conference, it’s no surprise that artificial intelligence dominates his thinking.

AWS needs a big boost given all the competition lately from the likes of Microsoft Corp., Google LLC and a raft of smaller clouds latching onto generative AI. Meanwhile, growth at Amazon.com Inc.’s cloud unit has slowed or flattened for seven quarters in a row as customers enjoy the flipside of the cloud: the ability to scale down as quickly as they scaled up. Selipsky, who has been CEO of AWS for well over two years — enough to have full responsibility for its results by now — has a lot to prove.

Although Selipsky couldn’t reveal much about new products coming at re:Invent, his focus in this interview on AWS’ ambitions with processors, the importance of AI model choices, and the central nature of data management provide some hints at what’s to come. The company is likely to announce its usual raft of new products, almost certainly focusing on AI, AI chips and other new infrastructure services to support it, and new data analysis tools.

Here’s the full interview, lightly edited for clarity:

Let’s start with the business. How do you think about the numbers now that we are in this new mode of AI growth driving changes in spending?

It’s been a very unusual couple of years economically, supply chain shocks, interest rate increases, inflation. And I’m hard-pressed to remember a year where customers have felt so uncertain about where the economy’s heading and so uncertain about their prospects. I’ve seen much better times. I’ve seen much worse times, but not such uncertain times. And as a result, I think some companies have decided to be conservative about their spending and their levels of investment.

As part of that, we have had a lot of customers who have been cost-optimizing their clouds. Why? Because it’s been a great feature of AWS — elasticity which we’ve been enabling literally from day one of the cloud. That’s the ability to rapidly grow and shrink your infrastructure. And guess what? We meant it. Airbnb shrank their infrastructure expense by 27% in just a few months. And they then bounced back rapidly once the travel and hospitality industry came back.  We help our customers lower their cost when it’s time for them to do so. And by the way, we think it’s good business for us. We think it engenders really strong long-term relationships.

If that’s the case, how should we as the industry think about the health of the new cloud business?

Our health is strong. Our growth rate obviously came down, over four- or six-quarter period. You saw in our last quarter that the growth rates had stabilized and now we’re cautiously optimistic that not all but a lot of the customers are through the cost optimization. And certainly if you look to the medium and long term, we feel very optimistic about the outlook for strong AWS growth. We’re still in the early days of the cloud.

We’ll get into the whole AI trend, especially with large language models and foundation models serving as the abstraction layer on the infrastructure. But first, what’s the most important thing you’re doing at the infrastructure level — silicon obviously, but what else?

First, there’s still all of the non-generative AI activity, which remains really, really important — as important as generative AI is. So all of the other use cases, all of the other infrastructure that customers need. We continue to have by far the broadest and deepest set of capabilities across all cloud use cases. So we continue to work on our own custom chips. So I think you saw some other providers just announcing, you know, that in the future they will have V.1 chips and maybe they’ll only be for their own internal use. And you gotta wonder, are they not good enough for customers?

Meanwhile, Graviton 3 has been in the market for over a year now. And I guarantee you, we’re not stopping at Graviton 3. Each generation of our custom chips has provided industry-leading price-performance, industry-leading energy consumption. And each generation has had on average 25% price-performance improvements over our prior generation. So it’s really important that we keep innovating for general-purpose computing.We also keep advancing our technology in storage. S3 was the granddaddy of them all, the first cloud service we announced. We have really important innovations that help save customers money, like S3 intelligent tiering, which automatically moves your storage to more cost-effective tiers. S3 customers have saved over $2 billion since intelligent tiering.

I just came back from the supercomputer conference in Denver and you’re starting to see some other formations around architectural changes at the infrastructure level, chips and other things. Beyond and around the chips, people are doing things differently with network and compute to get what they need for gen AI. How are you using things like your partnership with Anthropic, for example, to work on the silicon? How’s this impacting the pr

Within the generative AI world, much like we think about layers of the stack for other AWS services, we think of gen AI as having three layers of the stack as well. It’s really important for our customers collectively that we innovate at all three layers of the stack. Not all of our competitors have seemingly taken the path of innovating at all layers of the stack. Some of them may be coming around, but you have to wonder how long it’ll take them to get there.

So the bottom layer of the stack is the infrastructure itself and the silicon. We’ve already shipped earlier this year our Inferentia 2 chip for inference. Inside of our infrastructure in two compute instances with really good price-performance. And we’re well over a year into our Trainium line and yeah, you can imagine we’re not going to stop at our first version of Trainium.The gen AI workloads are so compute-intensive, particularly training models, but people will also see that running inference can be really expensive as well. That price-performance is going to be absolutely vital and companies will actually decide not to do certain things with generative AI if the price-performance can’t be brought into line.

That’s where Anthropic comes in.

They’ve been a customer literally since they launched, which seems like decades ago, but it was actually only 2021. Anthropic came to us and realized they needed very large amounts of cost-effective and performant compute capacity to train their leading foundation models and {make them available in] Amazon Bedrock, our managed service for foundation models. They will be developing the future models on AWS. They’ve named AWS their primary cloud provider for mission-critical workloads. And they’re going to be running the majority of their workloads on AWS.

We also realized that they’re world experts … and if we collaborated tightly, they could actually help us with our Trainium and Inferentia development. As part of that deepened and expanded partnership, the investment that we said we’d made, which was $1.25 billion with the option for it to go up to as high as $4 billion.

What kind of product feedback are you going to see or expect with Anthropic in the silicon?

Continued and renewed optimization just to make the chips work better. They work faster at lower cost when someone like Anthropic is encountering all sorts of insights that a lot of other people are just not gonna bump into. So I think there’ll be insights about … efficiency and the raw performance of the chips.

AWS has a great relationship with Nvidia, providing access to its graphics processing units. At the same time, it’s trying to enable some bare-metal server companies to do specialty clouds. How do you see the potential for purpose-built clouds inside data centers for specialized workloads?

Let’s talk about GPUs for a moment and then I’ll talk about where they’re gonna be deployed to. AWS is the leading place to host GPU-based capacity for generative AI. We’ve been the first cloud provider for basically every significant chip that NVIDIA’s come out with back to the gaming applications before anybody was talking about generative AI. It’s not just about having chips, it’s also about having incredibly performant networking inside of the clusters, for example. Various small providers have stood up GPU capacity, we’ve seen a lot of customers go and investigate it and then come running back to us saying, “Having chips is great, but it doesn’t actually work.” So we continue to deepen our relationship with Nvidia. That’s not going to change.

It’s pretty clear that generative AI is going to be the next set of massive workloads in the cloud. So many gen AI workloads are incredibly compute-intensive. And it’s not just about standing up some servers that have GPUs inside. It’s about making clusters highly performant, highly reliable, highly cost-effective, and more and more highly energy-efficient.

All of the chips from the general-purpose Graviton chips to Trainium and Inferentia as well as everything we do in our data centers makes us so incredibly more energy-efficient. The average U.S. enterprise will reduce its computing energy consumption by 80% simply by moving from on-premises to the cloud, to AWS. And this is becoming really important to our customers.

What things do you do in the data center for efficiency beyond your own chips?

Not every use case calls for long-term persistent usage of GPUs. And there are a lot of workloads, a lot of companies and a lot of workloads that really want short-term episodic usage of GPUs. And you need a cluster of GPUs. It’s not just like general compute where you can have one virtual server here and one virtual server over there. With GPUs, for these generative AI apps, need a cluster. And the cluster needs to have networking internal to the cluster. This has been an unsolved problem.

So just a couple weeks ago we released EC2 Capacity Blocks for ML, which is really the first solution for this short-term usage of GPU-based ML clusters. You can reserve in advance capacity blocks that can be very small and also be up to hundreds of GPUs or hundreds of servers with GPUs. You no longer have to choose between, do I want to take the chance that I have no capacity versus do I want to buy hold onto capacity that’s going to be dramatically underutilized when I don’t need it.

Inference is the killer app, where the extraction of the value will come from, and the data value. How will the user interface change with generative AI?

I agree the data will be the key thing. But the question of training versus inference, it’s like which foot do you need to run a race? You really need both feet to run a race and first you do need really great models, and the price-performance characteristics of the models will absolutely continue to accelerate rapidly. People are going to want models from multiple providers. There’s not going to be one model to rule the world. There will absolutely some very large, highly capable general-purpose models. And there will also be smaller specialized models. You may actually be able to train them to be better at answering certain specialized questions — for example, a model around molecular structure for drug research. And the smaller models are going to have really attractive price-performance characteristics.

That’s why we released Amazon Bedrock, which is our foundation model-as-a-service offering. And we’ve got multiple leading providers in there. So we’ve got Anthropic, Cohere, Stability AI, AI 21. We recently added Meta’s Llama 2 — we’re the only place that’s offered as a managed service. And we also have our own Amazon Titan models. All of that flexibility is absolutely vital to customers because at this early stage of the game with all of the experimentation going on … the key characteristic that people need to have is adaptability. 

We’ve already talked to a Fortune 100 financial services company that had an early gen AI app. It was around analyzing their code and they used model A and they got decent results, and they used model B and they got decent results. Then they actually used two, they experimented and used two models in conjunction with each other and they got, in their words, magic.

The developer opportunity is going to be massive. So what will be built from these experiments is the next question. If you have this foundation layer, there’s going to be a generative application tsunami. What’s your view on that?

That’s the third layer of the stack, so we’ll get there, but I didn’t really answer your second-layer-of-the stack questions. Let me just finish that. Once we’ve got great models that have reasonable price-performance, and as customers go and start deploying generative AI applications, the majority of what is spent on gen AI is going to flip to be inference. Not training. That’s why it’s so important not to only focus on training, but to focus on having, really great inference that has price-performance, that works. And that’s why things like our Inferentia-based compute capacity are so important.

We’re building a whole lot of capabilities to help people run inference. Just one, just to give a little flavor: A few months ago we announced capability inside of Bedrock called Agents. FMs are great at providing answers, whether that be summarizing things or making predictions, but they have not been good at actually taking action, performing tasks, tasks inside of your business.

Say you’re an e-commerce site and have a consumer who bought brown shoes and they want to exchange those shoes for black shoes. With Agents for Amazon Bedrock, the e-commerce provider can very simply use generative AI to set up a workflow where the customer will ask to do that exchange. And on the back end, the AI-driven agent will access all the databases, figure out what is allowable, not allowable, query the inventory, and return and answer in directions to the customer as to how to do that exchange. We’re trying to push into how can we actually help enable your business with generative AI?. How can we help you take action with generative AI? 

If it’s my business, I want to make sure it’s not going to be just publicly available data. I want to have proprietary data. 

Glad you brought that up. It boggles my mind how other providers have released early versions of their AI offerings essentially without a security model. And then they circled back and said, “Oh, well, we’ve got subsequent versions and that those have security.” If I went to one of my enterprise customers and said, “Hey, I’ve got this amazing new database service for you. It’s fantastic. I really think you should use it. By the way, it doesn’t have any of the security that you’ve come to expect with AWS, but don’t worry about it. In future versions I’ll build security in. They would throw me out.

So everything we’ve done, our own models, Titan models, all of the services, particularly Amazon Bedrock, are built with the same security model as any other service inside of AWS.

After the hockey game last night, I met with some folks who were in town for another conference and their security guys are turning off Copilot. There’s no way they’re going to deal with that data being in the product. That’s the data problem that I see: If there’s a public model like Anthropic or OpenAI, I’ll go to their data and ping them for some answers, but I’ll take it into my model.

Let me give you a couple specifics around the security model. We make a separate private version of your model and we place it inside of your Amazon virtual private cloud or VPC, which is your own isolated AWS environment. Any movement is encrypted over a private connection. And so you’re in a fully isolated environment. And that means two things.

One, that means that any data that you use in conjunction with the model, it’s not going out over the public internet. It’s all staying in your isolated environment. And two, it means that any improvements you make to the model through fine-tuning — through additional pretraining on the model or through RAG, retrieval-augment generation, any of those customization techniques — purely stay with and reside with you. The improvements to the model staying with you are the important security concepts that we don’t think you can bolt on.

What do you think the impact is going to be for data management?

Both you and your competitors will have equal access to those great models, so your data is going to be your differentiator. And what that requires is that you have a great data strategy and a great data platform. That’s why so many AWS customers are moving so quickly with generative AI because they’ve already got a great data strategy up and running on AWS. I’m talking Adidas, BMW, Booking.com, Boeing, Merck, Pfizer Nexus, Lexus, Zillow.

And you have to know what data you have. It has to be harmonized. It all has to be usable in conjunction with one another. You need multiple database engines, not just one. You need multiple analytics services each with deep capabilities. You need governance, which runs through the whole thing so that everybody in the company knows what data’s available [and] permissions are taken care of.

What about databases and analytics?

All of our data services, from databases to analytics services, that remains a really important priority for our customers. We introduced at the last re:Invent this concept of a zero-ETL [extract/transform/load, a process to organize data for storage and use], which was a vision for a future in which you don’t have to build data pipelines and rip your hair out trying to rewrite code. You can easily query your data without having to do complex ETL on it. We put a down payment on that last year. We for sure are gonna continue to make good on that down payment.

In the generative AI setting, now you’re talking data pipelining basically. So if you have a workflow end to end, you understand your data. If I can generate code to be a glue layer or I can have developed code to help set up this and scale the aspect of the infrastructure, developers won’t even need a console, they’re going to get the machine to code in-line somewhere. How soon do you see that coming?

We just finished with the middle of the stack, which is accessing foundation models. The top layer of the stack is basically using applications which are using the FMs, but you, the customer, don’t have to worry about using the FM itself. One of the first areas we looked at in terms of what generative AI applications can be really helpful, was how we could help developers. About a year ago, we released Code Whisperer, which is our coding companion. You type in a natural language query and it returns code to you. We’ve seen developers complete their tasks up to 60% faster. We also announced a capability this fall, Code Whisperer customization. It understands your libraries, your SDK, your code base, so it’s delivering Amazon-specific code or Zillow-, Accenture-specific code.

So I think the premise is right, which is really having full control and power over your data and bringing that to bear is going to impact every business function in the company. 

We’re seeing applications that are AI wrappers, really easy ways to put AI around something. Then you see enterprises with good data, data exhaust being used to get value. And then the third area is like some reasoning app. How do you see AI helping the ecosystem of partners building applications on top of AWS?

There are huge partnering opportunities across every facet of generative AI. There will be a huge number of partners who build generative AI-driven applications on top of Bedrock, on top of the models — for every business function, for developers, for customer service applications, for legal departments, for Hr, for design, for content generation.

In addition, we’re already deeply partnering with really almost every Systems Integrator that we work with on generative ai. And you’ve got folks like Accenture, like Deloitte, like Slalom, like Rackspace, who are both using our capabilities internally like Code Whisperer. Accenture’s deployed Code Whisperer to 50,000 people.

More importantly than that, we’re going out jointly to customers to go everywhere from the brainstorming to the proofs of concept and pilots to production, deployment of generative generative AI best applications. And our customers really need that expertise.

If you were a startup in 2007, if you weren’t on Amazon, you were not a startup. Now you have startups coming on the AI side with people who are 23, 24 years old, and now they have more choices. What’s your value proposition to them?

Well, there is no birthright. We have to earn everybody’s business every day. Over 80% of today’s unicorns use AWS — I’m not talking about the ones that are now Fortune 500 companies. So I think we’re still doing very well with today’s startups. They’re also just fun to work with. And it’s up to us to continue to have the offerings and the price-performance and also the human interaction and, and ways to help them bootstrap and, and grow their businesses. And to be successful. That will keep them, keep them running on AWS and I think we can do that.

How do you feel about re:Invent? Are you still pumped up?

I could not be more excited about re:Invent. We’re going to have over 50,000 of the AWS community in live in Las Vegas with us. We’re going to have over 2,000 sessions for them. We’re going to have hundreds of thousands of people watching online. It’s one of the favorite weeks of the year for everybody here. And we get to be with our customers and our partners. We get to launch a bunch of new stuff.

Photo: AWS

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU