re:Invent 2023: AWS aims to capture AI leadership by offering choice

Jim Love

2 years ago

AWS revealed its strategy for leadership in generative AI at its re:Invent conference last week, making it clear that, while rival Microsoft may have gotten off to an early start, AWS hasn’t been standing still.

Generative AI is relatively new – ChatGPT has only been on the market for a year. But artificial intelligence is not new. It’s been actively in use in business for more than a decade.

In fact, Amazon, AWS’ parent company, has been applying and integrating leading edge artificial intelligence (AI) on a global scale. Amazon’s retail operations leverage AI in every aspect of the company’s business, from sales to logistics. AWS’ message at re:Invent was that it has brought that breadth of experience of AI enabled business to the new opportunities presented by generative AI solutions.

In that context, AWS rolled out an overwhelming list of offerings, features and options at re:Invent. Presenters were often forced to acknowledge that if they covered all of the elements of their offerings, they’d far exceed the time they had for their presentations.

So how do you summarize AWS’ offerings in generative AI? If you had to do it in a single word, that word might very well be “choice.”

Choice of models

AWS has taken the position that it sees a world where there is no dominant model. This is consistent with how AI is evolving in the market. While there are some models that are better known in the public consciousness, there has been an explosion of different AI models. They range from large global models with billions of parameters to smaller specialized models with far fewer parameters that often rival bigger models in accuracy and function, at least for specific purposes.

There are also proprietary models, as well as many open source variations. There are models that are integrated into larger software packages and there are standalone models that can be adapted for multiple uses. There are text-based, voice-based and image-based models.

This list grows almost on a daily basis. AWS has chosen to embrace that diversity, and focus on the cloud infrastructure and tools that will enable companies to choose the right model for the right task.

So although AWS has made strategic investments in ChatGPT competitor Anthropic and its Claude.ai model, it also offers support for a wide range of others, ranging from Meta’s Llama 2 to Stable Diffusion’s text to image suite and a host of others.

AWS’s Bedrock allows customers to access multiple models. It can even allow for changing the model in use without changing the the underlying infrastructure, giving companies the choice of which generative AI model they use. Companies can select the right model for the right purpose, based on technical and business goals such as accuracy, cost and speed.

Understanding that this wealth of models can be confusing and even overwhelming, AWS’ Bedrock has tools to help evaluate models on more precise criteria, with both automatic and human evaluation. Taking these criteria and providing metrics such as “accuracy, robustness and even toxicity” allows companies to understand the advantages and the trade-offs they make when choosing one model over another.

Data privacy and performance

Another focus which was highlighted in many presentations was the need to provide an infrastructure that provides privacy, security, and performance.

One of the issues with LLMs that could hold back adoption is understanding how a customer’s data can be protected while dealing with models that learn from the data they process. In the early launches of generative AI products, there have been examples of model “leakage” and fears about the loss of key intellectual property.

AWS has focused on tools and structures to protect and isolate customer data.
In addition, it has added the concept of “clean rooms” that allow for applying models without sharing raw data. Their message that “it’s your data” came across loud and clear.

In addition to privacy, there is also the issue of performance. The strength of LLMs is the incredible amount of data that they can process. The challenge is to do this at scale and at a speed that supports tasks that often must be performed in real time.

In a consumer setting with a new and novel offering, you can tolerate some of the delays that have been part of early generative AI models. But at an enterprise level, the ability to scale and have split second response is critical. You can wait for a model to search and present you with an interesting fact, but to support a natural conversation or to do things like fraud prevention, even minor latency could be a real issue.

Some of the innovations that were showcased provided vector database integration with standard databases. Keeping these models and the data close together is one way to vastly increase performance.

Taking what one presenter called a “non-ETL” approach, avoiding the loading and unloading of data, is another way to bring processing into real-time applications.

Infrastructure options

Driving performance and scale while offering choice extends beyond the software and database right up to the hardware layer.

AWS has always allowed customers to choose the CPU chipsets that drive their servers, offering a choice of Intel, AMD and even its own Graviton chipset. Each has its own strengths and advantages. Having choice allows the end customer to balance performance and cost.

AWS announced that it will also offers choice in GPU chipsets. It has a close partnership with NVIDIA, which is the “gold standard” of AI processing. But it has also designed and implemented its own GPU chips which, depending on the usage, may offer lower cost, increased processing speeds and even better energy consumption.

Ease of use, accessibility and democratization

Generative AI solutions in business have divided into two major paths, with some possible variations. Everyone is familiar with the natural language applications that allow anyone to converse with the AI and conduct a wide range of tasks, even creating entire applications. That human level interaction and the democratization it supports has been a driving force in generative AI.

But once again, operating at scale or having highly specialized applications can also require the ability to customize, integrate, and fine-tune models using expert skills.

AWS has introduced a range of offerings that it feels will provide full natural language solutions and provide assistance with code development.

One demonstration really encapsulated this approach. There are still, and will continue to be, databases that need SQL queries to retrieve and interpret data. Tools which take natural language instruction speed up the process for the programmer and can do the heavy lifting, even down to the level of testing the program and checking for security and other issues. These still allow the programmer to intervene, change and adapt the solutions.

Equally, for an untrained user, the same facility can generate, test, and run a query, even reading the database schema and suggesting how to write the appropriate code.

Having both of these options allows AWS to appeal to enterprise technology groups and business users alike.

For the highly skilled, the emphasis is on productivity and security. For others, the natural language and no-code solutions emphasize the democratization possible with LLMs.

Summing it up

Those are our reflections from the time spent at re:Invent this week. I’ll be posting other stories in the coming week from some particular areas of interest.

For those who want to dive a little deeper, we’ll be updating this article with a list of resources, including links to presentations and papers that can provide more information. Check back for more information.