Enterprises are racing to adopt AI to enhance operational efficiency and gain competitive advantage. In fact, in a recent survey, 71% of respondents reported that their organizations were already using generative AI (GenAI) regularly.
Beyond employing existing GenAI services, many organizations are building their own AI-powered apps and AI agents to deliver hyper-personalized customer experiences. They recognize the tremendous opportunity for creating apps and agents that provide highly tailored buying recommendations, product information, and support for their global customer base.
This focus on hyper-personalization coincides with a shift toward a new model for application development. Developers are increasingly breaking up applications into smaller, single-function components that can serve as AI agents. This new model enables developers to produce new agents quickly while enhancing their performance, resilience, and security.
There’s just one problem: Developers often use hyperscalers for building apps. But those hyperscalers can’t adequately support this new application model. They are not designed for delivering personalized apps or agents to millions of individual users.
We need a new foundation for this AI era — a new infrastructure for building and deploying these next-generation apps and agents. Finding that foundation starts with understanding key shifts in application development, identifying the limitations of hyperscalers, and then defining essential requirements. Once we have the foundation in place, we can build apps and agents that address our business goals and meet our customers’ expectations.
The days of building large, monolithic applications that provide the same experience to every user are ending. Two key trends are working together to produce a new paradigm in app development: Organizations are using AI to create hyper-personalized experiences, and they are adopting a new architectural model to build and deliver apps.
Teams that are building apps can no longer produce generic, one-size-fits-all experiences that flood users with excessive information and irrelevant offers. Customers today — whether in retail, financial services, gaming, or some other field — expect apps to deliver engaging experiences that reflect their precise preferences and interests.
AI apps and agents help deliver those hyper-personalized experiences. And they can do so at scale, without requiring teams to increase manual development work.
Let’s say your company offers calendar software. You might want to enable your customers to schedule specific types of events, like a child’s soccer games. Each individual would have a unique AI agent, which could collect game information from the league’s online event calendar and then populate the individual’s custom, personalized calendar. Using AI agents lets you deliver these individualized experiences to thousands or millions of customers.
The drive to create hyper-personalized experiences with AI apps and agents is aided by an important change in software development. After transitioning from monolithic apps to microservices, developers are now embracing a “nanoservices” model. This model breaks app functions into small, self-contained, single-purpose components — even smaller and more independent than microservices. Like the microservices approach, nanoservices help accelerate development and improve app resiliency; but they also enable hyper-personalization.
Each nanoservice component functions as an agent, conducting a very specific, highly customized task. Instead of building a single app that serves millions of users, this architecture enables you to deliver millions of unique, tailored agents.
Integrating AI into this architectural model allows you to use agents to deliver individualized experiences at scale. You can send the input from the user (like a prompt, browsing history, or chat interaction) to a large language model (LLM), then execute a personalized task based on the output. You can run the process once and shut it down — or use it millions of times.
Coupling nanoservices with AI, then, can have a revolutionary impact on the kinds of experiences you deliver to users. Still, making the most of nanoservices and AI will be difficult with typical hyperscaler offerings.
It’s not surprising that hyperscalers are the preferred destination for building and delivering apps; they are generally much more flexible and cost-effective than legacy on-premises environments. Still, scalability constraints, statelessness, security limitations, and additive costs mean they are not the best foundation for AI.
Hyperscalers are great at scaling a single app: With a hyperscaler, you can deliver the same, single app to thousands of users. But hyperscalers don’t have the right application architecture to efficiently deliver millions of independent apps or agents to millions of individuals.
They also lack the automated processes to spin those apps and agents up — or down — quickly. You might have thousands of users who suddenly want a customized AI agent for scheduling soccer games when the season starts. To meet demand, you need to spin up those independent agents within milliseconds of one another. When the season ends, and demand ebbs, you need to spin those apps down immediately to conserve resources.
With most hyperscalers, provisioning resources for an individual app is a time-consuming manual process. That environment simply isn’t designed for supporting numerous individual, customized apps.
Hyperscalers cannot provide the statefulness that you need for running AI agents. An AI agent should remember context, user preferences, and user interactions from one session to the next. That statefulness enables users to pick up right where they left off with their task instead of having to start from scratch — even if there has been a significant period of inactivity.
But hyperscalers prioritize scalability and reliability over statefulness. Hyperscalers are designed so that resources can be added or removed easily to accommodate changing demand, and rapidly replaced when something goes wrong. That stateless flexibility comes at the expense of retaining data in memory.
When you are creating hyper-personalized AI-powered apps and agents, you need to make sure that data from one user’s app / agent will never be accessible to another user. If you are keeping track of your child’s soccer schedule with an AI agent, for example, you don’t want other people to have access to your child’s location.
Because hyperscalers don’t offer app architectures designed for nanoservice-based agents, they can’t adequately secure each agent. If you are building AI apps and agents, you also need to protect against the wide array of threats facing AI models — from prompt injection and insecure output handling, to data poisoning and supply chain vulnerabilities. Not all hyperscalers have sufficient built-in security capabilities to handle these and other threats.
And if your developers are using a hyperscaler and its AI model for “vibe coding,” where an LLM is helping write the code, you need to be sure the model doesn’t inadvertently introduce security vulnerabilities.
The costs of running an AI-powered app on a hyperscaler add up quickly. They charge you for every input to and output from an AI model. Others charge you for the time that resources are allocated, even if those resources are not processing requests. Let’s say the AI agent that tracks soccer schedules repeatedly checks the league’s online calendar for any updates to game times or locations. The company providing the app will be charged by the cloud provider for all the time that resources are reserved, even if your code is just waiting for user input or a response from an external source (like the league’s online calendar).
You might also incur high egress fees for data. Hyperscalers charge those fees if they are delivering data to a user through an app on a different cloud provider or transferring data from their cloud to another. If they transfer data within a cloud, there is still potential for regional egress fees.
When selecting a cloud-based platform for building and running AI-powered apps, there are four key requirements.
First, the platform must provide an application layer designed to support the nanoservices that enable hyper-personalized AI apps and agents. You need to easily produce millions of precisely tailored apps and agents, all running at once — and all delivered close to users, at the edge. And you should be able to spin those apps up or down automatically, in milliseconds.
The platform should enable you to build stateful agents that retain context in memory for minutes, hours, days, or weeks. Unlike traditional stateless serverless architectures, a platform with statefulness will allow agents to resume interactions after an extended period of time and perform complex, long-running tasks.
The best platform for AI development will enable you to easily build security into apps and agents. It will allow you to create secure, discrete nanoservices that protect sensitive data. Meanwhile, it will provide tools that let developers vibe code with confidence, preventing data leakage, prompt injection attacks, and other threats that can emerge when integrating AI models with apps.
A foundational AI platform must offer multiple ways to minimize the costs of running AI apps. For example, caching AI model responses can reduce token fees. If users provide the same input repeatedly, you won’t pay for numerous calls to the AI model.
The platform should also let you avoid the fees for “wall time,” when apps and agents are waiting for input or responses. They should charge you only when your apps are actually doing something. And because you might need to move data from one storage environment to another, the platform must eliminate data egress fees.
Cloudflare offers a scalable, secure, and cost-effective foundation for building, running, and securing AI-powered apps and agents. Cloudflare provides a broad portfolio of services to streamline AI app and AI agent development — including a platform for accelerating AI development, an SDK for building AI-powered agents, security services for protecting AI apps, and object storage for storing fast-growing data sets without incurring egress fees.
Importantly, the Cloudflare platform is designed to support the nanoservices model that enables you to build and scale millions of hyper-personalized apps and agents. It also provides the statefulness needed to retain context between sessions. And while hyperscalers make you string together multiple, distinct services to build, secure, and deploy apps, Cloudflare delivers all the capabilities in a single, unified connectivity cloud platform. Cloudflare’s global network enables you to run those apps and agents close to individual users, delivering responsive, low-latency experiences.
Using Cloudflare as an AI foundation can help you build and deliver innovative, hyper-personalized apps and agents rapidly, securely, and cost-effectively. With that foundation in place, you can then work to further maximize the ROI of AI for your enterprise while reducing risk.
This article is part of a series on the latest trends and topics impacting today’s technology decision-makers.
Learn how to support enterprise AI initiatives while enhancing security in Ensuring safe AI practices: A CISO’s guide on how to create a scalable AI strategy.
After reading this article, you will be able to understand:
How AI and nanoservices are transforming app development
Why hyperscalers can’t adequately support AI apps
3 requirements for AI app development platforms