The future of local AI will live on rented servers

For the past few years, artificial intelligence has been dominated by a very simple idea: use ChatGPT, Claude, Gemini or any other closed API, pay per token and let a big company take care of everything. For the average user, that model is convenient. But for many companies, startups and developers it has a serious problem: they don’t fully control their data, their costs or their infrastructure.

That’s where a trend appears that could define the sector’s next stage: local AI running on rented infrastructure. And this matters because, when we talk about “local models”, we’re not always talking about a laptop at home. Very often we’re talking about open-weight models deployed on private servers, rented GPUs, bare metal machines or specialized clouds.

The thesis is clear: if open-weight models keep improving, demand for GPU infrastructure could skyrocket. Not because everyone will have a supercomputer on their desk, but because thousands of companies will want to run their own models without depending entirely on a closed API.

The thesis: better open-weight models mean more demand for GPU infrastructure

If an open-weight model gets close enough to the level of the best closed models, many companies will consider deploying it themselves. They don’t need it to be the best model in the world across every benchmark. They need it to be good enough for their real tasks.

There’s no shortage of examples: internal support, agents for employees, document analysis, RAG, content generation, classification, translation, data extraction, code copilots, automations and specialized chatbots.

Running your own model can offer several advantages:

More control over data, especially in regulated sectors.
Greater privacy, by reducing dependence on external providers.
Customization, with models tuned to the business.
More predictable costs when volume is high.
Less dependence on price changes, limits or terms from OpenAI, Anthropic, Google or other providers.

This doesn’t mean closed APIs will disappear. The most likely scenario is that both worlds coexist. For very hard tasks, many companies will keep using closed frontier models. But for a huge share of day-to-day work, a well-deployed open-weight model can be enough.

The common mistake: thinking local AI only means “on my PC”

When someone hears “local model”, they usually picture Ollama on a Mac, a PC with a powerful GPU or a gaming tower. That part exists and will grow, but it’s not the big market.

The big market is in companies that need to serve thousands or millions of queries, startups embedding AI into their product, banks that want privacy, hospitals handling sensitive data, law firms, e-commerce, call centers, SaaS platforms and autonomous agents running 24/7.

None of that is solved with a laptop. It requires servers, GPUs, storage, fast networks, cooling, power and operations.

That’s why the complete thesis isn’t “there will be good local models”. The complete thesis is: there will be open-weight models that are good enough, and companies will need to rent infrastructure to run them at scale.

Who wins if this trend plays out

The value chain of this new stage of AI can be broken down into several layers. Each one has different opportunities and risks.

1. Chip makers

The most obvious name is NVIDIA. There’s also AMD, the in-house chips from Google, Amazon and Microsoft, plus specialized accelerators. Without GPUs or accelerators, there’s no modern AI at scale.

This part is already well known to the market. NVIDIA was the big visible winner of the first wave. The question now is which companies can benefit from the next layer: those who rent, operate and deploy that capacity.

2. AI-specialized clouds

Here come companies like CoreWeave and Nebius. They are platforms designed for AI workloads: training, fine-tuning, inference, GPU clusters, low-latency networks and model deployments.

They are very direct bets. If a startup or company wants to run its own model without buying a data center, it can rent capacity on one of these platforms.

3. Clouds for developers and SMEs

DigitalOcean holds a different position. It doesn’t compete the same way as AWS, Azure or Google Cloud on large enterprise contracts, but it has a very strong relationship with developers, small startups and SMEs that want simplicity.

If local AI becomes democratized, many teams won’t start by building complex clusters. They’ll look for an easy-to-deploy GPU, preconfigured models, clear costs and a simple experience.

4. Traditional hyperscalers

Amazon, Microsoft, Google and Oracle also win with this trend. They have customers, data centers, enterprise contracts and financial muscle. Oracle, in particular, has an interesting position in bare metal infrastructure and corporate clients.

The difference is that these companies are enormous. AI cloud can grow a lot, but within Amazon or Microsoft the impact is diluted more than in a smaller, more focused company.

5. Data centers, power and physical capacity

Companies like IREN and Applied Digital target another part of the problem: the physical bottleneck. AI doesn’t only consume software. It consumes power, land, cooling, permits, fiber, transformers and data centers built for high density.

If the world needs “AI factories”, whoever manages to build and operate that infrastructure can play a very relevant role.

Companies that fit this thesis

CoreWeave: the pure AI cloud

CoreWeave is one of the clearest names within AI infrastructure. Its business is based on renting cloud capacity optimized for artificial intelligence. More inference, more training and more in-house models mean more potential demand for this type of provider.

The upside is its direct exposure. The downside is that the business demands an enormous amount of capital: GPUs, data centers, power, networks, contracts and financing. It can grow a lot, but it can also suffer if the market punishes spending or if demand doesn’t advance at the expected pace.

Nebius: a neocloud with direct exposure

Nebius also fits very well into the AI neocloud idea. Its focus is on infrastructure for training and deploying models, both for advanced startups and for companies that need powerful capacity.

Its appeal lies in the purity of the thesis. Its risk lies in execution: high capex, the need for financing and pressure to scale fast without destroying margins.

DigitalOcean: the bridge to developers and SMEs

DigitalOcean can be important if local AI becomes more accessible to developers and small companies. It isn’t necessarily the option for the largest clusters, but it can be a natural entry point for deploying open-weight models, setting up inference services and embedding AI into small SaaS products.

Its role isn’t to be the most explosive bet, but to represent the democratization of AI in the cloud.

Oracle: the enterprise leg

Oracle isn’t a pure bet, but it is a more stable leg of the thesis. OCI has positioned itself with powerful cloud infrastructure, bare metal and relationships with large companies.

In regulated sectors such as banking, healthcare, public administration or large corporations, Oracle can benefit from demand for private AI and controlled environments.

IREN: power, data centers and aggressive GPU cloud

IREN is one of the most speculative bets. It comes from a history tied to energy infrastructure and bitcoin mining, but it is pivoting toward AI and data centers.

Its appeal is obvious: if AI needs more and more power and physical capacity, companies with access to electricity and land can become key pieces. But the risk is also high: financing, debt, dilution, project execution and customer acquisition.

Applied Digital: physical infrastructure for AI

Applied Digital completes the thesis from the physical side. It designs, builds and operates data centers for AI, HPC and cloud. If GPU-ready data centers are in short supply, companies like this can benefit.

The risk is that building data centers is expensive, slow and complex. You have to secure power, permits, cooling, financing and customers to fill the contracted capacity.

Why open-weight models can shift the balance

For a long time, the best models were closed. If you wanted maximum quality, you used OpenAI, Anthropic or Google. Local models were useful, but they lagged behind.

That’s changing. Families like Llama, Mistral, Qwen, DeepSeek or Kimi have shown that open-weight models can get very close to proprietary ones on many tasks. They don’t always win, but each generation narrows the gap.

And that changes the business decision. A company doesn’t always need the most powerful model on the planet. It needs a model that does its job well, is secure, customizable and has a reasonable cost.

If an open-weight model is good enough for 70%, 80% or 90% of certain internal workflows, many companies will use it on their own or rented infrastructure. That shift increases demand for GPU servers.

More efficient models can increase demand, not reduce it

A common objection is that models will become more and more efficient and, therefore, fewer GPUs will be needed. In the short term it may seem logical. If each query consumes less, the infrastructure needed per request goes down.

But in technology the opposite usually happens: when something gets cheaper, it gets used far more.

If inference drops in price, more use cases will appear:

Agents working all day long.
AI embedded in every SaaS.
Automated customer service.
Enterprise semantic search.
Generation of image, video, voice and code.
Robotics and industrial automation.
Internal copilots for employees.
Massive document analysis.
Multi-agent systems.

Even if each query costs less, the total number of queries can multiply. That’s why efficiency doesn’t necessarily destroy the thesis. It can broaden it.

Risks of investing in infrastructure for local AI

The thesis is attractive, but it isn’t guaranteed. There are significant risks worth watching.

Overcapacity

If too many companies build data centers at the same time, oversupply can appear. That would pressure rental prices for GPUs and cloud capacity.

Debt and dilution

Many companies need large amounts of capital. They can issue debt, sell shares or sign complex agreements. That can hurt the shareholder if growth doesn’t make up for it.

GPU obsolescence

Hardware ages fast. A cutting-edge GPU today can be less competitive within a few years. Refreshing capacity will be a constant requirement.

Dependence on few customers

Some infrastructure companies have huge contracts with few customers. That can be positive if everything goes well, but dangerous if a customer renegotiates, delays or cancels.

Competition from hyperscalers

AWS, Azure, Google Cloud and Oracle also build capacity. If they cut prices or integrate their services better, they can pressure the neoclouds.

Closed APIs getting ever cheaper

If OpenAI, Anthropic or Google cut prices significantly, some companies will prefer not to bother deploying their own models.

What signals to watch

To know whether this thesis is advancing, looking at the stock price isn’t enough. You have to watch operational signals.

CoreWeave and Nebius: backlog, contracts, GPU utilization, gross margin, debt, capex and customer diversification.
DigitalOcean: AI-linked revenue, GPU Droplet adoption, customer retention and ease of deployment.
Oracle: OCI growth, AI cloud contracts and enterprise adoption.
IREN and Applied Digital: contracted MW, real delivery dates, financing, investment-grade customers, construction cost, available power and dilution.

How to organize a basket of companies

A simple way to order this opportunity would be by layers:

Pure AI cloud core: CoreWeave and Nebius.
Developer and SME layer: DigitalOcean.
Enterprise layer: Oracle.
Physical and energy layer: IREN and Applied Digital.

They don’t all carry the same risk. CoreWeave and Nebius are more direct. DigitalOcean can benefit from democratization. Oracle adds stability. IREN and Applied Digital are more aggressive and depend heavily on physical execution.

Conclusion: local AI will also live in the cloud

Local AI doesn’t mean everything will run at home. The big version of this trend will be companies, startups and developers running their own models on rented infrastructure.

If open-weight models keep improving, many companies will want independence, privacy and control. To achieve that they’ll need GPUs, servers, data centers, power, cooling and cloud software.

That gives a clear thesis to companies like CoreWeave, Nebius, DigitalOcean, Oracle, IREN and Applied Digital. The opportunity can be enormous, but so can the risk. There will be cycles, sharp drops, capital raises, debt and doubts about overcapacity.

But the underlying direction makes sense: if AI becomes a basic layer of the economy, someone will have to host it. And not all of that AI will live in the giants’ closed APIs. Much of it will live in proprietary models, deployed on rented servers.

The future won’t only be “using ChatGPT”. It will also be thousands of companies running their own models on private clouds, rented GPUs and specialized data centers.

And your business? If you’re thinking about making the leap to AI with real control over your data, at DominaInternet we help you set up your own AI infrastructure, connect it with your processes through automation and integrate it into your management software. Tell us about your case and we’ll prepare a no-commitment quote, or write to us directly from contact.