Compute Autoscaling Report

Summary

Production databases on Neon use 2.4x less compute and 50% less cost than if they were running on a provisioned platform.
Putting the same Neon production workloads on a provisioned platform would result in 55 performance degradations per db per month because even provisioning at P99.5 + 20% doesn't account for the most extreme load spikes.
Read replicas on Neon use 4x less compute than if they were running on a provisioned platform because of how well autoscaling aligns with their use cases.
Running the same small scale-to-zero workloads on provisioned would cost 7.5x more than Neon.

We arrive at these numbers by comparing the amount of compute used on Neon to the amount of compute it would take to run the same workloads on a provisioned (non-autoscaling) platform like RDS or Heroku. These numbers are using Dec, 2025 data.

note

This report focuses on compute autoscaling only. Storage also autoscales seamlessly on Neon. (Customers are only charged for the exact storage they use.)

About Autoscaling

The amount of CPU, memory and local disk needed to run any database changes constantly over time. For example, here is the compute utilization of a typical database over a 24-hour period:

In the chart, a CU is an index of CPU, memory, and local file cache (LFC) utilization. 1 CU ≈ 1 CPU, 4GB RAM.

How much compute should you buy to run this database?

In an autoscaling platform compute is allocated automatically, while in a provisioned (non-autoscaling) platform the user must decide how much compute to buy.

A provisioned platform is one where databases run on instances with a fixed amount of CPU, memory, and sometimes even disk. To help provisioned database users make informed compute size decisions, AWS has an RDS Rightsizing tool that works by finding the P99.5 CPU and memory utilization over a lookback window and adding 20%.

Here's what P99.5 + 20% looks like for our example database:

You can think of orange as wasted compute. It's cloud compute that we paid for, but it delivered zero value. It just sat there idle.

Even when we over-provision, we see two performance degradations in red. This is because AWS rightsizing algorithm over-provisions 20% above the P99.5% value of resource utilization. So the most extreme 0.5% resource peaks may still exhaust the available resources.

To save money, we could also under-provision (i.e.: buy a smaller instance):

But now we see even more incidents in red where the database needs more compute than what is available. At these points we might experience degraded performance, even total failure.

Autoscaling removes the sizing decision from the customer. It uses an algorithm to buy and allocate the right amount of compute at each point in time to optimally run the database workload.

As you can see, with autoscaling we have:

Less wasted compute - The area in green (compute that was bought but not used) is minimal.
Little or no resource exhaustion - There are no points where the workload needed more compute than what was available.

Throughout the rest of this report, we focus on the difference between the amount of compute (and costs) used in an autoscaling vs provisioned database platform running the same workloads.

Production Autoscaling

Most production databases have a predictable periodic pattern of load, especially at 24-hour and 7-day intervals. Here is the autoscaling history of a Neon database that illustrates it well:

Three patterns are visible:

Intra-day: Within a 24hr period, load hits a mid-day peak and a nightly trough.
Weekend: On the weekend, load is noticeably lower.
Daily spike: A scheduled task causes a spike at the same time most days.

Production Statistics

When we take every production database on Neon and run the AWS RDS rightsizing algorithm on each one using their autoscaling history from December 2025, we can calculate the equivalent compute usage and cost. For this report, a database is classified as production if it is running at greater than 1CU on average.

Compute

Across the entire Neon platform in December 2025, the average production database used 2.4x less compute than if sized at 20% above P99.5 load on a provisioned platform like RDS.

The average production database on Neon uses 2.4x less compute than provisioned equivalent.

Provisioned

96 CU-hours

Autoscaling

2.4x

Less

40 CU-hours

Cost

When we factor in the cost of each production database (which varies depending on if the account is on the Scale or Launch plan) and compare it with a conservative $0.1/CU-hour equivalent for provisioned databases, that equates to 50% lower compute costs on Neon on average.

Why is cost savings less than compute savings?

Provisioned platforms run Postgres for you on a Virtual Machine (VM) managed by the provider. So the cost of compute in provisioned closely tracks commodity VM prices.

Autoscaling platforms run a distributed system that automatically handles high availability and keeps warm pools of capacity ready for databases that are autoscaling up. This requires additional compute and puts a small price premium on the base CU-hour rate for autoscaling relative to provisioned.

See methodology for our exact approach and rationale for estimating cost.

Performance Degradations

Database compute loads can be spiky. Operations like index creates, schema changes and migrations, bulk exports, and even just user load patterns can cause spikes in memory and CPU in particular. When we follow the AWS rightsizing algorithm and provision at P99.5 + 20%, the top 0.5% of loads are often spiky enough to exceed that 20% buffer.

When we counted up the number of times each production database on Neon autoscaled up beyond the provisioned P99.5 + 20% equivalent, we found that the average production database would experience 55 incidents per month where compute resources would be exhausted if it were running on a provisioned platform.

Autoscaling helps turn load spikes that would cause late-night on-call pages and customer-facing issues on a provisioned platform into a few extra pennies in cost on Neon.

Autoscaling Events per Database

The average production database running on Neon adjusts its compute size 32,016 times per month, or about once every 81 seconds. To understand how it works, the documentation on Neon Autoscaling algorithm is the best place to start.

Production Example

Here is a detailed price comparison for a real Neon customer with a production workload.

The results: Provisioned uses 3.5x more compute to serve the same workload, because much of the time only a fraction of allocated resources are being used. Translating that to costs, this workload incurs 60% lower cost on Neon thanks to autoscaling. We're using the $0.222/CU-hour rate from the Neon Scale plan recommended for businesses and a conservative $0.1/CU-hour rate for provisioned instances like RDS.

Not only is autoscaling cheaper and more efficient, but this exact workload running on a provisioned platform at exactly the AWS-recommended P99.5 + 20% compute utilization would experience ~73 performance degradations per month as a result of exhausting the allocated resources.

Checking the math with actual RDS instances

The compute for this exact database costs $217.16/month on Neon. The closest m-series latest-generation RDS instances that fit the provisioned specs necessary to run this workload are db.m8g.2xlarge with 8 CPU and 32GB RAM at $0.672/hour costing $504/month which is even more expensive than our $345.60/month estimate.

This highlights another weak point of provisioned databases. You can't buy exactly the compute you need. There is no 4.8CPU 19GB RAM RDS instance, so you are forced to "round up" to the next largest instance.

Read Replicas

Neon read-replicas are different than those in provisioned platforms because they don't replicate or duplicate data. They read from the same storage as the primary compute. This has a few benefits:

Feature	Neon	Provisioned
Storage costs	No increase when adding replicas	Adding a replica doubles storage costs
Compute scaling	Each replica independently autoscales and scales to zero	Replicas typically sized similarly to primary to avoid issues
Creation time	Seconds, regardless of database size	Hours for large databases

This makes read replicas on Neon particularly valuable not just for horizontally scaling reads, but also for offloading ad-hoc or analytical queries and anything else that you may not want to impact primary performance.

Read Replica Statistics

When we apply the same comparison logic as we did with production databases above, we find that read replicas on Neon are 4x more efficient than if they were running on a provisioned platform, and 78% lower cost.

The average read replica on Neon uses 4x less compute than provisioned equivalent.

Provisioned

160 CU-hours

Autoscaling

Less

40 CU-hours

Read replicas are more compute-efficient than standard production databases because of the different ways they are used: The compute efficiency of a read replica that is only used to scale out reads is fairly similar to the 2.4x stat we saw in the standard production category. But many read replicas on Neon have particularly spiky loads, leading us to infer that they are likely used for things like analytics, ad-hoc analysis, and batch work. The spikier the workload, the more pronounced the compute savings relative to a provisioned platform.

This efficiency even accounts for cases where read replicas are created and destroyed on-demand in Neon. If a replica only exists for one day, we only compare it to one day of provisioned cost.

Scale to Zero

In one of the features unique to Neon, compute can be configured to shut down entirely when there are no active connections and turn back on in 350ms when needed. Many small databases have an autoscaling history that looks like the one below, oscillating between a minimum configured size and zero:

This pattern shows up mostly in non-production databases: Dev and staging DB's that shut down outside work hours. But also prototypes, side-projects, early MVPs, etc... It takes a surprising amount of action to keep a database working 24/7.

Scale to Zero Statistics

If we tally up the compute used by small non-production databases that scale to zero on Neon and compare it with the compute required to run the same databases continually on a provisioned platform like RDS, we find that the savings are even more extreme than production databases.

Compute

A provisioned platform that cannot scale to zero would use 13.7x more compute to run the same small database workloads as Neon. This is using the same P99.5 + 20% methodology as before.

Costs

When we factor in costs using the rates of each database on Neon ($0.222 or $0.106 per CU-hour depending on the plan) and a conservative $0.065 per CU-hour equivalent on RDS, we find that scale-to-zero reduces costs by 7.5x. The savings numbers from scale to zero are dramatic enough to make it clear that this feature is changing customer behavior. Scale to zero changes the equation on what types of database usage patterns are economically viable.

Scale to Zero Example

Here is an actual autoscaling history from a scale-to-zero database on Neon:

Because of how often this database goes idle and scales to zero, this exact workload only uses 25 CU-hours/month on Neon. (Because it is running at 0.25 CU when it's on, that means it's active for 100 hours per month.) That drives the cost down to $2.68/month.

Provisioned platforms cannot scale to zero, so your best option for this workload is to buy the smallest instance that fits the workload (zero over-provisioning). Using that approach, running a similar workload on RDS would use 7.1x more compute and cost 4.4x more.

Checking the math with actual RDS instances

The smallest instance we can buy on RDS is the db.t4g.micro which runs $11.68 per month.

Methodology

Conservative Estimates

We've been careful to make these numbers as conservative as possible. For example:

We ignore the fact that Neon comes with storage durability and high availability built-in, while provisioned platforms require you to triple your compute footprint to get durability.
We compute the size of provisioned instance needed per database each month. That assumes on a provisioned platform the operator would be resizing the database monthly for maximum efficiency.
When a Neon database scales to zero and never comes back on, we immediately stop tallying up equivalent provisioned costs. In reality many idle databases on provisioned platforms are forgotten about until an invoice or audit exposes them and someone manually terminates them.

Classifying workloads

Production: Any database with an average CU-hour rate greater than or equal to 1.
Scale to Zero/Non-Production: Any database with an average CU-hour rate less than 1 that is running less than 95% of time.

We've excluded all databases on the Neon Free Plan from this analysis.

Sizing workloads

We use P99.5 + 20% as the default over-provisioning setting following the default logic of AWS RDS rightsizing tool. To compute the P99.5 + 20% for each database we:

Start with the dataset of that endpoint's autoscaling history for the month
Discard the 0.5% of time where the database was scaled largest.
Find the maximum remaining size.
Add 20% to it.

So if a database spent 1% of time scaled up to 8CU, the P99.5 would be 8CU and the P99.5 + 20% would be 9.6CU. If a database spent only 0.25% of time scaled up to 8CU the P99.5 would be lower.

Provisioned costs

Small Databases - We used a $0.065 per CU-hour equivalent rate based on the equivalent hourly cost of small provisioned databases across RDS, Google Cloud SQL, Heroku, DigitalOcean and PlanetScale.
Large Databases - We used a $0.1 per CU-hour equivalent rate by starting with an equivalent hourly cost of larger production-grade instances on provisioned database platforms like RDS, Google Cloud SQL, Heroku, DigitalOcean and Aiven.

Counting Incidents

To get a count of performance degradation incidents, we:

Calculate the P99.5 + 20% "provisioned equivalent" size of each Neon database for each month
Count the number of distinct time periods where the autoscaling history showed the database scaling up to larger than the P99.5 + 20% size.

This means if a database spent 1 minute above the P99.5 + 20% threshold, it would count as one incident, and if it scaled above and below the threshold for 5 seconds at a time in three separate occasions it would count as three incidents.

Terminology

—Autoscaling: The automated adjustment of compute resources to fit the needs of current load. Neon also autoscales storage, but this report focuses only on the compute side.
—Provisioned Database: A database that does not have compute autoscaling, where the user must select the CPU, RAM, (and storage) configuration upon creation.
—Compute Unit (CU): Used in autoscaling systems to refer to an allocation of compute. In Neon, 1CU = 1vCPU, 4GB RAM.
—CU-hour: A consumption unit corresponding to one hour of 1 CU. Autoscaling systems charge a rate per CU-hour, and a CU-hour can be consumed flexibly, e.g. running at 4CU for 15 minutes or 0.25 CU for 4 hrs.

Compute Autoscaling Report

Summary

note

About Autoscaling

Production Autoscaling

Production Statistics

Compute

Cost

Why is cost savings less than compute savings?

Performance Degradations

Autoscaling Events per Database

Production Example

Checking the math with actual RDS instances

Read Replicas

Read Replica Statistics

Scale to Zero

Scale to Zero Statistics

Compute

Costs

Scale to Zero Example

Checking the math with actual RDS instances

Methodology

Conservative Estimates

Classifying workloads

Sizing workloads

Provisioned costs

Counting Incidents

Terminology

On this page