Summary
- Production databases on Neon use 2.4x less compute and 50% less cost than if they were running on a provisioned platform.
- Putting the same Neon production workloads on a provisioned platform would result in 55 performance degradations per db per month because even provisioning at P99.5 + 20% doesn't account for the most extreme load spikes.
- Read replicas on Neon use 4x less compute than if they were running on a provisioned platform because of how well autoscaling aligns with their use cases.
- Running the same small scale-to-zero workloads on provisioned would cost 7.5x more than Neon.
We arrive at these numbers by comparing the amount of compute used on Neon to the amount of compute it would take to run the same workloads on a provisioned (non-autoscaling) platform like RDS or Heroku. These numbers are using Dec, 2025 data.
note
This report focuses on compute autoscaling only. Storage also autoscales seamlessly on Neon. (Customers are only charged for the exact storage they use.)
About Autoscaling
The amount of CPU, memory and local disk needed to run any database changes constantly over time. For example, here is the compute utilization of a typical database over a 24-hour period:
In the chart, a CU is an index of CPU, memory, and local file cache (LFC) utilization. 1 CU ≈ 1 CPU, 4GB RAM.
How much compute should you buy to run this database?
In an autoscaling platform compute is allocated automatically, while in a provisioned (non-autoscaling) platform the user must decide how much compute to buy.
A provisioned platform is one where databases run on instances with a fixed amount of CPU, memory, and sometimes even disk. To help provisioned database users make informed compute size decisions, AWS has an RDS Rightsizing tool that works by finding the P99.5 CPU and memory utilization over a lookback window and adding 20%.
Here's what P99.5 + 20% looks like for our example database:
You can think of orange as wasted compute. It's cloud compute that we paid for, but it delivered zero value. It just sat there idle.
Even when we over-provision, we see two performance degradations in red. This is because AWS rightsizing algorithm over-provisions 20% above the P99.5% value of resource utilization. So the most extreme 0.5% resource peaks may still exhaust the available resources.
To save money, we could also under-provision (i.e.: buy a smaller instance):
But now we see even more incidents in red where the database needs more compute than what is available. At these points we might experience degraded performance, even total failure.
Autoscaling removes the sizing decision from the customer. It uses an algorithm to buy and allocate the right amount of compute at each point in time to optimally run the database workload.
As you can see, with autoscaling we have:
- Less wasted compute - The area in green (compute that was bought but not used) is minimal.
- Little or no resource exhaustion - There are no points where the workload needed more compute than what was available.
Throughout the rest of this report, we focus on the difference between the amount of compute (and costs) used in an autoscaling vs provisioned database platform running the same workloads.
Production Autoscaling
Most production databases have a predictable periodic pattern of load, especially at 24-hour and 7-day intervals. Here is the autoscaling history of a Neon database that illustrates it well:
Three patterns are visible:
- Intra-day: Within a 24hr period, load hits a mid-day peak and a nightly trough.
- Weekend: On the weekend, load is noticeably lower.
- Daily spike: A scheduled task causes a spike at the same time most days.
Production Statistics
When we take every production database on Neon and run the AWS RDS rightsizing algorithm on each one using their autoscaling history from December 2025, we can calculate the equivalent compute usage and cost. For this report, a database is classified as production if it is running at greater than 1CU on average.
Compute
Across the entire Neon platform in December 2025, the average production database used 2.4x less compute than if sized at 20% above P99.5 load on a provisioned platform like RDS.
Cost
When we factor in the cost of each production database (which varies depending on if the account is on the Scale or Launch plan) and compare it with a conservative $0.1/CU-hour equivalent for provisioned databases, that equates to 50% lower compute costs on Neon on average.
Why is cost savings less than compute savings?
Provisioned platforms run Postgres for you on a Virtual Machine (VM) managed by the provider. So the cost of compute in provisioned closely tracks commodity VM prices.
Autoscaling platforms run a distributed system that automatically handles high availability and keeps warm pools of capacity ready for databases that are autoscaling up. This requires additional compute and puts a small price premium on the base CU-hour rate for autoscaling relative to provisioned.
See methodology for our exact approach and rationale for estimating cost.
Performance Degradations
Database compute loads can be spiky. Operations like index creates, schema changes and migrations, bulk exports, and even just user load patterns can cause spikes in memory and CPU in particular. When we follow the AWS rightsizing algorithm and provision at P99.5 + 20%, the top 0.5% of loads are often spiky enough to exceed that 20% buffer.
When we counted up the number of times each production database on Neon autoscaled up beyond the provisioned P99.5 + 20% equivalent, we found that the average production database would experience 55 incidents per month where compute resources would be exhausted if it were running on a provisioned platform.
Autoscaling helps turn load spikes that would cause late-night on-call pages and customer-facing issues on a provisioned platform into a few extra pennies in cost on Neon.
Autoscaling Events per Database
The average production database running on Neon adjusts its compute size 32,016 times per month, or about once every 81 seconds. To understand how it works, the documentation on Neon Autoscaling algorithm is the best place to start.
Production Example
Here is a detailed price comparison for a real Neon customer with a production workload.
The results: Provisioned uses 3.5x more compute to serve the same workload, because much of the time only a fraction of allocated resources are being used. Translating that to costs, this workload incurs 60% lower cost on Neon thanks to autoscaling. We're using the $0.222/CU-hour rate from the Neon Scale plan recommended for businesses and a conservative $0.1/CU-hour rate for provisioned instances like RDS.
Not only is autoscaling cheaper and more efficient, but this exact workload running on a provisioned platform at exactly the AWS-recommended P99.5 + 20% compute utilization would experience ~73 performance degradations per month as a result of exhausting the allocated resources.
Checking the math with actual RDS instances
The compute for this exact database costs $217.16/month on Neon.
The closest m-series latest-generation RDS instances that fit the provisioned specs necessary to run this workload are db.m8g.2xlarge with 8 CPU and 32GB RAM at $0.672/hour costing $504/month which is even more expensive than our $345.60/month estimate.
This highlights another weak point of provisioned databases. You can't buy exactly the compute you need. There is no 4.8CPU 19GB RAM RDS instance, so you are forced to "round up" to the next largest instance.
Read Replicas
Neon read-replicas are different than those in provisioned platforms because they don't replicate or duplicate data. They read from the same storage as the primary compute. This has a few benefits:
| Feature | Neon | Provisioned |
|---|---|---|
| Storage costs | No increase when adding replicas | Adding a replica doubles storage costs |
| Compute scaling | Each replica independently autoscales and scales to zero | Replicas typically sized similarly to primary to avoid issues |
| Creation time | Seconds, regardless of database size | Hours for large databases |
This makes read replicas on Neon particularly valuable not just for horizontally scaling reads, but also for offloading ad-hoc or analytical queries and anything else that you may not want to impact primary performance.
Read Replica Statistics
When we apply the same comparison logic as we did with production databases above, we find that read replicas on Neon are 4x more efficient than if they were running on a provisioned platform, and 78% lower cost.
Read replicas are more compute-efficient than standard production databases because of the different ways they are used: The compute efficiency of a read replica that is only used to scale out reads is fairly similar to the 2.4x stat we saw in the standard production category. But many read replicas on Neon have particularly spiky loads, leading us to infer that they are likely used for things like analytics, ad-hoc analysis, and batch work. The spikier the workload, the more pronounced the compute savings relative to a provisioned platform.
This efficiency even accounts for cases where read replicas are created and destroyed on-demand in Neon. If a replica only exists for one day, we only compare it to one day of provisioned cost.
Scale to Zero
In one of the features unique to Neon, compute can be configured to shut down entirely when there are no active connections and turn back on in 350ms when needed. Many small databases have an autoscaling history that looks like the one below, oscillating between a minimum configured size and zero:
This pattern shows up mostly in non-production databases: Dev and staging DB's that shut down outside work hours. But also prototypes, side-projects, early MVPs, etc... It takes a surprising amount of action to keep a database working 24/7.
Scale to Zero Statistics
If we tally up the compute used by small non-production databases that scale to zero on Neon and compare it with the compute required to run the same databases continually on a provisioned platform like RDS, we find that the savings are even more extreme than production databases.
Compute
A provisioned platform that cannot scale to zero would use 13.7x more compute to run the same small database workloads as Neon. This is using the same P99.5 + 20% methodology as before.
Costs
When we factor in costs using the rates of each database on Neon ($0.222 or $0.106 per CU-hour depending on the plan) and a conservative $0.065 per CU-hour equivalent on RDS, we find that scale-to-zero reduces costs by 7.5x. The savings numbers from scale to zero are dramatic enough to make it clear that this feature is changing customer behavior. Scale to zero changes the equation on what types of database usage patterns are economically viable.
Scale to Zero Example
Here is an actual autoscaling history from a scale-to-zero database on Neon:
Because of how often this database goes idle and scales to zero, this exact workload only uses 25 CU-hours/month on Neon. (Because it is running at 0.25 CU when it's on, that means it's active for 100 hours per month.) That drives the cost down to $2.68/month.
Provisioned platforms cannot scale to zero, so your best option for this workload is to buy the smallest instance that fits the workload (zero over-provisioning). Using that approach, running a similar workload on RDS would use 7.1x more compute and cost 4.4x more.
Checking the math with actual RDS instances
The smallest instance we can buy on RDS is the db.t4g.micro which runs $11.68 per month.
Methodology
Conservative Estimates
We've been careful to make these numbers as conservative as possible. For example:
- We ignore the fact that Neon comes with storage durability and high availability built-in, while provisioned platforms require you to triple your compute footprint to get durability.
- We compute the size of provisioned instance needed per database each month. That assumes on a provisioned platform the operator would be resizing the database monthly for maximum efficiency.
- When a Neon database scales to zero and never comes back on, we immediately stop tallying up equivalent provisioned costs. In reality many idle databases on provisioned platforms are forgotten about until an invoice or audit exposes them and someone manually terminates them.
Classifying workloads
- Production: Any database with an average CU-hour rate greater than or equal to 1.
- Scale to Zero/Non-Production: Any database with an average CU-hour rate less than 1 that is running less than 95% of time.
We've excluded all databases on the Neon Free Plan from this analysis.
Sizing workloads
We use P99.5 + 20% as the default over-provisioning setting following the default logic of AWS RDS rightsizing tool. To compute the P99.5 + 20% for each database we:
- Start with the dataset of that endpoint's autoscaling history for the month
- Discard the 0.5% of time where the database was scaled largest.
- Find the maximum remaining size.
- Add 20% to it.
So if a database spent 1% of time scaled up to 8CU, the P99.5 would be 8CU and the P99.5 + 20% would be 9.6CU. If a database spent only 0.25% of time scaled up to 8CU the P99.5 would be lower.
Provisioned costs
- Small Databases - We used a
$0.065per CU-hour equivalent rate based on the equivalent hourly cost of small provisioned databases across RDS, Google Cloud SQL, Heroku, DigitalOcean and PlanetScale. - Large Databases - We used a
$0.1per CU-hour equivalent rate by starting with an equivalent hourly cost of larger production-grade instances on provisioned database platforms like RDS, Google Cloud SQL, Heroku, DigitalOcean and Aiven.
Counting Incidents
To get a count of performance degradation incidents, we:
- Calculate the P99.5 + 20% "provisioned equivalent" size of each Neon database for each month
- Count the number of distinct time periods where the autoscaling history showed the database scaling up to larger than the P99.5 + 20% size.
This means if a database spent 1 minute above the P99.5 + 20% threshold, it would count as one incident, and if it scaled above and below the threshold for 5 seconds at a time in three separate occasions it would count as three incidents.
Terminology
- —Autoscaling
- The automated adjustment of compute resources to fit the needs of current load. Neon also autoscales storage, but this report focuses only on the compute side.
- —Provisioned Database
- A database that does not have compute autoscaling, where the user must select the CPU, RAM, (and storage) configuration upon creation.
- —Compute Unit (CU)
- Used in autoscaling systems to refer to an allocation of compute. In Neon, 1CU = 1vCPU, 4GB RAM.
- —CU-hour
- A consumption unit corresponding to one hour of 1 CU. Autoscaling systems charge a rate per CU-hour, and a CU-hour can be consumed flexibly, e.g. running at 4CU for 15 minutes or 0.25 CU for 4 hrs.








