Amazon Bedrock Leaves Builders Stuck in 1st Gear
AWS's gap between promised and actual quotas is preventing customers from exploiting the full potential of Amazon Bedrock.

The title of today’s post is how AWS has left me feeling over the last few months. AWS claims the default quota for Anthropic’s Claude Sonnet 4.5 with cross region inference as 1000 requests per minute. I only get 50 RPM. For the newer Sonnet 4.6 model it is 500 and 25 respectively. Getting the advertised quota is challenging.
Imagine this, you’re looking for a new car. You find one you like. The spec sheet says the top speed is 200 km/h. You buy the car. You’re excited.
Driving home, you find your new pride and joy won’t go faster than 10km/h. When you finally get it home, you start to wonder if you did something wrong. You check the owner’s manual. It clearly shows the top speed is 200 km/h.
You Google the issue. You find forum posts full of owners having the same issue. The solution is buried 8 levels deep in the infotainment system menu. There is a screen that shows "Top Speed 10km/h [contact support to increase]". You can’t believe it. You tap your way through the menus on your car. Sure enough, it too shows the top speed is 10km/h.
You tap the contact support button. It pops up a little box asking you what you want the top speed to be. It says small increases will be approved quickly and higher ones might take longer. At first you think this whole situation is insane. You enter 60km/h. The form fails validation. It states you must request at least the stated minimum speed of 200km/h. You enter 200km/h and hit submit.
2 minutes later your phone dings with a new email notification. It is from the car dealer. They acknowledge your ticket and say you should get a response in 48 hours.
Fast forward 2 to 3 weeks later, you get an email saying they’d love to allow you to drive your car at 200km/h, but first you need to answer a few questions. When you ask why you need to answer the questions to drive the car at the advertised speed you’re informed it is for your own good and you need to answer the questions. The questions are:
- Average speed you expect to drive the car at:
- Highest speed you expect to drive the car at:
- Average distance of each trip you plan to make:
- Longest distance trip you plan to make:
- Number of trips you plan to make in an average week:
- Maximum number of trips you plan to make in a week:
- All surfaces you plan to drive the car on:
- Use cases for your car:
No one in their right mind would accept this from a car dealer. A customer shouldn’t have to jump through these hoops to use something as advertised. Unfortunately this is what AWS expects customers to do for every single Bedrock quota request. For the Anthropic models alone there are dozens of these quotas. All up Bedrock has hundreds of unique quotas. Each increase requires a separate request.
In my case I’m only being given 5% of the advertised quota levels. To get more I’m waiting weeks, only for the support team to demand I provide an abridged version of War and Peace each time.
If Amazon doesn’t have enough GPUs to deliver on their advertised quotas, stop pretending you can. Drop the default quotas. At least then potential customers can make informed decisions based on facts, not false hopes. The same applies if this is about protecting customers from themselves. Lower the quotas rather than giving us a distorted picture of what’s really available.
It really feels like Amazon only wants enterprise customers with an account management team using Bedrock. They’re the users with an escalation path. For the rest of us, it is starting to feel easier to go direct and use Anthropic’s APIs. Cut out the company in the middle that denies access to the resources.
I’m trying to build multiple products with Anthropic's models on Bedrock. The delayed responses and red tape, not only slow me down, they’re demotivating. This isn't how a "customer obsessed" company behaves, let alone one that calls itself "the most customer-centric company in the world".
Until AWS can deliver on their published default quotas, I will struggle to recommend using third party models on their platform. Unless you’re too big for AWS to ignore, you’ll be building with AI tools stuck in first gear.🌊