Cosmos DB – Provisioning throughput

Before we understand provisioning throughput, the first step is to understand Cosmos DB Pricing. Cosmos DB pricing is based on requested units.

The cost of all database operations is normalized by Azure Cosmos DB and is expressed by Request Units (or RUs, for short).

You can think of RUs per second as the currency for throughput. RUs per second is a rate-based currency. It abstracts the system resources such as CPU, IOPS, and memory that are required to perform the database operations supported by Azure Cosmos DB.

Cosmos DB provides more than one way to provision throughput

  1. Dedicated collection level throughput
  2. Shared Database level throughput

Collection level throughput provides the dedicated request unit /second for that collection . for e..g if you have data coming from multiple IoT devices and one of them sends data at say 500 data points per second , then you need to provision a dedicated capacity to that collection .

Database Throughput = sum( dedicated throughput for each collection within the database)

Each collection in its own database, and the max throughput for each collection is determined by the dedicated capacity provisioned

Shared Database level throughput is used when you have have few collections and they are not used with a high frequency, for e.g. reference data that is read every single time there is another transaction that occurs once per second , if you have multiple reference data collection, it does not require a high dedicated throughput , then we can use shared database throughput.

The biggest difference in collection level and database provisioned throughput is the difference of marginal aggregation. Mathematically,

Collection throughput = Database throughput – sum(current throughput used by other collections)

Each collection gets the max available throughput provisioned at the database level,

Provisioning throughput using Collection

The RUs are setup while the DB and the collections are provisioned and can be scaled up or down on need basis.

Creating a collection within a new or existing DB

Alternatively, Cosmos DB provides Azure CLI to create the collections along with the throughput option

Provisioning throughput using Database

Alternatively Cosmos DB provides provisioning of throughput using Azure CLI

Scaling the throughput at database level is also just like the way we scale the collection level throughput

In Azure CLI, The throughput sub-entity supports two methods viz., show and update, the show method provides the current throughput and the update helps to scale the throughput at the database level

throughput show method displays the current throughput provisioned for the database level
The throughput update method scales the throughput provisioned at the database level

Next Steps

After started to write this article, i realized i should cover on migration from collection throughput to database throughput. I will try to write the migration steps in my next article

Cosmos DB – Operations using Azure CLI

If you had deployed Cosmos DB and provisioned the database, collection, you would want to know how to operate using Azure CLI. Azure Cosmos DB Command Line interface provides the commands to create the database, the collection and to update the throughput based on how the cosmos DB account has been setup (i.e., as SQL API , MongoDB API,....)

 I was creating a mongodb database in Cosmos DB and followed with the example. 

Creating a new Cosmos DB MongoDB Database

I provisioned my cosmos DB account and after provisioning ran the following CLI command from Powershell (or Azure Shell) to create a new MongoDB database
az cosmosdb mongodb database create --account-name mongodb123a --name newdb1 --resource-group ADFLAB

The "Mongodb123a" is the new account name i had created in my subscription, the "newdb1" is the database that we just created and it belongs to the resource group called "ADFLAB"

Displaying an existing Database using Az CLI

Once the cosmosdb mongodb database has been created or if you want to check the settings of the database, you can use the show command of the database
az cosmosdb mongodb database show --account-name mongodb123a --name newdb1 --resource-group ADFLAB


Listing the collection within the Database

Once you have created the database and collections (either through the CLI or Portal), if you would like to list the collections within the database , you can use the collection list command
az cosmosdb mongodb collection list  --account-name mongodb123a --database-name newdb1  --resource-group ADFLAB

Command group 'cosmosdb mongodb' is in preview. It may be changed/removed in a future release.
[
  {
    "id": null,
    "indexes": [
      {
        "key": {
          "keys": [
            "_id"
          ]
        },
        "options": {
          "expireAfterSeconds": null,
          "unique": null
        }
      },
      {
        "key": {
          "keys": [
            "DocumentDBDefaultIndex"
          ]
        },
        "options": {
          "expireAfterSeconds": null,
          "unique": null
        }
      },
      {
        "key": {
          "keys": [
            "DocumentDBDefaultIndex"
          ]
        },
        "options": {
          "expireAfterSeconds": null,
          "unique": null
        }
      }
    ],
    "location": null,
    "mongoDbCollectionId": "col001",
    "name": null,
    "shardKey": {
      "zipcode": "Hash"
    },
    "tags": null,
    "type": null
  }
]
Azure:/

Listing the Throughput values at the collection level

If you would like to review the throughput for each collection within the database then you would use the collection throughput entity
az cosmosdb mongodb collection throughput show --account-name mongodb123a --database-name newdb1 --name col001 --resource-group ADFLAB
Command group 'cosmosdb mongodb' is in preview. It may be changed/removed in a future release. {   "id": null,   "location": null,   "name": null,   "tags": null,   "throughput": 400,   "type": null } Azure:/

Changing the Throughput values at the collection level

 updating the throughput of the collection we can use the throughput entity which is a sub-entity of the cosmosdb mongodb collection
az cosmosdb mongodb collection throughput update --account-name mongodb123a --database-name newdb1 --name col001 --resource-group ADFLAB --throughput 500
Command group 'cosmosdb mongodb' is in preview. It may be changed/removed in a future release. {   "id": null,   "location": null,   "name": null,   "tags": null,   "throughput": 500,   "type": null } Azure:/