CosmosDB Queries – Handling Arrays, Special Characters and Keywords

I recently worked for a customer where I had to build a custom application based on several Azure services. To store data, the application uses CosmosDB with the SQL API. The stored items/documents had a pretty simple structure using several arrays. However, when I started querying the data from an Azure Function, I struggled a bit with the syntax, especially when querying data from the arrays where some properties used special characters and keywords. Browsing the Internet for a while, I found some fragmented solutions., so I decided to write this post to consolidate everything into one single place.

1. CosmosDB Items

As I cannot re-use the real customer items here, I use some example items. Every item represents one product that will be used in a webshop. The items contain some base data about the product and also all customer feedbacks. The feedback data is stored as in array. The structure of the documents look like this:

{
  "manufacturer":"Mondelez",
  "name":"Toblerone",
  "description":"Our bestselling product for tourists",
  "price":4.0,
  "feedback":[
    {
      "by":"Marcel Zehner",
      "rating":"5",
      "would-you-recommend":"yes"
    },
    {
      "by":"Petra Zadora",
      "rating":"5",
      "comments":"Looooooove it",
      "would-you-recommend":"yes"
    },
    {
      "by":"Susan Cole",
      "rating":"3",
      "would-you-recommend":"no"
    }
  ]
}

2. Query Data

I created 5 of these items. So let’s try some basic queries.

SELECT *
FROM products p
SELECT *
FROM products p
WHERE p.manufacturer = "LindtSpruengli"
Basic Query
Pretty much straightforward so far.

3. Filtering Array Data

When you want to filter data based on the feedback array, it’s get a little tricky. Let’s say we want to get all feedback with a rating that is lower than 3. You would maybe expect something like this.
SELECT *
FROM products p
WHERE p.feedback.rating < "3"
This does not work. What you need to do to make this work is using a “join” . You might know this from SQL databases where “join” is used to join data from 2  tables. However, in case of a CosmosDB which follows a denormalized, schema-free approach, this s a bit different. “Join” in CosmosDB allows you to join a document with a sub-document/array (self-join). Let’s see how this works.
SELECT p.manufacturer, p.name, p.feedback
FROM p
JOIN f IN p.feedback
With this you create a join between the document root (product base data) and the sub-document/array “Feedback”. Now we can query data.
SELECT p.manufacturer, p.name, f.comments, f.rating
FROM p
JOIN f IN p.feedback
WHERE f.rating < "3"
Join

4. Special Characters and Keywords

So let’s extend our query a bit. “Easy” you might think, but nope, the next problem waits around the corner.
SELECT p.manufacturer, p.name, f.by, f.would-you-recommend, f.rating, f.comments
FROM p
JOIN f IN p.feedback
One property uses hyphens which is one of the special characters that needs to be handled differently. There are others such as @, blanks etc. that also need to be handeled differently. The second problem is the propery with the name “by”. This is a keyword used by CosmosDB, e.g. to order results using “ORDER BY”. So how to handle those cases?
You can escape/handle those situations using by wrapping the properties. Let’s quickly have a look how to address properties in CosmosDB. These two queries deliver identical results. The second approach is used to handle special cases.
SELECT p.manufacturer
FROM p

SELECT p["manufacturer"]
FROM p
Take care: The property name needs to be in square brackets AND quotes, and the period needs to be removed. Back to our query, it needs to be changed to this to now make it deliver the correct results.
SELECT p.manufacturer, p.name, f["by"], f["would-you-recommend"], f.rating, f.comments
FROM p
JOIN f IN p.feedback
Special Characters and Keywords
That’s it. Happy querying!
Cheers, Marcel

About Marcel Zehner

Microsoft Azure MVP
This entry was posted in Azure and tagged , , , , , , , , , , , , . Bookmark the permalink.

2 Responses to CosmosDB Queries – Handling Arrays, Special Characters and Keywords

  1. PA says:

    Helpful article. I do have a question. In your filtering example for #3, what query would return the document plus the complete feedback array (properly formatted with square brackets) for a rating = 4? (see expected data sample below)

    Thank you for your insight:

    [
    {
    “manufacturer”: “LindtSpruengli”,
    “name”: “Lindor Kugeln”,
    “description”: “Unsere feinen Lindor Kugeln fur jeden Anlass”,
    “price”: 11.5,
    “Feedback”: [
    {
    “by”: “Marcel Zehner”
    “rating”: “4”,
    “comments”: “Fantastic product!”,
    “would-you-recommend”: “yes”
    }
    ]
    }
    ]

  2. Helpful article. I do have a question. In your filtering example for #3, what query would return the document plus the complete feedback array (properly formatted with square brackets) for a rating = 4? (see expected data sample below)

    Thank you for your insight:

    [
    {
    “manufacturer”: “LindtSpruengli”,
    “name”: “Lindor Kugeln”,
    “description”: “Unsere feinen Lindor Kugeln fur jeden Anlass”,
    “price”: 11.5,
    “Feedback”: [
    {
    “by”: “Marcel Zehner”
    “rating”: “4”,
    “comments”: “Fantastic product!”,
    “would-you-recommend”: “yes”
    }
    ]
    }
    ]

Leave a comment