I recently worked for a customer where I had to build a custom application based on several Azure services. To store data, the application uses CosmosDB with the SQL API. The stored items/documents had a pretty simple structure using several arrays. However, when I started querying the data from an Azure Function, I struggled a bit with the syntax, especially when querying data from the arrays where some properties used special characters and keywords. Browsing the Internet for a while, I found some fragmented solutions., so I decided to write this post to consolidate everything into one single place.
1. CosmosDB Items
As I cannot re-use the real customer items here, I use some example items. Every item represents one product that will be used in a webshop. The items contain some base data about the product and also all customer feedbacks. The feedback data is stored as in array. The structure of the documents look like this:
{ "manufacturer":"Mondelez", "name":"Toblerone", "description":"Our bestselling product for tourists", "price":4.0, "feedback":[ { "by":"Marcel Zehner", "rating":"5", "would-you-recommend":"yes" }, { "by":"Petra Zadora", "rating":"5", "comments":"Looooooove it", "would-you-recommend":"yes" }, { "by":"Susan Cole", "rating":"3", "would-you-recommend":"no" } ] }
2. Query Data
I created 5 of these items. So let’s try some basic queries.
SELECT * FROM products p
SELECT * FROM products p WHERE p.manufacturer = "LindtSpruengli"

3. Filtering Array Data
SELECT * FROM products p WHERE p.feedback.rating < "3"
SELECT p.manufacturer, p.name, p.feedback FROM p JOIN f IN p.feedback
SELECT p.manufacturer, p.name, f.comments, f.rating FROM p JOIN f IN p.feedback WHERE f.rating < "3"

4. Special Characters and Keywords
SELECT p.manufacturer, p.name, f.by, f.would-you-recommend, f.rating, f.comments FROM p JOIN f IN p.feedback
SELECT p.manufacturer FROM p SELECT p["manufacturer"] FROM p
SELECT p.manufacturer, p.name, f["by"], f["would-you-recommend"], f.rating, f.comments FROM p JOIN f IN p.feedback

Helpful article. I do have a question. In your filtering example for #3, what query would return the document plus the complete feedback array (properly formatted with square brackets) for a rating = 4? (see expected data sample below)
Thank you for your insight:
[
{
“manufacturer”: “LindtSpruengli”,
“name”: “Lindor Kugeln”,
“description”: “Unsere feinen Lindor Kugeln fur jeden Anlass”,
“price”: 11.5,
“Feedback”: [
{
“by”: “Marcel Zehner”
“rating”: “4”,
“comments”: “Fantastic product!”,
“would-you-recommend”: “yes”
}
]
}
]
Helpful article. I do have a question. In your filtering example for #3, what query would return the document plus the complete feedback array (properly formatted with square brackets) for a rating = 4? (see expected data sample below)
Thank you for your insight:
[
{
“manufacturer”: “LindtSpruengli”,
“name”: “Lindor Kugeln”,
“description”: “Unsere feinen Lindor Kugeln fur jeden Anlass”,
“price”: 11.5,
“Feedback”: [
{
“by”: “Marcel Zehner”
“rating”: “4”,
“comments”: “Fantastic product!”,
“would-you-recommend”: “yes”
}
]
}
]