$densify
The `$densify` stage in an aggregation pipeline is used to fill in missing data points within a sequence of values. It helps in creating a more complete dataset by generating missing values based on a specified field, range, and step. This is useful in scenarios like time-series data analysis, where gaps in data points need to be filled to ensure accurate analysis.
Syntax
{
$densify: {
field: <field>,
range: {
step: <number>,
unit: <string>,
bounds: [<lowerBound>, <upperBound>]
},
partitionByFields: [<field1>, <field2>, ...]
}
}
Parameters
field
stringrequiredThe field on which densification is performed.
range
objectrequiredSpecifies the step size and optional unit/bounds for generating missing values.
partitionByFields
objectOptional. Fields used to group data for densification.
Examples
Sample Data
{
"_id": "0fcc0bf0-ed18-4ab8-b558-9848e18058f4",
"name": "First Up Consultants | Beverage Shop",
"sales": { "totalSales": 75670 }
}
Densify a time-series dataset
Fill in missing days in the date field.
This query fills in missing dates between existing data points using daily steps.
Query:
db.aggregate([
{
$documents: [
{ date: new ISODate("2024-01-01"), value: 10 },
{ date: new ISODate("2024-01-03"), value: 15 }
]
},
{
$densify: {
field: "date",
range: {
step: 1,
unit: "day",
bounds: "full"
}
}
}
])
Output:
[
{ "date": "ISODate('2024-01-01T00:00:00.000Z')", "value": 10 },
{ "date": "ISODate('2024-01-02T00:00:00.000Z')" },
{ "date": "ISODate('2024-01-03T00:00:00.000Z')", "value": 15 }
]
Densify numeric data
Fill in missing numeric values in a sequence.
This query fills in missing level values between 1 and 5.
Query:
db.aggregate([
{
$documents: [
{ level: 1, score: 10 },
{ level: 3, score: 30 }
]
},
{
$densify: {
field: "level",
range: {
step: 1,
bounds: [1, 5]
}
}
}
])
Output:
[
{ "level": 1, "score": 10 },
{ "level": 2 },
{ "level": 3, "score": 30 },
{ "level": 4 }
]