aggregation framework - Mongodb aggregate works slowly with 200Gb collection -
i have sharded mongo collection 10m elements (200gb).
document structure:
{ _id updatedate cleandate events1: [{...}, {...}, ...] events2: [{...}, {...}, ...] events3: [{...}, {...}, ...] } there no indexes except _id.
1st java application creates, reads , updates documents collection.
2nd java application has scheduled task finds documents updatedate > cleandate , removes old objects eventx arrays. when task cleans object, updates cleandate.
i use mycollection.aggregate({ $project : { delta : { $cmp : ['$updatedate', '$cleandate'] } } }, { $match : { delta : { $gt : 0 } } } , { $limit : 10000}) query next portion cleaning.
the query execution takes lot of time (sometimes 10 min , more) when first elements cleaned in collection.
how can speed 2nd app?
the point whole collection projects , $match after that. if know last updatedate or cleandate (maybe store last scheduled task run time somewhere) can limit initial aggregation set field prior projecting.
Comments
Post a Comment