aggregation framework - Mongodb aggregate works slowly with 200Gb collection -


i have sharded mongo collection 10m elements (200gb).

document structure:

{   _id   updatedate   cleandate   events1: [{...}, {...}, ...]   events2: [{...}, {...}, ...]   events3: [{...}, {...}, ...] }  

there no indexes except _id.

1st java application creates, reads , updates documents collection.

2nd java application has scheduled task finds documents updatedate > cleandate , removes old objects eventx arrays. when task cleans object, updates cleandate.

i use mycollection.aggregate({ $project : { delta : { $cmp : ['$updatedate', '$cleandate'] } } }, { $match : { delta : { $gt : 0 } } } , { $limit : 10000}) query next portion cleaning.

the query execution takes lot of time (sometimes 10 min , more) when first elements cleaned in collection.

how can speed 2nd app?

the point whole collection projects , $match after that. if know last updatedate or cleandate (maybe store last scheduled task run time somewhere) can limit initial aggregation set field prior projecting.


Comments

Popular posts from this blog

authentication - Mongodb revoke acccess to connect test database -

r - Update two sets of radiobuttons reactively - shiny -

ios - Realm over CoreData should I use NSFetchedResultController or a Dictionary? -