Order | Type | Description | Query Example |
E | Equality (Exact Match) | Leftmost position of the index; exact matches quickly narrow the scan range | { status: "paid" } |
S | Sort (Sorting) | Immediately follows E; leverages the ordered nature of the index to avoid in-memory sorting | .sort({ createTime: -1 }) |
R | Range (Range Query) | Rightmost position of the index; after a range scan the index order is broken, and subsequent fields degrade to row-by-row filtering | { createTime: { $gte: ISODate("2024-01-01") } } |
// Business query statementdb.t_orders.find({userId: 10086, // E: Equalitystatus: "paid", // E: EqualitycreateTime: { $gte: ISODate("2024-02-01") } // R: Range}).sort({ amount: -1 }) // S: Sorting// Error: With Range before Sort, the index diverges on createTime and cannot support amount sorting, triggering in-memory sorting.db.t_orders.createIndex({ userId: 1, status: 1, createTime: 1, amount: -1 },{ background: true })// Correct: ESR order, equality filtering → index sorting → range truncationdb.t_orders.createIndex({ userId: 1, status: 1, amount: -1, createTime: 1 },{ background: true })
Query Condition | Index Utilization | Execution Description |
{ a: 1 } | Full utilization | Hits the leftmost prefix a. |
{ a: 1, b: 1 } | Full utilization | Hits the leftmost prefix a, b. |
{ a: 1, b: 1, c: 1 } | Full utilization | Fully matches all index fields a, b, c. |
{ a: 1, c: 1 } | Partial utilization | Only the leftmost field a can be used for index branch positioning. The absence of the middle field b blocks index matching for field c. |
{ b: 1 } | Unavailable | The absence of the leftmost field a causes the loss of the search baseline and triggers a full table scan. |
{ b: 1, c: 1 } | Unavailable | The absence of the leftmost field a triggers a full table scan. |
// A compound index already exists.db.t_orders.createIndex({ userId: 1, status: 1, createTime: 1 },{ background: true })// Correct: Contains the leftmost field userId, and the index is fully hit.db.t_orders.find({ userId: 10086, status: "paid" })// Partial hit: Contains the leftmost field userId but skips status, using only userId for positioning.db.t_orders.find({ userId: 10086, createTime: { $gte: ISODate("2024-01-01") } })// Error: Missing the leftmost field userId, triggering a full table scan.db.t_orders.find({ status: "pending" })// Solution: Create a separate index for the independent query field.db.t_orders.createIndex({ status: 1 },{ background: true })
Distinctiveness Level | Data Characteristic | Field Example | Index Placement Policy |
High | Unique or rarely duplicated | User ID, Order Number | Should be placed as the leading field in composite indexes. |
Medium | Contains some duplicates but has many categories. | City, Product Category | Effective. Usually placed in the middle of composite indexes and must be used in combination. |
Low | Extremely small value domain with massive duplicates | Gender, Boolean Value | Poor. If it must be included in a composite index, it should be strictly placed at the very end. |
db.orders.find({ userId: "U123" })db.orders.find({ status: "completed" })// Error: The background parameter is not specified, which may block all operations.db.t_orders.createIndex({ customerId: 1 })// Single creation: Enable the background mode to avoid blocking business operations.db.t_orders.createIndex({ customerId: 1, createTime: -1 },{ background: true })
Resource Dimension | Performance Overhead Description |
Write Latency (Write I/O) | Each time an Insert, Update, or Delete is executed, the database must synchronously traverse and update the B-Tree structures of all related indexes. The more indexes there are, the more the I/O overhead per write grows linearly. |
Storage consumption (Disk) | Each index is an independent data tree and occupies a large amount of physical disk space. |
// Step 1: View all current indexes of the collection.db.t_orders.getIndexes()// Step 2: Investigate the actual usage of each index using $indexStats.db.t_orders.aggregate([{ $indexStats: {} }]).forEach(function(idx) {print("Index Name:", idx.name,"| Invocation Count:", idx.accesses.ops,"| Last Used:", idx.accesses.since)})// Step 3: View the storage ratio between indexes and data.var stats = db.t_orders.stats()print("Data Size:", (stats.size / 1024 / 1024).toFixed(2), "MB")print("Index Size:", (stats.totalIndexSize / 1024 / 1024).toFixed(2), "MB")print("Index/Data Ratio:", (stats.totalIndexSize / stats.size * 100).toFixed(1), "%")// Step 4: Delete redundant indexes with an invocation count of 0.// Before performing the operation, ensure that the index has no business dependencies.db.t_orders.dropIndex("idx_legacy_report_1")db.t_orders.dropIndex("idx_legacy_report_2")// Note: Do not delete the default _id index.// db.t_orders.dropIndex("_id") // This operation will cause an error because the _id index cannot be deleted.
// Step 1: Execute a statistical aggregation to identify unused indexes.db.t_orders.aggregate([{ $indexStats: {} }])/* Output Example Analysis:{"name": "idx_old_field","accesses": {"ops": NumberLong(0), // Key metric: The invocation count is 0."since": ISODate("2024-01-01T00:00:00Z")}}*/// Step 2: Logical Hiding (Supports fast rollback, MongoDB 4.4+)db.t_orders.hideIndex("idx_old_field")// Observation Period: Continuously monitor for 1-7 days. If business is impacted, you can immediately execute db.t_orders.unhideIndex("idx_old_field") to restore it.// Step 3: Physical Deletion (Execute after confirming no damage).db.t_orders.dropIndex("idx_old_field")
Feature / Limitation | Specification |
Single-field constraint | A TTL index must be a single-field index. If the TTL field is placed in a compound index, the TTL automatic reclamation feature is directly invalidated. |
Data Type | The data type of the target field must be the Date type (or an array containing the Date type); otherwise the engine will not perform cleanup. |
Cleanup Delay | Deletion is not precise to the second upon expiration. By default, the TTLMonitor thread polls every 60 seconds, so the physical deletion of data has a minute-level delay. |
Engine Parameter Control | If there are performance fluctuations in production, the DBA can further smooth I/O resource consumption by dynamically adjusting ttlMonitorSleepSecs (controls the polling sleep interval) and ttlDeleteBatch (controls the deletion batch size per round). For specific operations, see parameter settings. |
// Pattern 1: Fixed-duration expiration (e.g., based on the createTime field, expires after 30 days)db.t_sessions.createIndex({ createTime: 1 },{ expireAfterSeconds: 2592000, background: true })// Pattern 2: Precise timestamp expiration (Recommended, where the business layer dynamically determines the lifetime)// Set expireAfterSeconds to 0. The engine will perform cleanup when the expireAt timestamp is reached.db.t_sessions.createIndex({ expireAt: 1 },{ expireAfterSeconds: 0, background: true })// When inserting at the business layer, directly specify the precise expiration time.db.t_sessions.insertOne({sessionId: "sess_001",userId: "user_001",expireAt: new Date(Date.now() + 3600000) // Dynamically specifies expiration after 1 hour})
Feature / Restriction | Specification |
Required parameter timeField | When a time series collection is created, you must specify the timeField. The value of this field must be of the Date type and is used to timestamp each measurement. MongoDB automatically performs bucket storage based on this. |
Recommended parameter metaField | The metaField is used to identify the data source (such as device ID and sensor number). MongoDB partitions data based on this. Select an identifier that rarely changes and avoid using the array type. |
Granularity setting granularity | Select the granularity based on the time interval between adjacent measurements from the same data source: seconds (bucket span: 1 hour), minutes (bucket span: 24 hours), hours (bucket span: 30 days). Too coarse a granularity leads to excessive data per bucket and slower queries, while too fine a granularity causes a surge in the number of buckets and wasted storage. |
Automatic indexing | MongoDB 6.0+ automatically creates secondary indexes on the timeField and metaField. For earlier versions, you must create them manually. |
Unsupported operations | Time series collections do not support efficient execution of distinct() (use $group aggregation as an alternative). Versions earlier than 8.0 do not support zone sharding (Zone Sharding). Once defined, the metaField cannot be changed to another field. |
// Recommendation: Create a time series collection (sensors report every 5 minutes)db.createCollection("t_sensor_data", {timeseries: {timeField: "timestamp", // Required: Timestamp fieldmetaField: "sensorId", // Recommended: Data source identifiergranularity: "minutes" // Granularity matches the collection frequency (5 minutes → minutes)}})// Recommendation: Create a time series collection (for system monitoring that collects data every second)db.createCollection("t_metrics", {timeseries: {timeField: "ts",metaField: "metadata", // metadata can be an object, such as { host: "web01", region: "bj" }granularity: "seconds" // Granularity matches the collection frequency (second-level → seconds)}})// Insert time series datadb.t_sensor_data.insertMany([{ sensorId: "sensor_001", timestamp: new Date(), temperature: 23.5, humidity: 65 },{ sensorId: "sensor_001", timestamp: new Date(), temperature: 23.6, humidity: 64 },{ sensorId: "sensor_002", timestamp: new Date(), temperature: 18.2, humidity: 72 }], { ordered: false }) // ordered: false improves batch write performance// Error: Manually managing time series data on a regular collectiondb.createCollection("t_sensor_data_old")db.t_sensor_data_old.createIndex({ sensorId: 1, timestamp: -1 })// Requires manual handling of data archiving, compression, and cleanup, resulting in high maintenance costs.// Query data for a specific device from the last 24 hours (automatically using time series indexes)db.t_sensor_data.find({sensorId: "sensor_001",timestamp: { $gte: new Date(Date.now() - 86400000) }}).sort({ timestamp: -1 })// Use $group aggregation instead of distinct() (time series collections do not support efficient distinct operations)db.t_sensor_data.aggregate([{ $match: { timestamp: { $gte: new Date(Date.now() - 86400000) } } },{ $group: { _id: "$sensorId" } }])
MongoDB Version | Default Memory Limit | Consequence of Exceeding Limit |
4.2 and earlier | 32MB | An error is reported, and the query fails. |
4.4+ | 100MB | An error is reported, and the query fails. |
// Error: The sort field is not indexed, which may trigger in-memory sorting.db.t_orders.find({ status: "paid" }).sort({ createTime: -1 })// Correct: The sort field is included in the index.db.t_orders.createIndex({ status: 1, createTime: -1 }, { background: true })db.t_orders.find({ status: "paid" }).sort({ createTime: -1 }) // Index sorting, no memory limit// Verify whether index sorting is used.db.t_orders.find({ status: "paid" }).sort({ createTime: -1 }).explain("executionStats")// Check whether a SORT stage exists. The absence of a SORT stage indicates that index sorting is used.
$ne, $nin, unprefixed $regex, and $where.Operator | Issue | Optimization Solution |
$ne | Requires scanning all non-matching values. | Use $in to list valid values. |
$nin | Requires scanning all values not in the list. | Use $in for positive matching. |
$not | Usually cannot utilize indexes. | Use positive conditions. |
$regex (without prefix) | Prefix without anchoring cannot utilize indexes. | Use prefix anchoring /^prefix/. |
$where | JavaScript execution is extremely slow. | Use standard query operators. |
$exists: false | Requires scanning all documents. | Use sparse indexes or redesign. |
// Inefficient: $ne cannot be effectively utilized by indexes.db.t_orders.find({ status: { $ne: "cancelled" } })// Optimized: $in lists all valid statuses.db.t_orders.find({ status: { $in: ["pending", "paid", "shipped", "completed"] } })// Inefficient: Unprefixed regex, full table scan.db.t_users.find({ name: /Zhang/ })// Optimized: Prefix-anchored regex can utilize indexes.db.t_users.find({ name: /^Zhang/ })
Check Item | Verification Method | Passing Criteria |
All queries hit indexes. | explain("executionStats") | The stage is IXSCAN, with no COLLSCAN. |
No in-memory sorting | explain("executionStats") | No SORT Stage |
Reasonable scan efficiency | totalDocsExamined / nReturned | A ratio close to 1 |
Compound indexes follow ESR. | Reviewing the order of index fields | Equality → Sort → Range |
Controllable number of indexes | db.collection.getIndexes() | ≤ 10 |
Creating indexes using background mode | Reviewing the index creation command | Including background: true |
Check Item | Verification Method | Recommended Action |
Unused index cleanup | $indexStats | Delete indexes with ops: 0 after evaluation. |
Index utilization | $indexStats | Consider deleting indexes with low utilization. |
Index size | db.collection.stats().indexSizes | Analyze abnormally large indexes. |
db.t_orders.find({ status: "paid" }).explain("executionStats")
// Key metric, close to 1: Index is efficient, scanning results in a hit.totalDocsExamined / nReturned ≈ 1 // Ideal value
// Locate the sorting stage in the explain results.executionStats.executionStages.stage// Or nested within inputStage / inputStages.
stage Value | Description | Optimization Required or Not |
SORT | In-memory sorting, not utilizing index orderliness | Optimization required |
SORT_KEY_GENERATOR | Extracting the sort key, occurring in conjunction with SORT. | Optimization required |
No `SORT` Stage | Sorting is completed by the inherent orderliness of the index. | Optimization not required |
// Does not comply with ESR: range field precedes sorting field.db.t_orders.createIndex({ createTime: 1, status: 1 })db.t_orders.find({ createTime: { $gte: ISODate("2024-01-01") }, status: "paid" }).sort({ amount: -1 })// Complies with ESR: equality → sorting → rangedb.t_orders.createIndex({ status: 1, amount: -1, createTime: 1 })
Comparison Dimension | Multiple Single-Field Indexes | Compound Index |
Indexing scheme | { status: 1 } + { userId: 1 } + { createTime: 1 } | { status: 1, userId: 1, createTime: -1 } |
Query process | Generally, only one field in a query uses the index. | A single B-Tree lookup directly locates the target. |
Sort | The index cannot cover sorting, potentially triggering an in-memory SORT. | The index is inherently ordered, eliminating the need for in-memory sorting. |
Covered query | Not achievable, requiring a table lookup to fetch the complete document. | If all query fields are included in the index, the results can be returned directly without requiring a table lookup. |
Memory overhead | Higher overall memory overhead | No additional overhead. |
// Multiple single-field indexes: MongoDB attempts index intersection, resulting in unstable efficiency.db.t_orders.createIndex({ status: 1 })db.t_orders.createIndex({ userId: 1 })db.t_orders.createIndex({ createTime: 1 })// Compound index: a single index covers both querying and sorting, adhering to the ESR principle.db.t_orders.createIndex({ status: 1, userId: 1, createTime: -1 })// Single-field indexes offer greater flexibility when query patterns are not fixed and field combinations vary:// Scenario: For dynamic filtering in an operations backend, where users may query based on any combination of fields, creating a compound index for each combination is impractical. Single-field indexes + index intersection is the appropriate choice.db.t_orders.find({ status: "paid" }) // Filters only by statusdb.t_orders.find({ userId: "u_10001" }) // Filters only by userdb.t_orders.find({ createTime: { $gte: ISODate("...") } }) // Filters only by timedb.t_orders.find({ status: "paid", userId: "u_10001" }) // Filters by status + user
sh.stopBalancer()
db.orders.createIndex({ userId: 1, createTime: -1 }, { background: true })
// View currently running index build tasks.db.currentOp({ "command.createIndexes": { $exists: true } })
Esta página foi útil?
Você também pode entrar em contato com a Equipe de vendas ou Enviar um tíquete em caso de ajuda.
comentários