Skip to content

B-Tree Indexes & ESR Rule

B-Tree Indexes & ESR Rule

Indexing is the single most important factor for query performance in MongoDB. Without an index, MongoDB must perform a Collection Scan (COLLSCAN), scanning every document in a collection.

πŸ—οΈ 1. Essential Index Types

MongoDB uses B-Tree data structures for its indexes, allowing for efficient range queries and sorting.

Compound Index

An index on multiple fields. The order of fields is critical.

// Mongo Shell: Create a compound index
db.orders.createIndex({ customer_id: 1, status: 1, amount: -1 });

Specialized Indexes

  • TTL (Time-To-Live): Automatically expires documents after a set time.
  • Multikey: Used for indexing arrays (creates an entry for each element).
  • Partial: Only indexes documents that match a specific filter (saves space).
  • Text: Supports full-text search capabilities.

πŸš€ 2. The ESR Rule (Equality, Sort, Range)

For compound indexes, follow the ESR Rule to ensure the most efficient index usage.

  1. Equality (E): Fields used for exact matches (e.g., status: "A") should come first.
  2. Sort (S): Fields used for sorting (e.g., sort: { date: -1 }) should come next.
  3. Range (R): Fields used for range filters (e.g., price: { $gt: 100 }) should come last.

Pymongo Example: Creating an ESR Index

from pymongo import MongoClient, ASCENDING, DESCENDING

client = MongoClient('mongodb://localhost:27017/')
db = client['ecommerce']

# Equality: user_id, Sort: order_date, Range: total_amount
db.orders.create_index([
    ("user_id", ASCENDING), 
    ("order_date", DESCENDING), 
    ("total_amount", ASCENDING)
])

⚑ 3. Explain Plans & Profiling

To optimize a query, you must analyze its Execution Plan.

Using .explain()

Appending .explain("executionStats") to a query provides details on its performance.

// Mongo Shell
db.users.find({ email: "alice@example.com" }).explain("executionStats");

Key Metrics to Analyze:

  • nReturned: Documents actually returned.
  • totalKeysExamined: Index entries scanned.
  • totalDocsExamined: Documents scanned.
  • winningPlan.stage: IXSCAN (Good), COLLSCAN (Bad), FETCH (Retrieving docs after index scan).

The Database Profiler

The profiler logs slow operations to the system.profile collection.

// Set profiler to level 1 (slow operations) with 100ms threshold
db.setProfilingLevel(1, 100);

// View slow operations
db.system.profile.find({ millis: { $gt: 200 } }).sort({ ts: -1 });

πŸ’‘ Best Practices

  1. Covered Queries: An index β€œcovers” a query when the index itself contains all fields required by the query (including projection). This avoids the FETCH stage entirely.
  2. Limit Indexes: Each index slows down write operations and consumes space in the WiredTiger Cache.
  3. Watch for In-Memory Sorts: If a query doesn’t use an index for sorting, MongoDB performs the sort in memory. If the data exceeds 100MB, the query will fail unless you allow disk usage.