#Indexing
Indexes dramatically improve query performance by providing fast access paths to your data. PhoenixmlDb supports multiple index types optimized for different query patterns.
#Index Types
#Path Index
Path indexes accelerate queries that navigate to specific elements or attributes by path.
// Create path index
container.CreateIndex(new PathIndex("price-idx", "/product/price"));
// Queries that benefit:
// - collection('products')/product/price
// - collection('products')//product/price
// - $doc/product/priceMulti-path index:
container.CreateIndex(new PathIndex("product-paths",
"/product/name",
"/product/category",
"/product/price"));#Value Index
Value indexes enable efficient range queries and sorting on typed values.
// Numeric value index
container.CreateIndex(new ValueIndex("price-val",
"/product/price",
ValueType.Decimal));
// Date value index
container.CreateIndex(new ValueIndex("date-val",
"/order/orderDate",
ValueType.Date));
// String value index
container.CreateIndex(new ValueIndex("name-val",
"/customer/name",
ValueType.String));Supported value types:
|
ValueType |
XSD Type |
Use Case |
|---|---|---|
|
|
xs:string |
Text comparisons, sorting |
|
|
xs:integer |
Whole number ranges |
|
|
xs:decimal |
Precise decimal ranges |
|
|
xs:double |
Scientific calculations |
|
|
xs:date |
Date ranges |
|
|
xs:dateTime |
Timestamp ranges |
|
|
xs:boolean |
True/false filtering |
Queries that benefit:
(: Range queries :)
//product[price > 10 and price < 100]
(: Sorting :)
for $p in //product order by $p/price return $p
(: Min/Max :)
min(//product/price)#Full-Text Index
Full-text indexes support natural language search with tokenization, stemming, and relevance ranking.
// Basic full-text index
container.CreateIndex(new FullTextIndex("desc-ft",
"/product/description"));
// With custom options
container.CreateIndex(new FullTextIndex("content-ft",
"/article/content",
new FullTextOptions
{
Language = "en",
Stemming = true,
StopWords = true,
CaseSensitive = false,
MinTokenLength = 2
}));Queries that benefit:
(: Contains search :)
//product[contains(description, 'wireless')]
(: Full-text functions :)
//product[ft:contains(description, 'wireless bluetooth')]
(: Phrase search :)
//product[ft:contains(description, '"noise canceling"')]
(: Boolean operators :)
//product[ft:contains(description, 'wireless AND NOT wired')]#Structural Index
Structural indexes accelerate navigation queries (parent, child, sibling, ancestor, descendant).
// Enable structural indexing for a container
container.CreateIndex(new StructuralIndex("struct-idx"));Queries that benefit:
(: Parent/child navigation :)
$element/parent::*
$element/child::item
(: Ancestor/descendant :)
$element/ancestor::section
$element//nested-item
(: Sibling navigation :)
$element/following-sibling::*
$element/preceding-sibling::*#Metadata Index
Metadata indexes allow efficient queries on document metadata.
container.CreateIndex(new MetadataIndex("meta-idx",
"author", "created", "version"));Queries that benefit:
(: Filter by metadata :)
for $doc in collection('products')
where doc-metadata($doc, 'author') = 'admin'
return $doc#Creating Indexes
#During Container Creation
var container = db.CreateContainer("products", new ContainerOptions
{
Indexes =
[
new PathIndex("paths", "/product/name", "/product/category"),
new ValueIndex("price", "/product/price", ValueType.Decimal),
new FullTextIndex("search", "/product/description")
]
});#On Existing Container
var container = db.GetContainer("products");
// Create if not exists
container.CreateIndexIfNotExists(new PathIndex("name-idx", "/product/name"));
// Force recreation
container.CreateIndex(new PathIndex("name-idx", "/product/name"),
recreateIfExists: true);#Deferred Indexing
For bulk imports, defer indexing for better performance:
// Disable auto-indexing
container.SetOption("IndexOnStore", false);
// Bulk import
using (var txn = db.BeginTransaction())
{
foreach (var doc in documents)
{
container.PutDocument(doc.Name, doc.Content);
}
txn.Commit();
}
// Rebuild indexes
container.RebuildIndexes();
// Re-enable auto-indexing
container.SetOption("IndexOnStore", true);#Managing Indexes
#List Indexes
foreach (var index in container.ListIndexes())
{
Console.WriteLine($"{index.Name}: {index.Type}");
Console.WriteLine($" Paths: {string.Join(", ", index.Paths)}");
Console.WriteLine($" Size: {index.SizeBytes} bytes");
Console.WriteLine($" Entries: {index.EntryCount}");
}#Drop Index
container.DropIndex("old-index");#Rebuild Index
// Rebuild specific index
container.RebuildIndex("price-idx");
// Rebuild all indexes
container.RebuildIndexes();#Index Statistics
var stats = container.GetIndexStats("price-idx");
Console.WriteLine($"Entries: {stats.EntryCount}");
Console.WriteLine($"Size: {stats.SizeBytes}");
Console.WriteLine($"Depth: {stats.TreeDepth}");
Console.WriteLine($"Fragmentation: {stats.Fragmentation:P}");#Query Optimization
#Explain Plan
var plan = db.Explain("""
for $p in collection('products')//product
where $p/price > 100
order by $p/name
return $p
""");
Console.WriteLine(plan.ToString());Output:
Query Plan:
├─ For: $p
│ ├─ Source: collection('products')//product
│ │ └─ Index: path-idx (estimated: 1000 nodes)
│ ├─ Where: $p/price > 100
│ │ └─ Index: price-val (estimated: 250 matches)
│ └─ OrderBy: $p/name
│ └─ Index: name-val (sorted access)
└─ Return: $p
Estimated cost: 250
Index usage: 3 indexes#Index Hints
// Force specific index usage
var results = db.Query("""
(: pragma index=price-idx :)
for $p in collection('products')//product
where $p/price > 100
return $p
""");
// Disable index usage (for testing)
var results = db.Query("""
(: pragma no-index :)
for $p in collection('products')//product
return $p
""");#Index Selection Guidelines
|
Query Pattern |
Recommended Index |
|---|---|
|
Navigate to element/attribute |
Path Index |
|
Equality comparison |
Value Index or Path Index |
|
Range comparison (<, >, between) |
Value Index |
|
Sorting (order by) |
Value Index |
|
Text search |
Full-Text Index |
|
Tree navigation (parent, ancestor) |
Structural Index |
|
Metadata filtering |
Metadata Index |
#Best Practices
Do:
-
Index frequently queried paths — Start with your most common queries
-
Use appropriate value types — Match the index type to your data
-
Create composite indexes — Combine related paths in one index
-
Monitor index usage — Use explain plans to verify indexes are used
-
Rebuild after bulk updates — Fragmented indexes slow queries
Don't:
-
Over-index — Each index adds storage and write overhead
-
Index rarely queried paths — Unused indexes waste resources
-
Use wrong value types — String indexes won't help numeric ranges
-
Forget to rebuild — After bulk deletes, indexes may be fragmented
#Performance Impact
|
Operation |
Without Index |
With Index |
|---|---|---|
|
Path lookup |
O(n) scan |
O(log n) |
|
Range query |
O(n) scan |
O(log n + k) |
|
Full-text search |
O(n) scan |
O(k) |
|
Sort |
O(n log n) |
O(n) or O(k) |
Where n = total nodes, k = matching nodes