Skip to content

3.2 集合操作

概述

集合(Collection)是MongoDB中文档的分组,类似于关系型数据库中的表。集合是MongoDB存储文档的基本单位,每个集合都属于一个特定的数据库。本章节将详细介绍集合的创建、管理、配置和优化操作,帮助开发者掌握集合的核心概念和实践技巧。

集合操作是MongoDB数据库管理的基础,包括集合的创建、删除、重命名、查看集合信息、配置集合属性等。通过合理的集合管理,可以优化数据库性能、提高数据组织效率、简化应用开发流程。

基本概念

集合定义

集合是MongoDB中存储文档的逻辑容器,具有以下特点:

  • 无模式性:集合中的文档可以有不同的字段和结构
  • 动态性:可以随时添加、修改、删除文档
  • 命名规则:集合名称必须符合特定规则
  • 存储限制:单个集合大小受限于磁盘空间

集合类型

MongoDB支持多种类型的集合:

  1. 普通集合:标准的文档集合,支持所有操作
  2. 固定集合:具有固定大小的集合,自动覆盖旧数据
  3. 分片集合:分布在多个分片上的集合,支持水平扩展
  4. 时间序列集合:专门用于存储时间序列数据的集合
  5. 视图集合:基于其他集合的虚拟集合,只读访问

集合命名规范

集合名称需要遵循以下规则:

  • 长度限制:集合名称不能超过128个字符
  • 字符限制:可以使用字母、数字、下划线、点号等
  • 特殊限制:不能以system.开头(系统集合保留)
  • 大小写敏感:集合名称区分大小写
  • 空格限制:不能包含空格和特殊字符

原理深度解析

集合存储结构

MongoDB集合在磁盘上的存储结构:

php
<?php
// 集合存储结构示例
class CollectionStorageStructure {
    public static function analyzeCollection($collectionName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $database = $client->selectDatabase("testdb");
        $collection = $database->selectCollection($collectionName);
        
        // 获取集合统计信息
        $stats = $database->command([
            'collStats' => $collectionName,
            'verbose' => true
        ])->toArray()[0];
        
        return [
            'collection_name' => $collectionName,
            'document_count' => $stats['count'],
            'total_size' => $stats['size'],
            'storage_size' => $stats['storageSize'],
            'index_count' => count($stats['indexSizes']),
            'avg_obj_size' => $stats['avgObjSize'],
            'padding_factor' => $stats['paddingFactor'] ?? 1,
            'capped' => $stats['capped'] ?? false,
            'max' => $stats['max'] ?? null,
            'size' => $stats['size'] ?? null
        ];
    }
}

// 使用示例
$structure = CollectionStorageStructure::analyzeCollection('users');
print_r($structure);
?>

集合元数据

集合元数据存储在system.namespaces集合中:

php
<?php
// 查看集合元数据
class CollectionMetadata {
    public static function getCollectionMetadata($databaseName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $database = $client->selectDatabase($databaseName);
        
        // 查询系统集合获取元数据
        $namespaces = $database->selectCollection('system.namespaces');
        $cursor = $namespaces->find([], [
            'projection' => ['name' => 1, 'options' => 1]
        ]);
        
        $metadata = [];
        foreach ($cursor as $namespace) {
            $collectionName = str_replace($databaseName . '.', '', $namespace['name']);
            if (!str_starts_with($collectionName, 'system.')) {
                $metadata[$collectionName] = $namespace['options'] ?? [];
            }
        }
        
        return $metadata;
    }
}

// 使用示例
$metadata = CollectionMetadata::getCollectionMetadata('testdb');
print_r($metadata);
?>

集合创建原理

MongoDB创建集合的内部机制:

php
<?php
// 集合创建过程分析
class CollectionCreationProcess {
    public static function createCollectionWithAnalysis(
        $databaseName, 
        $collectionName, 
        $options = []
    ) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $database = $client->selectDatabase($databaseName);
        
        try {
            // 记录创建开始时间
            $startTime = microtime(true);
            
            // 创建集合
            $result = $database->createCollection($collectionName, $options);
            
            // 记录创建结束时间
            $endTime = microtime(true);
            $creationTime = $endTime - $startTime;
            
            // 获取集合信息
            $collection = $database->selectCollection($collectionName);
            $stats = $database->command([
                'collStats' => $collectionName
            ])->toArray()[0];
            
            return [
                'status' => 'success',
                'collection_name' => $collectionName,
                'creation_time' => $creationTime,
                'capped' => $stats['capped'] ?? false,
                'document_count' => $stats['count'],
                'storage_size' => $stats['storageSize'],
                'options_applied' => $options
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage(),
                'collection_name' => $collectionName
            ];
        }
    }
}

// 使用示例 - 创建普通集合
$result1 = CollectionCreationProcess::createCollectionWithAnalysis(
    'testdb', 
    'products'
);
print_r($result1);

// 使用示例 - 创建固定集合
$result2 = CollectionCreationProcess::createCollectionWithAnalysis(
    'testdb', 
    'logs',
    [
        'capped' => true,
        'size' => 1024 * 1024 * 100, // 100MB
        'max' => 10000
    ]
);
print_r($result2);
?>

集合删除机制

集合删除的内部过程和影响:

php
<?php
// 集合删除过程分析
class CollectionDeletionProcess {
    public static function deleteCollectionWithAnalysis(
        $databaseName, 
        $collectionName,
        $backup = false
    ) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $database = $client->selectDatabase($databaseName);
        $collection = $database->selectCollection($collectionName);
        
        try {
            // 检查集合是否存在
            $collections = $database->listCollectionNames(['name' => $collectionName]);
            if (empty(iterator_to_array($collections))) {
                return [
                    'status' => 'error',
                    'message' => 'Collection does not exist',
                    'collection_name' => $collectionName
                ];
            }
            
            // 备份数据(如果需要)
            $backupData = null;
            if ($backup) {
                $backupData = iterator_to_array($collection->find());
            }
            
            // 获取删除前的统计信息
            $statsBefore = $database->command([
                'collStats' => $collectionName
            ])->toArray()[0];
            
            // 执行删除操作
            $result = $database->dropCollection($collectionName);
            
            return [
                'status' => 'success',
                'collection_name' => $collectionName,
                'documents_deleted' => $statsBefore['count'],
                'storage_freed' => $statsBefore['storageSize'],
                'backup_created' => $backup,
                'backup_count' => $backup ? count($backupData) : 0
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage(),
                'collection_name' => $collectionName
            ];
        }
    }
}

// 使用示例
$result = CollectionDeletionProcess::deleteCollectionWithAnalysis(
    'testdb', 
    'temp_collection',
    true
);
print_r($result);
?>

常见错误与踩坑点

错误1:集合名称不符合规范

问题描述:使用不符合规范的集合名称导致创建失败。

php
<?php
// 错误示例 - 使用特殊字符
try {
    $client = new MongoDB\Client("mongodb://localhost:27017");
    $database = $client->selectDatabase("testdb");
    
    // 错误:集合名称包含空格
    $database->createCollection("user data");
    
    // 错误:集合名称以system.开头
    $database->createCollection("system.users");
    
    // 错误:集合名称过长
    $longName = str_repeat('a', 150);
    $database->createCollection($longName);
    
} catch (Exception $e) {
    echo "错误: " . $e->getMessage() . "\n";
}

// 正确示例 - 使用规范的集合名称
$client = new MongoDB\Client("mongodb://localhost:27017");
$database = $client->selectDatabase("testdb");

// 正确:使用下划线替代空格
$database->createCollection("user_data");

// 正确:使用描述性的名称
$database->createCollection("user_profiles");

// 正确:使用驼峰命名
$database->createCollection("userSettings");
?>

错误2:固定集合大小设置不当

问题描述:固定集合大小设置过小或过大导致性能问题。

php
<?php
// 错误示例 - 固定集合大小设置不当
try {
    $client = new MongoDB\Client("mongodb://localhost:27017");
    $database = $client->selectDatabase("testdb");
    
    // 错误:固定集合大小过小
    $database->createCollection("tiny_logs", [
        'capped' => true,
        'size' => 1024 // 1KB,太小了
    ]);
    
    // 错误:固定集合大小过大
    $database->createCollection("huge_logs", [
        'capped' => true,
        'size' => 1024 * 1024 * 1024 * 1000 // 1TB,太大
    ]);
    
    // 错误:max参数与size参数不匹配
    $database->createCollection("mismatched_logs", [
        'capped' => true,
        'size' => 1024 * 1024, // 1MB
        'max' => 1000000 // 文档数量过大
    ]);
    
} catch (Exception $e) {
    echo "错误: " . $e->getMessage() . "\n";
}

// 正确示例 - 合理设置固定集合参数
$client = new MongoDB\Client("mongodb://localhost:27017");
$database = $client->selectDatabase("testdb");

// 正确:根据实际需求设置大小
$database->createCollection("app_logs", [
    'capped' => true,
    'size' => 1024 * 1024 * 100, // 100MB
    'max' => 10000 // 最多10000条记录
]);

// 正确:监控固定集合使用情况
class CappedCollectionMonitor {
    public static function monitorCappedCollection($databaseName, $collectionName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $database = $client->selectDatabase($databaseName);
        
        $stats = $database->command([
            'collStats' => $collectionName
        ])->toArray()[0];
        
        $usagePercentage = ($stats['size'] / $stats['storageSize']) * 100;
        
        return [
            'collection_name' => $collectionName,
            'current_size' => $stats['size'],
            'max_size' => $stats['storageSize'],
            'usage_percentage' => round($usagePercentage, 2),
            'document_count' => $stats['count'],
            'max_documents' => $stats['max'],
            'status' => $usagePercentage > 90 ? 'warning' : 'normal'
        ];
    }
}

$monitor = CappedCollectionMonitor::monitorCappedCollection('testdb', 'app_logs');
print_r($monitor);
?>

错误3:集合删除前未检查依赖关系

问题描述:删除集合时未检查应用程序依赖关系导致数据丢失。

php
<?php
// 错误示例 - 直接删除集合
try {
    $client = new MongoDB\Client("mongodb://localhost:27017");
    $database = $client->selectDatabase("testdb");
    
    // 错误:直接删除集合,未检查依赖
    $database->dropCollection("users");
    
} catch (Exception $e) {
    echo "错误: " . $e->getMessage() . "\n";
}

// 正确示例 - 删除前检查依赖关系
class SafeCollectionDeleter {
    public static function checkDependencies($databaseName, $collectionName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $database = $client->selectDatabase($databaseName);
        
        $dependencies = [];
        
        // 检查是否有其他集合引用此集合
        $collections = $database->listCollectionNames();
        foreach ($collections as $coll) {
            if ($coll !== $collectionName && !str_starts_with($coll, 'system.')) {
                $collection = $database->selectCollection($coll);
                
                // 检查是否有引用字段
                $sampleDoc = $collection->findOne();
                if ($sampleDoc) {
                    foreach ($sampleDoc as $field => $value) {
                        if (is_array($value) && isset($value['$ref']) && 
                            $value['$ref'] === $collectionName) {
                            $dependencies[] = [
                                'collection' => $coll,
                                'field' => $field,
                                'type' => 'DBRef'
                            ];
                        }
                    }
                }
            }
        }
        
        return $dependencies;
    }
    
    public static function safeDelete($databaseName, $collectionName) {
        $dependencies = self::checkDependencies($databaseName, $collectionName);
        
        if (!empty($dependencies)) {
            return [
                'status' => 'warning',
                'message' => 'Collection has dependencies',
                'dependencies' => $dependencies
            ];
        }
        
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $database = $client->selectDatabase($databaseName);
        
        $database->dropCollection($collectionName);
        
        return [
            'status' => 'success',
            'message' => 'Collection deleted successfully',
            'collection_name' => $collectionName
        ];
    }
}

// 使用示例
$result = SafeCollectionDeleter::safeDelete('testdb', 'users');
print_r($result);
?>

错误4:集合重命名导致索引失效

问题描述:集合重命名后未更新索引配置导致性能下降。

php
<?php
// 错误示例 - 直接重命名集合
try {
    $client = new MongoDB\Client("mongodb://localhost:27017");
    $database = $client->selectDatabase("testdb");
    
    // 错误:直接重命名,未考虑索引影响
    $database->command([
        'renameCollection' => 'testdb.old_users',
        'to' => 'testdb.users'
    ]);
    
} catch (Exception $e) {
    echo "错误: " . $e->getMessage() . "\n";
}

// 正确示例 - 重命名时处理索引
class SafeCollectionRenamer {
    public static function renameWithIndexPreservation(
        $databaseName, 
        $oldName, 
        $newName
    ) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $database = $client->selectDatabase($databaseName);
        
        try {
            // 获取原集合的索引信息
            $oldCollection = $database->selectCollection($oldName);
            $indexes = $oldCollection->listIndexes();
            
            $indexSpecs = [];
            foreach ($indexes as $index) {
                if ($index->getName() !== '_id_') {
                    $indexSpecs[] = [
                        'name' => $index->getName(),
                        'key' => $index->getKey(),
                        'options' => [
                            'unique' => $index->isUnique(),
                            'sparse' => $index->isSparse(),
                            'background' => true
                        ]
                    ];
                }
            }
            
            // 重命名集合
            $database->command([
                'renameCollection' => $databaseName . '.' . $oldName,
                'to' => $databaseName . '.' . $newName
            ]);
            
            // 重建索引
            $newCollection = $database->selectCollection($newName);
            foreach ($indexSpecs as $spec) {
                $newCollection->createIndex(
                    $spec['key'],
                    array_merge($spec['options'], ['name' => $spec['name']])
                );
            }
            
            return [
                'status' => 'success',
                'old_name' => $oldName,
                'new_name' => $newName,
                'indexes_preserved' => count($indexSpecs)
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
}

// 使用示例
$result = SafeCollectionRenamer::renameWithIndexPreservation(
    'testdb', 
    'old_users', 
    'users'
);
print_r($result);
?>

常见应用场景

场景1:日志管理系统

使用固定集合存储应用日志:

php
<?php
// 日志管理系统 - 使用固定集合
class LogManagementSystem {
    private $database;
    private $logCollection;
    
    public function __construct($databaseName = 'app_logs') {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
        $this->initializeCollections();
    }
    
    private function initializeCollections() {
        // 创建不同级别的日志集合
        $logLevels = ['error', 'warning', 'info', 'debug'];
        
        foreach ($logLevels as $level) {
            $collectionName = 'logs_' . $level;
            
            try {
                $this->database->createCollection($collectionName, [
                    'capped' => true,
                    'size' => $this->getCollectionSize($level),
                    'max' => $this->getMaxDocuments($level)
                ]);
                
                // 创建索引
                $collection = $this->database->selectCollection($collectionName);
                $collection->createIndex(['timestamp' => -1]);
                $collection->createIndex(['level' => 1, 'timestamp' => -1]);
                
            } catch (Exception $e) {
                // 集合可能已存在
            }
        }
    }
    
    private function getCollectionSize($level) {
        $sizes = [
            'error' => 1024 * 1024 * 200, // 200MB
            'warning' => 1024 * 1024 * 150, // 150MB
            'info' => 1024 * 1024 * 100, // 100MB
            'debug' => 1024 * 1024 * 50 // 50MB
        ];
        return $sizes[$level] ?? 1024 * 1024 * 100;
    }
    
    private function getMaxDocuments($level) {
        $maxDocs = [
            'error' => 50000,
            'warning' => 100000,
            'info' => 200000,
            'debug' => 500000
        ];
        return $maxDocs[$level] ?? 100000;
    }
    
    public function log($level, $message, $context = []) {
        $collectionName = 'logs_' . $level;
        $collection = $this->database->selectCollection($collectionName);
        
        $logEntry = [
            'timestamp' => new MongoDB\BSON\UTCDateTime(),
            'level' => $level,
            'message' => $message,
            'context' => $context,
            'server' => gethostname(),
            'process_id' => getmypid()
        ];
        
        $result = $collection->insertOne($logEntry);
        
        return [
            'success' => $result->getInsertedCount() > 0,
            'log_id' => $result->getInsertedId(),
            'level' => $level
        ];
    }
    
    public function queryLogs($level, $filters = [], $limit = 100) {
        $collectionName = 'logs_' . $level;
        $collection = $this->database->selectCollection($collectionName);
        
        $query = [];
        if (!empty($filters)) {
            $query = $filters;
        }
        
        $cursor = $collection->find($query, [
            'sort' => ['timestamp' => -1],
            'limit' => $limit
        ]);
        
        return iterator_to_array($cursor);
    }
    
    public function getLogStatistics() {
        $stats = [];
        $logLevels = ['error', 'warning', 'info', 'debug'];
        
        foreach ($logLevels as $level) {
            $collectionName = 'logs_' . $level;
            $collection = $this->database->selectCollection($collectionName);
            
            $collStats = $this->database->command([
                'collStats' => $collectionName
            ])->toArray()[0];
            
            $stats[$level] = [
                'count' => $collStats['count'],
                'size' => $collStats['size'],
                'storage_size' => $collStats['storageSize'],
                'usage_percentage' => round(
                    ($collStats['size'] / $collStats['storageSize']) * 100, 
                    2
                )
            ];
        }
        
        return $stats;
    }
}

// 使用示例
$logSystem = new LogManagementSystem();

// 记录不同级别的日志
$logSystem->log('error', 'Database connection failed', [
    'error_code' => 'CONNECTION_TIMEOUT',
    'retry_count' => 3
]);

$logSystem->log('warning', 'Memory usage high', [
    'usage_percentage' => 85,
    'threshold' => 80
]);

$logSystem->log('info', 'User login successful', [
    'user_id' => '12345',
    'ip_address' => '192.168.1.1'
]);

// 查询错误日志
$errorLogs = $logSystem->queryLogs('error', [], 10);
print_r($errorLogs);

// 获取日志统计信息
$stats = $logSystem->getLogStatistics();
print_r($stats);
?>

场景2:多租户数据隔离

使用集合隔离不同租户的数据:

php
<?php
// 多租户数据隔离系统
class MultiTenantCollectionManager {
    private $database;
    
    public function __construct($databaseName = 'multi_tenant_db') {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
    }
    
    public function createTenantCollection($tenantId, $collectionType) {
        $collectionName = $this->getCollectionName($tenantId, $collectionType);
        
        try {
            // 创建租户集合
            $this->database->createCollection($collectionName);
            $collection = $this->database->selectCollection($collectionName);
            
            // 创建租户索引
            $collection->createIndex(['tenant_id' => 1]);
            $collection->createIndex(['created_at' => -1]);
            
            // 根据集合类型创建特定索引
            $this->createTypeSpecificIndexes($collection, $collectionType);
            
            return [
                'status' => 'success',
                'collection_name' => $collectionName,
                'tenant_id' => $tenantId
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    private function getCollectionName($tenantId, $collectionType) {
        return sprintf('tenant_%s_%s', $tenantId, $collectionType);
    }
    
    private function createTypeSpecificIndexes($collection, $collectionType) {
        $indexConfigs = [
            'users' => [
                ['email' => 1],
                ['username' => 1],
                ['status' => 1]
            ],
            'orders' => [
                ['order_number' => 1],
                ['user_id' => 1],
                ['status' => 1],
                ['created_at' => -1]
            ],
            'products' => [
                ['sku' => 1],
                ['name' => 'text'],
                ['category' => 1]
            ]
        ];
        
        if (isset($indexConfigs[$collectionType])) {
            foreach ($indexConfigs[$collectionType] as $index) {
                $collection->createIndex($index);
            }
        }
    }
    
    public function insertTenantData($tenantId, $collectionType, $data) {
        $collectionName = $this->getCollectionName($tenantId, $collectionType);
        $collection = $this->database->selectCollection($collectionName);
        
        // 添加租户ID和时间戳
        $data['tenant_id'] = $tenantId;
        $data['created_at'] = new MongoDB\BSON\UTCDateTime();
        $data['updated_at'] = new MongoDB\BSON\UTCDateTime();
        
        try {
            $result = $collection->insertOne($data);
            
            return [
                'status' => 'success',
                'inserted_id' => $result->getInsertedId(),
                'tenant_id' => $tenantId
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function queryTenantData($tenantId, $collectionType, $filter = [], $options = []) {
        $collectionName = $this->getCollectionName($tenantId, $collectionType);
        $collection = $this->database->selectCollection($collectionName);
        
        // 确保查询包含租户ID
        $filter['tenant_id'] = $tenantId;
        
        try {
            $cursor = $collection->find($filter, $options);
            return iterator_to_array($cursor);
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function deleteTenantCollections($tenantId) {
        $collections = $this->database->listCollectionNames([
            'filter' => ['name' => new MongoDB\BSON\Regex("^tenant_{$tenantId}_")]
        ]);
        
        $deletedCollections = [];
        foreach ($collections as $collectionName) {
            $this->database->dropCollection($collectionName);
            $deletedCollections[] = $collectionName;
        }
        
        return [
            'status' => 'success',
            'tenant_id' => $tenantId,
            'deleted_collections' => $deletedCollections,
            'count' => count($deletedCollections)
        ];
    }
    
    public function getTenantStatistics($tenantId) {
        $collections = $this->database->listCollectionNames([
            'filter' => ['name' => new MongoDB\BSON\Regex("^tenant_{$tenantId}_")]
        ]);
        
        $stats = [];
        foreach ($collections as $collectionName) {
            $collection = $this->database->selectCollection($collectionName);
            $collStats = $this->database->command([
                'collStats' => $collectionName
            ])->toArray()[0];
            
            $collectionType = str_replace("tenant_{$tenantId}_", '', $collectionName);
            $stats[$collectionType] = [
                'document_count' => $collStats['count'],
                'storage_size' => $collStats['storageSize'],
                'index_count' => count($collStats['indexSizes'])
            ];
        }
        
        return $stats;
    }
}

// 使用示例
$tenantManager = new MultiTenantCollectionManager();

// 为租户创建集合
$tenantId = 'tenant_001';
$tenantManager->createTenantCollection($tenantId, 'users');
$tenantManager->createTenantCollection($tenantId, 'orders');
$tenantManager->createTenantCollection($tenantId, 'products');

// 插入租户数据
$tenantManager->insertTenantData($tenantId, 'users', [
    'name' => 'John Doe',
    'email' => 'john@example.com',
    'status' => 'active'
]);

$tenantManager->insertTenantData($tenantId, 'orders', [
    'order_number' => 'ORD-001',
    'user_id' => 'user_001',
    'status' => 'pending',
    'total_amount' => 99.99
]);

// 查询租户数据
$users = $tenantManager->queryTenantData($tenantId, 'users', ['status' => 'active']);
print_r($users);

// 获取租户统计信息
$stats = $tenantManager->getTenantStatistics($tenantId);
print_r($stats);
?>

场景3:时间序列数据管理

使用时间序列集合存储传感器数据:

php
<?php
// 时间序列数据管理系统
class TimeSeriesDataManager {
    private $database;
    
    public function __construct($databaseName = 'timeseries_db') {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
    }
    
    public function createTimeSeriesCollection($collectionName, $timeField = 'timestamp', $metaField = 'metadata') {
        try {
            $this->database->createCollection($collectionName, [
                'timeseries' => [
                    'timeField' => $timeField,
                    'metaField' => $metaField,
                    'granularity' => 'seconds'
                ]
            ]);
            
            $collection = $this->database->selectCollection($collectionName);
            
            // 创建索引
            $collection->createIndex([$timeField => -1]);
            $collection->createIndex([
                $metaField . '.sensor_id' => 1,
                $timeField => -1
            ]);
            
            return [
                'status' => 'success',
                'collection_name' => $collectionName,
                'time_field' => $timeField,
                'meta_field' => $metaField
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function insertTimeSeriesData($collectionName, $sensorId, $value, $timestamp = null) {
        $collection = $this->database->selectCollection($collectionName);
        
        if ($timestamp === null) {
            $timestamp = new MongoDB\BSON\UTCDateTime();
        }
        
        $dataPoint = [
            'timestamp' => $timestamp,
            'value' => $value,
            'metadata' => [
                'sensor_id' => $sensorId,
                'location' => $this->getSensorLocation($sensorId),
                'type' => $this->getSensorType($sensorId)
            ]
        ];
        
        try {
            $result = $collection->insertOne($dataPoint);
            
            return [
                'status' => 'success',
                'inserted_id' => $result->getInsertedId(),
                'sensor_id' => $sensorId
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function batchInsertTimeSeriesData($collectionName, $sensorId, $dataPoints) {
        $collection = $this->database->selectCollection($collectionName);
        
        $documents = [];
        foreach ($dataPoints as $point) {
            $documents[] = [
                'timestamp' => $point['timestamp'] ?? new MongoDB\BSON\UTCDateTime(),
                'value' => $point['value'],
                'metadata' => [
                    'sensor_id' => $sensorId,
                    'location' => $this->getSensorLocation($sensorId),
                    'type' => $this->getSensorType($sensorId)
                ]
            ];
        }
        
        try {
            $result = $collection->insertMany($documents);
            
            return [
                'status' => 'success',
                'inserted_count' => $result->getInsertedCount(),
                'sensor_id' => $sensorId
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function queryTimeSeriesData($collectionName, $sensorId, $startTime, $endTime, $options = []) {
        $collection = $this->database->selectCollection($collectionName);
        
        $filter = [
            'metadata.sensor_id' => $sensorId,
            'timestamp' => [
                '$gte' => $startTime,
                '$lte' => $endTime
            ]
        ];
        
        $defaultOptions = [
            'sort' => ['timestamp' => 1],
            'limit' => 10000
        ];
        
        $queryOptions = array_merge($defaultOptions, $options);
        
        try {
            $cursor = $collection->find($filter, $queryOptions);
            return iterator_to_array($cursor);
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function aggregateTimeSeriesData($collectionName, $sensorId, $startTime, $endTime, $interval = 'hour') {
        $collection = $this->database->selectCollection($collectionName);
        
        $pipeline = [
            [
                '$match' => [
                    'metadata.sensor_id' => $sensorId,
                    'timestamp' => [
                        '$gte' => $startTime,
                        '$lte' => $endTime
                    ]
                ]
            ],
            [
                '$group' => [
                    '_id' => $this->getGroupExpression($interval),
                    'avg_value' => ['$avg' => '$value'],
                    'min_value' => ['$min' => '$value'],
                    'max_value' => ['$max' => '$value'],
                    'count' => ['$sum' => 1]
                ]
            ],
            [
                '$sort' => ['_id' => 1]
            ]
        ];
        
        try {
            $cursor = $collection->aggregate($pipeline);
            return iterator_to_array($cursor);
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    private function getGroupExpression($interval) {
        $expressions = [
            'minute' => [
                'year' => ['$year' => '$timestamp'],
                'month' => ['$month' => '$timestamp'],
                'day' => ['$dayOfMonth' => '$timestamp'],
                'hour' => ['$hour' => '$timestamp'],
                'minute' => ['$minute' => '$timestamp']
            ],
            'hour' => [
                'year' => ['$year' => '$timestamp'],
                'month' => ['$month' => '$timestamp'],
                'day' => ['$dayOfMonth' => '$timestamp'],
                'hour' => ['$hour' => '$timestamp']
            ],
            'day' => [
                'year' => ['$year' => '$timestamp'],
                'month' => ['$month' => '$timestamp'],
                'day' => ['$dayOfMonth' => '$timestamp']
            ]
        ];
        
        return $expressions[$interval] ?? $expressions['hour'];
    }
    
    private function getSensorLocation($sensorId) {
        $locations = [
            'sensor_001' => 'Building A, Floor 1',
            'sensor_002' => 'Building A, Floor 2',
            'sensor_003' => 'Building B, Floor 1'
        ];
        return $locations[$sensorId] ?? 'Unknown Location';
    }
    
    private function getSensorType($sensorId) {
        $types = [
            'sensor_001' => 'temperature',
            'sensor_002' => 'humidity',
            'sensor_003' => 'pressure'
        ];
        return $types[$sensorId] ?? 'unknown';
    }
}

// 使用示例
$tsManager = new TimeSeriesDataManager();

// 创建时间序列集合
$tsManager->createTimeSeriesCollection('sensor_data', 'timestamp', 'metadata');

// 插入时间序列数据
$now = new MongoDB\BSON\UTCDateTime();
for ($i = 0; $i < 100; $i++) {
    $timestamp = new MongoDB\BSON\UTCDateTime($now->toDateTime()->modify("-{$i} minutes")->getTimestamp() * 1000);
    $tsManager->insertTimeSeriesData('sensor_data', 'sensor_001', 20 + rand(-5, 5), $timestamp);
}

// 查询时间序列数据
$startTime = new MongoDB\BSON\UTCDateTime((time() - 3600) * 1000);
$endTime = new MongoDB\BSON\UTCDateTime();
$data = $tsManager->queryTimeSeriesData('sensor_data', 'sensor_001', $startTime, $endTime);
print_r($data);

// 聚合时间序列数据
$aggregated = $tsManager->aggregateTimeSeriesData('sensor_data', 'sensor_001', $startTime, $endTime, 'hour');
print_r($aggregated);
?>

企业级进阶应用场景

场景1:分布式集合管理

管理跨多个MongoDB实例的集合:

php
<?php
// 分布式集合管理系统
class DistributedCollectionManager {
    private $connections = [];
    private $config;
    
    public function __construct($configFile) {
        $this->config = json_decode(file_get_contents($configFile), true);
        $this->initializeConnections();
    }
    
    private function initializeConnections() {
        foreach ($this->config['instances'] as $instance) {
            $connectionString = sprintf(
                "mongodb://%s:%s@%s:%d/%s",
                $instance['username'],
                $instance['password'],
                $instance['host'],
                $instance['port'],
                $instance['database']
            );
            
            $this->connections[$instance['name']] = [
                'client' => new MongoDB\Client($connectionString),
                'database' => $instance['database'],
                'config' => $instance
            ];
        }
    }
    
    public function createDistributedCollection($collectionName, $options = []) {
        $results = [];
        
        foreach ($this->connections as $instanceName => $connection) {
            try {
                $database = $connection['client']->selectDatabase($connection['database']);
                $database->createCollection($collectionName, $options);
                
                $collection = $database->selectCollection($collectionName);
                $this->createStandardIndexes($collection);
                
                $results[$instanceName] = [
                    'status' => 'success',
                    'instance' => $instanceName,
                    'collection_name' => $collectionName
                ];
                
            } catch (Exception $e) {
                $results[$instanceName] = [
                    'status' => 'error',
                    'instance' => $instanceName,
                    'message' => $e->getMessage()
                ];
            }
        }
        
        return $results;
    }
    
    private function createStandardIndexes($collection) {
        $collection->createIndex(['created_at' => -1]);
        $collection->createIndex(['updated_at' => -1]);
        $collection->createIndex(['status' => 1]);
    }
    
    public function syncCollectionData($sourceInstance, $targetInstance, $collectionName, $batchSize = 1000) {
        $source = $this->connections[$sourceInstance];
        $target = $this->connections[$targetInstance];
        
        $sourceDatabase = $source['client']->selectDatabase($source['database']);
        $targetDatabase = $target['client']->selectDatabase($target['database']);
        
        $sourceCollection = $sourceDatabase->selectCollection($collectionName);
        $targetCollection = $targetDatabase->selectCollection($collectionName);
        
        $syncResults = [
            'total_documents' => 0,
            'synced_documents' => 0,
            'failed_documents' => 0,
            'batches' => 0
        ];
        
        try {
            // 获取总文档数
            $syncResults['total_documents'] = $sourceCollection->countDocuments();
            
            // 批量同步数据
            $skip = 0;
            while (true) {
                $documents = $sourceCollection->find([], [
                    'skip' => $skip,
                    'limit' => $batchSize
                ])->toArray();
                
                if (empty($documents)) {
                    break;
                }
                
                try {
                    $result = $targetCollection->insertMany($documents);
                    $syncResults['synced_documents'] += $result->getInsertedCount();
                    
                } catch (Exception $e) {
                    $syncResults['failed_documents'] += count($documents);
                }
                
                $syncResults['batches']++;
                $skip += $batchSize;
            }
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
        
        return [
            'status' => 'success',
            'results' => $syncResults
        ];
    }
    
    public function getDistributedCollectionStats($collectionName) {
        $stats = [];
        
        foreach ($this->connections as $instanceName => $connection) {
            try {
                $database = $connection['client']->selectDatabase($connection['database']);
                $collStats = $database->command([
                    'collStats' => $collectionName
                ])->toArray()[0];
                
                $stats[$instanceName] = [
                    'document_count' => $collStats['count'],
                    'storage_size' => $collStats['storageSize'],
                    'index_count' => count($collStats['indexSizes']),
                    'avg_obj_size' => $collStats['avgObjSize']
                ];
                
            } catch (Exception $e) {
                $stats[$instanceName] = [
                    'error' => $e->getMessage()
                ];
            }
        }
        
        return $stats;
    }
    
    public function rebalanceCollections($collectionName, $targetDistribution) {
        $results = [];
        
        foreach ($targetDistribution as $instanceName => $percentage) {
            $results[$instanceName] = [
                'target_percentage' => $percentage,
                'status' => 'pending'
            ];
        }
        
        // 实现负载均衡逻辑
        // 这里简化处理,实际需要更复杂的算法
        
        return [
            'status' => 'success',
            'rebalance_results' => $results
        ];
    }
}

// 使用示例
$config = [
    'instances' => [
        [
            'name' => 'primary',
            'host' => 'mongodb1.example.com',
            'port' => 27017,
            'database' => 'app_db',
            'username' => 'admin',
            'password' => 'password'
        ],
        [
            'name' => 'secondary',
            'host' => 'mongodb2.example.com',
            'port' => 27017,
            'database' => 'app_db',
            'username' => 'admin',
            'password' => 'password'
        ]
    ]
];

file_put_contents('mongodb_config.json', json_encode($config, JSON_PRETTY_PRINT));

$manager = new DistributedCollectionManager('mongodb_config.json');

// 在所有实例上创建集合
$result = $manager->createDistributedCollection('distributed_users');
print_r($result);

// 同步集合数据
$syncResult = $manager->syncCollectionData('primary', 'secondary', 'distributed_users');
print_r($syncResult);

// 获取分布式集合统计信息
$stats = $manager->getDistributedCollectionStats('distributed_users');
print_r($stats);
?>

场景2:集合生命周期管理

自动化管理集合的创建、维护和删除:

php
<?php
// 集合生命周期管理系统
class CollectionLifecycleManager {
    private $database;
    private $policies;
    
    public function __construct($databaseName, $policyFile) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
        $this->policies = json_decode(file_get_contents($policyFile), true);
    }
    
    public function enforceCollectionPolicies() {
        $results = [];
        
        foreach ($this->policies['collections'] as $collectionPolicy) {
            $collectionName = $collectionPolicy['name'];
            
            switch ($collectionPolicy['action']) {
                case 'create':
                    $results[$collectionName] = $this->createCollectionWithPolicy($collectionPolicy);
                    break;
                    
                case 'maintain':
                    $results[$collectionName] = $this->maintainCollection($collectionPolicy);
                    break;
                    
                case 'archive':
                    $results[$collectionName] = $this->archiveCollection($collectionPolicy);
                    break;
                    
                case 'delete':
                    $results[$collectionName] = $this->deleteCollectionWithPolicy($collectionPolicy);
                    break;
            }
        }
        
        return $results;
    }
    
    private function createCollectionWithPolicy($policy) {
        try {
            $this->database->createCollection($policy['name'], $policy['options'] ?? []);
            $collection = $this->database->selectCollection($policy['name']);
            
            // 创建策略定义的索引
            if (isset($policy['indexes'])) {
                foreach ($policy['indexes'] as $indexConfig) {
                    $collection->createIndex($indexConfig['key'], $indexConfig['options'] ?? []);
                }
            }
            
            return [
                'status' => 'success',
                'action' => 'create',
                'collection_name' => $policy['name']
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'action' => 'create',
                'message' => $e->getMessage()
            ];
        }
    }
    
    private function maintainCollection($policy) {
        $collection = $this->database->selectCollection($policy['name']);
        $maintenanceResults = [];
        
        // 数据清理
        if (isset($policy['retention'])) {
            $cleanupResult = $this->cleanupOldData($collection, $policy['retention']);
            $maintenanceResults['cleanup'] = $cleanupResult;
        }
        
        // 索引优化
        if (isset($policy['index_optimization'])) {
            $optimizeResult = $this->optimizeIndexes($collection, $policy['index_optimization']);
            $maintenanceResults['index_optimization'] = $optimizeResult;
        }
        
        // 数据压缩
        if (isset($policy['compression'])) {
            $compressResult = $this->compressCollection($collection, $policy['compression']);
            $maintenanceResults['compression'] = $compressResult;
        }
        
        return [
            'status' => 'success',
            'action' => 'maintain',
            'collection_name' => $policy['name'],
            'results' => $maintenanceResults
        ];
    }
    
    private function cleanupOldData($collection, $retentionPolicy) {
        $dateField = $retentionPolicy['date_field'] ?? 'created_at';
        $retentionDays = $retentionPolicy['days'] ?? 30;
        
        $cutoffDate = new MongoDB\BSON\UTCDateTime(
            (time() - $retentionDays * 86400) * 1000
        );
        
        $result = $collection->deleteMany([
            $dateField => ['$lt' => $cutoffDate]
        ]);
        
        return [
            'deleted_count' => $result->getDeletedCount(),
            'cutoff_date' => $cutoffDate->toDateTime()->format('Y-m-d H:i:s')
        ];
    }
    
    private function optimizeIndexes($collection, $optimizationPolicy) {
        $results = [];
        
        // 重建索引
        if ($optimizationPolicy['rebuild'] ?? false) {
            $indexes = $collection->listIndexes();
            $indexNames = [];
            
            foreach ($indexes as $index) {
                if ($index->getName() !== '_id_') {
                    $indexNames[] = $index->getName();
                }
            }
            
            foreach ($indexNames as $indexName) {
                $collection->dropIndex($indexName);
            }
            
            // 重新创建索引(需要从策略中获取索引定义)
            $results['rebuild'] = [
                'dropped_indexes' => count($indexNames)
            ];
        }
        
        return $results;
    }
    
    private function compressCollection($collection, $compressionPolicy) {
        $result = $this->database->command([
            'compact' => $collection->getCollectionName()
        ])->toArray()[0];
        
        return [
            'status' => $result['ok'] == 1 ? 'success' : 'error',
            'result' => $result
        ];
    }
    
    private function archiveCollection($policy) {
        $sourceCollection = $this->database->selectCollection($policy['name']);
        $archiveCollectionName = $policy['archive_name'] ?? $policy['name'] . '_archive';
        
        try {
            // 创建归档集合
            $this->database->createCollection($archiveCollectionName, [
                'capped' => false
            ]);
            
            $archiveCollection = $this->database->selectCollection($archiveCollectionName);
            
            // 复制数据到归档集合
            $documents = $sourceCollection->find()->toArray();
            if (!empty($documents)) {
                $archiveCollection->insertMany($documents);
            }
            
            // 根据策略决定是否删除原集合数据
            if ($policy['delete_after_archive'] ?? false) {
                $sourceCollection->deleteMany([]);
            }
            
            return [
                'status' => 'success',
                'action' => 'archive',
                'source_collection' => $policy['name'],
                'archive_collection' => $archiveCollectionName,
                'archived_count' => count($documents)
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'action' => 'archive',
                'message' => $e->getMessage()
            ];
        }
    }
    
    private function deleteCollectionWithPolicy($policy) {
        try {
            // 备份数据(如果策略要求)
            if ($policy['backup_before_delete'] ?? false) {
                $this->backupCollection($policy['name']);
            }
            
            // 删除集合
            $this->database->dropCollection($policy['name']);
            
            return [
                'status' => 'success',
                'action' => 'delete',
                'collection_name' => $policy['name']
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'action' => 'delete',
                'message' => $e->getMessage()
            ];
        }
    }
    
    private function backupCollection($collectionName) {
        $collection = $this->database->selectCollection($collectionName);
        $backupCollectionName = $collectionName . '_backup_' . date('YmdHis');
        
        $this->database->createCollection($backupCollectionName);
        $backupCollection = $this->database->selectCollection($backupCollectionName);
        
        $documents = $collection->find()->toArray();
        if (!empty($documents)) {
            $backupCollection->insertMany($documents);
        }
        
        return $backupCollectionName;
    }
    
    public function getCollectionHealthStatus() {
        $collections = $this->database->listCollectionNames();
        $healthStatus = [];
        
        foreach ($collections as $collectionName) {
            if (str_starts_with($collectionName, 'system.')) {
                continue;
            }
            
            $collection = $this->database->selectCollection($collectionName);
            $stats = $this->database->command([
                'collStats' => $collectionName
            ])->toArray()[0];
            
            $healthStatus[$collectionName] = [
                'document_count' => $stats['count'],
                'storage_size' => $stats['storageSize'],
                'index_count' => count($stats['indexSizes']),
                'avg_obj_size' => $stats['avgObjSize'],
                'padding_factor' => $stats['paddingFactor'] ?? 1,
                'health_score' => $this->calculateHealthScore($stats)
            ];
        }
        
        return $healthStatus;
    }
    
    private function calculateHealthScore($stats) {
        $score = 100;
        
        // 根据存储使用率调整分数
        if ($stats['paddingFactor'] > 1.5) {
            $score -= 10;
        }
        
        // 根据文档大小调整分数
        if ($stats['avgObjSize'] > 10000) {
            $score -= 5;
        }
        
        return max(0, $score);
    }
}

// 使用示例
$policyConfig = [
    'collections' => [
        [
            'name' => 'temp_data',
            'action' => 'maintain',
            'retention' => [
                'date_field' => 'created_at',
                'days' => 7
            ],
            'index_optimization' => [
                'rebuild' => true
            ]
        ],
        [
            'name' => 'archived_logs',
            'action' => 'archive',
            'archive_name' => 'logs_archive_2024',
            'delete_after_archive' => true
        ],
        [
            'name' => 'obsolete_collection',
            'action' => 'delete',
            'backup_before_delete' => true
        ]
    ]
];

file_put_contents('collection_policies.json', json_encode($policyConfig, JSON_PRETTY_PRINT));

$lifecycleManager = new CollectionLifecycleManager('testdb', 'collection_policies.json');

// 执行集合策略
$results = $lifecycleManager->enforceCollectionPolicies();
print_r($results);

// 获取集合健康状态
$healthStatus = $lifecycleManager->getCollectionHealthStatus();
print_r($healthStatus);
?>

行业最佳实践

最佳实践1:集合命名规范

php
<?php
// 集合命名规范管理器
class CollectionNamingConvention {
    private $namingRules = [
        'max_length' => 120,
        'allowed_chars' => '/^[a-zA-Z0-9_\-\.]+$/',
        'forbidden_prefixes' => ['system.'],
        'forbidden_names' => ['$', 'local.', 'admin.'],
        'case_sensitive' => true
    ];
    
    public function validateCollectionName($name) {
        $errors = [];
        
        // 检查长度
        if (strlen($name) > $this->namingRules['max_length']) {
            $errors[] = "Collection name exceeds maximum length of {$this->namingRules['max_length']}";
        }
        
        // 检查字符
        if (!preg_match($this->namingRules['allowed_chars'], $name)) {
            $errors[] = "Collection name contains invalid characters";
        }
        
        // 检查禁止的前缀
        foreach ($this->namingRules['forbidden_prefixes'] as $prefix) {
            if (str_starts_with($name, $prefix)) {
                $errors[] = "Collection name cannot start with '{$prefix}'";
            }
        }
        
        // 检查禁止的名称
        foreach ($this->namingRules['forbidden_names'] as $forbidden) {
            if (str_contains($name, $forbidden)) {
                $errors[] = "Collection name cannot contain '{$forbidden}'";
            }
        }
        
        return [
            'valid' => empty($errors),
            'errors' => $errors
        ];
    }
    
    public function generateCollectionName($type, $context = []) {
        $patterns = [
            'users' => 'users',
            'user_profiles' => 'user_profiles',
            'orders' => 'orders',
            'order_items' => 'order_items',
            'products' => 'products',
            'categories' => 'categories',
            'logs' => 'logs_{level}',
            'cache' => 'cache_{key}',
            'sessions' => 'sessions',
            'temp' => 'temp_{timestamp}'
        ];
        
        $pattern = $patterns[$type] ?? strtolower($type);
        
        // 替换占位符
        foreach ($context as $key => $value) {
            $pattern = str_replace('{' . $key . '}', $value, $pattern);
        }
        
        // 添加时间戳(如果需要)
        if (str_contains($pattern, '{timestamp}')) {
            $pattern = str_replace('{timestamp}', date('YmdHis'), $pattern);
        }
        
        return $pattern;
    }
    
    public function suggestCollectionName($description) {
        $suggestions = [];
        
        $keywords = [
            'user' => ['users', 'user_profiles', 'user_settings'],
            'product' => ['products', 'product_catalog', 'product_inventory'],
            'order' => ['orders', 'order_items', 'order_history'],
            'log' => ['logs', 'app_logs', 'system_logs'],
            'cache' => ['cache', 'temp_cache', 'data_cache'],
            'session' => ['sessions', 'user_sessions', 'active_sessions']
        ];
        
        foreach ($keywords as $keyword => $names) {
            if (str_contains(strtolower($description), $keyword)) {
                $suggestions = array_merge($suggestions, $names);
            }
        }
        
        return array_unique($suggestions);
    }
}

// 使用示例
$namingConvention = new CollectionNamingConvention();

// 验证集合名称
$validation1 = $namingConvention->validateCollectionName('users');
print_r($validation1);

$validation2 = $namingConvention->validateCollectionName('system.users');
print_r($validation2);

$validation3 = $namingConvention->validateCollectionName(str_repeat('a', 150));
print_r($validation3);

// 生成集合名称
$collectionName1 = $namingConvention->generateCollectionName('logs', ['level' => 'error']);
echo "Generated collection name: " . $collectionName1 . "\n";

$collectionName2 = $namingConvention->generateCollectionName('temp', ['timestamp' => '']);
echo "Generated collection name: " . $collectionName2 . "\n";

// 获取集合名称建议
$suggestions = $namingConvention->suggestCollectionName('store user information and profiles');
print_r($suggestions);
?>

最佳实践2:集合性能优化

php
<?php
// 集合性能优化管理器
class CollectionPerformanceOptimizer {
    private $database;
    
    public function __construct($databaseName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
    }
    
    public function analyzeCollectionPerformance($collectionName) {
        $collection = $this->database->selectCollection($collectionName);
        $stats = $this->database->command([
            'collStats' => $collectionName,
            'verbose' => true
        ])->toArray()[0];
        
        $analysis = [
            'collection_name' => $collectionName,
            'document_count' => $stats['count'],
            'storage_size' => $stats['storageSize'],
            'data_size' => $stats['size'],
            'index_size' => $stats['totalIndexSize'],
            'avg_obj_size' => $stats['avgObjSize'],
            'padding_factor' => $stats['paddingFactor'] ?? 1,
            'performance_issues' => [],
            'recommendations' => []
        ];
        
        // 检查填充因子
        if ($stats['paddingFactor'] > 1.5) {
            $analysis['performance_issues'][] = 'High padding factor indicates document growth';
            $analysis['recommendations'][] = 'Consider compacting the collection';
        }
        
        // 检查文档大小
        if ($stats['avgObjSize'] > 10000) {
            $analysis['performance_issues'][] = 'Large average document size';
            $analysis['recommendations'][] = 'Consider document normalization or using GridFS for large data';
        }
        
        // 检查索引效率
        $indexCount = count($stats['indexSizes']);
        if ($indexCount > 10) {
            $analysis['performance_issues'][] = 'Too many indexes may impact write performance';
            $analysis['recommendations'][] = 'Review and remove unnecessary indexes';
        }
        
        // 检查存储效率
        if ($stats['storageSize'] > $stats['size'] * 2) {
            $analysis['performance_issues'][] = 'Poor storage efficiency';
            $analysis['recommendations'][] = 'Consider compacting the collection';
        }
        
        return $analysis;
    }
    
    public function optimizeCollection($collectionName, $options = []) {
        $collection = $this->database->selectCollection($collectionName);
        $results = [];
        
        // 压缩集合
        if ($options['compact'] ?? true) {
            $compactResult = $this->compactCollection($collection);
            $results['compact'] = $compactResult;
        }
        
        // 重建索引
        if ($options['rebuild_indexes'] ?? false) {
            $rebuildResult = $this->rebuildIndexes($collection);
            $results['rebuild_indexes'] = $rebuildResult;
        }
        
        // 调整填充因子
        if (isset($options['padding_factor'])) {
            $paddingResult = $this->adjustPaddingFactor($collection, $options['padding_factor']);
            $results['padding_factor'] = $paddingResult;
        }
        
        return [
            'status' => 'success',
            'collection_name' => $collectionName,
            'optimization_results' => $results
        ];
    }
    
    private function compactCollection($collection) {
        try {
            $result = $this->database->command([
                'compact' => $collection->getCollectionName()
            ])->toArray()[0];
            
            return [
                'status' => $result['ok'] == 1 ? 'success' : 'error',
                'message' => $result['ok'] == 1 ? 'Collection compacted successfully' : 'Compaction failed'
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    private function rebuildIndexes($collection) {
        try {
            // 获取所有索引
            $indexes = $collection->listIndexes();
            $indexDefinitions = [];
            
            foreach ($indexes as $index) {
                if ($index->getName() !== '_id_') {
                    $indexDefinitions[] = [
                        'name' => $index->getName(),
                        'key' => $index->getKey(),
                        'options' => [
                            'unique' => $index->isUnique(),
                            'sparse' => $index->isSparse(),
                            'background' => true
                        ]
                    ];
                }
            }
            
            // 删除并重建索引
            foreach ($indexDefinitions as $def) {
                $collection->dropIndex($def['name']);
                $collection->createIndex($def['key'], $def['options']);
            }
            
            return [
                'status' => 'success',
                'rebuilt_indexes' => count($indexDefinitions)
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    private function adjustPaddingFactor($collection, $paddingFactor) {
        try {
            $result = $this->database->command([
                'collMod' => $collection->getCollectionName(),
                'paddingFactor' => $paddingFactor
            ])->toArray()[0];
            
            return [
                'status' => $result['ok'] == 1 ? 'success' : 'error',
                'padding_factor' => $paddingFactor
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function monitorCollectionPerformance($collectionName, $duration = 60) {
        $collection = $this->database->selectCollection($collectionName);
        
        $metrics = [
            'collection_name' => $collectionName,
            'monitoring_duration' => $duration,
            'operations' => [
                'reads' => 0,
                'writes' => 0,
                'updates' => 0,
                'deletes' => 0
            ],
            'performance' => [
                'avg_read_time' => 0,
                'avg_write_time' => 0,
                'slow_queries' => 0
            ]
        ];
        
        // 模拟监控过程
        $startTime = time();
        while (time() - $startTime < $duration) {
            // 执行测试查询
            $readStart = microtime(true);
            $collection->findOne();
            $readTime = microtime(true) - $readStart;
            
            $metrics['operations']['reads']++;
            $metrics['performance']['avg_read_time'] += $readTime;
            
            if ($readTime > 0.1) {
                $metrics['performance']['slow_queries']++;
            }
            
            sleep(1);
        }
        
        // 计算平均值
        if ($metrics['operations']['reads'] > 0) {
            $metrics['performance']['avg_read_time'] /= $metrics['operations']['reads'];
        }
        
        return $metrics;
    }
}

// 使用示例
$optimizer = new CollectionPerformanceOptimizer('testdb');

// 分析集合性能
$performance = $optimizer->analyzeCollectionPerformance('users');
print_r($performance);

// 优化集合
$optimizationResult = $optimizer->optimizeCollection('users', [
    'compact' => true,
    'rebuild_indexes' => false
]);
print_r($optimizationResult);

// 监控集合性能
$monitoringResult = $optimizer->monitorCollectionPerformance('users', 10);
print_r($monitoringResult);
?>

最佳实践3:集合安全策略

php
<?php
// 集合安全策略管理器
class CollectionSecurityManager {
    private $database;
    private $securityPolicies;
    
    public function __construct($databaseName, $policyFile) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
        $this->securityPolicies = json_decode(file_get_contents($policyFile), true);
    }
    
    public function enforceCollectionSecurity() {
        $results = [];
        
        foreach ($this->securityPolicies['collections'] as $collectionPolicy) {
            $collectionName = $collectionPolicy['name'];
            $results[$collectionName] = $this->enforcePolicy($collectionName, $collectionPolicy);
        }
        
        return $results;
    }
    
    private function enforcePolicy($collectionName, $policy) {
        $collection = $this->database->selectCollection($collectionName);
        $enforcementResults = [];
        
        // 执行访问控制
        if (isset($policy['access_control'])) {
            $accessResult = $this->enforceAccessControl($collection, $policy['access_control']);
            $enforcementResults['access_control'] = $accessResult;
        }
        
        // 执行数据加密
        if (isset($policy['encryption'])) {
            $encryptionResult = $this->enforceEncryption($collection, $policy['encryption']);
            $enforcementResults['encryption'] = $encryptionResult;
        }
        
        // 执行审计日志
        if (isset($policy['audit'])) {
            $auditResult = $this->enableAuditLogging($collection, $policy['audit']);
            $enforcementResults['audit'] = $auditResult;
        }
        
        // 执行数据验证
        if (isset($policy['validation'])) {
            $validationResult = $this->enforceValidation($collection, $policy['validation']);
            $enforcementResults['validation'] = $validationResult;
        }
        
        return [
            'status' => 'success',
            'collection_name' => $collectionName,
            'enforcement_results' => $enforcementResults
        ];
    }
    
    private function enforceAccessControl($collection, $accessPolicy) {
        // MongoDB的访问控制主要通过用户角色实现
        // 这里提供基本的访问控制建议
        
        $recommendations = [];
        
        if ($accessPolicy['read_only'] ?? false) {
            $recommendations[] = 'Create read-only user role for this collection';
        }
        
        if ($accessPolicy['restricted_fields'] ?? false) {
            $recommendations[] = 'Implement field-level security using views';
        }
        
        if ($accessPolicy['ip_whitelist'] ?? false) {
            $recommendations[] = 'Configure IP whitelist in MongoDB configuration';
        }
        
        return [
            'status' => 'success',
            'recommendations' => $recommendations
        ];
    }
    
    private function enforceEncryption($collection, $encryptionPolicy) {
        // MongoDB支持字段级加密
        $recommendations = [];
        
        if ($encryptionPolicy['field_level'] ?? false) {
            $recommendations[] = 'Enable field-level encryption for sensitive fields';
        }
        
        if ($encryptionPolicy['transparent'] ?? false) {
            $recommendations[] = 'Enable transparent data encryption at rest';
        }
        
        if ($encryptionPolicy['transport'] ?? false) {
            $recommendations[] = 'Enable TLS/SSL for data in transit';
        }
        
        return [
            'status' => 'success',
            'recommendations' => $recommendations
        ];
    }
    
    private function enableAuditLogging($collection, $auditPolicy) {
        // 创建审计日志集合
        $auditCollectionName = $collection->getCollectionName() . '_audit';
        
        try {
            $this->database->createCollection($auditCollectionName, [
                'capped' => true,
                'size' => 1024 * 1024 * 100 // 100MB
            ]);
            
            $auditCollection = $this->database->selectCollection($auditCollectionName);
            $auditCollection->createIndex(['timestamp' => -1]);
            $auditCollection->createIndex(['operation' => 1]);
            $auditCollection->createIndex(['user' => 1]);
            
            return [
                'status' => 'success',
                'audit_collection' => $auditCollectionName,
                'audit_events' => $auditPolicy['events'] ?? ['insert', 'update', 'delete']
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    private function enforceValidation($collection, $validationPolicy) {
        try {
            $validationRules = $validationPolicy['rules'] ?? [];
            $validationAction = $validationPolicy['action'] ?? 'error';
            $validationLevel = $validationPolicy['level'] ?? 'strict';
            
            $result = $this->database->command([
                'collMod' => $collection->getCollectionName(),
                'validator' => $validationRules,
                'validationAction' => $validationAction,
                'validationLevel' => $validationLevel
            ])->toArray()[0];
            
            return [
                'status' => $result['ok'] == 1 ? 'success' : 'error',
                'validation_rules' => $validationRules,
                'validation_action' => $validationAction,
                'validation_level' => $validationLevel
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function auditCollectionAccess($collectionName, $user, $operation, $documentId = null) {
        $auditCollectionName = $collectionName . '_audit';
        $auditCollection = $this->database->selectCollection($auditCollectionName);
        
        $auditEntry = [
            'timestamp' => new MongoDB\BSON\UTCDateTime(),
            'user' => $user,
            'operation' => $operation,
            'collection' => $collectionName,
            'document_id' => $documentId,
            'ip_address' => $_SERVER['REMOTE_ADDR'] ?? 'unknown',
            'user_agent' => $_SERVER['HTTP_USER_AGENT'] ?? 'unknown'
        ];
        
        try {
            $result = $auditCollection->insertOne($auditEntry);
            
            return [
                'status' => 'success',
                'audit_id' => $result->getInsertedId()
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function getAuditReport($collectionName, $filters = [], $limit = 100) {
        $auditCollectionName = $collectionName . '_audit';
        $auditCollection = $this->database->selectCollection($auditCollectionName);
        
        try {
            $cursor = $auditCollection->find($filters, [
                'sort' => ['timestamp' => -1],
                'limit' => $limit
            ]);
            
            return [
                'status' => 'success',
                'audit_entries' => iterator_to_array($cursor)
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
}

// 使用示例
$securityPolicy = [
    'collections' => [
        [
            'name' => 'users',
            'access_control' => [
                'read_only' => false,
                'restricted_fields' => true,
                'ip_whitelist' => true
            ],
            'encryption' => [
                'field_level' => true,
                'transparent' => true,
                'transport' => true
            ],
            'audit' => [
                'events' => ['insert', 'update', 'delete', 'find']
            ],
            'validation' => [
                'rules' => [
                    '$jsonSchema' => [
                        'bsonType' => 'object',
                        'required' => ['name', 'email'],
                        'properties' => [
                            'name' => ['bsonType' => 'string'],
                            'email' => ['bsonType' => 'string'],
                            'age' => ['bsonType' => 'int', 'minimum' => 0]
                        ]
                    ]
                ],
                'action' => 'error',
                'level' => 'strict'
            ]
        ]
    ]
];

file_put_contents('security_policies.json', json_encode($securityPolicy, JSON_PRETTY_PRINT));

$securityManager = new CollectionSecurityManager('testdb', 'security_policies.json');

// 执行安全策略
$enforcementResults = $securityManager->enforceCollectionSecurity();
print_r($enforcementResults);

// 记录审计日志
$auditResult = $securityManager->auditCollectionAccess('users', 'admin', 'insert', new MongoDB\BSON\ObjectId());
print_r($auditResult);

// 获取审计报告
$auditReport = $securityManager->getAuditReport('users', [], 10);
print_r($auditReport);
?>

常见问题答疑

问题1:如何选择合适的集合类型?

回答:选择集合类型需要根据应用场景和数据特点来决定:

  • 普通集合:适用于大多数应用场景,支持所有操作
  • 固定集合:适用于日志、缓存等需要自动清理旧数据的场景
  • 分片集合:适用于大数据量、需要水平扩展的场景
  • 时间序列集合:适用于传感器数据、监控数据等时间序列场景
  • 视图集合:适用于需要聚合多个集合数据的只读场景
php
<?php
// 集合类型选择助手
class CollectionTypeSelector {
    public static function recommendCollectionType($requirements) {
        $recommendations = [];
        
        // 检查数据量
        if ($requirements['data_volume'] ?? 'small' === 'large') {
            $recommendations[] = [
                'type' => 'sharded',
                'reason' => 'Large data volume requires horizontal scaling',
                'priority' => 'high'
            ];
        }
        
        // 检查数据保留策略
        if ($requirements['retention_policy'] ?? 'permanent' === 'automatic') {
            $recommendations[] = [
                'type' => 'capped',
                'reason' => 'Automatic data cleanup required',
                'priority' => 'high'
            ];
        }
        
        // 检查时间序列特性
        if ($requirements['is_time_series'] ?? false) {
            $recommendations[] = [
                'type' => 'timeseries',
                'reason' => 'Time series data optimized storage',
                'priority' => 'high'
            ];
        }
        
        // 检查只读需求
        if ($requirements['read_only'] ?? false) {
            $recommendations[] = [
                'type' => 'view',
                'reason' => 'Read-only access with data aggregation',
                'priority' => 'medium'
            ];
        }
        
        // 默认推荐
        if (empty($recommendations)) {
            $recommendations[] = [
                'type' => 'standard',
                'reason' => 'Standard collection suitable for most use cases',
                'priority' => 'default'
            ];
        }
        
        return $recommendations;
    }
}

// 使用示例
$requirements = [
    'data_volume' => 'large',
    'retention_policy' => 'automatic',
    'is_time_series' => false,
    'read_only' => false
];

$recommendations = CollectionTypeSelector::recommendCollectionType($requirements);
print_r($recommendations);
?>

问题2:如何监控集合的性能?

回答:监控集合性能需要关注多个指标:

php
<?php
// 集合性能监控器
class CollectionPerformanceMonitor {
    private $database;
    
    public function __construct($databaseName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
    }
    
    public function getPerformanceMetrics($collectionName) {
        $collection = $this->database->selectCollection($collectionName);
        $stats = $this->database->command([
            'collStats' => $collectionName
        ])->toArray()[0];
        
        $serverStatus = $this->database->command(['serverStatus' => 1])->toArray()[0];
        
        return [
            'collection_name' => $collectionName,
            'storage_metrics' => [
                'document_count' => $stats['count'],
                'storage_size' => $stats['storageSize'],
                'data_size' => $stats['size'],
                'index_size' => $stats['totalIndexSize'],
                'avg_document_size' => $stats['avgObjSize']
            ],
            'performance_metrics' => [
                'index_count' => count($stats['indexSizes']),
                'padding_factor' => $stats['paddingFactor'] ?? 1,
                'capped' => $stats['capped'] ?? false
            ],
            'server_metrics' => [
                'connections' => $serverStatus['connections']['current'] ?? 0,
                'memory_usage' => $serverStatus['mem']['resident'] ?? 0
            ]
        ];
    }
    
    public function generatePerformanceReport($collectionName) {
        $metrics = $this->getPerformanceMetrics($collectionName);
        
        $report = [
            'collection_name' => $collectionName,
            'overall_health' => 'good',
            'issues' => [],
            'recommendations' => []
        ];
        
        // 检查存储效率
        if ($metrics['storage_metrics']['storage_size'] > $metrics['storage_metrics']['data_size'] * 2) {
            $report['issues'][] = 'Poor storage efficiency';
            $report['recommendations'][] = 'Consider compacting the collection';
        }
        
        // 检查文档大小
        if ($metrics['storage_metrics']['avg_document_size'] > 10000) {
            $report['issues'][] = 'Large average document size';
            $report['recommendations'][] = 'Consider document normalization';
        }
        
        // 检查索引数量
        if ($metrics['performance_metrics']['index_count'] > 10) {
            $report['issues'][] = 'Too many indexes';
            $report['recommendations'][] = 'Review and optimize indexes';
        }
        
        // 设置整体健康状态
        if (!empty($report['issues'])) {
            $report['overall_health'] = 'needs_attention';
        }
        
        return $report;
    }
}

// 使用示例
$monitor = new CollectionPerformanceMonitor('testdb');

// 获取性能指标
$metrics = $monitor->getPerformanceMetrics('users');
print_r($metrics);

// 生成性能报告
$report = $monitor->generatePerformanceReport('users');
print_r($report);
?>

问题3:如何处理集合重命名?

回答:集合重命名需要谨慎处理,特别是生产环境:

php
<?php
// 安全的集合重命名工具
class SafeCollectionRenamer {
    private $database;
    
    public function __construct($databaseName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
    }
    
    public function renameCollection($oldName, $newName, $options = []) {
        $collection = $this->database->selectCollection($oldName);
        
        // 验证新名称
        $validation = $this->validateNewName($newName);
        if (!$validation['valid']) {
            return [
                'status' => 'error',
                'message' => 'Invalid collection name',
                'errors' => $validation['errors']
            ];
        }
        
        // 检查依赖关系
        $dependencies = $this->checkDependencies($oldName);
        if (!empty($dependencies)) {
            return [
                'status' => 'warning',
                'message' => 'Collection has dependencies',
                'dependencies' => $dependencies
            ];
        }
        
        // 备份索引信息
        $indexBackup = $this->backupIndexes($collection);
        
        // 执行重命名
        try {
            $this->database->command([
                'renameCollection' => $this->database->getDatabaseName() . '.' . $oldName,
                'to' => $this->database->getDatabaseName() . '.' . $newName
            ]);
            
            // 恢复索引
            $newCollection = $this->database->selectCollection($newName);
            $this->restoreIndexes($newCollection, $indexBackup);
            
            return [
                'status' => 'success',
                'old_name' => $oldName,
                'new_name' => $newName,
                'indexes_restored' => count($indexBackup)
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    private function validateNewName($name) {
        $errors = [];
        
        if (strlen($name) > 128) {
            $errors[] = 'Collection name too long';
        }
        
        if (str_starts_with($name, 'system.')) {
            $errors[] = 'Cannot use system. prefix';
        }
        
        return [
            'valid' => empty($errors),
            'errors' => $errors
        ];
    }
    
    private function checkDependencies($collectionName) {
        $dependencies = [];
        $collections = $this->database->listCollectionNames();
        
        foreach ($collections as $coll) {
            if ($coll !== $collectionName && !str_starts_with($coll, 'system.')) {
                $collection = $this->database->selectCollection($coll);
                $sampleDoc = $collection->findOne();
                
                if ($sampleDoc) {
                    foreach ($sampleDoc as $field => $value) {
                        if (is_array($value) && isset($value['$ref']) && 
                            $value['$ref'] === $collectionName) {
                            $dependencies[] = [
                                'collection' => $coll,
                                'field' => $field
                            ];
                        }
                    }
                }
            }
        }
        
        return $dependencies;
    }
    
    private function backupIndexes($collection) {
        $indexes = $collection->listIndexes();
        $indexBackup = [];
        
        foreach ($indexes as $index) {
            if ($index->getName() !== '_id_') {
                $indexBackup[] = [
                    'name' => $index->getName(),
                    'key' => $index->getKey(),
                    'options' => [
                        'unique' => $index->isUnique(),
                        'sparse' => $index->isSparse(),
                        'background' => true
                    ]
                ];
            }
        }
        
        return $indexBackup;
    }
    
    private function restoreIndexes($collection, $indexBackup) {
        foreach ($indexBackup as $index) {
            $collection->createIndex($index['key'], $index['options']);
        }
    }
}

// 使用示例
$renamer = new SafeCollectionRenamer('testdb');

// 重命名集合
$result = $renamer->renameCollection('old_users', 'users');
print_r($result);
?>

实战练习

练习1:创建和管理固定集合

创建一个固定集合用于存储日志数据,并实现自动清理功能。

php
<?php
// 练习1:固定集合管理
class CappedCollectionExercise {
    private $database;
    
    public function __construct($databaseName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
    }
    
    public function createLogCollection($collectionName, $sizeMB, $maxDocs) {
        try {
            $this->database->createCollection($collectionName, [
                'capped' => true,
                'size' => $sizeMB * 1024 * 1024,
                'max' => $maxDocs
            ]);
            
            $collection = $this->database->selectCollection($collectionName);
            $collection->createIndex(['timestamp' => -1]);
            
            return [
                'status' => 'success',
                'collection_name' => $collectionName,
                'size_mb' => $sizeMB,
                'max_documents' => $maxDocs
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function insertLogEntry($collectionName, $level, $message) {
        $collection = $this->database->selectCollection($collectionName);
        
        $logEntry = [
            'timestamp' => new MongoDB\BSON\UTCDateTime(),
            'level' => $level,
            'message' => $message
        ];
        
        $result = $collection->insertOne($logEntry);
        
        return [
            'status' => 'success',
            'log_id' => $result->getInsertedId()
        ];
    }
    
    public function getCollectionStatus($collectionName) {
        $stats = $this->database->command([
            'collStats' => $collectionName
        ])->toArray()[0];
        
        return [
            'document_count' => $stats['count'],
            'current_size' => $stats['size'],
            'max_size' => $stats['storageSize'],
            'usage_percentage' => round(($stats['size'] / $stats['storageSize']) * 100, 2)
        ];
    }
}

// 使用示例
$exercise = new CappedCollectionExercise('testdb');

// 创建固定集合
$result = $exercise->createLogCollection('app_logs', 10, 1000);
print_r($result);

// 插入日志条目
for ($i = 0; $i < 50; $i++) {
    $exercise->insertLogEntry('app_logs', 'info', "Log entry {$i}");
}

// 查看集合状态
$status = $exercise->getCollectionStatus('app_logs');
print_r($status);
?>

练习2:实现集合备份和恢复

创建一个集合备份和恢复系统,支持增量备份。

php
<?php
// 练习2:集合备份和恢复
class CollectionBackupRestore {
    private $database;
    
    public function __construct($databaseName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
    }
    
    public function backupCollection($collectionName, $backupName = null) {
        $collection = $this->database->selectCollection($collectionName);
        
        if ($backupName === null) {
            $backupName = $collectionName . '_backup_' . date('YmdHis');
        }
        
        try {
            // 创建备份集合
            $this->database->createCollection($backupName);
            $backupCollection = $this->database->selectCollection($backupName);
            
            // 复制数据
            $documents = $collection->find()->toArray();
            if (!empty($documents)) {
                $backupCollection->insertMany($documents);
            }
            
            // 复制索引
            $indexes = $collection->listIndexes();
            foreach ($indexes as $index) {
                if ($index->getName() !== '_id_') {
                    $backupCollection->createIndex(
                        $index->getKey(),
                        [
                            'name' => $index->getName(),
                            'unique' => $index->isUnique(),
                            'sparse' => $index->isSparse()
                        ]
                    );
                }
            }
            
            return [
                'status' => 'success',
                'source_collection' => $collectionName,
                'backup_collection' => $backupName,
                'document_count' => count($documents)
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function restoreCollection($backupName, $targetCollectionName = null) {
        $backupCollection = $this->database->selectCollection($backupName);
        
        if ($targetCollectionName === null) {
            $targetCollectionName = str_replace('_backup_' . date('YmdHis'), '', $backupName);
        }
        
        try {
            // 创建目标集合
            $this->database->createCollection($targetCollectionName);
            $targetCollection = $this->database->selectCollection($targetCollectionName);
            
            // 复制数据
            $documents = $backupCollection->find()->toArray();
            if (!empty($documents)) {
                $targetCollection->insertMany($documents);
            }
            
            // 复制索引
            $indexes = $backupCollection->listIndexes();
            foreach ($indexes as $index) {
                if ($index->getName() !== '_id_') {
                    $targetCollection->createIndex(
                        $index->getKey(),
                        [
                            'name' => $index->getName(),
                            'unique' => $index->isUnique(),
                            'sparse' => $index->isSparse()
                        ]
                    );
                }
            }
            
            return [
                'status' => 'success',
                'backup_collection' => $backupName,
                'target_collection' => $targetCollectionName,
                'document_count' => count($documents)
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
    
    public function incrementalBackup($collectionName, $lastBackupTime) {
        $collection = $this->database->selectCollection($collectionName);
        
        $backupName = $collectionName . '_inc_backup_' . date('YmdHis');
        
        try {
            $this->database->createCollection($backupName);
            $backupCollection = $this->database->selectCollection($backupName);
            
            // 查找增量数据
            $documents = $collection->find([
                'updated_at' => ['$gt' => $lastBackupTime]
            ])->toArray();
            
            if (!empty($documents)) {
                $backupCollection->insertMany($documents);
            }
            
            return [
                'status' => 'success',
                'backup_collection' => $backupName,
                'document_count' => count($documents),
                'last_backup_time' => $lastBackupTime
            ];
            
        } catch (Exception $e) {
            return [
                'status' => 'error',
                'message' => $e->getMessage()
            ];
        }
    }
}

// 使用示例
$backupRestore = new CollectionBackupRestore('testdb');

// 完整备份
$backupResult = $backupRestore->backupCollection('users');
print_r($backupResult);

// 增量备份
$lastBackup = new MongoDB\BSON\UTCDateTime((time() - 3600) * 1000);
$incBackupResult = $backupRestore->incrementalBackup('users', $lastBackup);
print_r($incBackupResult);

// 恢复集合
$restoreResult = $backupRestore->restoreCollection($backupResult['backup_collection']);
print_r($restoreResult);
?>

练习3:集合性能优化

分析集合性能并实施优化措施。

php
<?php
// 练习3:集合性能优化
class CollectionPerformanceOptimization {
    private $database;
    
    public function __construct($databaseName) {
        $client = new MongoDB\Client("mongodb://localhost:27017");
        $this->database = $client->selectDatabase($databaseName);
    }
    
    public function analyzePerformance($collectionName) {
        $stats = $this->database->command([
            'collStats' => $collectionName
        ])->toArray()[0];
        
        $analysis = [
            'collection_name' => $collectionName,
            'metrics' => [
                'document_count' => $stats['count'],
                'storage_size' => $stats['storageSize'],
                'data_size' => $stats['size'],
                'avg_document_size' => $stats['avgObjSize'],
                'index_count' => count($stats['indexSizes']),
                'padding_factor' => $stats['paddingFactor'] ?? 1
            ],
            'issues' => [],
            'recommendations' => []
        ];
        
        // 分析存储效率
        if ($stats['storageSize'] > $stats['size'] * 2) {
            $analysis['issues'][] = 'Poor storage efficiency';
            $analysis['recommendations'][] = 'Compact the collection';
        }
        
        // 分析文档大小
        if ($stats['avgObjSize'] > 10000) {
            $analysis['issues'][] = 'Large documents';
            $analysis['recommendations'][] = 'Consider document normalization';
        }
        
        // 分析索引
        if (count($stats['indexSizes']) > 10) {
            $analysis['issues'][] = 'Too many indexes';
            $analysis['recommendations'][] = 'Review and optimize indexes';
        }
        
        return $analysis;
    }
    
    public function optimizeCollection($collectionName) {
        $collection = $this->database->selectCollection($collectionName);
        $results = [];
        
        // 压缩集合
        try {
            $this->database->command([
                'compact' => $collectionName
            ]);
            $results['compact'] = 'success';
        } catch (Exception $e) {
            $results['compact'] = 'failed: ' . $e->getMessage();
        }
        
        // 重建索引
        try {
            $indexes = $collection->listIndexes();
            $indexNames = [];
            
            foreach ($indexes as $index) {
                if ($index->getName() !== '_id_') {
                    $indexNames[] = $index->getName();
                }
            }
            
            foreach ($indexNames as $indexName) {
                $collection->dropIndex($indexName);
            }
            
            $results['rebuild_indexes'] = 'success';
        } catch (Exception $e) {
            $results['rebuild_indexes'] = 'failed: ' . $e->getMessage();
        }
        
        return [
            'status' => 'completed',
            'collection_name' => $collectionName,
            'optimization_results' => $results
        ];
    }
}

// 使用示例
$optimizer = new CollectionPerformanceOptimization('testdb');

// 分析性能
$analysis = $optimizer->analyzePerformance('users');
print_r($analysis);

// 优化集合
$optimization = $optimizer->optimizeCollection('users');
print_r($optimization);
?>

知识点总结

核心概念

  1. 集合定义:MongoDB中文档的逻辑容器,类似于关系型数据库中的表
  2. 集合类型:普通集合、固定集合、分片集合、时间序列集合、视图集合
  3. 集合命名:遵循特定规则,不能以system.开头,区分大小写
  4. 集合操作:创建、删除、重命名、查看、配置等基本操作

技术要点

  1. 集合创建:使用createCollection()方法,支持多种配置选项
  2. 固定集合:具有固定大小,自动覆盖旧数据,适合日志和缓存
  3. 集合管理:包括备份、恢复、性能优化、安全策略等
  4. 集合监控:监控文档数量、存储大小、索引效率等指标

实践技巧

  1. 命名规范:使用描述性名称,遵循命名规则,避免特殊字符
  2. 性能优化:定期压缩集合,优化索引,监控存储效率
  3. 安全策略:实施访问控制、数据加密、审计日志等安全措施
  4. 生命周期管理:根据数据保留策略自动创建、维护、删除集合

最佳实践

  1. 集合设计:根据应用场景选择合适的集合类型
  2. 索引管理:合理创建索引,避免过度索引
  3. 数据清理:使用固定集合或定期清理旧数据
  4. 监控告警:建立监控机制,及时发现性能问题

拓展参考资料

官方文档

推荐阅读

  • 《MongoDB权威指南》- 集合管理章节
  • 《MongoDB设计与实现》- 集合存储原理
  • 《MongoDB性能优化实战》- 集合优化技巧

在线资源

  • MongoDB University集合管理课程
  • MongoDB官方博客集合最佳实践
  • Stack Overflow MongoDB集合相关问题

相关工具

  • MongoDB Compass - 图形化集合管理工具
  • mongosh - 命令行集合操作工具
  • MongoDB Ops Manager - 企业级集合管理平台