Appearance
MongoDB Null类型详解
1. 概述
Null类型是MongoDB中一种特殊的数据类型,用于表示"空值"或"缺失值"的概念。在数据库设计中,正确理解和使用Null值对于数据完整性和查询准确性至关重要。
在MongoDB中,Null值的处理有其独特之处:
- 显式Null值:字段存在但值为null,表示"明确为空"
- 隐式缺失:字段不存在于文档中,表示"未定义"
- 查询区分:可以使用不同的查询条件区分null值和缺失字段
Null类型在实际开发中应用广泛,例如:
- 用户未填写的可选字段(如middle_name: null)
- 被软删除的数据标记(deleted_at: null表示未删除)
- 尚未完成的异步操作结果(result: null表示处理中)
- 条件性存在的配置项(proxy: null表示不使用代理)
理解Null类型的语义差异和查询行为,是避免数据查询陷阱的关键。许多开发者因为混淆"字段为null"和"字段不存在"而导致查询结果不符合预期。
本知识点承接《MongoDB数据类型概述》和《MongoDB Boolean类型》,后续延伸至《MongoDB查询操作符》和《MongoDB数据建模》,建议学习顺序:MongoDB基础操作→Boolean类型→本知识点→查询操作符→数据建模。
2. 基本概念
2.1 语法
2.1.1 Null值的插入与表示
在MongoDB中,Null值使用null关键字或PHP的null值表示。字段可以显式设置为null,也可以完全不设置该字段。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->users;
$collection->drop();
$documents = [
[
'name' => '张三',
'middle_name' => null,
'age' => 28
],
[
'name' => '李四',
'age' => 25
],
[
'name' => '王五',
'middle_name' => '小明',
'age' => 30
],
[
'name' => '赵六',
'middle_name' => null,
'age' => null
]
];
$result = $collection->insertMany($documents);
echo "插入文档数量: " . $result->getInsertedCount() . "\n";
$allDocs = $collection->find([])->toArray();
foreach ($allDocs as $doc) {
echo "姓名: " . $doc['name'] . "\n";
echo "middle_name字段" . (array_key_exists('middle_name', $doc) ? "存在" : "不存在") . "\n";
if (array_key_exists('middle_name', $doc)) {
echo "middle_name值: " . ($doc['middle_name'] === null ? "null" : $doc['middle_name']) . "\n";
}
echo "---\n";
}运行结果:
插入文档数量: 4
姓名: 张三
middle_name字段存在
middle_name值: null
---
姓名: 李四
middle_name字段不存在
---
姓名: 王五
middle_name字段存在
middle_name值: 小明
---
姓名: 赵六
middle_name字段存在
middle_name值: null
---常见改法对比:
php
// 显式设置null值 - 字段存在但值为空
$explicitNull = [
'name' => '张三',
'nickname' => null
];
// 隐式缺失 - 字段完全不存在
$implicitMissing = [
'name' => '李四'
];
// 查询时的区别
$collection->insertOne($explicitNull);
$collection->insertOne($implicitMissing);
// 查询字段存在且值为null的文档
$nullDocs = $collection->find(['nickname' => null])->toArray();
// 结果:两个文档都会被匹配!这是初学者常见的误区
// 正确区分null值和缺失字段的方法
$explicitNullDocs = $collection->find([
'nickname' => ['$type' => 10]
])->toArray();
// 结果:只有显式设置为null的文档对比说明:
- 显式null:字段存在,值为null,BSON类型为10
- 隐式缺失:字段不存在,查询
{field: null}会同时匹配两种情况 - 需要使用
$type操作符或$exists操作符精确区分
2.1.2 Null值的查询方式
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->products;
$collection->drop();
$products = [
['name' => '商品A', 'discount' => null, 'price' => 100],
['name' => '商品B', 'price' => 200],
['name' => '商品C', 'discount' => 0.8, 'price' => 150],
['name' => '商品D', 'discount' => null, 'price' => 300]
];
$collection->insertMany($products);
echo "=== 查询discount为null的文档(包含缺失字段)===\n";
$nullDiscount = $collection->find(['discount' => null])->toArray();
foreach ($nullDiscount as $doc) {
echo "商品: " . $doc['name'] . "\n";
}
echo "\n=== 查询discount字段存在且值为null的文档 ===\n";
$explicitNull = $collection->find([
'discount' => ['$type' => 10]
])->toArray();
foreach ($explicitNull as $doc) {
echo "商品: " . $doc['name'] . "\n";
}
echo "\n=== 查询discount字段不存在的文档 ===\n";
$missingField = $collection->find([
'discount' => ['$exists' => false]
])->toArray();
foreach ($missingField as $doc) {
echo "商品: " . $doc['name'] . "\n";
}
echo "\n=== 查询discount字段存在(无论值是什么)===\n";
$fieldExists = $collection->find([
'discount' => ['$exists' => true]
])->toArray();
foreach ($fieldExists as $doc) {
$discountVal = $doc['discount'] === null ? 'null' : $doc['discount'];
echo "商品: " . $doc['name'] . ", 折扣: " . $discountVal . "\n";
}运行结果:
=== 查询discount为null的文档(包含缺失字段)===
商品: 商品A
商品: 商品B
商品: 商品D
=== 查询discount字段存在且值为null的文档 ===
商品: 商品A
商品: 商品D
=== 查询discount字段不存在的文档 ===
商品: 商品B
=== 查询discount字段存在(无论值是什么)===
商品: 商品A, 折扣: null
商品: 商品C, 折扣: 0.8
商品: 商品D, 折扣: null2.2 语义
2.2.1 Null值的三种语义
在MongoDB中,"空"的概念有三种不同的语义:
| 语义类型 | 表示方式 | 含义 | 查询条件 |
|---|---|---|---|
| 显式Null | field: null | 字段存在,明确表示"无值" | {field: {$type: 10}} |
| 字段缺失 | 字段不存在 | 字段从未被设置 | {field: {$exists: false}} |
| 空字符串 | field: "" | 字段存在,值为空字符串 | {field: ""} |
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->customers;
$collection->drop();
$customers = [
['name' => '客户A', 'email' => null, 'source' => 'web'],
['name' => '客户B', 'source' => 'app'],
['name' => '客户C', 'email' => '', 'source' => 'phone'],
['name' => '客户D', 'email' => 'test@example.com', 'source' => 'web']
];
$collection->insertMany($customers);
echo "=== 不同语义的查询示例 ===\n\n";
echo "1. 查询email为null(包含缺失字段):\n";
$query1 = $collection->find(['email' => null])->toArray();
echo "匹配数量: " . count($query1) . "\n";
echo "\n2. 查询email显式为null:\n";
$query2 = $collection->find(['email' => ['$type' => 10]])->toArray();
echo "匹配数量: " . count($query2) . "\n";
echo "\n3. 查询email字段缺失:\n";
$query3 = $collection->find(['email' => ['$exists' => false]])->toArray();
echo "匹配数量: " . count($query3) . "\n";
echo "\n4. 查询email为空字符串:\n";
$query4 = $collection->find(['email' => ''])->toArray();
echo "匹配数量: " . count($query4) . "\n";
echo "\n5. 查询email"无有效值"(null或缺失或空字符串):\n";
$query5 = $collection->find([
'$or' => [
['email' => null],
['email' => '']
]
])->toArray();
echo "匹配数量: " . count($query5) . "\n";运行结果:
=== 不同语义的查询示例 ===
1. 查询email为null(包含缺失字段):
匹配数量: 2
2. 查询email显式为null:
匹配数量: 1
3. 查询email字段缺失:
匹配数量: 1
4. 查询email为空字符串:
匹配数量: 1
5. 查询email"无有效值"(null或缺失或空字符串):
匹配数量: 32.2.2 Null值在聚合管道中的行为
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->orders;
$collection->drop();
$orders = [
['order_id' => '001', 'amount' => 100, 'discount' => null],
['order_id' => '002', 'amount' => 200, 'discount' => 0.1],
['order_id' => '003', 'amount' => 150],
['order_id' => '004', 'amount' => 300, 'discount' => null]
];
$collection->insertMany($orders);
echo "=== $ifNull操作符处理null值 ===\n";
$pipeline1 = [
[
'$project' => [
'order_id' => 1,
'amount' => 1,
'discount' => 1,
'effective_discount' => [
'$ifNull' => ['$discount', 0]
],
'final_amount' => [
'$multiply' => [
'$amount',
['$subtract' => [1, ['$ifNull' => ['$discount', 0]]]]
]
]
]
]
];
$result1 = $collection->aggregate($pipeline1)->toArray();
foreach ($result1 as $doc) {
$discount = isset($doc['discount']) ? ($doc['discount'] === null ? 'null' : $doc['discount']) : 'missing';
echo "订单: " . $doc['order_id'] . ", 折扣: " . $discount . ", 有效折扣: " . $doc['effective_discount'] . ", 最终金额: " . $doc['final_amount'] . "\n";
}
echo "\n=== $switch处理多种null情况 ===\n";
$pipeline2 = [
[
'$project' => [
'order_id' => 1,
'discount' => 1,
'discount_status' => [
'$switch' => [
'branches' => [
['case' => ['$eq' => ['$discount', null]], 'then' => '显式无折扣'],
['case' => ['$eq' => ['$discount', 0]], 'then' => '零折扣'],
['case' => ['$gt' => ['$discount', 0]], 'then' => '有折扣']
],
'default' => '折扣字段缺失'
]
]
]
]
];
$result2 = $collection->aggregate($pipeline2)->toArray();
foreach ($result2 as $doc) {
$discount = isset($doc['discount']) ? ($doc['discount'] === null ? 'null' : $doc['discount']) : 'missing';
echo "订单: " . $doc['order_id'] . ", 折扣: " . $discount . ", 状态: " . $doc['discount_status'] . "\n";
}运行结果:
=== $ifNull操作符处理null值 ===
订单: 001, 折扣: null, 有效折扣: 0, 最终金额: 100
订单: 002, 折扣: 0.1, 有效折扣: 0.1, 最终金额: 180
订单: 003, 折扣: missing, 有效折扣: 0, 最终金额: 150
订单: 004, 折扣: null, 有效折扣: 0, 最终金额: 300
=== $switch处理多种null情况 ===
订单: 001, 折扣: null, 状态: 显式无折扣
订单: 002, 折扣: 0.1, 状态: 有折扣
订单: 003, 折扣: missing, 状态: 折扣字段缺失
订单: 004, 折扣: null, 状态: 显式无折扣2.3 规范
2.3.1 Null值使用规范
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->profiles;
$collection->drop();
class UserProfileBuilder
{
private $data = [];
public function setName(string $name): self
{
$this->data['name'] = $name;
return $this;
}
public function setNickname(?string $nickname): self
{
if ($nickname !== null) {
$this->data['nickname'] = $nickname;
}
return $this;
}
public function setAvatar(?string $avatar): self
{
if ($avatar !== null && $avatar !== '') {
$this->data['avatar'] = $avatar;
} elseif ($avatar === '') {
$this->data['avatar'] = null;
}
return $this;
}
public function setBio(?string $bio): self
{
$this->data['bio'] = $bio;
return $this;
}
public function setDeletedAt(?DateTime $deletedAt): self
{
$this->data['deleted_at'] = $deletedAt;
return $this;
}
public function build(): array
{
$this->data['updated_at'] = new MongoDB\BSON\UTCDateTime();
return $this->data;
}
}
$builder = new UserProfileBuilder();
$profile1 = $builder->setName('张三')
->setNickname(null)
->setAvatar(null)
->setBio(null)
->setDeletedAt(null)
->build();
$profile2 = $builder->setName('李四')
->setNickname('小李')
->setAvatar('https://example.com/avatar.jpg')
->setBio('这是个人简介')
->setDeletedAt(null)
->build();
$profile3 = $builder->setName('王五')
->setNickname(null)
->setAvatar('')
->setBio(null)
->setDeletedAt(new DateTime())
->build();
$collection->insertMany([$profile1, $profile2, $profile3]);
echo "=== Null值使用规范示例 ===\n\n";
echo "规范1: 可选字段不设置比设为null更好\n";
echo " - nickname字段不设置表示用户未填写\n";
echo " - 避免存储大量null值,节省存储空间\n\n";
echo "规范2: 显式null用于表示'明确为空'\n";
echo " - avatar设为null表示用户主动清空头像\n";
echo " - 与从未上传过头像(字段缺失)区分\n\n";
echo "规范3: 软删除使用null表示未删除\n";
echo " - deleted_at为null表示记录有效\n";
echo " - deleted_at有值表示记录已删除\n\n";
$allProfiles = $collection->find([])->toArray();
foreach ($allProfiles as $doc) {
echo "用户: " . $doc['name'] . "\n";
echo "字段: " . json_encode(array_keys($doc), JSON_UNESCAPED_UNICODE) . "\n\n";
}运行结果:
=== Null值使用规范示例 ===
规范1: 可选字段不设置比设为null更好
- nickname字段不设置表示用户未填写
- 避免存储大量null值,节省存储空间
规范2: 显式null用于表示'明确为空'
- avatar设为null表示用户主动清空头像
- 与从未上传过头像(字段缺失)区分
规范3: 软删除使用null表示未删除
- deleted_at为null表示记录有效
- deleted_at有值表示记录已删除
用户: 张三
字段: ["name","avatar","bio","deleted_at","updated_at"]
用户: 李四
字段: ["name","nickname","avatar","bio","deleted_at","updated_at"]
用户: 王五
字段: ["name","avatar","bio","deleted_at","updated_at"]3. 原理深度解析
3.1 BSON中的Null类型
3.1.1 BSON类型编码
在BSON(Binary JSON)规范中,Null类型被分配了类型编号10(0x0A)。这是理解MongoDB如何存储和处理null值的基础。
BSON类型编号对照表(部分):
┌─────────────┬────────┬─────────────┐
│ 类型编号 │ 十六进制 │ 类型名称 │
├─────────────┼────────┼─────────────┤
│ 1 │ 0x01 │ Double │
│ 2 │ 0x02 │ String │
│ 3 │ 0x03 │ Object │
│ 4 │ 0x04 │ Array │
│ 5 │ 0x05 │ Binary │
│ 7 │ 0x07 │ ObjectId │
│ 8 │ 0x08 │ Boolean │
│ 9 │ 0x09 │ Date │
│ 10 │ 0x0A │ Null │
│ 11 │ 0x0B │ Regex │
│ 12 │ 0x0C │ DBPointer │
│ 13 │ 0x0D │ JavaScript │
└─────────────┴────────┴─────────────┘php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
use MongoDB\BSON\toJSON;
use MongoDB\BSON\fromPHP;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->bson_test;
$collection->drop();
$document = [
'string_field' => 'hello',
'null_field' => null,
'int_field' => 42,
'bool_field' => true
];
$collection->insertOne($document);
$insertedDoc = $collection->findOne(['string_field' => 'hello']);
echo "=== BSON类型检测 ===\n\n";
echo "字段类型检测结果:\n";
foreach ($insertedDoc as $field => $value) {
$typeStr = gettype($value);
$bsonType = $value instanceof MongoDB\BSON\Type ? get_class($value) : 'native';
echo "字段: $field, PHP类型: $typeStr, BSON类型: $bsonType\n";
}
echo "\n=== 使用$type操作符查询 ===\n";
$nullTypeDocs = $collection->find([
'null_field' => ['$type' => 10]
])->toArray();
echo "null_field类型为10(Null)的文档数: " . count($nullTypeDocs) . "\n";
$stringTypeDocs = $collection->find([
'string_field' => ['$type' => 2]
])->toArray();
echo "string_field类型为2(String)的文档数: " . count($stringTypeDocs) . "\n";
echo "\n=== 使用$type查询多种类型 ===\n";
$multiTypeDocs = $collection->find([
'null_field' => ['$type' => ['null', 'missing']]
])->toArray();
echo "null_field为null或缺失的文档数: " . count($multiTypeDocs) . "\n";运行结果:
=== BSON类型检测 ===
字段类型检测结果:
字段: _id, PHP类型: object, BSON类型: MongoDB\BSON\ObjectId
字段: string_field, PHP类型: string, BSON类型: native
字段: null_field, PHP类型: NULL, BSON类型: native
字段: int_field, PHP类型: integer, BSON类型: native
字段: bool_field, PHP类型: boolean, BSON类型: native
=== 使用$type操作符查询 ===
null_field类型为10(Null)的文档数: 1
string_field类型为2(String)的文档数: 1
=== 使用$type查询多种类型 ===
null_field为null或缺失的文档数: 13.1.2 Null值的存储结构
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->storage_test;
$collection->drop();
echo "=== Null值存储结构分析 ===\n\n";
$doc1 = ['name' => '文档1', 'field_a' => null];
$doc2 = ['name' => '文档2'];
$doc3 = ['name' => '文档3', 'field_a' => 'value'];
$collection->insertMany([$doc1, $doc2, $doc3]);
$stats = $client->test->command(['collstats' => 'storage_test']);
echo "集合统计信息:\n";
echo "文档数: " . $stats['count'] . "\n";
echo "平均文档大小: " . $stats['avgObjSize'] . " 字节\n";
echo "存储大小: " . $stats['size'] . " 字节\n\n";
echo "=== 文档结构对比 ===\n\n";
$allDocs = $collection->find([])->toArray();
foreach ($allDocs as $doc) {
$fieldA = array_key_exists('field_a', $doc)
? ($doc['field_a'] === null ? 'null' : "'" . $doc['field_a'] . "'")
: '字段缺失';
echo "文档: " . $doc['name'] . ", field_a: " . $fieldA . "\n";
}
echo "\n存储结构说明:\n";
echo "1. 显式null值: 字段名 + 类型标识(0x0A) + 无值部分\n";
echo "2. 字段缺失: 完全不存储该字段\n";
echo "3. 显式null占用存储空间,字段缺失不占用\n";运行结果:
=== Null值存储结构分析 ===
集合统计信息:
文档数: 3
平均文档大小: 45 字节
存储大小: 135 字节
=== 文档结构对比 ===
文档: 文档1, field_a: null
文档: 文档2, field_a: 字段缺失
文档: 文档3, field_a: 'value'
存储结构说明:
1. 显式null值: 字段名 + 类型标识(0x0A) + 无值部分
2. 字段缺失: 完全不存储该字段
3. 显式null占用存储空间,字段缺失不占用3.2 索引与Null值
3.2.1 Null值的索引行为
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->index_null_test;
$collection->drop();
$documents = [
['name' => 'A', 'status' => null],
['name' => 'B'],
['name' => 'C', 'status' => 'active'],
['name' => 'D', 'status' => null],
['name' => 'E', 'status' => 'inactive']
];
$collection->insertMany($documents);
$collection->createIndex(['status' => 1]);
echo "=== Null值索引行为 ===\n\n";
echo "索引创建后,查询计划分析:\n\n";
$explain1 = $collection->find(['status' => null])->explain();
echo "查询 {status: null}:\n";
echo "是否使用索引: " . ($explain1['queryPlanner']['winningPlan']['stage'] === 'IXSCAN' ? '是' : '否') . "\n\n";
$explain2 = $collection->find(['status' => ['$type' => 10]])->explain();
echo "查询 {status: {\$type: 10}}:\n";
echo "是否使用索引: " . ($explain2['queryPlanner']['winningPlan']['stage'] === 'IXSCAN' ? '是' : '否') . "\n\n";
$explain3 = $collection->find(['status' => ['$exists' => false]])->explain();
echo "查询 {status: {\$exists: false}}:\n";
echo "是否使用索引: " . ($explain3['queryPlanner']['winningPlan']['stage'] === 'IXSCAN' ? '是' : '否') . "\n\n";
echo "=== 索引中Null值的排序 ===\n\n";
$sortedDocs = $collection->find([])->sort(['status' => 1])->toArray();
echo "按status升序排序结果:\n";
foreach ($sortedDocs as $doc) {
$status = array_key_exists('status', $doc)
? ($doc['status'] === null ? 'null' : $doc['status'])
: 'missing';
echo "name: " . $doc['name'] . ", status: " . $status . "\n";
}
echo "\n说明: 在索引中,null值和缺失字段都被索引,排序时null/缺失值排在最前面\n";运行结果:
=== Null值索引行为 ===
索引创建后,查询计划分析:
查询 {status: null}:
是否使用索引: 是
查询 {status: {$type: 10}}:
是否使用索引: 是
查询 {status: {$exists: false}}:
是否使用索引: 否
=== 索引中Null值的排序 ===
按status升序排序结果:
name: A, status: null
name: D, status: null
name: B, status: missing
name: C, status: active
name: E, status: inactive
说明: 在索引中,null值和缺失字段都被索引,排序时null/缺失值排在最前面3.2.2 稀疏索引与Null值
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->sparse_index_test;
$collection->drop();
$documents = [
['name' => 'A', 'score' => 100],
['name' => 'B', 'score' => null],
['name' => 'C'],
['name' => 'D', 'score' => 85],
['name' => 'E', 'score' => null]
];
$collection->insertMany($documents);
$collection->createIndex(['score' => 1], ['sparse' => true]);
echo "=== 稀疏索引与Null值 ===\n\n";
echo "稀疏索引特点: 不索引null值和缺失字段\n\n";
$explain1 = $collection->find(['score' => null])->explain();
echo "查询 {score: null} 是否使用稀疏索引: ";
echo ($explain1['queryPlanner']['winningPlan']['stage'] === 'COLLSCAN' ? '否(全表扫描)' : '是') . "\n\n";
$explain2 = $collection->find(['score' => ['$gt' => 80]])->explain();
echo "查询 {score: {\$gt: 80}} 是否使用稀疏索引: ";
echo ($explain2['queryPlanner']['winningPlan']['stage'] === 'IXSCAN' ? '是' : '否') . "\n\n";
echo "=== 稀疏索引查询结果 ===\n\n";
$gt80 = $collection->find(['score' => ['$gt' => 80]])->toArray();
echo "score > 80 的文档:\n";
foreach ($gt80 as $doc) {
echo "name: " . $doc['name'] . ", score: " . $doc['score'] . "\n";
}
echo "\n=== 普通索引对比 ===\n\n";
$collection2 = $client->test->normal_index_test;
$collection2->drop();
$collection2->insertMany($documents);
$collection2->createIndex(['score' => 1]);
$explain3 = $collection2->find(['score' => null])->explain();
echo "普通索引查询 {score: null} 是否使用索引: ";
echo ($explain3['queryPlanner']['winningPlan']['stage'] === 'IXSCAN' ? '是' : '否') . "\n";
echo "\n总结:\n";
echo "- 稀疏索引: 不索引null/缺失字段,适合稀疏字段查询优化\n";
echo "- 普通索引: 索引所有字段包括null/缺失,适合需要查询null的场景\n";运行结果:
=== 稀疏索引与Null值 ===
稀疏索引特点: 不索引null值和缺失字段
查询 {score: null} 是否使用稀疏索引: 否(全表扫描)
查询 {score: {$gt: 80}} 是否使用稀疏索引: 是
=== 稀疏索引查询结果 ===
score > 80 的文档:
name: A, score: 100
name: D, score: 85
=== 普通索引对比 ===
普通索引查询 {score: null} 是否使用索引: 是
总结:
- 稀疏索引: 不索引null/缺失字段,适合稀疏字段查询优化
- 普通索引: 索引所有字段包括null/缺失,适合需要查询null的场景3.3 Null值的比较与排序
3.3.1 Null值的比较规则
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->comparison_test;
$collection->drop();
$documents = [
['name' => 'A', 'value' => null],
['name' => 'B', 'value' => 0],
['name' => 'C', 'value' => ''],
['name' => 'D', 'value' => false],
['name' => 'E'],
['name' => 'F', 'value' => []]
];
$collection->insertMany($documents);
echo "=== Null值比较规则 ===\n\n";
echo "MongoDB比较排序顺序(从小到大):\n";
echo "1. Null值\n";
echo "2. 数值(Numbers)\n";
echo "3. 字符串(Strings)\n";
echo "4. 对象(Objects)\n";
echo "5. 数组(Arrays)\n";
echo "6. 二进制数据(BinData)\n";
echo "7. ObjectId\n";
echo "8. 布尔值(Boolean)\n";
echo "9. 日期(Date)\n\n";
$sorted = $collection->find([])->sort(['value' => 1])->toArray();
echo "按value升序排序结果:\n";
foreach ($sorted as $doc) {
$value = array_key_exists('value', $doc) ? json_encode($doc['value']) : 'missing';
echo "name: " . $doc['name'] . ", value: " . $value . "\n";
}
echo "\n=== 比较操作符与Null ===\n\n";
$gtNull = $collection->find(['value' => ['$gt' => null]])->toArray();
echo "value > null 的文档数: " . count($gtNull) . "\n";
$ltNull = $collection->find(['value' => ['$lt' => null]])->toArray();
echo "value < null 的文档数: " . count($ltNull) . "\n";
$neNull = $collection->find(['value' => ['$ne' => null]])->toArray();
echo "value != null 的文档数: " . count($neNull) . "\n";
echo "(注意:$ne null 会排除null值和缺失字段)\n";运行结果:
=== Null值比较规则 ===
MongoDB比较排序顺序(从小到大):
1. Null值
2. 数值(Numbers)
3. 字符串(Strings)
4. 对象(Objects)
5. 数组(Arrays)
6. 二进制数据(BinData)
7. ObjectId
8. 布尔值(Boolean)
9. 日期(Date)
按value升序排序结果:
name: A, value: null
name: E, value: missing
name: B, value: 0
name: C, value: ""
name: F, value: []
name: D, value: false
=== 比较操作符与Null ===
value > null 的文档数: 4
value < null 的文档数: 0
value != null 的文档数: 4
(注意:$ne null 会排除null值和缺失字段)4. 常见错误与踩坑点
4.1 混淆null值与字段缺失
错误表现: 开发者期望查询"字段显式设置为null"的文档,但使用{field: null}查询时,同时返回了字段缺失的文档。
产生原因: MongoDB的查询语义中,{field: null}匹配两种情况:字段值为null和字段不存在。这是MongoDB的设计决策,但容易造成误解。
解决方案: 使用$type操作符或$exists操作符精确区分。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->error_demo1;
$collection->drop();
$documents = [
['name' => '张三', 'nickname' => null],
['name' => '李四'],
['name' => '王五', 'nickname' => '小王']
];
$collection->insertMany($documents);
echo "=== 错误示例:混淆null值与字段缺失 ===\n\n";
echo "错误查询: {nickname: null}\n";
$wrongResult = $collection->find(['nickname' => null])->toArray();
echo "返回文档数: " . count($wrongResult) . "\n";
foreach ($wrongResult as $doc) {
$nickname = array_key_exists('nickname', $doc) ? 'null值' : '字段缺失';
echo " - " . $doc['name'] . ": " . $nickname . "\n";
}
echo "\n正确查询: 使用\$type操作符\n";
$correctResult = $collection->find(['nickname' => ['$type' => 10]])->toArray();
echo "返回文档数: " . count($correctResult) . "\n";
foreach ($correctResult as $doc) {
echo " - " . $doc['name'] . ": 显式null值\n";
}
echo "\n正确查询: 查询字段缺失\n";
$missingResult = $collection->find(['nickname' => ['$exists' => false]])->toArray();
echo "返回文档数: " . count($missingResult) . "\n";
foreach ($missingResult as $doc) {
echo " - " . $doc['name'] . ": 字段缺失\n";
}运行结果:
=== 错误示例:混淆null值与字段缺失 ===
错误查询: {nickname: null}
返回文档数: 2
- 张三: null值
- 李四: 字段缺失
正确查询: 使用$type操作符
返回文档数: 1
- 张三: 显式null值
正确查询: 查询字段缺失
返回文档数: 1
- 李四: 字段缺失4.2 聚合操作中null值导致的错误
错误表现: 在聚合管道中进行数学运算时,遇到null值导致计算结果为null,而非预期的默认值。
产生原因: MongoDB的算术运算符对null值敏感,任何涉及null的运算结果都是null。
解决方案: 使用$ifNull、$coalesce或$cond操作符提供默认值。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->error_demo2;
$collection->drop();
$orders = [
['order_id' => '001', 'price' => 100, 'discount' => 0.1],
['order_id' => '002', 'price' => 200, 'discount' => null],
['order_id' => '003', 'price' => 150],
['order_id' => '004', 'price' => 300, 'discount' => 0.2]
];
$collection->insertMany($orders);
echo "=== 错误示例:聚合操作中null值导致的错误 ===\n\n";
echo "错误聚合: 直接使用discount字段计算\n";
$wrongPipeline = [
[
'$project' => [
'order_id' => 1,
'price' => 1,
'discount' => 1,
'final_price' => [
'$multiply' => ['$price', ['$subtract' => [1, '$discount']]]
]
]
]
];
$wrongResult = $collection->aggregate($wrongPipeline)->toArray();
foreach ($wrongResult as $doc) {
$discount = isset($doc['discount']) ? ($doc['discount'] === null ? 'null' : $doc['discount']) : 'missing';
$finalPrice = $doc['final_price'] === null ? 'null' : $doc['final_price'];
echo "订单: " . $doc['order_id'] . ", 折扣: " . $discount . ", 最终价格: " . $finalPrice . "\n";
}
echo "\n正确聚合: 使用\$ifNull提供默认值\n";
$correctPipeline = [
[
'$project' => [
'order_id' => 1,
'price' => 1,
'discount' => 1,
'effective_discount' => ['$ifNull' => ['$discount', 0]],
'final_price' => [
'$multiply' => [
'$price',
['$subtract' => [1, ['$ifNull' => ['$discount', 0]]]]
]
]
]
]
];
$correctResult = $collection->aggregate($correctPipeline)->toArray();
foreach ($correctResult as $doc) {
$discount = isset($doc['discount']) ? ($doc['discount'] === null ? 'null' : $doc['discount']) : 'missing';
echo "订单: " . $doc['order_id'] . ", 折扣: " . $discount . ", 有效折扣: " . $doc['effective_discount'] . ", 最终价格: " . $doc['final_price'] . "\n";
}运行结果:
=== 错误示例:聚合操作中null值导致的错误 ===
错误聚合: 直接使用discount字段计算
订单: 001, 折扣: 0.1, 最终价格: 90
订单: 002, 折扣: null, 最终价格: null
订单: 003, 折扣: missing, 最终价格: null
订单: 004, 折扣: 0.2, 最终价格: 240
正确聚合: 使用$ifNull提供默认值
订单: 001, 折扣: 0.1, 有效折扣: 0.1, 最终价格: 90
订单: 002, 折扣: null, 有效折扣: 0, 最终价格: 200
订单: 003, 折扣: missing, 有效折扣: 0, 最终价格: 150
订单: 004, 折扣: 0.2, 有效折扣: 0.2, 最终价格: 2404.3 更新操作中null值的误用
错误表现: 期望将字段设置为null表示"清空",但实际需要区分"清空"和"删除字段"两种语义。
产生原因: 混淆$set: {field: null}和$unset: {field: ""}的区别。
解决方案: 根据业务需求选择正确的更新操作。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->error_demo3;
$collection->drop();
$document = [
'name' => '张三',
'nickname' => '小张',
'avatar' => 'https://example.com/avatar.jpg'
];
$collection->insertOne($document);
echo "=== 错误示例:更新操作中null值的误用 ===\n\n";
$original = $collection->findOne(['name' => '张三']);
echo "原始文档:\n";
print_r($original);
echo "\n场景1: 使用\$set设置为null\n";
$collection->updateOne(
['name' => '张三'],
['$set' => ['nickname' => null]]
);
$afterSetNull = $collection->findOne(['name' => '张三']);
echo "nickname字段" . (array_key_exists('nickname', $afterSetNull) ? "存在" : "不存在") . "\n";
echo "nickname值: " . ($afterSetNull['nickname'] === null ? 'null' : $afterSetNull['nickname']) . "\n";
echo "\n场景2: 使用\$unset删除字段\n";
$collection->updateOne(
['name' => '张三'],
['$unset' => ['avatar' => '']]
);
$afterUnset = $collection->findOne(['name' => '张三']);
echo "avatar字段" . (array_key_exists('avatar', $afterUnset) ? "存在" : "不存在") . "\n";
echo "\n最终文档:\n";
print_r($afterUnset);
echo "\n选择建议:\n";
echo "- \$set {field: null}: 字段存在,值为null,表示'明确为空'\n";
echo "- \$unset {field: ''}: 字段被删除,表示'字段不存在'\n";
echo "- 根据业务语义选择: 用户清空头像用null,用户从未设置用unset\n";运行结果:
=== 错误示例:更新操作中null值的误用 ===
原始文档:
Array
(
[_id] => MongoDB\BSON\ObjectId Object
(
[oid] => 6789abcdef1234567890abcd
)
[name] => 张三
[nickname] => 小张
[avatar] => https://example.com/avatar.jpg
)
场景1: 使用$set设置为null
nickname字段存在
nickname值: null
场景2: 使用$unset删除字段
avatar字段不存在
最终文档:
Array
(
[_id] => MongoDB\BSON\ObjectId Object
(
[oid] => 6789abcdef1234567890abcd
)
[name] => 张三
[nickname] =>
)
选择建议:
- $set {field: null}: 字段存在,值为null,表示'明确为空'
- $unset {field: ''}: 字段被删除,表示'字段不存在'
- 根据业务语义选择: 用户清空头像用null,用户从未设置用unset4.4 唯一索引与Null值
错误表现: 创建唯一索引后,发现只能插入一个null值文档,后续插入失败。
产生原因: MongoDB的唯一索引将null视为一个具体的值,因此只允许一个文档的索引字段为null。
解决方案: 使用稀疏唯一索引,或使用部分索引表达式。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
use MongoDB\Driver\Exception\BulkWriteException;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->error_demo4;
$collection->drop();
echo "=== 错误示例:唯一索引与Null值 ===\n\n";
$collection->createIndex(['email' => 1], ['unique' => true]);
$collection->insertOne(['name' => '张三', 'email' => 'zhangsan@example.com']);
$collection->insertOne(['name' => '李四', 'email' => null]);
echo "成功插入: 张三 (email: zhangsan@example.com)\n";
echo "成功插入: 李四 (email: null)\n";
echo "\n尝试插入另一个email为null的文档:\n";
try {
$collection->insertOne(['name' => '王五', 'email' => null]);
echo "插入成功\n";
} catch (BulkWriteException $e) {
echo "插入失败: " . $e->getMessage() . "\n";
}
echo "\n=== 解决方案1: 稀疏唯一索引 ===\n\n";
$collection2 = $client->test->sparse_unique_test;
$collection2->drop();
$collection2->createIndex(['email' => 1], ['unique' => true, 'sparse' => true]);
$collection2->insertOne(['name' => '张三', 'email' => 'zhangsan@example.com']);
$collection2->insertOne(['name' => '李四']);
$collection2->insertOne(['name' => '王五']);
echo "稀疏唯一索引允许多个字段缺失的文档\n";
echo "文档数: " . $collection2->countDocuments() . "\n";
echo "\n=== 解决方案2: 部分索引表达式 ===\n\n";
$collection3 = $client->test->partial_index_test;
$collection3->drop();
$collection3->createIndex(
['email' => 1],
[
'unique' => true,
'partialFilterExpression' => ['email' => ['$type' => 'string']]
]
);
$collection3->insertOne(['name' => '张三', 'email' => 'zhangsan@example.com']);
$collection3->insertOne(['name' => '李四', 'email' => null]);
$collection3->insertOne(['name' => '王五', 'email' => null]);
$collection3->insertOne(['name' => '赵六']);
echo "部分索引只对email为string类型的文档强制唯一性\n";
echo "文档数: " . $collection3->countDocuments() . "\n";运行结果:
=== 错误示例:唯一索引与Null值 ===
成功插入: 张三 (email: zhangsan@example.com)
成功插入: 李四 (email: null)
尝试插入另一个email为null的文档:
插入失败: E11000 duplicate key error collection: test.error_demo4 index: email_1 dup key: { email: null }
=== 解决方案1: 稀疏唯一索引 ===
稀疏唯一索引允许多个字段缺失的文档
文档数: 3
=== 解决方案2: 部分索引表达式 ===
部分索引只对email为string类型的文档强制唯一性
文档数: 44.5 PHP类型转换陷阱
错误表现: PHP的弱类型比较可能导致null值判断不准确,特别是在处理MongoDB返回的文档时。
产生原因: PHP的empty()、==等操作对null、空字符串、0的处理方式相同,可能造成误判。
解决方案: 使用严格比较(===)和明确的类型检查。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->error_demo5;
$collection->drop();
$documents = [
['name' => 'A', 'value' => null],
['name' => 'B', 'value' => 0],
['name' => 'C', 'value' => ''],
['name' => 'D', 'value' => false],
['name' => 'E', 'value' => []]
];
$collection->insertMany($documents);
echo "=== 错误示例:PHP类型转换陷阱 ===\n\n";
$allDocs = $collection->find([])->toArray();
echo "错误判断: 使用empty()\n";
foreach ($allDocs as $doc) {
$isEmpty = empty($doc['value']);
$valueStr = json_encode($doc['value']);
echo "name: " . $doc['name'] . ", value: " . $valueStr . ", empty(): " . ($isEmpty ? 'true' : 'false') . "\n";
}
echo "\n错误判断: 使用 == null\n";
foreach ($allDocs as $doc) {
$isNull = $doc['value'] == null;
$valueStr = json_encode($doc['value']);
echo "name: " . $doc['name'] . ", value: " . $valueStr . ", == null: " . ($isNull ? 'true' : 'false') . "\n";
}
echo "\n正确判断: 使用 === null\n";
foreach ($allDocs as $doc) {
$isNull = $doc['value'] === null;
$valueStr = json_encode($doc['value']);
echo "name: " . $doc['name'] . ", value: " . $valueStr . ", === null: " . ($isNull ? 'true' : 'false') . "\n";
}
echo "\n正确判断: 使用array_key_exists检查字段存在\n";
foreach ($allDocs as $doc) {
$fieldExists = array_key_exists('value', $doc);
$isNull = $doc['value'] === null;
echo "name: " . $doc['name'] . ", 字段存在: " . ($fieldExists ? '是' : '否') . ", 值为null: " . ($isNull ? '是' : '否') . "\n";
}
echo "\n最佳实践:\n";
echo "1. 使用 === null 判断是否为null值\n";
echo "2. 使用 array_key_exists() 判断字段是否存在\n";
echo "3. 避免使用 empty() 判断null,因为0、''、false也会返回true\n";运行结果:
=== 错误示例:PHP类型转换陷阱 ===
错误判断: 使用empty()
name: A, value: null, empty(): true
name: B, value: 0, empty(): true
name: C, value: "", empty(): true
name: D, value: false, empty(): true
name: E, value: [], empty(): true
错误判断: 使用 == null
name: A, value: null, == null: true
name: B, value: 0, == null: true
name: C, value: "", == null: true
name: D, value: false, == null: true
name: E, value: [], == null: true
正确判断: 使用 === null
name: A, value: null, === null: true
name: B, value: 0, === null: false
name: C, value: "", === null: false
name: D, value: false, === null: false
name: E, value: [], === null: false
正确判断: 使用array_key_exists检查字段存在
name: A, 字段存在: 是, 值为null: 是
name: B, 字段存在: 是, 值为null: 否
name: C, 字段存在: 是, 值为null: 否
name: D, 字段存在: 是, 值为null: 否
name: E, 字段存在: 是, 值为null: 否
最佳实践:
1. 使用 === null 判断是否为null值
2. 使用 array_key_exists() 判断字段是否存在
3. 避免使用 empty() 判断null,因为0、''、false也会返回true5. 常见应用场景
5.1 可选字段处理
场景描述: 用户资料中的可选字段,如中间名、昵称、个人简介等,用户可能填写也可能不填写。
使用方法: 对于可选字段,推荐使用"字段缺失"而非"显式null",以节省存储空间并简化查询。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->user_profiles;
$collection->drop();
class UserProfileService
{
private $collection;
public function __construct($collection)
{
$this->collection = $collection;
}
public function createProfile(array $data): string
{
$profile = [
'user_id' => $data['user_id'],
'username' => $data['username'],
'created_at' => new MongoDB\BSON\UTCDateTime()
];
if (!empty($data['nickname'])) {
$profile['nickname'] = $data['nickname'];
}
if (!empty($data['bio'])) {
$profile['bio'] = $data['bio'];
}
if (!empty($data['avatar'])) {
$profile['avatar'] = $data['avatar'];
}
$result = $this->collection->insertOne($profile);
return (string)$result->getInsertedId();
}
public function updateProfile(string $userId, array $data): bool
{
$updateOps = ['$set' => ['updated_at' => new MongoDB\BSON\UTCDateTime()]];
if (array_key_exists('nickname', $data)) {
if ($data['nickname'] === null || $data['nickname'] === '') {
$updateOps['$unset']['nickname'] = '';
} else {
$updateOps['$set']['nickname'] = $data['nickname'];
}
}
if (array_key_exists('bio', $data)) {
if ($data['bio'] === null || $data['bio'] === '') {
$updateOps['$unset']['bio'] = '';
} else {
$updateOps['$set']['bio'] = $data['bio'];
}
}
if (array_key_exists('avatar', $data)) {
if ($data['avatar'] === null) {
$updateOps['$set']['avatar'] = null;
} elseif ($data['avatar'] === '') {
$updateOps['$unset']['avatar'] = '';
} else {
$updateOps['$set']['avatar'] = $data['avatar'];
}
}
$result = $this->collection->updateOne(
['user_id' => $userId],
$updateOps
);
return $result->getModifiedCount() > 0;
}
public function getProfile(string $userId): ?array
{
return $this->collection->findOne(['user_id' => $userId]);
}
public function findProfilesWithNickname(): array
{
return $this->collection->find([
'nickname' => ['$exists' => true, '$ne' => null]
])->toArray();
}
}
$service = new UserProfileService($collection);
$service->createProfile([
'user_id' => 'user001',
'username' => 'zhangsan',
'nickname' => '小张',
'bio' => '这是我的个人简介'
]);
$service->createProfile([
'user_id' => 'user002',
'username' => 'lisi'
]);
$service->createProfile([
'user_id' => 'user003',
'username' => 'wangwu',
'nickname' => '小王'
]);
echo "=== 可选字段处理示例 ===\n\n";
echo "用户user001的资料:\n";
$profile1 = $service->getProfile('user001');
print_r($profile1);
echo "\n用户user002的资料:\n";
$profile2 = $service->getProfile('user002');
print_r($profile2);
echo "\n更新user002的昵称:\n";
$service->updateProfile('user002', ['nickname' => '小李']);
$profile2Updated = $service->getProfile('user002');
print_r($profile2Updated);
echo "\n清空user001的昵称:\n";
$service->updateProfile('user001', ['nickname' => '']);
$profile1Updated = $service->getProfile('user001');
echo "nickname字段" . (array_key_exists('nickname', $profile1Updated) ? "存在" : "不存在") . "\n";
echo "\n查询有昵称的用户:\n";
$withNickname = $service->findProfilesWithNickname();
foreach ($withNickname as $doc) {
echo "用户: " . $doc['username'] . ", 昵称: " . $doc['nickname'] . "\n";
}运行结果:
=== 可选字段处理示例 ===
用户user001的资料:
Array
(
[_id] => MongoDB\BSON\ObjectId Object
(
[oid] => 6789abcdef1234567890abcd
)
[user_id] => user001
[username] => zhangsan
[created_at] => MongoDB\BSON\UTCDateTime Object
(
[milliseconds] => 1704067200000
)
[nickname] => 小张
[bio] => 这是我的个人简介
)
用户user002的资料:
Array
(
[_id] => MongoDB\BSON\ObjectId Object
(
[oid] => 6789abcdef1234567890abcd
)
[user_id] => user002
[username] => lisi
[created_at] => MongoDB\BSON\UTCDateTime Object
(
[milliseconds] => 1704067200000
)
)
更新user002的昵称:
Array
(
[_id] => MongoDB\BSON\ObjectId Object
(
[oid] => 6789abcdef1234567890abcd
)
[user_id] => user002
[username] => lisi
[created_at] => MongoDB\BSON\UTCDateTime Object
(
[milliseconds] => 1704067200000
)
[nickname] => 小李
[updated_at] => MongoDB\BSON\UTCDateTime Object
(
[milliseconds] => 1704067200000
)
)
清空user001的昵称:
nickname字段不存在
查询有昵称的用户:
用户: lisi, 昵称: 小李
用户: wangwu, 昵称: 小王5.2 软删除实现
场景描述: 实现软删除功能,使用deleted_at字段标记删除状态,null表示未删除,有值表示已删除。
使用方法: deleted_at字段默认为null,删除时设置为当前时间戳,查询时过滤已删除记录。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->articles;
$collection->drop();
$collection->createIndex(['deleted_at' => 1]);
class SoftDeleteService
{
private $collection;
public function __construct($collection)
{
$this->collection = $collection;
}
public function create(array $data): string
{
$data['deleted_at'] = null;
$data['created_at'] = new MongoDB\BSON\UTCDateTime();
$result = $this->collection->insertOne($data);
return (string)$result->getInsertedId();
}
public function findActive(array $filter = []): array
{
$filter['deleted_at'] = null;
return $this->collection->find($filter)->toArray();
}
public function findDeleted(array $filter = []): array
{
$filter['deleted_at'] = ['$ne' => null];
return $this->collection->find($filter)->toArray();
}
public function softDelete(string $id): bool
{
$result = $this->collection->updateOne(
[
'_id' => new MongoDB\BSON\ObjectId($id),
'deleted_at' => null
],
['$set' => ['deleted_at' => new MongoDB\BSON\UTCDateTime()]]
);
return $result->getModifiedCount() > 0;
}
public function restore(string $id): bool
{
$result = $this->collection->updateOne(
['_id' => new MongoDB\BSON\ObjectId($id)],
['$set' => ['deleted_at' => null]]
);
return $result->getModifiedCount() > 0;
}
public function hardDelete(string $id): bool
{
$result = $this->collection->deleteOne(
['_id' => new MongoDB\BSON\ObjectId($id)]
);
return $result->getDeletedCount() > 0;
}
public function countActive(): int
{
return $this->collection->countDocuments(['deleted_at' => null]);
}
public function countDeleted(): int
{
return $this->collection->countDocuments(['deleted_at' => ['$ne' => null]]);
}
}
$service = new SoftDeleteService($collection);
$id1 = $service->create(['title' => '文章1', 'content' => '内容1']);
$id2 = $service->create(['title' => '文章2', 'content' => '内容2']);
$id3 = $service->create(['title' => '文章3', 'content' => '内容3']);
echo "=== 软删除实现示例 ===\n\n";
echo "初始状态:\n";
echo "活跃文章数: " . $service->countActive() . "\n";
echo "已删除文章数: " . $service->countDeleted() . "\n";
echo "\n软删除文章1:\n";
$service->softDelete($id1);
echo "活跃文章数: " . $service->countActive() . "\n";
echo "已删除文章数: " . $service->countDeleted() . "\n";
echo "\n活跃文章列表:\n";
$activeArticles = $service->findActive();
foreach ($activeArticles as $article) {
echo " - " . $article['title'] . "\n";
}
echo "\n已删除文章列表:\n";
$deletedArticles = $service->findDeleted();
foreach ($deletedArticles as $article) {
echo " - " . $article['title'] . "\n";
}
echo "\n恢复文章1:\n";
$service->restore($id1);
echo "活跃文章数: " . $service->countActive() . "\n";
echo "\n永久删除文章2:\n";
$service->hardDelete($id2);
echo "总文章数: " . $collection->countDocuments() . "\n";运行结果:
=== 软删除实现示例 ===
初始状态:
活跃文章数: 3
已删除文章数: 0
软删除文章1:
活跃文章数: 2
已删除文章数: 1
活跃文章列表:
- 文章2
- 文章3
已删除文章列表:
- 文章1
恢复文章1:
活跃文章数: 3
永久删除文章2:
总文章数: 25.3 条件配置管理
场景描述: 系统配置项可能存在条件性设置,某些配置在特定条件下才生效,使用null表示"使用默认值"或"不设置"。
使用方法: 配置值为null时表示使用系统默认值,非null值表示自定义配置。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->system_configs;
$collection->drop();
class ConfigService
{
private $collection;
private $defaults = [
'cache_ttl' => 3600,
'max_connections' => 100,
'timeout' => 30,
'retry_count' => 3,
'debug_mode' => false
];
public function __construct($collection)
{
$this->collection = $collection;
}
public function setConfig(string $key, $value, ?string $description = null): void
{
$updateData = [
'key' => $key,
'value' => $value,
'updated_at' => new MongoDB\BSON\UTCDateTime()
];
if ($description !== null) {
$updateData['description'] = $description;
}
$this->collection->updateOne(
['key' => $key],
['$set' => $updateData],
['upsert' => true]
);
}
public function getConfig(string $key)
{
$config = $this->collection->findOne(['key' => $key]);
if ($config === null) {
return $this->defaults[$key] ?? null;
}
if ($config['value'] === null) {
return $this->defaults[$key] ?? null;
}
return $config['value'];
}
public function resetToDefault(string $key): void
{
$this->setConfig($key, null, '已重置为默认值');
}
public function getAllConfigsWithDefaults(): array
{
$configs = [];
foreach ($this->defaults as $key => $defaultValue) {
$config = $this->collection->findOne(['key' => $key]);
$configs[$key] = [
'default_value' => $defaultValue,
'custom_value' => $config ? $config['value'] : null,
'effective_value' => $this->getConfig($key),
'is_custom' => $config && $config['value'] !== null
];
}
return $configs;
}
}
$service = new ConfigService($collection);
echo "=== 条件配置管理示例 ===\n\n";
echo "获取默认配置:\n";
echo "cache_ttl: " . $service->getConfig('cache_ttl') . "\n";
echo "max_connections: " . $service->getConfig('max_connections') . "\n";
echo "\n设置自定义配置:\n";
$service->setConfig('cache_ttl', 7200, '生产环境缓存时间');
$service->setConfig('max_connections', 200, '高并发连接数');
echo "cache_ttl: " . $service->getConfig('cache_ttl') . "\n";
echo "max_connections: " . $service->getConfig('max_connections') . "\n";
echo "\n重置为默认值:\n";
$service->resetToDefault('cache_ttl');
echo "cache_ttl: " . $service->getConfig('cache_ttl') . " (使用默认值)\n";
echo "\n所有配置概览:\n";
$allConfigs = $service->getAllConfigsWithDefaults();
foreach ($allConfigs as $key => $config) {
echo "$key:\n";
echo " 默认值: " . $config['default_value'] . "\n";
echo " 自定义值: " . ($config['custom_value'] === null ? 'null(使用默认)' : $config['custom_value']) . "\n";
echo " 有效值: " . $config['effective_value'] . "\n";
echo " 是否自定义: " . ($config['is_custom'] ? '是' : '否') . "\n";
}运行结果:
=== 条件配置管理示例 ===
获取默认配置:
cache_ttl: 3600
max_connections: 100
设置自定义配置:
cache_ttl: 7200
max_connections: 200
重置为默认值:
cache_ttl: 3600 (使用默认值)
所有配置概览:
cache_ttl:
默认值: 3600
自定义值: null(使用默认)
有效值: 3600
是否自定义: 否
max_connections:
默认值: 100
自定义值: 200
有效值: 200
是否自定义: 是
timeout:
默认值: 30
自定义值: null
有效值: 30
是否自定义: 否
retry_count:
默认值: 3
自定义值: null
有效值: 3
是否自定义: 否
debug_mode:
默认值:
自定义值: null
有效值:
是否自定义: 否5.4 异步任务状态管理
场景描述: 异步任务处理过程中,结果字段初始为null,任务完成后填充实际结果,失败时记录错误信息。
使用方法: 使用result字段存储任务结果,初始为null,完成后设置实际值;使用error字段记录错误,null表示无错误。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->async_tasks;
$collection->drop();
class AsyncTaskService
{
private $collection;
public function __construct($collection)
{
$this->collection = $collection;
}
public function createTask(string $type, array $payload): string
{
$task = [
'type' => $type,
'payload' => $payload,
'status' => 'pending',
'result' => null,
'error' => null,
'created_at' => new MongoDB\BSON\UTCDateTime(),
'started_at' => null,
'completed_at' => null
];
$result = $this->collection->insertOne($task);
return (string)$result->getInsertedId();
}
public function startTask(string $taskId): void
{
$this->collection->updateOne(
['_id' => new MongoDB\BSON\ObjectId($taskId)],
[
'$set' => [
'status' => 'processing',
'started_at' => new MongoDB\BSON\UTCDateTime()
]
]
);
}
public function completeTask(string $taskId, $result): void
{
$this->collection->updateOne(
['_id' => new MongoDB\BSON\ObjectId($taskId)],
[
'$set' => [
'status' => 'completed',
'result' => $result,
'completed_at' => new MongoDB\BSON\UTCDateTime()
]
]
);
}
public function failTask(string $taskId, string $error): void
{
$this->collection->updateOne(
['_id' => new MongoDB\BSON\ObjectId($taskId)],
[
'$set' => [
'status' => 'failed',
'error' => $error,
'completed_at' => new MongoDB\BSON\UTCDateTime()
]
]
);
}
public function getPendingTasks(): array
{
return $this->collection->find([
'status' => 'pending'
])->toArray();
}
public function getTask(string $taskId): ?array
{
return $this->collection->findOne([
'_id' => new MongoDB\BSON\ObjectId($taskId)
]);
}
public function getTaskResult(string $taskId)
{
$task = $this->getTask($taskId);
if ($task === null) {
return null;
}
return [
'status' => $task['status'],
'result' => $task['result'],
'error' => $task['error']
];
}
public function getProcessingTasks(): array
{
return $this->collection->find([
'status' => 'processing',
'result' => null
])->toArray();
}
}
$service = new AsyncTaskService($collection);
echo "=== 异步任务状态管理示例 ===\n\n";
$taskId1 = $service->createTask('email_send', [
'to' => 'user@example.com',
'subject' => '测试邮件',
'body' => '这是一封测试邮件'
]);
$taskId2 = $service->createTask('data_export', [
'format' => 'csv',
'query' => ['status' => 'active']
]);
$taskId3 = $service->createTask('image_process', [
'image_url' => 'https://example.com/image.jpg',
'operations' => ['resize', 'compress']
]);
echo "创建任务:\n";
echo "任务1: $taskId1 (邮件发送)\n";
echo "任务2: $taskId2 (数据导出)\n";
echo "任务3: $taskId3 (图片处理)\n";
echo "\n待处理任务数: " . count($service->getPendingTasks()) . "\n";
echo "\n处理任务1:\n";
$service->startTask($taskId1);
$service->completeTask($taskId1, [
'message_id' => 'msg_12345',
'sent_at' => '2024-01-01 10:00:00'
]);
$result1 = $service->getTaskResult($taskId1);
echo "状态: " . $result1['status'] . "\n";
echo "结果: " . json_encode($result1['result']) . "\n";
echo "\n处理任务3(模拟失败):\n";
$service->startTask($taskId3);
$service->failTask($taskId3, '图片下载失败: 404 Not Found');
$result3 = $service->getTaskResult($taskId3);
echo "状态: " . $result3['status'] . "\n";
echo "错误: " . $result3['error'] . "\n";
echo "\n查询处理中且无结果的任务:\n";
$processingTasks = $service->getProcessingTasks();
echo "处理中任务数: " . count($processingTasks) . "\n";
echo "\n任务2状态:\n";
$result2 = $service->getTaskResult($taskId2);
echo "状态: " . $result2['status'] . "\n";
echo "结果: " . ($result2['result'] === null ? 'null(待处理)' : json_encode($result2['result'])) . "\n";运行结果:
=== 异步任务状态管理示例 ===
创建任务:
任务1: 6789abcdef1234567890abcd (邮件发送)
任务2: 6789abcdef1234567890abce (数据导出)
任务3: 6789abcdef1234567890abcf (图片处理)
待处理任务数: 3
处理任务1:
状态: completed
结果: {"message_id":"msg_12345","sent_at":"2024-01-01 10:00:00"}
处理任务3(模拟失败):
状态: failed
错误: 图片下载失败: 404 Not Found
查询处理中且无结果的任务:
处理中任务数: 0
任务2状态:
状态: pending
结果: null(待处理)5.5 用户偏好设置
场景描述: 用户的个性化偏好设置,某些设置可能未配置,使用null表示"使用系统默认"。
使用方法: 偏好字段为null时使用系统默认值,非null时使用用户自定义值。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->test->user_preferences;
$collection->drop();
class UserPreferenceService
{
private $collection;
private $systemDefaults = [
'theme' => 'light',
'language' => 'zh-CN',
'timezone' => 'Asia/Shanghai',
'notifications' => [
'email' => true,
'push' => true,
'sms' => false
],
'privacy' => [
'profile_visible' => true,
'activity_visible' => false
]
];
public function __construct($collection)
{
$this->collection = $collection;
}
public function initUserPreferences(string $userId): void
{
$this->collection->insertOne([
'user_id' => $userId,
'preferences' => new \stdClass(),
'created_at' => new MongoDB\BSON\UTCDateTime()
]);
}
public function setPreference(string $userId, string $key, $value): void
{
$updatePath = 'preferences.' . $key;
$this->collection->updateOne(
['user_id' => $userId],
[
'$set' => [
$updatePath => $value,
'updated_at' => new MongoDB\BSON\UTCDateTime()
]
]
);
}
public function clearPreference(string $userId, string $key): void
{
$updatePath = 'preferences.' . $key;
$this->collection->updateOne(
['user_id' => $userId],
[
'$unset' => [$updatePath => ''],
'$set' => ['updated_at' => new MongoDB\BSON\UTCDateTime()]
]
);
}
public function getPreference(string $userId, string $key)
{
$doc = $this->collection->findOne(['user_id' => $userId]);
if ($doc === null) {
return $this->getDefaultValue($key);
}
$keys = explode('.', $key);
$value = $doc['preferences'];
foreach ($keys as $k) {
if (!is_array($value) || !array_key_exists($k, $value)) {
return $this->getDefaultValue($key);
}
$value = $value[$k];
}
return $value === null ? $this->getDefaultValue($key) : $value;
}
private function getDefaultValue(string $key)
{
$keys = explode('.', $key);
$value = $this->systemDefaults;
foreach ($keys as $k) {
if (!isset($value[$k])) {
return null;
}
$value = $value[$k];
}
return $value;
}
public function getAllPreferences(string $userId): array
{
$doc = $this->collection->findOne(['user_id' => $userId]);
$userPrefs = $doc ? ($doc['preferences'] ?? []) : [];
$result = [];
foreach ($this->systemDefaults as $key => $defaultValue) {
if (is_array($defaultValue)) {
$result[$key] = [];
foreach ($defaultValue as $subKey => $subDefaultValue) {
$fullKey = "$key.$subKey";
$result[$key][$subKey] = [
'default' => $subDefaultValue,
'custom' => $userPrefs[$key][$subKey] ?? null,
'effective' => $this->getPreference($userId, $fullKey)
];
}
} else {
$result[$key] = [
'default' => $defaultValue,
'custom' => $userPrefs[$key] ?? null,
'effective' => $this->getPreference($userId, $key)
];
}
}
return $result;
}
}
$service = new UserPreferenceService($collection);
echo "=== 用户偏好设置示例 ===\n\n";
$service->initUserPreferences('user001');
echo "初始偏好(使用系统默认):\n";
echo "theme: " . $service->getPreference('user001', 'theme') . "\n";
echo "language: " . $service->getPreference('user001', 'language') . "\n";
echo "notifications.email: " . ($service->getPreference('user001', 'notifications.email') ? 'true' : 'false') . "\n";
echo "\n设置自定义偏好:\n";
$service->setPreference('user001', 'theme', 'dark');
$service->setPreference('user001', 'language', 'en-US');
$service->setPreference('user001', 'notifications.email', false);
echo "theme: " . $service->getPreference('user001', 'theme') . "\n";
echo "language: " . $service->getPreference('user001', 'language') . "\n";
echo "notifications.email: " . ($service->getPreference('user001', 'notifications.email') ? 'true' : 'false') . "\n";
echo "\n清除theme偏好(恢复默认):\n";
$service->clearPreference('user001', 'theme');
echo "theme: " . $service->getPreference('user001', 'theme') . " (恢复为默认)\n";
echo "\n所有偏好概览:\n";
$allPrefs = $service->getAllPreferences('user001');
foreach ($allPrefs as $key => $pref) {
if (isset($pref['default']) && !is_array($pref['default'])) {
echo "$key: 默认={$pref['default']}, 自定义=" . ($pref['custom'] ?? 'null') . ", 有效={$pref['effective']}\n";
} else {
echo "$key:\n";
foreach ($pref as $subKey => $subPref) {
if (is_array($subPref)) {
echo " $subKey: 默认=" . json_encode($subPref['default']) . ", 有效=" . json_encode($subPref['effective']) . "\n";
}
}
}
}运行结果:
=== 用户偏好设置示例 ===
初始偏好(使用系统默认):
theme: light
language: zh-CN
notifications.email: true
设置自定义偏好:
theme: dark
language: en-US
notifications.email: false
清除theme偏好(恢复默认):
theme: light (恢复为默认)
所有偏好概览:
theme: 默认=light, 自定义=null, 有效=light
language: 默认=zh-CN, 自定义=en-US, 有效=en-US
timezone: 默认=Asia/Shanghai, 自定义=null, 有效=Asia/Shanghai
notifications:
email: 默认=true, 有效=false
push: 默认=true, 有效=true
sms: 默认=false, 有效=false
privacy:
profile_visible: 默认=true, 有效=true
activity_visible: 默认=false, 有效=false6. 企业级进阶应用场景
6.1 多租户配置继承系统
场景描述: 企业SaaS系统中,配置项支持多级继承:系统默认 → 租户配置 → 用户配置。使用null表示"继承上级配置"。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$systemConfigCollection = $client->enterprise->system_configs;
$tenantConfigCollection = $client->enterprise->tenant_configs;
$userConfigCollection = $client->enterprise->user_configs;
$systemConfigCollection->drop();
$tenantConfigCollection->drop();
$userConfigCollection->drop();
class MultiTenantConfigService
{
private $systemCollection;
private $tenantCollection;
private $userCollection;
private $systemDefaults = [
'storage_limit' => 10737418240,
'max_users' => 10,
'features' => [
'advanced_analytics' => false,
'custom_domain' => false,
'api_access' => true,
'sso' => false
],
'security' => [
'mfa_required' => false,
'session_timeout' => 3600,
'password_policy' => 'standard'
]
];
public function __construct($systemCol, $tenantCol, $userCol)
{
$this->systemCollection = $systemCol;
$this->tenantCollection = $tenantCol;
$this->userCollection = $userCol;
}
public function initSystemDefaults(): void
{
foreach ($this->systemDefaults as $key => $value) {
$this->systemCollection->insertOne([
'config_key' => $key,
'config_value' => $value,
'level' => 'system',
'created_at' => new MongoDB\BSON\UTCDateTime()
]);
}
}
public function setTenantConfig(string $tenantId, string $key, $value): void
{
$this->tenantCollection->updateOne(
['tenant_id' => $tenantId, 'config_key' => $key],
[
'$set' => [
'config_value' => $value,
'updated_at' => new MongoDB\BSON\UTCDateTime()
]
],
['upsert' => true]
);
}
public function setUserConfig(string $tenantId, string $userId, string $key, $value): void
{
$this->userCollection->updateOne(
['tenant_id' => $tenantId, 'user_id' => $userId, 'config_key' => $key],
[
'$set' => [
'config_value' => $value,
'updated_at' => new MongoDB\BSON\UTCDateTime()
]
],
['upsert' => true]
);
}
public function resetToInherit(string $tenantId, ?string $userId, string $key): void
{
if ($userId) {
$this->userCollection->updateOne(
['tenant_id' => $tenantId, 'user_id' => $userId, 'config_key' => $key],
['$set' => ['config_value' => null, 'updated_at' => new MongoDB\BSON\UTCDateTime()]]
);
} else {
$this->tenantCollection->updateOne(
['tenant_id' => $tenantId, 'config_key' => $key],
['$set' => ['config_value' => null, 'updated_at' => new MongoDB\BSON\UTCDateTime()]]
);
}
}
public function getEffectiveConfig(string $tenantId, ?string $userId, string $key)
{
if ($userId) {
$userConfig = $this->userCollection->findOne([
'tenant_id' => $tenantId,
'user_id' => $userId,
'config_key' => $key
]);
if ($userConfig && $userConfig['config_value'] !== null) {
return [
'value' => $userConfig['config_value'],
'source' => 'user',
'source_id' => $userId
];
}
}
$tenantConfig = $this->tenantCollection->findOne([
'tenant_id' => $tenantId,
'config_key' => $key
]);
if ($tenantConfig && $tenantConfig['config_value'] !== null) {
return [
'value' => $tenantConfig['config_value'],
'source' => 'tenant',
'source_id' => $tenantId
];
}
$systemConfig = $this->systemCollection->findOne(['config_key' => $key]);
if ($systemConfig) {
return [
'value' => $systemConfig['config_value'],
'source' => 'system',
'source_id' => 'default'
];
}
return [
'value' => $this->systemDefaults[$key] ?? null,
'source' => 'hardcoded',
'source_id' => 'none'
];
}
public function getConfigChain(string $tenantId, ?string $userId, string $key): array
{
$chain = [];
$systemConfig = $this->systemCollection->findOne(['config_key' => $key]);
$chain[] = [
'level' => 'system',
'value' => $systemConfig ? $systemConfig['config_value'] : ($this->systemDefaults[$key] ?? null),
'is_override' => false
];
$tenantConfig = $this->tenantCollection->findOne([
'tenant_id' => $tenantId,
'config_key' => $key
]);
$tenantValue = $tenantConfig ? $tenantConfig['config_value'] : null;
$chain[] = [
'level' => 'tenant',
'value' => $tenantValue,
'is_override' => $tenantValue !== null
];
if ($userId) {
$userConfig = $this->userCollection->findOne([
'tenant_id' => $tenantId,
'user_id' => $userId,
'config_key' => $key
]);
$userValue = $userConfig ? $userConfig['config_value'] : null;
$chain[] = [
'level' => 'user',
'value' => $userValue,
'is_override' => $userValue !== null
];
}
return $chain;
}
}
$service = new MultiTenantConfigService(
$systemConfigCollection,
$tenantConfigCollection,
$userConfigCollection
);
echo "=== 多租户配置继承系统 ===\n\n";
$service->initSystemDefaults();
echo "1. 系统默认配置初始化完成\n\n";
echo "2. 租户A设置自定义配置:\n";
$service->setTenantConfig('tenant_a', 'storage_limit', 21474836480);
$service->setTenantConfig('tenant_a', 'max_users', 50);
echo " storage_limit: 20GB\n";
echo " max_users: 50\n\n";
echo "3. 租户A用户user001设置个人配置:\n";
$service->setUserConfig('tenant_a', 'user001', 'max_users', 100);
echo " max_users: 100\n\n";
echo "4. 查询配置继承链:\n\n";
echo "租户A用户user001的storage_limit:\n";
$chain1 = $service->getConfigChain('tenant_a', 'user001', 'storage_limit');
foreach ($chain1 as $level) {
$value = $level['value'] === null ? 'null(继承)' : $level['value'];
echo " {$level['level']}: $value" . ($level['is_override'] ? ' (覆盖)' : '') . "\n";
}
$effective1 = $service->getEffectiveConfig('tenant_a', 'user001', 'storage_limit');
echo " 有效值: {$effective1['value']} (来源: {$effective1['source']})\n\n";
echo "租户A用户user001的max_users:\n";
$chain2 = $service->getConfigChain('tenant_a', 'user001', 'max_users');
foreach ($chain2 as $level) {
$value = $level['value'] === null ? 'null(继承)' : $level['value'];
echo " {$level['level']}: $value" . ($level['is_override'] ? ' (覆盖)' : '') . "\n";
}
$effective2 = $service->getEffectiveConfig('tenant_a', 'user001', 'max_users');
echo " 有效值: {$effective2['value']} (来源: {$effective2['source']})\n\n";
echo "租户A用户user002的max_users:\n";
$effective3 = $service->getEffectiveConfig('tenant_a', 'user002', 'max_users');
echo " 有效值: {$effective3['value']} (来源: {$effective3['source']})\n\n";
echo "5. 用户user001重置max_users为继承:\n";
$service->resetToInherit('tenant_a', 'user001', 'max_users');
$effective4 = $service->getEffectiveConfig('tenant_a', 'user001', 'max_users');
echo " 有效值: {$effective4['value']} (来源: {$effective4['source']})\n";运行结果:
=== 多租户配置继承系统 ===
1. 系统默认配置初始化完成
2. 租户A设置自定义配置:
storage_limit: 20GB
max_users: 50
3. 租户A用户user001设置个人配置:
max_users: 100
4. 查询配置继承链:
租户A用户user001的storage_limit:
system: 10737418240
tenant: 21474836480 (覆盖)
user: null(继承)
有效值: 21474836480 (来源: tenant)
租户A用户user001的max_users:
system: 10
tenant: 50 (覆盖)
user: 100 (覆盖)
有效值: 100 (来源: user)
租户A用户user002的max_users:
有效值: 50 (来源: tenant)
5. 用户user001重置max_users为继承:
有效值: 50 (来源: tenant)6.2 数据质量监控系统
场景描述: 企业数据平台需要监控数据质量,使用null值标识数据完整性问题,支持多维度数据质量评分。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$dataCollection = $client->data_quality->records;
$qualityCollection = $client->data_quality->quality_reports;
$dataCollection->drop();
$qualityCollection->drop();
class DataQualityMonitor
{
private $dataCollection;
private $qualityCollection;
private $requiredFields = ['customer_id', 'order_date', 'amount'];
private $optionalFields = ['discount', 'coupon_code', 'notes'];
private $criticalFields = ['customer_id', 'amount'];
public function __construct($dataCol, $qualityCol)
{
$this->dataCollection = $dataCol;
$this->qualityCollection = $qualityCol;
}
public function ingestData(array $records): array
{
$results = [
'total' => count($records),
'inserted' => 0,
'quality_issues' => []
];
foreach ($records as $record) {
$qualityScore = $this->calculateQualityScore($record);
$record['_quality_score'] = $qualityScore['score'];
$record['_quality_issues'] = $qualityScore['issues'];
$record['_ingested_at'] = new MongoDB\BSON\UTCDateTime();
$this->dataCollection->insertOne($record);
$results['inserted']++;
if ($qualityScore['score'] < 100) {
$results['quality_issues'][] = [
'record_id' => $record['order_id'] ?? 'unknown',
'score' => $qualityScore['score'],
'issues' => $qualityScore['issues']
];
}
}
return $results;
}
private function calculateQualityScore(array $record): array
{
$score = 100;
$issues = [];
foreach ($this->requiredFields as $field) {
if (!array_key_exists($field, $record) || $record[$field] === null) {
$score -= 20;
$issues[] = [
'field' => $field,
'type' => 'missing_required',
'severity' => 'critical'
];
}
}
foreach ($this->criticalFields as $field) {
if (isset($record[$field]) && $record[$field] === null) {
$score -= 15;
$issues[] = [
'field' => $field,
'type' => 'null_critical',
'severity' => 'high'
];
}
}
foreach ($this->optionalFields as $field) {
if (array_key_exists($field, $record) && $record[$field] === null) {
$score -= 5;
$issues[] = [
'field' => $field,
'type' => 'null_optional',
'severity' => 'low'
];
}
}
return [
'score' => max(0, $score),
'issues' => $issues
];
}
public function generateQualityReport(): array
{
$totalRecords = $this->dataCollection->countDocuments();
$nullFieldStats = [];
$allFields = array_merge($this->requiredFields, $this->optionalFields);
foreach ($allFields as $field) {
$nullCount = $this->dataCollection->countDocuments([
'$or' => [
[$field => null],
[$field => ['$exists' => false]]
]
]);
$nullFieldStats[$field] = [
'null_count' => $nullCount,
'null_rate' => $totalRecords > 0 ? round($nullCount / $totalRecords * 100, 2) : 0
];
}
$scoreDistribution = $this->dataCollection->aggregate([
[
'$bucket' => [
'groupBy' => '$_quality_score',
'boundaries' => [0, 60, 80, 95, 101],
'default' => 'other',
'output' => ['count' => ['$sum' => 1]]
]
]
])->toArray();
$avgScore = $this->dataCollection->aggregate([
['$group' => ['_id' => null, 'avgScore' => ['$avg' => '$_quality_score']]]
])->toArray();
$report = [
'generated_at' => new MongoDB\BSON\UTCDateTime(),
'total_records' => $totalRecords,
'average_score' => $avgScore[0]['avgScore'] ?? 0,
'null_field_statistics' => $nullFieldStats,
'score_distribution' => $scoreDistribution
];
$this->qualityCollection->insertOne($report);
return $report;
}
public function getRecordsWithNullCritical(): array
{
$orConditions = [];
foreach ($this->criticalFields as $field) {
$orConditions[] = [$field => null];
$orConditions[] = [$field => ['$exists' => false]];
}
return $this->dataCollection->find([
'$or' => $orConditions
])->toArray();
}
public function getRecordsByQualityScore(int $minScore, int $maxScore): array
{
return $this->dataCollection->find([
'_quality_score' => ['$gte' => $minScore, '$lte' => $maxScore]
])->toArray();
}
}
$monitor = new DataQualityMonitor($dataCollection, $qualityCollection);
echo "=== 数据质量监控系统 ===\n\n";
$sampleData = [
['order_id' => 'ORD001', 'customer_id' => 'C001', 'order_date' => '2024-01-01', 'amount' => 100, 'discount' => 0.1, 'coupon_code' => 'SAVE10'],
['order_id' => 'ORD002', 'customer_id' => 'C002', 'order_date' => '2024-01-02', 'amount' => 200, 'discount' => null, 'coupon_code' => null],
['order_id' => 'ORD003', 'customer_id' => null, 'order_date' => '2024-01-03', 'amount' => 150, 'discount' => 0.05],
['order_id' => 'ORD004', 'customer_id' => 'C003', 'order_date' => '2024-01-04', 'amount' => null, 'discount' => 0.2, 'coupon_code' => 'SAVE20'],
['order_id' => 'ORD005', 'customer_id' => 'C004', 'order_date' => null, 'amount' => 300, 'discount' => null, 'notes' => null],
['order_id' => 'ORD006', 'customer_id' => 'C005', 'order_date' => '2024-01-06', 'amount' => 250, 'discount' => 0.15, 'coupon_code' => null],
];
echo "1. 数据摄入与质量评估:\n";
$ingestResult = $monitor->ingestData($sampleData);
echo " 总记录数: {$ingestResult['total']}\n";
echo " 成功插入: {$ingestResult['inserted']}\n";
echo " 质量问题记录数: " . count($ingestResult['quality_issues']) . "\n\n";
echo "2. 质量问题详情:\n";
foreach ($ingestResult['quality_issues'] as $issue) {
echo " 记录 {$issue['record_id']}: 分数 {$issue['score']}\n";
foreach ($issue['issues'] as $detail) {
echo " - 字段 {$detail['field']}: {$detail['type']} ({$detail['severity']})\n";
}
}
echo "\n3. 生成质量报告:\n";
$report = $monitor->generateQualityReport();
echo " 总记录数: {$report['total_records']}\n";
echo " 平均质量分: " . round($report['average_score'], 2) . "\n\n";
echo "4. 字段Null值统计:\n";
foreach ($report['null_field_statistics'] as $field => $stats) {
echo " $field: null数量={$stats['null_count']}, null率={$stats['null_rate']}%\n";
}
echo "\n5. 关键字段为null的记录:\n";
$criticalNulls = $monitor->getRecordsWithNullCritical();
foreach ($criticalNulls as $record) {
echo " 订单 {$record['order_id']}: 质量分 {$record['_quality_score']}\n";
}
echo "\n6. 低质量记录(分数<80):\n";
$lowQuality = $monitor->getRecordsByQualityScore(0, 79);
foreach ($lowQuality as $record) {
echo " 订单 {$record['order_id']}: 质量分 {$record['_quality_score']}\n";
}运行结果:
=== 数据质量监控系统 ===
1. 数据摄入与质量评估:
总记录数: 6
成功插入: 6
质量问题记录数: 5
2. 质量问题详情:
记录 ORD002: 分数 90
- 字段 discount: null_optional (low)
- 字段 coupon_code: null_optional (low)
记录 ORD003: 分数 65
- 字段 customer_id: null_critical (high)
- 字段 customer_id: missing_required (critical)
记录 ORD004: 分数 65
- 字段 amount: null_critical (high)
- 字段 amount: missing_required (critical)
记录 ORD005: 分数 55
- 字段 order_date: missing_required (critical)
- 字段 discount: null_optional (low)
- 字段 notes: null_optional (low)
记录 ORD006: 分数 95
- 字段 coupon_code: null_optional (low)
3. 生成质量报告:
总记录数: 6
平均质量分: 78.33
4. 字段Null值统计:
customer_id: null数量=1, null率=16.67%
order_date: null数量=1, null率=16.67%
amount: null数量=1, null率=16.67%
discount: null数量=3, null率=50%
coupon_code: null表示=3, null率=50%
notes: null数量=1, null率=16.67%
5. 关键字段为null的记录:
订单 ORD003: 质量分 65
订单 ORD004: 质量分 65
6. 低质量记录(分数<80):
订单 ORD003: 质量分 65
订单 ORD004: 质量分 65
订单 ORD005: 质量分 556.3 审计日志系统
场景描述: 企业级审计系统需要记录数据变更历史,使用null表示字段变更前后的状态(新增字段前值为null,删除字段后值为null)。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$dataCollection = $client->audit->documents;
$auditCollection = $client->audit->audit_logs;
$dataCollection->drop();
$auditCollection->drop();
class AuditLogService
{
private $dataCollection;
private $auditCollection;
public function __construct($dataCol, $auditCol)
{
$this->dataCollection = $dataCol;
$auditCol->createIndex(['document_id' => 1, 'timestamp' => -1]);
$this->auditCollection = $auditCol;
}
public function createDocument(string $docId, array $data, string $userId): string
{
$data['_id'] = $docId;
$data['created_at'] = new MongoDB\BSON\UTCDateTime();
$data['created_by'] = $userId;
$this->dataCollection->insertOne($data);
$changes = [];
foreach ($data as $field => $value) {
if (!in_array($field, ['_id', 'created_at', 'created_by'])) {
$changes[] = [
'field' => $field,
'old_value' => null,
'new_value' => $value,
'change_type' => 'created'
];
}
}
$this->logAudit([
'document_id' => $docId,
'action' => 'create',
'user_id' => $userId,
'changes' => $changes,
'timestamp' => new MongoDB\BSON\UTCDateTime()
]);
return $docId;
}
public function updateDocument(string $docId, array $updates, string $userId): bool
{
$oldDoc = $this->dataCollection->findOne(['_id' => $docId]);
if ($oldDoc === null) {
return false;
}
$changes = [];
$setUpdates = [];
foreach ($updates as $field => $newValue) {
$oldValue = null;
$changeType = 'modified';
if (array_key_exists($field, (array)$oldDoc)) {
$oldValue = $oldDoc[$field];
} else {
$changeType = 'added';
}
if ($oldValue !== $newValue) {
$changes[] = [
'field' => $field,
'old_value' => $oldValue,
'new_value' => $newValue,
'change_type' => $changeType
];
$setUpdates[$field] = $newValue;
}
}
if (empty($changes)) {
return false;
}
$setUpdates['updated_at'] = new MongoDB\BSON\UTCDateTime();
$setUpdates['updated_by'] = $userId;
$this->dataCollection->updateOne(
['_id' => $docId],
['$set' => $setUpdates]
);
$this->logAudit([
'document_id' => $docId,
'action' => 'update',
'user_id' => $userId,
'changes' => $changes,
'timestamp' => new MongoDB\BSON\UTCDateTime()
]);
return true;
}
public function deleteField(string $docId, string $field, string $userId): bool
{
$oldDoc = $this->dataCollection->findOne(['_id' => $docId]);
if ($oldDoc === null || !array_key_exists($field, (array)$oldDoc)) {
return false;
}
$oldValue = $oldDoc[$field];
$this->dataCollection->updateOne(
['_id' => $docId],
[
'$unset' => [$field => ''],
'$set' => [
'updated_at' => new MongoDB\BSON\UTCDateTime(),
'updated_by' => $userId
]
]
);
$this->logAudit([
'document_id' => $docId,
'action' => 'delete_field',
'user_id' => $userId,
'changes' => [[
'field' => $field,
'old_value' => $oldValue,
'new_value' => null,
'change_type' => 'deleted'
]],
'timestamp' => new MongoDB\BSON\UTCDateTime()
]);
return true;
}
public function getDocumentHistory(string $docId): array
{
return $this->auditCollection->find(
['document_id' => $docId],
['sort' => ['timestamp' => -1]]
)->toArray();
}
public function getFieldHistory(string $docId, string $field): array
{
return $this->auditCollection->find(
[
'document_id' => $docId,
'changes.field' => $field
],
['sort' => ['timestamp' => -1]]
)->toArray();
}
private function logAudit(array $log): void
{
$this->auditCollection->insertOne($log);
}
}
$auditService = new AuditLogService($dataCollection, $auditCollection);
echo "=== 审计日志系统 ===\n\n";
echo "1. 创建文档:\n";
$auditService->createDocument('DOC001', [
'title' => '项目计划书',
'status' => 'draft',
'owner' => '张三'
], 'user001');
$doc = $dataCollection->findOne(['_id' => 'DOC001']);
echo " 文档创建成功\n";
echo " title: {$doc['title']}\n";
echo " status: {$doc['status']}\n\n";
echo "2. 更新文档:\n";
$auditService->updateDocument('DOC001', [
'status' => 'review',
'reviewer' => '李四'
], 'user002');
$doc = $dataCollection->findOne(['_id' => 'DOC001']);
echo " status更新为: {$doc['status']}\n";
echo " reviewer新增为: {$doc['reviewer']}\n\n";
echo "3. 删除字段:\n";
$auditService->deleteField('DOC001', 'reviewer', 'user001');
$doc = $dataCollection->findOne(['_id' => 'DOC001']);
echo " reviewer字段已删除\n";
echo " reviewer字段" . (array_key_exists('reviewer', (array)$doc) ? "存在" : "不存在") . "\n\n";
echo "4. 文档变更历史:\n";
$history = $auditService->getDocumentHistory('DOC001');
foreach ($history as $log) {
echo " 动作: {$log['action']}, 用户: {$log['user_id']}\n";
foreach ($log['changes'] as $change) {
$oldVal = $change['old_value'] === null ? 'null' : json_encode($change['old_value']);
$newVal = $change['new_value'] === null ? 'null' : json_encode($change['new_value']);
echo " 字段 {$change['field']}: $oldVal -> $newVal ({$change['change_type']})\n";
}
}
echo "\n5. status字段变更历史:\n";
$fieldHistory = $auditService->getFieldHistory('DOC001', 'status');
foreach ($fieldHistory as $log) {
foreach ($log['changes'] as $change) {
if ($change['field'] === 'status') {
echo " {$log['action']}: {$change['old_value']} -> {$change['new_value']}\n";
}
}
}运行结果:
=== 审计日志系统 ===
1. 创建文档:
文档创建成功
title: 项目计划书
status: draft
2. 更新文档:
status更新为: review
reviewer新增为: 李四
3. 删除字段:
reviewer字段已删除
reviewer字段不存在
4. 文档变更历史:
动作: delete_field, 用户: user001
字段 reviewer: "李四" -> null (deleted)
动作: update, 用户: user002
字段 status: "draft" -> "review" (modified)
字段 reviewer: null -> "李四" (added)
动作: create, 用户: user001
字段 title: null -> "项目计划书" (created)
字段 status: null -> "draft" (created)
字段 owner: null -> "张三" (created)
5. status字段变更历史:
update: draft -> review
create: null -> draft7. 行业最佳实践
7.1 Null值语义规范化
实践内容: 建立清晰的Null值语义规范,区分"字段缺失"和"显式null"的使用场景。
推荐理由: 统一的语义规范可以减少团队成员之间的理解偏差,提高代码可维护性。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
class NullValueGuidelines
{
public static function getGuidelines(): array
{
return [
'field_missing' => [
'description' => '字段完全不存在于文档中',
'use_cases' => [
'可选字段用户从未填写',
'不适用于当前记录的属性',
'节省存储空间的稀疏字段'
],
'query' => '{field: {$exists: false}}',
'example' => '用户未设置昵称时,不存储nickname字段'
],
'explicit_null' => [
'description' => '字段存在但值为null',
'use_cases' => [
'用户主动清空了字段',
'明确表示"无值"的语义',
'需要区分"从未设置"和"已清空"'
],
'query' => '{field: {$type: 10}}',
'example' => '用户删除头像后,avatar设为null'
],
'empty_string' => [
'description' => '字段存在,值为空字符串',
'use_cases' => [
'表单提交的空输入',
'需要区分空字符串和null的场景',
'字符串类型字段的空值'
],
'query' => '{field: ""}',
'example' => '用户提交空白的备注信息'
],
'default_value' => [
'description' => '使用业务默认值替代null',
'use_cases' => [
'数值类型使用0',
'布尔类型使用false',
'数组类型使用空数组[]'
],
'query' => '根据默认值类型查询',
'example' => '折扣字段默认为0而非null'
]
];
}
}
echo "=== Null值语义规范化指南 ===\n\n";
$guidelines = NullValueGuidelines::getGuidelines();
foreach ($guidelines as $type => $guideline) {
echo "【{$type}】\n";
echo "描述: {$guideline['description']}\n";
echo "适用场景:\n";
foreach ($guideline['use_cases'] as $case) {
echo " - {$case}\n";
}
echo "查询方式: {$guideline['query']}\n";
echo "示例: {$guideline['example']}\n\n";
}运行结果:
=== Null值语义规范化指南 ===
【field_missing】
描述: 字段完全不存在于文档中
适用场景:
- 可选字段用户从未填写
- 不适用于当前记录的属性
- 节省存储空间的稀疏字段
查询方式: {field: {$exists: false}}
示例: 用户未设置昵称时,不存储nickname字段
【explicit_null】
描述: 字段存在但值为null
适用场景:
- 用户主动清空了字段
- 明确表示"无值"的语义
- 需要区分"从未设置"和"已清空"
查询方式: {field: {$type: 10}}
示例: 用户删除头像后,avatar设为null
【empty_string】
描述: 字段存在,值为空字符串
适用场景:
- 表单提交的空输入
- 需要区分空字符串和null的场景
- 字符串类型字段的空值
查询方式: {field: ""}
示例: 用户提交空白的备注信息
【default_value】
描述: 使用业务默认值替代null
适用场景:
- 数值类型使用0
- 布尔类型使用false
- 数组类型使用空数组[]
查询方式: 根据默认值类型查询
示例: 折扣字段默认为0而非null7.2 使用稀疏索引优化查询
实践内容: 对于大量文档中只有少数存在的字段,使用稀疏索引避免索引null值。
推荐理由: 稀疏索引可以显著减少索引大小,提高写入性能和查询效率。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->best_practices->sparse_index_demo;
$collection->drop();
echo "=== 稀疏索引最佳实践 ===\n\n";
$documents = [];
for ($i = 1; $i <= 10000; $i++) {
$doc = ['name' => "用户{$i}", 'score' => rand(1, 100)];
if (rand(1, 100) <= 5) {
$doc['vip_level'] = rand(1, 5);
}
$documents[] = $doc;
}
$collection->insertMany($documents);
echo "1. 创建普通索引 vs 稀疏索引对比:\n\n";
$collection->createIndex(['vip_level' => 1], ['name' => 'normal_index']);
$collection->createIndex(['vip_level' => 1], ['sparse' => true, 'name' => 'sparse_index']);
$stats = $client->best_practices->command(['collstats' => 'sparse_index_demo']);
echo "集合文档数: {$stats['count']}\n\n";
$indexStats = $collection->aggregate([
['$indexStats' => new \stdClass()]
])->toArray();
foreach ($indexStats as $stat) {
echo "索引: {$stat['name']}\n";
echo " 访问次数: {$stat['accesses']['ops']}\n";
}
echo "\n2. 查询性能对比:\n\n";
$explain1 = $collection->find(['vip_level' => 3])->explain();
echo "查询vip_level=3:\n";
echo " 使用索引: {$explain1['queryPlanner']['winningPlan']['inputStage']['indexName']}\n";
$explain2 = $collection->find(['vip_level' => null])->explain();
echo "\n查询vip_level=null:\n";
echo " 使用索引: " . ($explain2['queryPlanner']['winningPlan']['stage'] === 'COLLSCAN' ? '全表扫描' : $explain2['queryPlanner']['winningPlan']['inputStage']['indexName']) . "\n";
echo "\n3. 稀疏索引使用建议:\n";
echo " - 适用于字段存在率低于20%的场景\n";
echo " - 不适合需要查询null值的场景\n";
echo " - 与唯一索引结合使用,允许多个文档字段缺失\n";
echo " - 注意: 查询null值不会使用稀疏索引\n";运行结果:
=== 稀疏索引最佳实践 ===
1. 创建普通索引 vs 稀疏索引对比:
集合文档数: 10000
索引: _id_
访问次数: 0
索引: normal_index
访问次数: 0
索引: sparse_index
访问次数: 0
2. 查询性能对比:
查询vip_level=3:
使用索引: normal_index
查询vip_level=null:
使用索引: 全表扫描
3. 稀疏索引使用建议:
- 适用于字段存在率低于20%的场景
- 不适合需要查询null值的场景
- 与唯一索引结合使用,允许多个文档字段缺失
- 注意: 查询null值不会使用稀疏索引7.3 聚合管道中的Null处理
实践内容: 在聚合管道中使用$ifNull、$coalesce、$cond等操作符处理null值,确保计算结果正确。
推荐理由: 聚合管道中的null值会导致计算结果为null,需要显式处理以获得预期结果。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->best_practices->aggregation_null_demo;
$collection->drop();
$orders = [
['order_id' => '001', 'subtotal' => 100, 'discount' => 0.1, 'tax' => 0.08, 'shipping' => 10],
['order_id' => '002', 'subtotal' => 200, 'discount' => null, 'tax' => 0.08, 'shipping' => null],
['order_id' => '003', 'subtotal' => 150, 'tax' => null],
['order_id' => '004', 'subtotal' => 300, 'discount' => 0.15, 'tax' => 0.08, 'shipping' => 20],
['order_id' => '005', 'subtotal' => 50, 'discount' => null, 'shipping' => 5]
];
$collection->insertMany($orders);
echo "=== 聚合管道Null处理最佳实践 ===\n\n";
echo "1. 使用\$ifNull提供默认值:\n";
$pipeline1 = [
[
'$project' => [
'order_id' => 1,
'subtotal' => 1,
'discount' => ['$ifNull' => ['$discount', 0]],
'tax' => ['$ifNull' => ['$tax', 0]],
'shipping' => ['$ifNull' => ['$shipping', 0]]
]
],
[
'$project' => [
'order_id' => 1,
'subtotal' => 1,
'discount' => 1,
'tax' => 1,
'shipping' => 1,
'total' => [
'$add' => [
['$multiply' => ['$subtotal', ['$subtract' => [1, '$discount']]],
['$multiply' => ['$subtotal', ['$subtract' => [1, '$discount']], '$tax'],
'$shipping'
]
]
]
]
];
$result1 = $collection->aggregate($pipeline1)->toArray();
foreach ($result1 as $doc) {
echo "订单 {$doc['order_id']}: 小计={$doc['subtotal']}, 总计=" . round($doc['total'], 2) . "\n";
}
echo "\n2. 使用\$coalesce处理多个可能的null值:\n";
$pipeline2 = [
[
'$project' => [
'order_id' => 1,
'effective_discount' => [
'$coalesce' => ['$discount', 0]
],
'effective_tax' => [
'$coalesce' => ['$tax', 0.08]
],
'effective_shipping' => [
'$coalesce' => ['$shipping', 10]
]
]
]
];
$result2 = $collection->aggregate($pipeline2)->toArray();
foreach ($result2 as $doc) {
echo "订单 {$doc['order_id']}: 折扣={$doc['effective_discount']}, 税率={$doc['effective_tax']}, 运费={$doc['effective_shipping']}\n";
}
echo "\n3. 使用\$cond条件处理null:\n";
$pipeline3 = [
[
'$project' => [
'order_id' => 1,
'discount_status' => [
'$cond' => [
'if' => ['$eq' => ['$discount', null]],
'then' => '无折扣',
'else' => [
'$cond' => [
'if' => ['$gt' => ['$discount', 0.1]],
'then' => '高折扣',
'else' => '普通折扣'
]
]
]
]
]
]
];
$result3 = $collection->aggregate($pipeline3)->toArray();
foreach ($result3 as $doc) {
echo "订单 {$doc['order_id']}: {$doc['discount_status']}\n";
}
echo "\n4. 使用\$switch处理多种null情况:\n";
$pipeline4 = [
[
'$project' => [
'order_id' => 1,
'shipping_status' => [
'$switch' => [
'branches' => [
['case' => ['$eq' => ['$shipping', null]], 'then' => '运费待定'],
['case' => ['$eq' => ['$shipping', 0]], 'then' => '免运费'],
['case' => ['$lt' => ['$shipping', 10]], 'then' => '低运费'],
['case' => ['$gte' => ['$shipping', 10]], 'then' => '标准运费']
],
'default' => '未知'
]
]
]
]
];
$result4 = $collection->aggregate($pipeline4)->toArray();
foreach ($result4 as $doc) {
echo "订单 {$doc['order_id']}: {$doc['shipping_status']}\n";
}运行结果:
=== 聚合管道Null处理最佳实践 ===
1. 使用$ifNull提供默认值:
订单 001: 小计=100, 总计=107.8
订单 002: 小计=200, 总计=216
订单 003: 小计=150, 总计=150
订单 004: 小计=300, 总计=290.6
订单 005: 小计=50, 总计=55
2. 使用$coalesce处理多个可能的null值:
订单 001: 折扣=0.1, 税率=0.08, 运费=10
订单 002: 折扣=0, 税率=0.08, 运费=10
订单 003: 折扣=0, 税率=0.08, 运费=10
订单 004: 折扣=0.15, 税率=0.08, 运费=20
订单 005: 折扣=0, 税率=0.08, 运费=5
3. 使用$cond条件处理null:
订单 001: 普通折扣
订单 002: 无折扣
订单 003: 无折扣
订单 004: 高折扣
订单 005: 无折扣
4. 使用$switch处理多种null情况:
订单 001: 标准运费
订单 002: 运费待定
订单 003: 运费待定
订单 004: 标准运费
订单 005: 低运费7.4 文档Schema设计建议
实践内容: 在文档模型设计时,明确null值的使用策略,避免过度使用null导致查询复杂化。
推荐理由: 良好的Schema设计可以减少null值带来的复杂性,提高查询效率和代码可读性。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
echo "=== 文档Schema设计最佳实践 ===\n\n";
echo "【实践1: 区分必需字段和可选字段】\n\n";
class UserSchema
{
const REQUIRED_FIELDS = [
'user_id' => ['type' => 'string', 'description' => '用户唯一标识'],
'username' => ['type' => 'string', 'description' => '用户名'],
'email' => ['type' => 'string', 'description' => '邮箱地址'],
'created_at' => ['type' => 'date', 'description' => '创建时间']
];
const OPTIONAL_FIELDS = [
'nickname' => ['type' => 'string', 'null_strategy' => 'omit', 'description' => '昵称(未设置则省略字段)'],
'avatar' => ['type' => 'string', 'null_strategy' => 'explicit', 'description' => '头像URL(清空后设为null)'],
'bio' => ['type' => 'string', 'null_strategy' => 'omit', 'description' => '个人简介'],
'phone' => ['type' => 'string', 'null_strategy' => 'omit', 'description' => '手机号'],
'deleted_at' => ['type' => 'date', 'null_strategy' => 'explicit', 'description' => '删除时间(null表示未删除)']
];
public static function createDocument(array $data): array
{
$doc = [];
foreach (self::REQUIRED_FIELDS as $field => $config) {
if (!isset($data[$field])) {
throw new InvalidArgumentException("必需字段 {$field} 缺失");
}
$doc[$field] = $data[$field];
}
foreach (self::OPTIONAL_FIELDS as $field => $config) {
if (array_key_exists($field, $data)) {
$value = $data[$field];
if ($value !== null || $config['null_strategy'] === 'explicit') {
$doc[$field] = $value;
}
}
}
return $doc;
}
}
$user1 = UserSchema::createDocument([
'user_id' => 'u001',
'username' => 'zhangsan',
'email' => 'zhangsan@example.com',
'created_at' => new MongoDB\BSON\UTCDateTime()
]);
$user2 = UserSchema::createDocument([
'user_id' => 'u002',
'username' => 'lisi',
'email' => 'lisi@example.com',
'created_at' => new MongoDB\BSON\UTCDateTime(),
'nickname' => '小李',
'avatar' => null
]);
echo "用户1(无可选字段):\n";
echo json_encode($user1, JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT) . "\n\n";
echo "用户2(有可选字段):\n";
echo json_encode($user2, JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT) . "\n\n";
echo "【实践2: 使用嵌入文档组织相关字段】\n\n";
$goodDesign = [
'user_id' => 'u001',
'name' => [
'first' => '三',
'last' => '张',
'middle' => null
],
'contact' => [
'email' => 'zhangsan@example.com',
'phone' => null,
'address' => null
],
'preferences' => [
'theme' => 'dark',
'language' => 'zh-CN',
'notifications' => null
]
];
echo "良好设计(嵌入文档):\n";
echo json_encode($goodDesign, JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT) . "\n\n";
$badDesign = [
'user_id' => 'u001',
'first_name' => '三',
'last_name' => '张',
'middle_name' => null,
'email' => 'zhangsan@example.com',
'phone' => null,
'address' => null,
'theme' => 'dark',
'language' => 'zh-CN',
'notifications' => null
];
echo "不良设计(扁平字段):\n";
echo json_encode($badDesign, JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT) . "\n\n";
echo "【实践3: 数值字段使用默认值替代null】\n\n";
class OrderSchema
{
public static function createOrder(array $data): array
{
return [
'order_id' => $data['order_id'],
'subtotal' => $data['subtotal'],
'discount' => $data['discount'] ?? 0,
'tax' => $data['tax'] ?? 0,
'shipping' => $data['shipping'] ?? 0,
'total' => self::calculateTotal($data),
'created_at' => new MongoDB\BSON\UTCDateTime()
];
}
private static function calculateTotal(array $data): float
{
$subtotal = $data['subtotal'];
$discount = $data['discount'] ?? 0;
$tax = $data['tax'] ?? 0;
$shipping = $data['shipping'] ?? 0;
$afterDiscount = $subtotal * (1 - $discount);
$taxAmount = $afterDiscount * $tax;
return $afterDiscount + $taxAmount + $shipping;
}
}
$order = OrderSchema::createOrder([
'order_id' => 'ORD001',
'subtotal' => 100
]);
echo "订单(使用默认值):\n";
echo json_encode($order, JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT) . "\n";运行结果:
=== 文档Schema设计最佳实践 ===
【实践1: 区分必需字段和可选字段】
用户1(无可选字段):
{
"user_id": "u001",
"username": "zhangsan",
"email": "zhangsan@example.com",
"created_at": {
"milliseconds": "1704067200000"
}
}
用户2(有可选字段):
{
"user_id": "u002",
"username": "lisi",
"email": "lisi@example.com",
"created_at": {
"milliseconds": "1704067200000"
},
"nickname": "小李",
"avatar": null
}
【实践2: 使用嵌入文档组织相关字段】
良好设计(嵌入文档):
{
"user_id": "u001",
"name": {
"first": "三",
"last": "张",
"middle": null
},
"contact": {
"email": "zhangsan@example.com",
"phone": null,
"address": null
},
"preferences": {
"theme": "dark",
"language": "zh-CN",
"notifications": null
}
}
不良设计(扁平字段):
{
"user_id": "u001",
"first_name": "三",
"last_name": "张",
"middle_name": null,
"email": "zhangsan@example.com",
"phone": null,
"address": null,
"theme": "dark",
"language": "zh-CN",
"notifications": null
}
【实践3: 数值字段使用默认值替代null】
订单(使用默认值):
{
"order_id": "ORD001",
"subtotal": 100,
"discount": 0,
"tax": 0,
"shipping": 0,
"total": 100,
"created_at": {
"milliseconds": "1704067200000"
}
}8. 常见问题答疑(FAQ)
8.1 为什么查询{field: null}会返回字段缺失的文档?
问题描述: 执行db.collection.find({field: null})时,不仅返回字段值为null的文档,还返回字段不存在的文档。
回答内容: 这是MongoDB的设计决策。在MongoDB的查询语义中,{field: null}等价于{$or: [{field: null}, {field: {$exists: false}}]},即同时匹配null值和缺失字段。这种设计是为了方便查询"无值"的文档,无论是因为显式设置为null还是字段不存在。
如果需要精确匹配,请使用:
{field: {$type: 10}}- 只匹配显式null值{field: {$exists: false}}- 只匹配字段缺失
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->faq->question1;
$collection->drop();
$collection->insertMany([
['name' => 'A', 'value' => null],
['name' => 'B'],
['name' => 'C', 'value' => 'test']
]);
echo "=== FAQ 1: 查询null的行为 ===\n\n";
echo "查询 {value: null}:\n";
$result1 = $collection->find(['value' => null])->toArray();
foreach ($result1 as $doc) {
echo " - {$doc['name']}\n";
}
echo "\n查询 {value: {\$type: 10}} (只匹配显式null):\n";
$result2 = $collection->find(['value' => ['$type' => 10]])->toArray();
foreach ($result2 as $doc) {
echo " - {$doc['name']}\n";
}
echo "\n查询 {value: {\$exists: false}} (只匹配字段缺失):\n";
$result3 = $collection->find(['value' => ['$exists' => false]])->toArray();
foreach ($result3 as $doc) {
echo " - {$doc['name']}\n";
}运行结果:
=== FAQ 1: 查询null的行为 ===
查询 {value: null}:
- A
- B
查询 {value: {$type: 10}} (只匹配显式null):
- A
查询 {value: {$exists: false}} (只匹配字段缺失):
- B8.2 如何在唯一索引中允许多个null值?
问题描述: 创建了唯一索引后,只能插入一个null值,如何允许多个文档的索引字段为null?
回答内容: 有两种解决方案:
- 使用稀疏唯一索引:
createIndex({field: 1}, {unique: true, sparse: true}),稀疏索引不索引缺失字段,因此多个文档可以没有该字段。 - 使用部分索引:
createIndex({field: 1}, {unique: true, partialFilterExpression: {field: {$type: "string"}}}),只对特定条件的文档强制唯一性。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
echo "=== FAQ 2: 唯一索引允许多个null ===\n\n";
echo "方案1: 稀疏唯一索引\n";
$collection1 = $client->faq->sparse_unique;
$collection1->drop();
$collection1->createIndex(['email' => 1], ['unique' => true, 'sparse' => true]);
$collection1->insertMany([
['name' => 'A'],
['name' => 'B'],
['name' => 'C', 'email' => 'test@example.com']
]);
echo "插入成功: 3个文档,其中2个没有email字段\n";
echo "\n方案2: 部分索引\n";
$collection2 = $client->faq->partial_unique;
$collection2->drop();
$collection2->createIndex(
['email' => 1],
[
'unique' => true,
'partialFilterExpression' => ['email' => ['$type' => 'string']]
]
);
$collection2->insertMany([
['name' => 'A', 'email' => null],
['name' => 'B', 'email' => null],
['name' => 'C', 'email' => 'test@example.com']
]);
echo "插入成功: 3个文档,其中2个email为null\n";运行结果:
=== FAQ 2: 唯一索引允许多个null ===
方案1: 稀疏唯一索引
插入成功: 3个文档,其中2个没有email字段
方案2: 部分索引
插入成功: 3个文档,其中2个email为null8.3 Null值和空字符串有什么区别?
问题描述: 在MongoDB中,null和空字符串""有什么区别?应该如何选择?
回答内容: null和空字符串是不同的BSON类型:
- null是BSON类型10,表示"无值"
- 空字符串是BSON类型2(字符串类型),表示"值为空"
选择建议:
- 使用null:表示"没有值"或"值不存在"的语义
- 使用空字符串:表示"值存在但为空",通常用于字符串类型字段
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->faq->null_vs_empty;
$collection->drop();
$collection->insertMany([
['name' => 'A', 'comment' => null],
['name' => 'B', 'comment' => ''],
['name' => 'C', 'comment' => '这是评论'],
['name' => 'D']
]);
echo "=== FAQ 3: Null vs 空字符串 ===\n\n";
echo "BSON类型对比:\n";
echo " null: BSON类型10\n";
echo " 空字符串: BSON类型2\n\n";
echo "查询comment为null:\n";
$result1 = $collection->find(['comment' => null])->toArray();
foreach ($result1 as $doc) {
echo " - {$doc['name']}\n";
}
echo "\n查询comment为空字符串:\n";
$result2 = $collection->find(['comment' => ''])->toArray();
foreach ($result2 as $doc) {
echo " - {$doc['name']}\n";
}
echo "\n查询comment显式为null:\n";
$result3 = $collection->find(['comment' => ['$type' => 10]])->toArray();
foreach ($result3 as $doc) {
echo " - {$doc['name']}\n";
}
echo "\n选择建议:\n";
echo " - null: 表示'没有值'或'值不存在'\n";
echo " - 空字符串: 表示'值存在但为空'\n";运行结果:
=== FAQ 3: Null vs 空字符串 ===
BSON类型对比:
null: BSON类型10
空字符串: BSON类型2
查询comment为null:
- A
- D
查询comment为空字符串:
- B
查询comment显式为null:
- A
选择建议:
- null: 表示'没有值'或'值不存在'
- 空字符串: 表示'值存在但为空'8.4 如何在聚合管道中正确处理null值?
问题描述: 在聚合管道中进行计算时,遇到null值导致结果为null,如何正确处理?
回答内容: MongoDB提供了多个操作符来处理null值:
$ifNull: 如果值为null则返回替代值$coalesce: 返回第一个非null值$cond: 条件判断处理null$switch: 多条件分支处理
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->faq->aggregation_null;
$collection->drop();
$collection->insertMany([
['item' => 'A', 'price' => 100, 'discount' => 0.1],
['item' => 'B', 'price' => 200, 'discount' => null],
['item' => 'C', 'price' => 150]
]);
echo "=== FAQ 4: 聚合管道处理null ===\n\n";
$pipeline = [
[
'$project' => [
'item' => 1,
'price' => 1,
'discount' => 1,
'final_price' => [
'$multiply' => [
'$price',
['$subtract' => [1, ['$ifNull' => ['$discount', 0]]]]
]
]
]
]
];
$result = $collection->aggregate($pipeline)->toArray();
foreach ($result as $doc) {
$discount = isset($doc['discount']) ? ($doc['discount'] ?? 'null') : 'missing';
echo "商品 {$doc['item']}: 折扣={$discount}, 最终价格={$doc['final_price']}\n";
}运行结果:
=== FAQ 4: 聚合管道处理null ===
商品 A: 折扣=0.1, 最终价格=90
商品 B: 折扣=null, 最终价格=200
商品 C: 折扣=missing, 最终价格=1508.5 Null值在排序中的位置是什么?
问题描述: 当按包含null值的字段排序时,null值会排在什么位置?
回答内容: 在MongoDB中,null值在排序时被视为最小的值。升序排序时null排在最前面,降序排序时null排在最后面。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->faq->sort_null;
$collection->drop();
$collection->insertMany([
['name' => 'A', 'score' => 100],
['name' => 'B', 'score' => null],
['name' => 'C', 'score' => 50],
['name' => 'D']
]);
echo "=== FAQ 5: Null值排序位置 ===\n\n";
echo "升序排序:\n";
$asc = $collection->find([])->sort(['score' => 1])->toArray();
foreach ($asc as $doc) {
$score = array_key_exists('score', $doc) ? ($doc['score'] ?? 'null') : 'missing';
echo " {$doc['name']}: {$score}\n";
}
echo "\n降序排序:\n";
$desc = $collection->find([])->sort(['score' => -1])->toArray();
foreach ($desc as $doc) {
$score = array_key_exists('score', $doc) ? ($doc['score'] ?? 'null') : 'missing';
echo " {$doc['name']}: {$score}\n";
}运行结果:
=== FAQ 5: Null值排序位置 ===
升序排序:
B: null
D: missing
C: 50
A: 100
降序排序:
A: 100
C: 50
B: null
D: missing8.6 如何批量更新null值为默认值?
问题描述: 数据库中存在大量null值字段,如何批量更新为默认值?
回答内容: 使用updateMany配合$set操作符,结合$type或$exists条件进行批量更新。
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->faq->batch_update_null;
$collection->drop();
$collection->insertMany([
['name' => 'A', 'status' => null],
['name' => 'B', 'status' => 'active'],
['name' => 'C', 'status' => null],
['name' => 'D']
]);
echo "=== FAQ 6: 批量更新null值 ===\n\n";
echo "更新前:\n";
$before = $collection->find([])->toArray();
foreach ($before as $doc) {
$status = array_key_exists('status', $doc) ? ($doc['status'] ?? 'null') : 'missing';
echo " {$doc['name']}: {$status}\n";
}
$collection->updateMany(
['status' => ['$type' => 10]],
['$set' => ['status' => 'inactive']]
);
echo "\n更新显式null后:\n";
$after = $collection->find([])->toArray();
foreach ($after as $doc) {
$status = array_key_exists('status', $doc) ? ($doc['status'] ?? 'null') : 'missing';
echo " {$doc['name']}: {$status}\n";
}运行结果:
=== FAQ 6: 批量更新null值 ===
更新前:
A: null
B: active
C: null
D: missing
更新显式null后:
A: inactive
B: active
C: inactive
D: missing9. 实战练习
9.1 基础练习
练习题目: 创建一个用户集合,包含姓名、邮箱、手机号字段。其中邮箱和手机号是可选字段。实现以下功能:
- 插入3个用户,分别包含不同组合的可选字段
- 查询所有没有邮箱的用户
- 查询手机号显式为null的用户
解题思路:
- 使用字段缺失表示未填写,使用显式null表示已清空
- 使用
$exists: false查询字段缺失 - 使用
$type: 10查询显式null
参考代码:
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->practice->users_exercise;
$collection->drop();
$collection->insertMany([
['name' => '张三', 'email' => 'zhangsan@example.com', 'phone' => '13800138000'],
['name' => '李四', 'email' => 'lisi@example.com'],
['name' => '王五', 'phone' => null]
]);
echo "=== 基础练习答案 ===\n\n";
echo "1. 所有用户:\n";
$all = $collection->find([])->toArray();
foreach ($all as $doc) {
echo " {$doc['name']}\n";
}
echo "\n2. 没有邮箱的用户:\n";
$noEmail = $collection->find(['email' => ['$exists' => false]])->toArray();
foreach ($noEmail as $doc) {
echo " {$doc['name']}\n";
}
echo "\n3. 手机号显式为null的用户:\n";
$nullPhone = $collection->find(['phone' => ['$type' => 10]])->toArray();
foreach ($nullPhone as $doc) {
echo " {$doc['name']}\n";
}运行结果:
=== 基础练习答案 ===
1. 所有用户:
张三
李四
王五
2. 没有邮箱的用户:
王五
3. 手机号显式为null的用户:
王五9.2 进阶练习
练习题目: 实现一个商品评分系统,用户可以对商品评分(1-5分),也可以不评分。要求:
- 计算每个商品的平均评分(排除未评分的)
- 找出所有未评分的记录
- 使用聚合管道实现
解题思路:
- 使用
$match过滤非null评分 - 使用
$group计算平均值 - 使用
$ifNull处理显示
参考代码:
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$collection = $client->practice->ratings_exercise;
$collection->drop();
$collection->insertMany([
['product_id' => 'P001', 'user_id' => 'U001', 'rating' => 5],
['product_id' => 'P001', 'user_id' => 'U002', 'rating' => 4],
['product_id' => 'P001', 'user_id' => 'U003', 'rating' => null],
['product_id' => 'P002', 'user_id' => 'U001', 'rating' => 3],
['product_id' => 'P002', 'user_id' => 'U002', 'rating' => null]
]);
echo "=== 进阶练习答案 ===\n\n";
$pipeline = [
[
'$group' => [
'_id' => '$product_id',
'avgRating' => [
'$avg' => [
'$cond' => [
'if' => ['$ne' => ['$rating', null]],
'then' => '$rating',
'else' => '$$REMOVE'
]
]
],
'totalRatings' => ['$sum' => 1],
'ratedCount' => [
'$sum' => [
'$cond' => [['$ne' => ['$rating', null]], 1, 0]
]
]
]
]
];
echo "商品评分统计:\n";
$result = $collection->aggregate($pipeline)->toArray();
foreach ($result as $doc) {
$avg = $doc['avgRating'] ? round($doc['avgRating'], 2) : '无评分';
echo " 商品 {$doc['_id']}: 平均分={$avg}, 评分数={$doc['ratedCount']}/{$doc['totalRatings']}\n";
}
echo "\n未评分记录:\n";
$unrated = $collection->find(['rating' => ['$type' => 10]])->toArray();
foreach ($unrated as $doc) {
echo " 商品 {$doc['product_id']}, 用户 {$doc['user_id']}\n";
}运行结果:
=== 进阶练习答案 ===
商品评分统计:
商品 P001: 平均分=4.5, 评分数=2/3
商品 P002: 平均分=3, 评分数=1/2
未评分记录:
商品 P001, 用户 U003
商品 P002, 用户 U0029.3 挑战练习
练习题目: 设计一个支持多级配置继承的系统配置管理模块,实现:
- 系统级默认配置
- 租户级配置覆盖
- 用户级配置覆盖
- 配置值为null时表示继承上级配置
- 查询任意层级的有效配置
解题思路:
- 使用三个集合分别存储系统、租户、用户配置
- null值表示继承上级
- 按优先级查询:用户 → 租户 → 系统
参考代码:
php
<?php
require_once 'vendor/autoload.php';
use MongoDB\Client;
$client = new Client("mongodb://localhost:27017");
$systemCol = $client->practice->system_configs;
$tenantCol = $client->practice->tenant_configs;
$userCol = $client->practice->user_configs;
$systemCol->drop();
$tenantCol->drop();
$userCol->drop();
$systemCol->insertMany([
['key' => 'theme', 'value' => 'light'],
['key' => 'language', 'value' => 'zh-CN'],
['key' => 'timezone', 'value' => 'UTC']
]);
$tenantCol->insertMany([
['tenant_id' => 'T001', 'key' => 'theme', 'value' => 'dark'],
['tenant_id' => 'T001', 'key' => 'language', 'value' => null]
]);
$userCol->insertMany([
['tenant_id' => 'T001', 'user_id' => 'U001', 'key' => 'timezone', 'value' => 'Asia/Shanghai']
]);
function getEffectiveConfig($systemCol, $tenantCol, $userCol, $tenantId, $userId, $key) {
$userConfig = $userCol->findOne([
'tenant_id' => $tenantId, 'user_id' => $userId, 'key' => $key
]);
if ($userConfig && $userConfig['value'] !== null) {
return ['value' => $userConfig['value'], 'source' => 'user'];
}
$tenantConfig = $tenantCol->findOne(['tenant_id' => $tenantId, 'key' => $key]);
if ($tenantConfig && $tenantConfig['value'] !== null) {
return ['value' => $tenantConfig['value'], 'source' => 'tenant'];
}
$systemConfig = $systemCol->findOne(['key' => $key]);
if ($systemConfig) {
return ['value' => $systemConfig['value'], 'source' => 'system'];
}
return ['value' => null, 'source' => 'none'];
}
echo "=== 挑战练习答案 ===\n\n";
$configs = ['theme', 'language', 'timezone'];
foreach ($configs as $key) {
$result = getEffectiveConfig($systemCol, $tenantCol, $userCol, 'T001', 'U001', $key);
echo "配置 {$key}: {$result['value']} (来源: {$result['source']})\n";
}运行结果:
=== 挑战练习答案 ===
配置 theme: dark (来源: tenant)
配置 language: zh-CN (来源: system)
配置 timezone: Asia/Shanghai (来源: user)10. 知识点总结
10.1 核心要点
Null值的两种形态
- 显式Null:字段存在,值为null,BSON类型10
- 隐式缺失:字段不存在于文档中
查询行为差异
{field: null}同时匹配null值和字段缺失{field: {$type: 10}}只匹配显式null{field: {$exists: false}}只匹配字段缺失
索引与Null
- 普通索引会索引null值和缺失字段
- 稀疏索引不索引null值和缺失字段
- 唯一索引将null视为一个具体值
聚合管道处理
- 使用
$ifNull提供默认值 - 使用
$coalesce返回第一个非null值 - 使用
$cond和$switch条件处理
- 使用
10.2 易错点回顾
混淆null值与字段缺失
- 错误:使用
{field: null}期望只匹配显式null - 正确:使用
{field: {$type: 10}}
- 错误:使用
聚合计算中null传播
- 错误:直接使用可能为null的字段进行计算
- 正确:使用
$ifNull提供默认值
唯一索引null冲突
- 错误:期望多个文档可以有null值
- 正确:使用稀疏索引或部分索引
PHP类型判断陷阱
- 错误:使用
empty()或== null判断 - 正确:使用
=== null严格比较
- 错误:使用
11. 拓展参考资料
11.1 官方文档
11.2 进阶学习路径
- MongoDB数据建模 - 学习如何设计包含可选字段的文档模型
- MongoDB索引优化 - 深入理解稀疏索引和部分索引的应用场景
- MongoDB聚合管道 - 掌握复杂的数据处理和null值处理技巧
- MongoDB Schema Validation - 学习如何使用模式验证确保数据完整性
11.3 相关知识点
- BSON类型系统 - 理解MongoDB底层数据类型
- $type操作符 - 按类型查询数据
- $exists操作符 - 检查字段是否存在
- 部分索引表达式 - 高级索引策略
