核心概念与原理

一、倒排索引(Inverted Index)

1.1 什么是倒排索引

倒排索引是Elasticsearch实现快速全文检索的核心数据结构。

正排索引 vs 倒排索引:

正排索引(传统数据库):
DocID | Content
1     | "Java is great"
2     | "Python is easy"
3     | "Java and Python"

查询"Java"需要扫描所有文档 → O(n)

倒排索引(Elasticsearch):
Term    | DocIDs      | 词频(TF) | 位置
Java    | [1, 3]      | [1, 1]   | [[0], [0]]
Python  | [2, 3]      | [1, 1]   | [[0], [2]]
great   | [1]         | [1]      | [[2]]
easy    | [2]         | [1]      | [[2]]

查询"Java"直接定位 → O(1)

1.2 倒排索引结构

完整倒排索引包含:

Term Dictionary(词项字典)
- 所有文档的单词集合
- 使用FST(Finite State Transducer)压缩存储
- 支持前缀搜索
Posting List(倒排列表)
- 包含该词的文档ID列表
- 文档频率(DF)
- 词频(TF)
- 位置信息(用于短语查询)
Term Index(词项索引)
- Term Dictionary的索引
- 存储在内存中,加速查询

数据结构图:

内存:
Term Index (FST)
    ↓
磁盘:
Term Dictionary
├─ java → Posting List: [1→TF:2→Pos:[0,5], 3→TF:1→Pos:[0]]
├─ python → Posting List: [2→TF:1→Pos:[0], 3→TF:1→Pos:[2]]
└─ elasticsearch → Posting List: [1→TF:1→Pos:[3]]

1.3 倒排索引的优势

查询速度快😮(1)时间复杂度定位文档
支持复杂查询:布尔查询、短语查询、模糊查询
空间效率高:使用压缩算法减少存储

压缩技术:

Frame of Reference(FOR):整数压缩
Roaring Bitmap:文档ID集合压缩
FST:Term Dictionary压缩

二、核心概念

2.1 Index(索引)

Index是文档的集合,类似于MySQL的数据库。

示例:

// 创建索引
PUT /products
{
  "settings": {
    "number_of_shards": 3,      // 主分片数(创建后不可修改)
    "number_of_replicas": 1     // 副本数(可动态修改)
  },
  "mappings": {
    "properties": {
      "name": { "type": "text" },
      "price": { "type": "float" },
      "created_at": { "type": "date" }
    }
  }
}

2.2 Document(文档)

Document是可被索引的基本单位,以JSON格式存储。

示例:

PUT /products/_doc/1
{
  "name": "iPhone 15 Pro",
  "price": 7999,
  "brand": "Apple",
  "tags": ["5G", "A17芯片"],
  "created_at": "2024-09-15"
}

文档元数据:

_index:文档所属索引
_id:文档唯一标识
_source:原始JSON数据
_version:文档版本号(乐观锁)
_score:相关性得分

2.3 Mapping(映射)

Mapping定义文档字段类型和索引方式,类似于MySQL的表结构。

核心字段类型:

类型	说明	示例
text	全文检索,会分词	商品标题、文章内容
keyword	精确匹配,不分词	手机号、订单号、标签
long/integer	整数	价格、库存
float/double	浮点数	评分、经纬度
date	日期	创建时间、更新时间
boolean	布尔值	是否上架
object	嵌套对象	地址信息
geo_point	地理位置	经纬度坐标

text vs keyword:

{
  "mappings": {
    "properties": {
      "title": {
        "type": "text",              // 分词索引,支持全文检索
        "analyzer": "ik_max_word",   // 使用IK分词器
        "fields": {
          "keyword": {               // 子字段,不分词
            "type": "keyword"
          }
        }
      },
      "status": {
        "type": "keyword"            // 不分词,精确匹配
      }
    }
  }
}

使用场景:

// text字段:模糊搜索
GET /products/_search
{
  "query": {
    "match": { "title": "手机" }  // 可以匹配"智能手机"、"手机壳"
  }
}

// keyword字段:精确匹配、聚合、排序
GET /products/_search
{
  "query": {
    "term": { "status": "published" }  // 必须完全相等
  },
  "aggs": {
    "by_status": {
      "terms": { "field": "status" }   // 按status聚合
    }
  }
}

2.4 Shard(分片)

Shard是索引的物理分区,实现水平扩展和并行处理。

主分片(Primary Shard):

创建索引时指定,之后不可修改
每个文档只存在于一个主分片
建议数量:节点数 × (1-3)

副本分片(Replica Shard):

主分片的完整拷贝
提供高可用和读取性能
可动态调整数量

分片架构:

索引:products (3主分片,1副本)

节点1: P0, R1, R2
节点2: P1, R2, R0
节点3: P2, R0, R1

P = Primary Shard
R = Replica Shard

分片数量选择:

场景1:小数据集(< 10GB)
└─ 1-3个主分片

场景2:中等数据集(10-100GB)
└─ 3-5个主分片

场景3:大数据集(> 100GB)
└─ 5-10个主分片

原则:
- 单分片大小控制在10-50GB
- 避免过度分片(管理开销)
- 考虑未来增长空间

三、数据写入流程

3.1 写入流程详解

1. 客户端发送写入请求
   ↓
2. 协调节点(Coordinating Node)接收请求
   ↓
3. 路由计算:shard = hash(_routing) % number_of_primary_shards
   ↓
4. 转发到主分片所在节点
   ↓
5. 主分片写入文档
   ├─ 写入内存 Buffer
   ├─ 写入 Translog(持久化日志)
   └─ 返回成功
   ↓
6. 主分片将请求转发给所有副本分片
   ↓
7. 副本分片写入完成后,主分片返回客户端

代码示例:

// 自定义routing,确保相关文档在同一分片
PUT /orders/_doc/order-123?routing=user-456
{
  "user_id": "user-456",
  "product": "iPhone 15",
  "price": 7999
}

// routing计算
shard = hash("user-456") % 3  // 假设3个主分片

3.2 Refresh机制

写入的文档不会立即可搜索,需要经过Refresh操作。

Refresh流程:

内存Buffer(不可搜索)
    ↓ Refresh(默认1秒)
Segment(可搜索,未持久化)
    ↓ Flush
磁盘(持久化)

Refresh配置:

// 修改refresh间隔(牺牲实时性,提升写入性能)
PUT /logs/_settings
{
  "index.refresh_interval": "30s"
}

// 批量写入时禁用refresh
PUT /logs/_settings
{
  "index.refresh_interval": "-1"
}

// 手动refresh
POST /logs/_refresh

3.3 Translog机制

Translog确保数据不丢失。

作用:

故障恢复:节点宕机后,从Translog恢复未持久化数据
实时性:写入Translog后立即返回,无需等待Flush

Flush流程:

1. 将内存Buffer的数据写入新Segment
2. 清空内存Buffer
3. 写入Commit Point
4. Fsync Segment到磁盘
5. 清空Translog

Translog配置:

PUT /logs/_settings
{
  "index.translog.durability": "async",     // 异步刷盘,提升性能
  "index.translog.sync_interval": "5s",    // 每5秒刷盘一次
  "index.translog.flush_threshold_size": "512mb"  // 大小阈值
}

四、文档路由

4.1 路由算法

# 默认路由
shard_num = hash(_id) % num_primary_shards

# 自定义routing
shard_num = hash(_routing) % num_primary_shards

为什么主分片数不能修改?

修改后,路由公式变化,导致文档无法定位
解决方案:Reindex到新索引

4.2 自定义Routing场景

场景1:用户数据隔离

// 同一用户的所有订单在同一分片,提升查询性能
PUT /orders/_doc/order-1?routing=user-123
{
  "user_id": "user-123",
  "product": "iPhone"
}

GET /orders/_search?routing=user-123
{
  "query": {
    "term": { "user_id": "user-123" }
  }
}

场景2:多租户系统

// 每个租户的数据在独立分片
PUT /logs/_doc/log-1?routing=tenant-A
{
  "tenant_id": "tenant-A",
  "message": "Error occurred"
}

五、高频面试题

为什么Elasticsearch查询这么快?

答案:

倒排索引😮(1)定位文档,无需全表扫描
FST:Term Dictionary压缩在内存,加速查询
分片并行:多分片并行执行查询
缓存机制:
- Query Cache:缓存查询结果
- Request Cache:缓存聚合结果
- Field Data Cache:缓存排序/聚合字段

什么时候用text,什么时候用keyword?

场景	字段类型
商品标题搜索	text
订单号精确匹配	keyword
用户评论搜索	text
用户性别(男/女)	keyword
文章内容搜索	text
标签聚合统计	keyword

原则:

需要分词、模糊搜索 → text
精确匹配、聚合、排序 → keyword

主分片数如何选择?

计算公式:

主分片数 = 预估数据量 / 单分片大小(30GB)

示例:
- 预估1年后数据量:300GB
- 主分片数 = 300GB / 30GB = 10个

注意:

创建后不可修改,需预留增长空间
过多分片增加管理开销
建议:3-10个主分片

如何保证写入数据不丢失?

答案:

Translog持久化:
- 每次写入同步写入Translog
- 节点宕机后从Translog恢复
副本机制:
- 设置wait_for_active_shards=2
- 主分片+副本分片都写入成功才返回
集群冗余:
- 多副本部署(至少1个副本)
- 跨机架/跨AZ部署

示例:

PUT /orders/_doc/1?wait_for_active_shards=2
{
  "order_id": "12345",
  "amount": 999
}

Refresh、Flush、Merge的区别?

操作	作用	频率	性能影响
Refresh	Buffer → Segment,文档变为可搜索	1秒	小
Flush	Segment → 磁盘,清空Translog	30分钟	中
Merge	合并小Segment,删除已标记删除的文档	后台自动	大

优化建议:

// 批量导入时
PUT /logs/_settings
{
  "index.refresh_interval": "-1",           // 禁用refresh
  "index.number_of_replicas": 0            // 禁用副本
}

// 导入完成后恢复
PUT /logs/_settings
{
  "index.refresh_interval": "1s",
  "index.number_of_replicas": 1
}
POST /logs/_refresh
POST /logs/_forcemerge?max_num_segments=1

六、实战技巧

6.1 Mapping设计最佳实践

PUT /products
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1,
    "refresh_interval": "1s"
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "ik_max_word",
        "search_analyzer": "ik_smart",      // 搜索时用粗粒度分词
        "fields": {
          "keyword": { "type": "keyword" }  // 支持精确匹配和聚合
        }
      },
      "price": {
        "type": "scaled_float",             // 比float更节省空间
        "scaling_factor": 100
      },
      "tags": {
        "type": "keyword"                   // 数组自动支持
      },
      "created_at": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss||epoch_millis"
      },
      "location": {
        "type": "geo_point"                 // 地理位置
      }
    }
  }
}

6.2 动态Mapping陷阱

问题:数字字符串被自动识别为long类型

// 自动创建mapping
PUT /logs/_doc/1
{
  "order_id": "123456"  // 被识别为long,不是keyword!
}

// 后续查询失败
GET /logs/_search
{
  "query": {
    "term": { "order_id": "123456" }  // keyword查询,但字段是long
  }
}

解决方案:

// 方案1:预定义Mapping
PUT /logs
{
  "mappings": {
    "properties": {
      "order_id": { "type": "keyword" }
    }
  }
}

// 方案2:使用Index Template
PUT /_index_template/logs_template
{
  "index_patterns": ["logs-*"],
  "template": {
    "mappings": {
      "properties": {
        "order_id": { "type": "keyword" }
      }
    }
  }
}

6.3 监控关键指标

# 查看分片分配
GET _cat/shards/products?v

# 查看索引统计
GET /products/_stats

# 查看节点性能
GET _nodes/stats

# 关键指标
- indexing.index_total: 写入文档数
- search.query_total: 查询次数
- search.query_time_in_millis: 查询耗时
- segments.count: Segment数量(过多需Merge)