原文
| ||||||||||||||||||
| ||||||||||||||||||
![]() |
原始链接: https://news.ycombinator.com/item?id=43623099
Hacker News用户“dicroce”分享了他用大约500行代码实现的向量嵌入HNSW(分层可导航小世界)索引。HNSW被描述为一种分层图结构,其顶层节点稀疏,底层节点密集。节点连接到同一层内的附近节点。插入操作涉及随机选择一个层级,并将节点插入到该层级及其下方的所有层级。搜索从顶层开始,找到最近的节点,然后向下搜索附近的节点,同时跟踪已看到的K个最近节点。搜索返回level 0上的值或最近值以及K个最近节点。一位评论者赞赏了对该数据结构的清晰解释。另一位建议使用ann-benchmarks.com将该实现的性能和召回特性与现有实现进行比较。
| ||||||||||||||||||
| ||||||||||||||||||
![]() |
> HNSW is a graph structure that consists of levels that are more sparsely populated at the top and more densely populated at the bottom. Nodes within a layer have connections to other nodes that are near them on the same level. When a node is inserted a random level is picked and the node is inserted there. It is also inserted into all levels beneath that level down to 0.
> When searches arrive they start at the top and search that level (following connections) until they find the closest node in the top level. The search then descends and keeps searching nearby nodes. As the search progresses the code keeps track of the K nearest nodes it has seen. Eventually it either finds the value OR it finds the closest value on level 0 and the K nearest nodes seen are returned.
reply