SEARCH SYSTEM FOR QUANTIZATION OF VECTOR INDICES AND EFFICIENT MULTI-GRAPH SEARCHING AND MERGING

Information

  • Patent Application
  • 20250231948
  • Publication Number
    20250231948
  • Date Filed
    January 13, 2025
    a year ago
  • Date Published
    July 17, 2025
    6 months ago
Abstract
According to an aspect, a search system is provided that executes quantization techniques and/or multi-graph searching and/or merging techniques that may increase the speed of searching and/or indexing while reducing the amount of computing resources that are used to perform these computer tasks as compared with conventional approaches.
Description
DESCRIPTION

An index (e.g., index vectors) may be used to store representations of data (e.g., structured data or, in some examples, unstructured data). Machine-learning (ML) model(s) may generate index vectors, and these index vectors are learned such that the distance between the index vectors may map to certain properties. For example, text can be represented by the index vectors where neighboring index vectors are semantically similar. Some vector search algorithms (e.g., Hierarchical Navigable Small World (HNSW)) use a multi-layered graph structure to efficiently locate approximate nearest neighbors (ANNs) in high-dimensional data spaces, and these vector search algorithms may store the vectors in memory so that they are accessible when traversing a graph. Quantization techniques can be applied to approximate such vectors in a space efficient manner. However, some conventional quantization techniques may still require relatively high memory storage requirements and/or may have lower accuracy of representing the underlying data.


In some examples, when building index vectors, a single graph is generated with the index vectors. In some examples, the index vectors are built with multiple graphs. Some conventional search algorithms that search in a single graph may be computationally inefficient. Some conventional indexing algorithms that merge multiple graphs may be relatively complex and use increased computing resources.


SUMMARY

In some aspects, the techniques described herein relate to a method including: identifying, using an optimization solver, one or more quantization constants for quantizing a vector in a graph of a search system; generating a quantized vector of the vector using the quantization constants.


In some aspects, the techniques described herein relate to a method including: computing a similarity distance between a first quantized vector and a second quantized vector stored in a search system; estimating an error approximation based on metadata associated with the first quantized vector and metadata associated with the second quantized vector; and updating the similarity distance based on the error approximation.


In some aspects, the techniques described herein relate to a method including: receiving vector segments to merge; receiving one or more system constraints; identifying an indexing strategy based on the one or more system constraints; and generating a merged segment for the vector segments using the indexing strategy.


In some aspects, the techniques described herein relate to a method including: receiving a user query for a vector search on a first graph and a second graph; executing a first search on the first graph and a second search on the second graph; at an interval, broadcasting a first message about the first search to the second graph and broadcasting a second message about the second search to the first graph; and updating the first search and the second search based on the second message and the first message, respectively.


In some aspects, the techniques described herein relate to a method including: identifying a first graph and a second graph to merge; adding a subset of vectors from the second graph to the first graph; for a first vector in the second graph not added to the first graph, identify neighbor vectors in the first graph and the second graph using neighbors of the first vector included in the subset; and adding the neighbor vectors to the first graph.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1A illustrates a search system according to an aspect.



FIG. 1B illustrates a distributed computing system of a search system according to an aspect.



FIG. 2 illustrates an example of a quantization constant selector according to an aspect.



FIG. 3A illustrates aspects of providing quantized distance accuracy using a quantization engine and a quantized distance calculator according to an aspect.



FIG. 3B illustrates an example of a layout of a packed vector according to an aspect.



FIG. 3C illustrates examples of a layout of an unpacked vector and a query vector according to an aspect.



FIG. 4 illustrates an example of an indexing strategy selector according to an aspect.



FIG. 5 illustrates an example of a multigraph searcher according to an aspect.



FIG. 6A illustrates a graph merging engine according to an aspect.



FIG. 6B illustrates a graph merging engine according to another aspect.



FIG. 6C depicts an illustration of a gain calculation from adding a vertex to a join set according to an aspect.



FIG. 7 illustrates a flowchart depicting example operations of graph merging according to an aspect.





DETAILED DESCRIPTION

This disclosure relates to a search system that executes quantization techniques and/or multi-graph searching and/or merging techniques that may increase the speed of searching and/or indexing while reducing the amount of computing resources that are used to perform these computer tasks as compared with conventional approaches. In some examples, the search system may automatically select one or more quantization constants which may improve the efficiency of obtaining search results (e.g., nearest neighbor queries of a query vector). In some examples, the search system implements a quantization strategy that can improve the accuracy of a distance calculation for a fixed compression ratio and/or may increase compression of vectors without loss of retrieval accuracy (or minimized loss), thereby reducing the amount of memory requirements. In some examples, the search system may implement an adaptive indexing strategy for storing vectors when graphs are merged in a manner that is computationally more efficient than conventional approaches. In some examples, the search system may execute a parallel graph search that is computationally more efficient than conventional approaches. In some examples, the search system may execute a graph merging mechanism for merging multiple graphs that is computationally more efficient than conventional approaches.



FIGS. 1A and 1B illustrate an example of a search system 100 according to an aspect. In some examples, the search system 100 is a search and analytics engine. In some examples, the search system 100 is an indexing and search engine. In some examples, the search system 100 is an indexing, search, and analytics engine. The search system 100 includes a search and index tier 101. The search and index tier 101 may be a single layer or multiple layers for managing and storing index information 120 and searching using the index information 120. In some examples, the search and index tier 101 include common computer resources (e.g., CPU, memory) that are shared between indexing and searching operations. In some examples, the search and index tier 101 include computer resources for indexing that are separate from the computer resources for searching, thereby allowing a search tier and an index tier to be independently scaled.


The search and index tier 101 may be executable by one or more server computers 160. The search and index tier 101 may communicate with one or more client devices 130. In some examples, the search and index tier 101 may receive data 134 (e.g., new data) to be indexed by the search and index tier 101. In some examples, the search and index tier 101 may receive a client query 132 from a client device 130, and the search and index tier 101 may identify (and return) one or more search results responsive to the client query 132 by searching the index information 120.


The search and index tier 101 may include one or more indexing engines, one or more searching engines, and one or more local storage devices (e.g., local disks, buffers). In some examples, the search and index tier 101 include a distributed computing system. In some examples, as shown in FIG. 1B, the search and index tier 101 includes a plurality of machines 162 (e.g., computing devices) that are in communication with each other. The machines 162 may include a machine 162-1, a machine 162-2, a machine 162-3, a machine 162-3, and a machine 162-4. Although four machines 162 are illustrated in FIG. 1B, the search system 100 may include any number of machines 162. In some examples, a machine 162 is referred to as a node, which may be an indexing node or a search node. A machine 162 includes one or more processors and one or more local storage devices.


The search and index tier 101 may receive (e.g., ingest) data 134 (e.g., documents) from one or more client devices 130. The search and index tier 101 may generate one or more index structures about the data 134 and store the index structures in the index information 120. The index information 120 includes one or more index structures about the data stored by the search and index tier 101. In some examples, the index information 120 includes one or more graphs 122 storing vectors 124. In some examples, the vectors 124 are referred to as index vectors. The graphs 122 may be used by the search and index tier 101 to efficiently find the documents that are relevant to a particular client query 132. The type of index structure is dependent upon the type of documents ingested by the search and index tier 101, but may generally include a document identifier, document type, timestamp, index terms, ranking, etc.


In some examples, the search system 100 includes a quantization constant selector 108 configured to automatically select one or more quantization constants 128, which are used by a quantized distance calculator 112 to determine the similarity of vectors 124 to a vector query. FIG. 2 illustrates an example of the quantization constant selector 108. Referring to FIG. 2, the quantization constant selector 108 includes a re-quantization decider 126 configured to receive vector segments to merge and generate values for the quantization constraints 128, which are used to update the quantized vectors. In some examples, a vector segment is a component of a vector 124, and a vector 124 may have multiple components.


For example, in a floating point representation of a vector 124, each of its components (e.g., vector segments) is represented by a floating point number (e.g., a thirty-two bit institute of electrical and electronics engineers (IEEE) floating point number). The search and index tier 101 includes a quantization engine 110 configured to execute scalar quantization, which includes converting floating point numbers into integers with bits (e.g., b bits). In some examples, scalar quantization may be executed based on the following transformation:










x





(


2
b

-
1

)



(


clamp
(

x
,
l
,
u

)

-
l

)

/

(

u
-
l

)









Eq
.


(
1
)








The operator └⋅┘ is a rounding operation that rounds to the nearest integer and clamp (x, l, u) is a function that restricts an argument x to the interval [l, u]. As further described below, the quantization constants 128 (e.g., l and u) may be selected by the quantization constant selector 108.


Since the result of this transformation (e.g., Eq. (1)) is an integer in the range [0, 2b], the result can be represented by a b bit unsigned integer. In some examples, the quantization constants 128 are used to map components (e.g., every component) of a vector 124 to an integer, and a quantized distance calculator 112 may compute (e.g., directly compute) the distance between vectors 124 from their quantized representation. The variable xq is the result of quantizing a floating point vector x, and Eq. (1) may be expressed with the vector's components as follows:









x
=

l
+



u
-
l



2
b

-
1




x
q







Eq
.


(
2
)








A vector l is a vector with components l. In some examples, vector proximity may use cosine or dot product distance metrics. Cosine distance may reduce to a dot product if each floating point vector is normalized. Although some aspects discuss dot product distance, the search system 100 may use cosine product distance metrics. The dot product distance between vector x and y may be defined by xty. Substituting from Eq. (2) and denoting







α
=


u
-
l



2
b

-
1



,













x
t


y

=




(

l
+

α


x
q



)

t



(

l
+

α


y
q



)


=



l
t


l

+

α



l
t

(


x
q

+

y
q


)


+


α
2



x
q
t



y
q








Eq
.


(
3
)








The first term of the right-hand side (RHS) is a constant, and the second term depends (e.g., depends only) on a single vector and can be pre-computed and stored with each vector as a single float. As such, the quantized distance calculator 112 may compute the distance between any two vectors can be computed by correcting the dot product of their quantized vectors using Eq. (3).


As indicated above, the quantization constant selector 108 may select quantization constants 128 (e.g., l and u) to minimize the chance of reordering the dot product between two random vectors in the index information 120. Vector indices may be used for nearest neighbor queries. A nearest neighbor query may search for the closest k vectors or top-k set in the index information 120 for a given query vector y (or sometimes referred to as a query). An error in the vector approximation (e.g., which preserves the order of the vectors' distance to a query vector y, such as a constant shift), may not affect the top-k set. In some examples, the search system 100 may minimize the chance that vectors are reordered on average for a query. In some examples, the search system 100 may not know the query distribution at index time. In some examples, the search system 100 may assume that queries are drawn from the same distribution as the index vectors.


The quantization constant selector 108 may compute the values of the quantization constants, l and u, which may minimize the chance of reordering the dot product between two random vectors 124 in the index information 120. In some examples, the quantization constant selector 108 may compute the values of the quantization constants 128 based on an expected score variance introduced by the quantization. The search system 100 may use xq(l, u) to denote the quantized version of the vector x as a function of quantization parameters and the quantization constant selector 108 may select values l and u which satisfy the following equation:










l

?


,


u

?


=

arg




min



l
,
u





𝔼

X
,
Y


[




"\[LeftBracketingBar]"




X

?



Y

-



X

?


?


(

l
,
u

)




Y

?


(

l
,
u

)


-



𝔼

X
,
Y


[



X

?



Y

-



X

?


?


(

l
,
u

)




Y

?


(

l
,
u

)



]




"\[RightBracketingBar]"


2

]







Eq
.


(
4
)











?

indicates text missing or illegible when filed




In some examples, the quantization constant selector 108 may approximate Eq. (4) by selecting random pairs of vector P from the index information 120 and averaging the selected pairs. In some examples, the quantization constant selector 108 may use an optimization solver to identify values of the quantization constants 128 that minimizes a loss (representing the expected score variance introduced by quantization) based on the following equations:










loss
(

l
,
u

)

=


1



"\[LeftBracketingBar]"

P


"\[RightBracketingBar]"








i

P





"\[LeftBracketingBar]"




x

?


?




y

?



-



x

?


(

l
,
u

)




y

?


(

l
,
u

)


-


1



"\[LeftBracketingBar]"

P


"\[RightBracketingBar]"








i

P






"\[LeftBracketingBar]"




x

?


?




y

?



-




x

?


(

l
,
u

)




y

?


(

l
,
u

)





"\[RightBracketingBar]"


2











Eq
.


(
5
)














l

?


,


u

?


=

arg




min



l
,
u




(

l
,
u

)







Eq
.


(
6
)











?

indicates text missing or illegible when filed




In some examples, the quantization constant selector 108 may solve Eq. (5) and Eq. (6) using an optimization solver such as a grid search solver or a Bayesian optimization solver. A grid search solver may evaluate the function (e.g., Eq. (5) and Eq. (6)) at a predefined grid of points in the input space and the point with the best output is then considered the optimal solution. A Bayesian optimization solver may use Bayesian statistics to construct a probabilistic model of the function and the model is then used to intelligently select promising points for evaluation, allowing for more efficient exploration of the input space.


In some examples, the search system 100 may improve quantized distance accuracy (e.g., improving the error analysis in Eq. (3) to obtain a better approximation of the dot product). For example, as shown in FIG. 3A, a quantization engine 110 may receive a vector 124 and generate metadata 136 and a quantized vector 124b using quantization parameters 138, which may include the quantization constants 128 selected by the quantization constant selector 108. The quantized distance calculator 112 may execute a similarity function 190 (e.g., compute a distance (e.g., a dot product distance)) between a quantized vector 124b-1 and a quantized vector 124b-2. The quantized distance calculator 112 may execute a correction function 192 to obtain an approximation error based on metadata 136-1 associated with the quantized vector 124b-1 and metadata 136-2 associated with the quantized vector 124b-2. The quantized distance calculator 112 may update the result of the similarity function 190 with the result of the correction function 192, thereby providing an improved similarity accuracy.


For example, the dot product distance (e.g., result of the similarity function 190) between two vectors may be expressed as follows:














x

?



y

=




(

l
+
x
-
l

)


?




(

l
+
y
-
l

)








=




l

?



l

+


l

?


(

y
-
l
+
x
-
l

)

+



(

x
-
l

)


?




(

y
-
l

)










Eq
.


(
7
)











?

indicates text missing or illegible when filed




The first two terms are functions of a single vector 124. The first two terms can be pre-computed and stored for each vector 124. The third term may be expressed as follows:















(

x
-
l

)


?




(

y
-
l

)


=




(


α


x
q


+
x
-
l
-

α


x
q



)


?




(


α


y

?



+
y
-
l
-

α


y

?




)








=




(


α


x
q


+

e
x


)


?




(


α


y
q


+

e

?



)








=




α
2



x
q

?




x
q


+

α


e
x

?




y
q


+

α


x
q

?




e
y


+


?


(




"\[LeftBracketingBar]"

e


"\[RightBracketingBar]"


2

)










Eq
.


(
8
)











?

indicates text missing or illegible when filed




In some examples, quantities ex=x−1−axq and ey=y−1−ayq may be relatively small. In some examples, the search system 100 may not know the value of either yq or the vectors {xq} in the index, which may be required to compute xqtey in advance. In some examples, the search system 100 may use the expected values, which may yield:













(

x
-
l

)


?




(

y
-
l

)






α
2



x
q

?




y

?



+

α𝔼
(



e
x

?




Y
q


+


X

?

t



e
y






]




Eq
.


(
9
)











?

indicates text missing or illegible when filed




The search system 100 may minimize the approximation error for vectors close to the query vector, e.g., xq≈yq. If any distribution for Xq is averaged, this becomes a constant term that may be added to dot products (e.g., all dot products) for the query. This may not affect their order, and, in some examples, the constant term may be discarded.


In some examples, the search system 100 may not derive the query distribution at index time. In some examples, the search system 100 may assume queries are drawn from the same distribution as the vectors in the index. Given typical index data structures, such as an HNSW graph, a search system 100 may quickly extract the near neighbors of a vector. A set of near neighbors of xq may be denoted by N. Then, for yq≈xq, the expectation can be approximated by:










𝔼
[


e

?

t



Y

?



]




1



"\[LeftBracketingBar]"

N


"\[RightBracketingBar]"









x

?




N



?


x
q


?







e
x

?




?

q








Eq
.


(
10
)











?

indicates text missing or illegible when filed




In some examples, the search system 100 uses a simpler approximation of custom-character[extYq]≈extxq, which may yield the following corrected dot product distance:











x

?



y



constant
+


L

?


(

x
-
l

)

+

α


e
x
t



x
q


+


α
2



x
q

?




y
q







Eq
.


(
11
)











?

indicates text missing or illegible when filed




The value lt(x−1)+aextxq is a single floating point number. In some examples, the search system 100 may compute the value lt(x−1)+aextxq for each vector and may store this value with its quantized representation. The computation and storage of this value may reduce the memory requirements as compared to some conventional approaches.


In some examples, the search system 100 may further improve the approximation of Eq. 11 in the context of a specific nearest neighbor search. In some examples, custom-character[Yq] is better approximated when the search system rescales, e.g.,







𝔼
[

Y
q

]







y
q






x
q




.





If the search system 100 stores two floating point numbers instead of one, lt(x−l) and aext/∥xq∥, the search system 100 can efficiently apply rescaling when the search system computes each dot product. In some examples, the search system 100 applies rescaling for non-normalized vectors. In some examples, the search system 100 does not apply rescaling for cosine distance dot products.


In some examples, the search system 100 may retrieve more than k nearest neighbor vectors, and the search system 100 may re-rank these nearest neighbor vectors using floating point vectors. In some examples, the search system 100 may retrieve the nearest neighbor vectors by reading them from disk.


In some examples, the search system 100 may store the vectors {xq} in a manner to enable efficient unpacking and computing of dot products with a query vector using instruction sets (e.g., SIMD instruction sets) of variable width (efficient distance calculation for packed quantized vectors). FIG. 3B illustrates an example of a packed vector 124a having a packed vector layout 174 with the top four bits (X1) for one component and the bottom four bits (X2) for another component. FIG. 3C illustrates a layout of an unpacked vector 124c and a corresponding query vector 176.


If the search system 100 uses fewer than a certain number of bits (e.g., eight bits) to represent each component, the search system 100 may combine the bits into a supported integer data type such as an unsigned integer (e.g., an eight bit unsigned integer). The search system 100 may combine multiple values into a single larger data type (e.g., packing) and extract them back into their individual values (e.g., unpacking) using shift operations 168 and/or mask operations 170. In some examples, unpacking instructions may add overhead to the vector distance calculation, which can be a processing bottleneck in nearest neighbor searching.


To efficiently use computer processing unit (CPU) power, in some examples, the search system 100 may use hardware parallelism. In some examples, as shown in FIG. 1B, the search system may include registers 166 (e.g., wide registers) to provide hardware parallelism. A wide register is a register that can store more data than a standard register. In some examples, a wide register can hold 128, 256, or bits greater than 256. A register 166 may perform a shift operation 168 and/or a mask operation 170. A shift operation 168 may move bits within a data structure to a different position. A mask operation 170 may select or mask out specific bits within a data structure. Registers 166 (e.g., wide registers) may allow the same operation (e.g., a shift operation or a mask operation) to be performed on multiple pieces of data or data streams 107 (also referred to as lanes) at the same time (e.g., at least partially at once). The search system 100 may access the registers 166 using one or more instruction sets 164 (e.g., single instruction, multiple data (SIMD) instruction sets). Generally, if an instruction set 164 uses more data streams 107 (e.g., lanes), the register 166 can perform a higher number of operations per second.


Different processors (e.g., machines 162) have different instruction sets 164, thereby using a number of different data streams 107. In some distributed data stores, segments of the index information 120 can be copied (e.g., freely copied) from one machine 162 to another machine 162. These machines 162 may implement different instruction sets 164. In some examples, the search system 100 is configured to not reindex the data when the search system 100 copies the data. However, the search system 100 may be configured to use the instruction sets 164 with a higher number of data streams 107 to unpack and compute the dot products on a specific machine 162, which, in some examples, may involve finding a single vector representation for indexed vectors 124 to allow the search system 100 to unpack and compute the dot product on them with different width SIMD instructions (e.g., instruction sets 164).


For any permutations of the first d integers π, the following equation is provided:











x

?



y

=





i
=
1

d



x

?




y

?




=




i
=
1

d



x

?




y

?









Eq
.


(
12
)











?

indicates text missing or illegible when filed




In some examples, the search system 100 may represent vector components using four bit integers. However, the search system 100 may represent vector components using any number of bit counts other than four. In some examples, if the search system 100 uses four bits to represent vector components, the search system 100 may pack two components into each eight bit unsigned integer, e.g., using the top four bits for one component and the bottom four bits for the other. The top four bits can be accessed via a shift right operator and the bottom four bits via an AND with a four bit mask. FIG. 3B illustrates an example of a packed vector 124a having a packed vector layout 174 with the top four bits (X1) for one component and the bottom four bits (X2) for another component. In some examples, the search system 100 may store vectors 124 such that the x2i−1 components are stored in the top bits of an eight bit unsigned integer and the x2i components in bottom four bits. If the vector dimension is odd, the search system 100 may add a single zero component.


Using this arrangement, the search system 100 may efficiently load the vector 124 into the registers 166 and perform the dot product using instruction sets 164 (e.g., SIMD instructions) for any width instruction by adjusting the query vector representation of the query vector 176. In some examples, adjusting the query vector representation is computationally inexpensive, which may allow the search system 100 to increase hardware parallelism from the SIMD instruction sets (e.g., instruction sets 164) available for the machine 162 where the vector 124 is stored.


To efficiently process the instruction sets 164, the search system 100 may reduce or minimize the number of load instructions used to access the components of the packed vector 124a. In some examples, the search system 100 may load contiguous blocks of memory from the vector 124. The maximum width of a register 166 on the machine 162 may be n bits. In some examples, the search system 100 may load n/4 contiguous components of the packed vector 124a each time.


For the jth load, the components {X(j−1)n/4+1, X(j−1)n/4+3, . . . , Xjn/4−1} are in the top bits and the components {X(j−1)n/4+2, X(j−1)n/4+4, . . . , Xjn/4} are in the bottom bits of an n bit register 166.


Common SIMD instruction sets 164 provide lane-wise shift operations 168 and mask operations 170. Therefore, to unpack these vectors 124, the search system 100 may apply a single lane-wise right shift operator by a certain number of bits (e.g., four bits) and a single lane-wise AND with the mask 0x0F0F . . . 0F. The results of these two operations are stored in two n bit registers as shown in FIG. 3C.


In some examples, for any number of bits n, the search system 100 can arrange the components of the query vector 176 so that the search system 100 can read them in the same order in which the search system 100 reads the components of the packed vector 124a.


The search system 100 may define a permutation function as follows:













π

(
i
)

=


2

i

-
1






for


i



[

1
,

1

n
/
8


]








π


(
i
)


=

2


(

i
-

n
/
8


)







for


i



[



1

n
/
8

+
1

,

2

n
/
8


]








π


(
i
)


=


2


(

i
-

n
/
8


)


-
1






for


i



[



2

n
/
8

+
1

,

3

n
/
8


]








π


(
i
)


=

2


(

i
-

2

n
/
8


)







for


i



[



3

n
/
8

+
1

,

4

n
/
8


]








Eq
.


(
13
)








In some examples, the search system 100 has a query vector layout based on xi→xπ(i), the components from the 2j−1th and 2jth load may correspond to (e.g., match) the components in the unpacked registers 166 from the jth load from the packed vector 124a. Using Eq. (12), the search system 100 may perform SIMD multiplication operations on the components to compute the dot product. In some examples, the register width n may be undetermined in the previous discussion, and, in some examples, the same packing scheme operates for any n provided the query layout is adjusted.


In some examples, the search system 100 may implement an adaptive indexing strategy for storing vectors 124 when graphs 122 are merged in a manner that is computationally more efficient than conventional approaches. In some examples, the search and index tier 101 includes an indexing strategy selector 114 configured to determine an indexing strategy. For example, as shown in FIG. 4, the indexing strategy selector 114 may receive system constraints 178 and vector segments 131 to be merged and identify an index strategy 180 among a plurality of indexing strategies 180. The search system 100 may merge the vector segments 131 according to the selected index strategy 180, thereby generating a merged segment 131a.


In general, an indexing strategy 180 (e.g., an optimal indexing strategy) may change as the amount of data increases. For example, merging graphs 122 (e.g., HNSW graph) may be computationally expensive (e.g., requiring a higher amount of computing resources) whereas merging unordered lists of vectors is less (e.g., significantly less) computationally expensive. Furthermore, if the number of vectors 124 is relatively low, a brute force search can be as efficient as graph search because the brute force search has a locality of reference that is advantageous as compared to a graph search. In some examples, the search system 100 may delay building a graph structure until a brute force search becomes too expensive. In some examples, as further discussed below, the indexing strategy selector 114 may adjust an indexing strategy 180 based on the data characteristics. In some examples, the indexing strategy selector 114 may adjust the indexing strategy 180 when merging immutable segments (e.g., before, during, or after merging immutable segments).


An optimal indexing strategy 180 may depend on one or more system constraints 178. The system constraints 178 include the required queries per second (QPS), maximum latency, memory availability and/or data scale (e.g., the number of vectors and their dimension). For example, graph based data structures may allow for fast queries at large scale but may be slow to build, and, in some examples, may require that each vector be stored in memory because they read vectors at random. Quantizing vectors may reduce memory but may require additional retrieval and ranking of vectors 124 to achieve high recall, which may slow down queries particularly if they require relatively high recall. Storing vectors 124 in floating point precision and/or as a contiguous block of memory address (flat in memory) may provide fast times for building and merging vectors 124 but may require a relatively large amount of memory space and the query time may be linear in the index size. If a vector index is built sequentially in immutable segments, when segments are merged, a system can modify the way the vectors 124 are indexed without incurring additional overhead (e.g., the index may be rebuilt at these points anyway).


The search system may define an indexing strategy preference based on the following list of index strategies 180: 1) flat floating point vectors denoted S(flat, f32), 2) flat b bit scalar quantized with higher bit counts preferred S(flat, ib), 3) HNSW floating point vectors S(HNSW,f32), 4) HNSW b bit scalar quantized with higher bit counts preferred S(HNSW, ib), and 5) IVF with product quantization S(IVF, PQ).


For each index strategy 180, the indexing strategy selector 114 may compute an implied QPS, latency and memory usage as a function of vector count n and dimension d, which are represented by q(·|, d), l(·|n, d) and m(·|n, d), respectively. In some examples, the QPS, latency, and memory usage are functions of the compute environment used by the index.


Typically, the characteristics of the compute environment are known in advance and can be used to parameterize these functions. Multiple index strategies 180 could be used for this, but in general, the indexing strategy selector 114 may learn a model (e.g., a machine-learning model), which predicts each quantity, e.g., QPS, latency, and memory usage. The query latency for S(flat, f32) may be a linear function of the product of n and d. Specifically, for the term l(S(flat, f32), In, d)=c X nd, the indexing strategy selector 114 may estimate c for the compute environment and store that information with the index.


After obtaining QPS qmin and query latency lmax, and available RAM mmax. When the search system merges segments, the search system 100 has a new vector count n=Σini, which the indexing strategy selector 114 uses to reassess the indexing strategy 180. In some examples, the indexing strategy selector 114 may evaluate the above indexing strategies 180 in descending preference order, and, in response to achieving Q (S|n, d)≥qmin, l(S|n, d)≤lmax and m(S|n, d≤mmax, the indexing strategy selector 114 may select one of the indexing strategies 180 for the merged segment 131a. If the indexing strategy selector 114 determines that the memory usage is equal or greater than a memory usage threshold, the indexing strategy selector 114 may select an indexing strategy 180 of S(IVF, PQ). If the indexing strategy selector 114 determines that the memory usage is less than the memory usage threshold, the indexing strategy selector 114 may select an indexing strategy 180 that minimizes the percentage miss in QPS and latency.


Below is an example of an algorithm 1 that selects a vector index strategy at segment merge according to an aspect.












Algorithm 1















MERGED-SEGMENT-STRATEGY ({ni}, d)


Input:


 the segment vector counts {ni}


 the vector dimension d


Output:


 the index strategy to use for the merged segment









n




i


n
i











miss ← ∞ // the relative miss of constraint


F ← S(IVF, PQ) // the fallback strategy


for S ∈ (S(flat, f32), S(flat, i8), S(flat, I8), S(HNSW, f32), S(HNSW, i4),


S(IVF, PQ))


 if q(S|n, d) ≥ qminandI(S|n, d) ≤ lmaxand m(S|n, d) ≤ mmax


  return S


 if m(S|n, d ≤ mmax






candq(S|n,d)qmin+l(S|n,d)lmax






 if cand < miss


  miss ← cand


  F ← S









In some examples, the search system 100 discussed herein provides an algorithm to efficiently search multiple graphs 122 (e.g., HNSW graphs) at least partially in parallel. For example, multiple graphs 122 may represent an index or multiple indexes (e.g., index information 120). A first graph may include a number of nodes, node attributes, and edges and their properties. A second graph may include a number of nodes, node attributes, and edges and their properties. For example, the search and index tier may include a multigraph searcher 104 configured to efficiently execute a search on multiple graphs 122. In some examples, the search system 100 includes a graph merging engine 116 configured to efficiently merge multiple graphs 122 (e.g., HNSW graphs). Searching a single graph may be more efficient in terms of computing power than searching n graphs independently because of a search pruning heuristic used by a graph. If a direction in the graph is not competitive with the kth match (found so far), in some examples, a search system may not explore that direction. With one graph, a global top-k set is used. With multiple graphs, a top-k set per graph is used. In some examples, where an adversary has arranged the data between segments by clustering, top-k matches (e.g., all top-k matches) may come from one graph, and using n graphs may result in using n times more computing power than a single graph.


The multigraph searcher 104 may implement techniques that share state information between graphs 122 to reduce the cost of searching non-competitive directions in a given graph 122. FIG. 5 illustrates an example of the multigraph searcher 104. For example, in operation 111, the multigraph searcher 104 receives a user query. In operation 113, a query is received by a number of graphs 122. In operation 115, the multigraph searcher 104 may execute a search (simultaneously) on each graph 122. In operation 117, at fixed intervals, each search broadcasts its progress. In operation 119, each graph 122 updates how it searches based on information received.


For example, a search (e.g., a HNSW search) may traverse (e.g., only traverses) edges of the graph 122 if their end vertex's vector is competitive. In some examples, a search layer routine may obtain and store a set of matches ef (e.g., closest neighbors to the query). If the end vertex of an edge is further from the query vector 176 than the furthest vector in this set, the search system 100 may terminate searching the graph 122 in that direction. This parameter may control exploration in the search. At each step, since a search system 100 may select the vertex closest to a query not visited, a value of one may indicate the search explores the path along which distance to the query decreases fastest (e.g., a fully greedy algorithm). Conversely, a value of infinity (e.g., ∞) may indicate that a search system 100 may eventually explore the whole graph. The smaller the value ef the faster the search, but the greater the chance the search may be fixed in a local minimum. In some examples, there is no approximation bound associated with a stopping condition used by HNSW. In some examples, the search system 100 reaches the global nearest vectors with high probability a suitable value for ef.


As discussed herein, the multigraph searcher 104 may dynamically reduce ef for graphs 122 where the search is not currently globally competitive. In some examples, the multigraph searcher 104 may manage two priority queues, e.g., a first priority queue and a second priority queue. The first priority query may be a queue that is shorter than the second priority queue. In some examples, the multigraph searcher 104 may prune globally non-competitive matches using the short queue. In some examples, the multigraph searcher 104 uses a soft search stop based on the other graphs' search state.


A full layer search routine is depicted in Algorithm 2 below.












Algorithm 2















SEARCH-LAYER (q, Ep, Øf, g, lf)


Input:


 Query element q,


 Set of enter points Ep,


 Number of nearest neighbors to q to return ef,


 Greediness of globally non-competitive search g ∈ (0,1]


 Layer number lf


Output:


ef closest neighbors to q


V ← Ep // set of visited vertices


C ← Ep // set of candidates


W ← Ep // dynamic list of found nearest neighbors


Wg ← nearest min((1 − g)ef , |Ep|), elements from Ep to q


Wglob ← Ø // the global nearest neighbor set of size at most ef


while |C| > 0


 c ← nearest nelement from C to q


 f ← get further element from W to q


 if dist(c, q) > dist(f, q)


 Break


For each Ø ∈ V


 V ← V ∪ (e)


 f ← furtherest element to q in W


 fg ← furtherst element to q to Wg


 fglob ← furthest element to q in Wglob or a point at ∞ if i′empty


if (|W| < Øf or dist(e, q) < dist (f, Jq)) and


 (|Wg| < (1 − g)efor dist (e, q) < dist(fs, q)or dist (e, q) < dist (fglob, q)


 C ← C ∪ (e)


 W ← W ∪ (e)


 if dist (e, q) < dist (fø, q)


 Wg ← Wg ∪ (e)


 Remove further element from W to q to |W| > ef


 Remove furthest element from Wg to q if |Wg| > (1 − g)ef


 Periodically update Wglob with the delta in W


Update Wglob with the delta in W.









In some examples, the multigraph searcher 104 may synchronize processes (e.g., all processes) at the point at which they share information. Synchronizing the processes at the point at which they share information may provide a balance in a tradeoff in performance for the sharing frequency (e.g., sharing information more frequently may cause the graph searches to execute more efficiently but with an increase in contention).


In some examples, the multigraph searcher 104 may use message passing, where each process maintains its own copy of Wglob and is reminiscent of opportunistic concurrency. Messages of the change in the nearest neighbor set are periodically broadcast from each process to all others, at iterations pi for i∈[n] and period p, in the loop. These are tagged with their period i. A special search complete message is passed when a search exits. Loops (e.g., all loops) start to use information with tag i only after they reach iteration pi+[fp] for f∈[0,1). If a process has not received information from all processes tagged with either period i or end of search when it reaches iteration pi+[fp], it waits until it does. Provided the discrepancy in progress is always less than [fp] no process should ever wait. By only counting iterations which do not skip visited vertices, each loop iteration performs approximately the same amount of work. In some examples, each process may share new information with the others as soon as it has it by updating Wglob when it finds a new nearer element. Here, (1−g)ef is at least 1.


The graph merging engine 116 may reduce the cost of merges by using the small graph connectivity to avoid inserting vectors 124 in the small graph into the large one. FIGS. 6A and 6B illustrate an example of a graph merging engine 116. In operation 121, the graph merging engine 116 receives n graphs 122 to merge, and, in operation 123, the graph merging engine 116 selects two graphs, e.g., graph 122-1 (G1) and graph 122-2 (G2). The graph 122-1 includes vectors 124x. The graph 122-2 includes vectors 124y. In operation 125, the graph merging engine 116 adds a subset (M) of vectors 124y from the graph 122-2 to the graph 122-1. In operation 127, for a vector 124y-1 (e.g., each vector) in the graph 122-2 not added to graph 122-1, the graph merging engine 116 uses neighbors of the vector 124y-1 in the subset(M) to find neighbor vectors 124z in the graph 122-1 and the graph 122-2. Operation 129 includes updating neighbor vectors with vectors found in operation 127. The neighbor vectors 124z are added to the graph 122-1, and then the graph 122-1 is outputted as a merged graph 122a.


For example, the graph merging engine 116 may implement a fast merge algorithm for two graphs Gs and Gl. The layer assignment of vectors 124 in a graph 122 (e.g., every graph 122) follows the same distribution or substantially the same distribution. Therefore, the search system 100 may retain the current layer assignment of vertices in both graphs 122. In some examples, we assume without loss of generality that |Gs|≤|Gl|. In some examples, it may be more efficient to merge the small graph into the large one. In some examples, adding vertices in Gs one at a time has complexity of









O

(




"\[LeftBracketingBar]"


G
s



"\[RightBracketingBar]"




log

(



"\[LeftBracketingBar]"



G
l

_



"\[RightBracketingBar]"


)


)


1

)

.




The graph merging engine 116 may merge two single layer or NSW graphs 122 based on a location of where to insert a vertex v. For example, if the graph merging engine 116 determines the location of where to insert a vertex v, the graph merging engine 116 can estimate locations of where to insert its neighbor vectors 124z. The neighbors of a vertex v in a graph Gi may be denoted by Ni(v) and Gs indicates the vertices of Gs.


The graph merging engine 116 may proceed with the following stages. In a first stage (step 1), the graph merging engine 116 finds J⊏Gs which is inserted into the corresponding layer of Gl using COMPUTE-JOIN-SET (described below). In a second stage (step 2), the graph merging engine 116 inserts each vertex v∈J into Gl. In a third stage (step 3), for each vertex u∈GsV, the graph merging engine 116 finds J∩Gs(u), finds Eu=∪v∈J∩Ns(u) Nl(v), sets C=Ns(u)∪Eu and runs SELECT-NEIGHBORS-HEURISTIC for u using this candidate set.


For every vertex in u∈∪, the graph merging engine 116 has |J∩Ns(u)|≥k for some small k<M, the layer connectivity. In some examples, the graph merging engine 116 uses M∈[5,48] with higher values better for high dimensional data and the default Lucene chooses is M=16. (Note that the bottom layer has vertices up to twice this degree.) In some examples, k=4.


The graph merging engine 116 may select vertices in J in a manner to achieve roughly custom-character[|J∩Ns(u)|]=M|J]/(|Gs|−|J|). The graph merging engine 116 may expect










"\[LeftBracketingBar]"

J


"\[RightBracketingBar]"


=


k

M
+
k






"\[LeftBracketingBar]"


G
s



"\[RightBracketingBar]"




,




which may allow the analysis of the complexity of the above approach.


In some examples, step 1 examines (e.g., only examines) the graph connectivity and computes no dot products. The graph merging engine 116 may use one or more different strategies, including a simple greedy approach discussed below, which may be around O(|J|log(|Gs|)).


In some examples, for step 2, the graph merging engine 116 computes








k

M
+
k






"\[LeftBracketingBar]"


G
s



"\[RightBracketingBar]"



=



4

2

0






"\[LeftBracketingBar]"


G
s



"\[RightBracketingBar]"



=




"\[LeftBracketingBar]"


G
s



"\[RightBracketingBar]"


/
5






inserts into Gl, but with a reduced computational cost.


In some examples, step 3 has complexity O(|Gs|), since









"\[LeftBracketingBar]"

U


"\[RightBracketingBar]"





4
5





"\[LeftBracketingBar]"


G
s



"\[RightBracketingBar]"







can be obtained. In some examples, the graph merging engine 116 may iterate u∈GsV using a graph traversal, which may provide a better locality of reference than step 2 (e.g., since neighbors are likely to share neighbors), thereby providing increased cache performance and lower constants. Provided the constants for steps 1 and 3 are less than step 2, this process yields an increased speed as compared to some conventional approaches.


With respect to step 1, the graph merging engine 116 may select vertices and increment a count at their neighbors, which are not already in J (e.g., instead of selecting edges). If the count at a vertex v is denoted by c(v), then the gain from adding a vertex to J is:










max

(


k
-

c

(
v
)


,
0

)

+




?




?


(


c

(
u
)

<
k

)







Eq
.


(
14
)











?

indicates text missing or illegible when filed




The parameter 1{⋅} denotes the indicator function. Note that this includes the change in the count of the vertex added to J, e.g., mark(k−c(v), 0). FIG. 6C illustrates a gain calculation from adding a vertex to the join set.


Each vertex has the following state: 1) whether it is stale, 2) its gain, 3) the count of adjacent vertices not in J, 4) its count, and 5) a random number in the range [0,1] which is used for tie breaking. In some examples, vertices' states (e.g., all vertices' states) are initialized with “is stale” false, gain equal to “vertex degree” +k, count of adjacent vertices not in J equal to “vertex degree” and count equal to 0. The vertices are inserted into a priority queue ordered by gain (with ties broken first by count of adjacent vertices not in J then at random). For each iteration a vertex v is popped. If the gain is not stale, it is added to J, otherwise its gain is recomputed, and it is added back into the priority queue if its gain is greater than zero. Its neighbors' counts are incremented by 1. Then, all neighbors of a vertex whose count increased to k in the iteration are marked as stale.


An algorithm 3 is provided below.












Algorithm 3

















COMPUTE-JOIN-SET (Gs, k)



Input:



 Small graph to join Gs



 Required coverage k



Output:



 the vertices in Gs to join J



C ← Ø



For v in Gs



 C ← C ∪ [(flase, deg(v) + k, deg(v), 0, r and(0,1))]



gtot ← 0



while gtot < k|Gs|



 v ← maximum gain node in c (see tie breaking logic above)



 C ← C\{v}



 if gain of v is not stale



 J ← J ∪ {v}



 gtot ← gtot + gain of v



 mark neighbors of v stale if c(v) < k



 for u ∈ Ns(v)



 mark neighbors of u stale if c(u) = k − 1



 c(u) ← c(u) + 1



else



 gain ← max(k − c(v), 0) + Σu∈Ns(v)\J 1{c(u) < k}



 if gain > 0



 C ← C ∪ {(false,gain,|Ns(v)\J|, c(v), rand90,1))}



Return J










The graph merging engine 116 may determine and store the total gain to determine when to stop. In response to the total gain being equal to a threshold level (e.g., k|Gs|), the graph merging engine 116 determines to stop the compute-join-set operation. In some examples, the graph merging engine 116 does not add a vertex to J with zero gain. Because gains decrease, stale gains are an upper bound on the vertices' current gains and the first vertex we pop which is not stale has the highest gain of any remaining vertex. This has complexity at least O(|J|log(|Gs|)), but the constants are different to step 2, and may be lower.


The search system 100 may include one or more processors and one or more memory devices. The processor(s) may be formed in a substrate configured to execute one or more machine executable instructions or pieces of software, firmware, or a combination thereof. The processor(s) can be semiconductor-based—that is, the processors can include semiconductor material that can perform digital logic. The memory device(s) may include a main memory that stores information in a format that can be read and/or executed by the processor(s). The memory device(s) may store the search and index tier 101, when executed by the processors, perform certain operations discussed herein. In some examples, the memory device(s) includes a non-transitory computer-readable medium that includes executable instructions that cause at least one processor to execute operations.


In some examples, the search system 100 may execute on one or more server computers 160. The client devices 130 may communicate with the search system 100 over a network. The server computer(s) 160 may be computing devices that take the form of a number of different devices, for example a standard server, a group of such servers, or a rack server system. In some examples, the server computer may be a single system sharing components such as processors and memories. The network may include the Internet and/or other types of data networks, such as a local area network (LAN), a wide area network (WAN), a cellular network, satellite network, or other types of data networks. The network may also include any number of computing devices (e.g., computers, servers, routers, network switches, etc.) that are configured to receive and/or transmit data within the network. Network may further include any number of hardwired and/or wireless connections.



FIG. 7 is a flowchart 700 depicting example operations of a system for merging graphs according to an aspect. The flowchart 700 may depict operations of a computer-implemented method. Although the flowchart 700 of FIG. 7 illustrates the operations in sequential order, it will be appreciated that this is merely an example, and that additional or alternative operations may be included. Further, operations of FIG. 7 and related operations may be executed in a different order than that shown, or in a parallel or overlapping fashion.


Operation 702 includes selecting, by a search system, a first graph and a second graph to merge. Operation 704 includes adding a subset of vectors from the second graph to the first graph. Operation 706 includes, for a first vector in the second graph not added to the first graph, identifying first neighbor vectors in at least one of the first graph or the second graph using second neighbor vectors of the first vector included in the subset of vectors. Operation 708 includes generating a merged graph by adding the first neighbor vectors to the first graph.


Clause 1. A method comprising: identifying, using an optimization solver, one or more quantization constants for quantizing a vector in a graph of a search system; generating a quantized vector of the vector using the quantization constants.


Clause 2. A method comprising: computing a similarity distance between a first quantized vector and a second quantized vector stored in a search system; estimating an error approximation based on metadata associated with the first quantized vector and metadata associated with the second quantized vector; and updating the similarity distance based on the error approximation.


Clause 3. A method comprising: receiving vector segments to merge; receiving one or more system constraints; identifying an indexing strategy based on the one or more system constraints; and generating a merged segment for the vector segments using the indexing strategy.


Clause 4. A method comprising: receiving a user query for a vector search on a first graph and a second graph; executing a first search on the first graph and a second search on the second graph; at an interval, broadcasting a first message about the first search to the second graph and broadcasting a second message about the second search to the first graph; and updating the first search and the second search based on the second message and the first message, respectively.


Clause 5. A method comprising: identifying a first graph and a second graph to merge; adding a subset of vectors from the second graph to the first graph; for a first vector in the second graph not added to the first graph, identify neighbor vectors in the first graph and the second graph using neighbors of the first vector included in the subset; and adding the neighbor vectors to the first graph.


Clause 1. A method comprising: selecting, by a search system, a first graph and a second graph to merge; adding a subset of vectors from the second graph to the first graph; for a first vector in the second graph not added to the first graph, identifying first neighbor vectors in at least one of the first graph or the second graph using second neighbor vectors of the first vector included in the subset of vectors; and generating a merged graph by adding the first neighbor vectors to the first graph.


Clause 2. The method of clause 1, further comprising: in response to a user query received from a user device, searching the merged graph for data that is responsive to the user query.


Clause 3. The method of clause 1, wherein the second graph is a size that is smaller than the first graph.


Clause 4. The method of clause 1, wherein the first graph represents first indexing information, and the second graph represents second indexing information.


Clause 5. The method of clause 1, wherein the first neighbor vectors are identified without using dot product calculations.


Clause 6. The method of clause 1, further comprising: computing a gain value of a vector from the subset of vectors that was added to the first graph; and determining whether to add a subsequent vector to the first graph based on the gain value.


Clause 7. An apparatus comprising: at least one processor; and a non-transitory computer-readable medium storing executable instructions that when executed by the at least one processor cause the at least one processor to: select, by a search system, a first graph and a second graph to merge; add a subset of vectors from the second graph to the first graph; for a first vector in the second graph not added to the first graph, identify first neighbor vectors in at least one of the first graph or the second graph using second neighbor vectors of the first vector included in the subset of vectors; and generate a merged graph by adding the first neighbor vectors to the first graph.


Clause 8. The apparatus of clause 7, wherein the executable instructions include instructions that cause that least one processor to: in response to a user query received from a user device, search the merged graph for data that is responsive to the user query.


Clause 9. The apparatus of clause 7, wherein the second graph is a size that is smaller than the first graph.


Clause 10. The apparatus of clause 7, wherein the first graph represents first indexing information, and the second graph represents second indexing information.


Clause 11. The apparatus of clause 7, wherein the first neighbor vectors are identified without using dot product calculations.


Clause 12. The apparatus of clause 7, wherein the executable instructions include instructions that cause that least one processor to: compute a gain value of a vector from the subset of vectors that was added to the first graph; and determine whether to add a subsequent vector to the first graph based on the gain value.


Clause 13. A non-transitory computer-readable medium storing executable instructions that cause at least one processor to execute operations, the operations comprising: selecting, by a search system, a first graph and a second graph to merge; adding a subset of vectors from the second graph to the first graph; for a first vector in the second graph not added to the first graph, identifying first neighbor vectors in at least one of the first graph or the second graph using second neighbor vectors of the first vector included in the subset of vectors; and generating a merged graph by adding the first neighbor vectors to the first graph.


Clause 14. The non-transitory computer-readable medium of clause 13, wherein the operations further comprise: in response to a user query received from a user device, searching the merged graph for data that is responsive to the user query.


Clause 15. The non-transitory computer-readable medium of clause 13, wherein the second graph is a size that is smaller than the first graph.


Clause 16. The non-transitory computer-readable medium of clause 13, wherein the first graph represents first indexing information, and the second graph represents second indexing information.


Clause 17. The non-transitory computer-readable medium of clause 13, wherein the first neighbor vectors are identified without using dot product calculations.


Clause 18. The non-transitory computer-readable medium of clause 13, further comprising: computing a gain value of a vector from the subset of vectors that was added to the first graph; and determining whether to add a subsequent vector to the first graph based on the gain value.


Clause 19. A method comprising: receiving, from a user device, a user query for a vector search on indexing information, the indexing information including a first graph and a second graph; executing a first search on the first graph and a second search on the second graph; at an interval, broadcasting a first message about the first search to the second graph and broadcasting a second message about the second search to the first graph; and updating the first search and the second search based on the second message and the first message, respectively.


Clause 20. The method of clause 19, wherein the first search is executed at least partially in parallel with the second search.


Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.


These computer programs (also known as programs, software, software applications or code) include machine instructions for a programmable processor and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms “machine-readable medium” “computer-readable medium” refers to any computer program product, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term “machine-readable signal” refers to any signal used to provide machine instructions and/or data to a programmable processor.


To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having a display device (e.g., an OLED (Organic light emitting diode) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer. Alternatively, this can be implemented with a 3D user interaction system making use of trackers that are tracked in orientation and 3D position. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.


The systems and techniques described here can be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), and the Internet.


The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship with each other.


In this specification and the appended claims, the singular forms “a,” “an” and “the” do not exclude the plural reference unless the context clearly dictates otherwise. Further, conjunctions such as “and,” “or,” and “and/or” are inclusive unless the context clearly dictates otherwise. For example, “A and/or B” includes A alone, B alone, and A with B. Further, connecting lines or connectors shown in the various figures presented are intended to represent example functional relationships and/or physical or logical couplings between the various elements. Many alternative or additional functional relationships, physical connections or logical connections may be present in a practical device. Moreover, no item or component is essential to the practice of the implementations disclosed herein unless the element is specifically described as “essential” or “critical”.


Terms such as, but not limited to, approximately, substantially, generally, etc. are used herein to indicate that a precise value or range thereof is not required and need not be specified. As used herein, the terms discussed above will have ready and instant meaning to one of ordinary skill in the art. Moreover, use of terms such as up, down, top, bottom, side, end, front, back, etc. herein are used with reference to a currently considered or illustrated orientation. If they are considered with respect to another orientation, it should be understood that such terms must be correspondingly modified.


Further, in this specification and the appended claims, the singular forms “a,” “an” and “the” do not exclude the plural reference unless the context clearly dictates otherwise. Moreover, conjunctions such as “and,” “or,” and “and/or” are inclusive unless the context clearly dictates otherwise. For example, “A and/or B” includes A alone, B alone, and A with B.


Although certain example methods, apparatuses and articles of manufacture have been described herein, the scope of coverage of this patent is not limited thereto. It is to be understood that terminology employed herein is for the purpose of describing particular aspects and is not intended to be limiting. On the contrary, this patent covers all methods, apparatus and articles of manufacture fairly falling within the scope of the claims of this patent.

Claims
  • 1. A method comprising: selecting, by a search system, a first graph and a second graph to merge;adding a subset of vectors from the second graph to the first graph;for a first vector in the second graph not added to the first graph, identifying first neighbor vectors in at least one of the first graph or the second graph using second neighbor vectors of the first vector included in the subset of vectors; andgenerating a merged graph by adding the first neighbor vectors to the first graph.
  • 2. The method of claim 1, further comprising: in response to a user query received from a user device, searching the merged graph for data that is responsive to the user query.
  • 3. The method of claim 1, wherein the second graph is a size that is smaller than the first graph.
  • 4. The method of claim 1, wherein the first graph represents first indexing information, and the second graph represents second indexing information.
  • 5. The method of claim 1, wherein the first neighbor vectors are identified without using dot product calculations.
  • 6. The method of claim 1, further comprising: computing a gain value of a vector from the subset of vectors that was added to the first graph; anddetermining whether to add a subsequent vector to the first graph based on the gain value.
  • 7. An apparatus comprising: at least one processor; anda non-transitory computer-readable medium storing executable instructions that when executed by the at least one processor cause the at least one processor to:select, by a search system, a first graph and a second graph to merge;add a subset of vectors from the second graph to the first graph;for a first vector in the second graph not added to the first graph, identify first neighbor vectors in at least one of the first graph or the second graph using second neighbor vectors of the first vector included in the subset of vectors; andgenerate a merged graph by adding the first neighbor vectors to the first graph.
  • 8. The apparatus of claim 7, wherein the executable instructions include instructions that cause that least one processor to: in response to a user query received from a user device, search the merged graph for data that is responsive to the user query.
  • 9. The apparatus of claim 7, wherein the second graph is a size that is smaller than the first graph.
  • 10. The apparatus of claim 7, wherein the first graph represents first indexing information, and the second graph represents second indexing information.
  • 11. The apparatus of claim 7, wherein the first neighbor vectors are identified without using dot product calculations.
  • 12. The apparatus of claim 7, wherein the executable instructions include instructions that cause that least one processor to: compute a gain value of a vector from the subset of vectors that was added to the first graph; anddetermine whether to add a subsequent vector to the first graph based on the gain value.
  • 13. A non-transitory computer-readable medium storing executable instructions that cause at least one processor to execute operations, the operations comprising: selecting, by a search system, a first graph and a second graph to merge;adding a subset of vectors from the second graph to the first graph;for a first vector in the second graph not added to the first graph, identifying first neighbor vectors in at least one of the first graph or the second graph using second neighbor vectors of the first vector included in the subset of vectors; andgenerating a merged graph by adding the first neighbor vectors to the first graph.
  • 14. The non-transitory computer-readable medium of claim 13, wherein the operations further comprise: in response to a user query received from a user device, searching the merged graph for data that is responsive to the user query.
  • 15. The non-transitory computer-readable medium of claim 13, wherein the second graph is a size that is smaller than the first graph.
  • 16. The non-transitory computer-readable medium of claim 13, wherein the first graph represents first indexing information, and the second graph represents second indexing information.
  • 17. The non-transitory computer-readable medium of claim 13, wherein the first neighbor vectors are identified without using dot product calculations.
  • 18. The non-transitory computer-readable medium of claim 13, further comprising: computing a gain value of a vector from the subset of vectors that was added to the first graph; anddetermining whether to add a subsequent vector to the first graph based on the gain value.
  • 19. A method comprising: receiving, from a user device, a user query for a vector search on indexing information, the indexing information including a first graph and a second graph;executing a first search on the first graph and a second search on the second graph;at an interval, broadcasting a first message about the first search to the second graph and broadcasting a second message about the second search to the first graph; andupdating the first search and the second search based on the second message and the first message, respectively.
  • 20. The method of claim 19, wherein the first search is executed at least partially in parallel with the second search.
CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit of and priority to U.S. Provisional Application No. 63/620,435, filed Jan. 12, 2024, which is incorporated by reference herein in its entirety.

Provisional Applications (1)
Number Date Country
63620435 Jan 2024 US