How do you calculate cosine on a subset of vectors in a vector index?

Documents

1M -> 10000 -> 100 -> 10

64D -> 128D -> 256D -> 512D (D - vector dimension)

FAISS and ANNOY don't support it.

You can do a filter by ID in Elasticsearch and then run cosine query, but should you?

First search on 1M docs in FAISS with ANN(approximate nearest neighbours) and rest passes on ES.

Is numpy masked array a solution for this?

Let me know your thoughts on Twitter!

0 subscriptions will be displayed on your profile (edit)

Skip for now

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

## Cosine similarity on a subset of documents for multipass search

How do you calculate cosine on a subset of vectors in a vector index?

Documents

1M -> 10000 -> 100 -> 10

64D -> 128D -> 256D -> 512D (D - vector dimension)

FAISS and ANNOY don't support it.

You can do a filter by ID in Elasticsearch and then run cosine query, but should you?

First search on 1M docs in FAISS with ANN(approximate nearest neighbours) and rest passes on ES.

Is numpy masked array a solution for this?

Let me know your thoughts on Twitter!

January 14th 2021

## Create your profile

## Only paid subscribers can comment on this post

Sign in## Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.