Search…
Introduction
Search Q&A can be used to answer natural language questions about your data.
Forefront provides a Search Q&A deployment that is optimized to answer natural language questions from a large amount of text.
This deployment provides a text embedding and Q&A (question answering) API that enables you to ask a question about a set of text and receive a natural language answer. This is helpful for chatbots, search results, and any use case that requires navigating text.
Most real-world question answering use cases require searching large amounts of text for relevant information, much more than you'd want to fit into a single API call. To filter out the noise, it is helpful to use a search engine to retrieve the top search results before passing them into the Q&A model.
Semantic search is an effective search method because it attempts to find search results based on "meaning" instead of simply finding matching keywords. It accomplishes this by first converting documents into vectors, and returns results based on finding the closest matching vectors.
The Text Embedding API provides an endpoint to easily convert any text into a vector that can then be stored in a database.
Opensearch is an open source search engine that provides built-in methods for implementing semantic search. We will use Opensearch as our search engine in this guide, although there are alternatives.

Overview

This guide includes two parts to building out the search and question answering functionality.
  1. 1.
    Index and store data in Opensearch
  2. 2.
    Run the Q&A workflow
Note: Opensearch is only needed if you wish to pull top search results from a large amount of documents (greater than 2048 tokens including the search query, or roughly 1500 words.) If this is not your use case, continue to part 2.

Prerequisites

  1. 1.
    You have a Forefront account and have deployed a Search Q&A model
  2. 2.
    You have access to an Opensearch instance to connect to. For information on creating an Opensearch instance on AWS, see docs

Part 1: Index and Store Data in Opensearch

Install the Python Opensearch client

This guide will use the Python client to programmatically interact with Opensearch, however there are other options.
To install the python client run: pip install opensearch-py

Connect to Opensearch

Your authentication method will vary depending on how you set up your Opensearch instance.
For basic auth (username/password):
1
from opensearchpy import OpenSearch, RequestsHttpConnection
2
3
host=<HOSTNAME> # for example, instance-name.region.es.amazonaws.com
4
auth=(<username>, <password>)
5
6
# for basic HTTP auth (username/password)
7
client = OpenSearch(
8
hosts = [{'host': host,'port': 443}],
9
http_auth = auth,
10
use_ssl = True,
11
connection_class = RequestsHttpConnection,
12
http_compress = True, # optional: enables gzip compression for request bodies
13
)
Copied!
For AWS auth, you'll first need to run pip install boto3 request-aws4auth
1
from requests_aws4auth import AWS4Auth
2
import boto3
3
4
host = '' # for example, my-test-domain.us-east-1.es.amazonaws.com
5
region = '' # e.g. us-west-1
6
7
service = 'es'
8
credentials = boto3.Session().get_credentials()
9
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
10
11
search = OpenSearch(
12
hosts = [{'host': host, 'port': 443}],
13
http_auth = awsauth,
14
use_ssl = True,
15
verify_certs = True,
16
connection_class = RequestsHttpConnection
17
)
Copied!
You should now be connected. You can run the following to see if you are connected properly:
client.indices.stats() # should dump cluster statistics
For more information on connection options, see docs.

Indexing Data

Before running the search model, you will need to index and store your documents. There are three steps to indexing data in Opensearch:
  1. 1.
    Create an index to store your documents
  2. 2.
    Get embeddings for your documents
  3. 3.
    Store the embeddings in Opensearch

Create an index

You must create an Opensearch index before attempting to store documents. An index specifies a data type that your documents must match. Since we are performing vector based search, your index must specify a vector data type.
For this example, let's assume we are indexing a set of blog posts. We will create an index containing four fields:
  1. 1.
    a text field called title containing the blog title
  2. 2.
    a text field called url containing the blog url
  3. 3.
    a text field called text containing the blog content
  4. 4.
    a vector field called embedding used for search containing the embeddings for text
1
index_name = 'my-index'
2
3
index_body = {
4
"mappings": {
5
"properties": {
6
"title": {
7
"type": "text"
8
},
9
"url": {
10
"type": "text"
11
},
12
"text": {
13
"type": "text"
14
},
15
"embedding": {
16
"type": "knn_vector",
17
"dimension": 4096
18
}
19
}
20
}
21
}
22
23
response = client.indices.create(index_name, body=index_body)
Copied!
Note that the embedding field must specify a dimension size of 4096 as this is the dimension of the embeddings outputted by the API.

Get text embeddings

Our Text Embedding API can be used to create embeddings for a piece of text. For best results, we recommend breaking up your text into chunks of 2-3 sentences using NLTK.
To install NLTK run: pip install nltk
1
from nltk import tokenize, download
2
nltk.download('punkt') # required for tokenize
3
4
# helper function to split sentences into chunks
5
def split_by_sentences(text, num_sentences = 2):
6
def chunks(l, n):
7
n = max(1, n)
8
return (l[i:i+n] for i in range(0, len(l), n))
9
10
sentences = tokenize.sent_tokenize(text)
11
documents = chunks(sentences,2)
12
13
return list(map(lambda x: ' '.join(x), documents))
14
15
# your documents
16
documents = ''' A lot of text '''
17
18
# split your documents into an array of 2 sentence strings
19
split_documents = split_by_sentences(documents, 2)
Copied!
To convert text into an embedding use the /embedding endpoint
1
import requests
2
3
url = '' # for example, <https://example.forefront.link/embedding>
4
api_key = '' # obtained from Settings -> API Key page in forefront
5
headers = { "authorization": "Bearer <token>" }
6
7
# to get embeddings for one string of text
8
data = { "text": "example text to embed" }
9
10
# you can optionally embed strings in batch
11
data = { "texts": ["first example to embed", "second example to embed", "..."] }
12
13
res = requests.post(url,json=data,headers)
14
# {
15
# "result": [
16
# [
17
# 0.52392578125,
18
# 0.439697265625,
19
# -0.0298004150390625,
20
# -0.96630859375,
21
# -0.289794921875,
22
# 2.494140625,
23
# 1.0576171875,
24
# 0.57763671875,
25
# -0.0999755859375,
26
# -0.9931640625,
27
# ...
28
# }
Copied!

Store Embeddings in Opensearch

1
title = "my document"
2
embedding = [0.52392578125, 0.439697265625, -0.0298004150390625,...]
3
4
document = {
5
"title": title,
6
"embedding": embedding
7
}
8
9
client.index('my-index',document,refresh=True)
Copied!

Example: Upload multiple documents from an array

1
sample_documents = [
2
{
3
"title": "First document",
4
"text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.",
5
"url" : "https:/example.com/document1"
6
},
7
{
8
"title": "Second document",
9
"text": "A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution. Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to prevent double-spending.",
10
"url" : "https:/example.com/document2"
11
},
12
{
13
"title": "Third document",
14
"text": "The bear loved the honey. The rabbit ate the carrots. The horse ate the apple.",
15
"url" : "https:/example.com/document3"
16
}
17
]
18
19
# helper function to convert text to an embedding
20
def get_embedding_for_text(text):
21
res = requests.post(f"{search_url}/embedding",json={"text":text}, headers=headers)
22
return json.loads(res.text)["result"][0]
23
24
for doc in sample_documents:
25
embedding = get_embedding_for_text(doc["text"])
26
27
document = {
28
"title": doc["title"],
29
"url": doc["url"],
30
"text": doc["text"],
31
"embedding": embedding
32
}
33
34
client.index('my-index',document,refresh=True)
Copied!
For bulk importing of documents, see docs.
Once you have indexed your documents, you are ready to use the question-answering model.

Part 2: Run the Q&A Workflow

There are two steps to running the Q&A workflow:
  1. 1.
    Get top search results for the query
  2. 2.
    Run the query and search results through the Q&A model to get an answer
Note: If you're not using Opensearch, you can skip to "Run the Q&A model"

Get top search results

The objective here is to identify the most relevant information before passing it to our Q&A model.
There are a few intermediate steps that need to happen:
  1. 1.
    Convert the search query to an embedding
  2. 2.
    Run the search query embedding through Opensearch to get the top results
  3. 3.
    Extract the text field from the results
A few helper functions are provided to help streamline this process:
1
client = # your Opensearch client from earlier in this guide
2
3
search_url = '' # your Forefront Search Q&A url
4
headers = { "authorization: "Bearer <api_key>" } # use your forefront api key
5
6
7
# Convert query to an embedding
8
def get_embedding_for_query(text):
9
res = requests.post(f"{search_url}/embedding",json={"text":text, "is_query":true}, headers=headers)
10
return json.loads(res.text)["result"][0]
11
12
13
# Get top 3 search results for a query within a specified index
14
def search(query, n = 3, index_name = 'my-index'):
15
search_query = {
16
"size": n,
17
"query": {
18
"script_score": {
19
"query": {
20
"match_all": {}
21
},
22
"script": {
23
"source": "knn_score",
24
"lang": "knn",
25
"params": {
26
"field": "embedding",
27
"query_value": query,
28
"space_type": "cosinesimil"
29
}
30
}
31
}
32
}
33
}
34
return client.search(search_query, index_name)
35
36
37
# Get text from results
38
def get_text_from_results(results):
39
arr = []
40
docs = results['hits']['hits']
41
for doc in docs:
42
arr.append(doc["_source"]["text"])
43
44
return arr
45
46
47
# Get top 3 text result for a query
48
def get_search_results(query):
49
embeddings = get_embedding_for_query(query)
50
results = search(embeddings)
51
52
texts = get_text_from_results(results)
53
return texts
54
55
56
# Running the top_results workflow
57
query = "What did the rabbit eat?"
58
text_results = get_search_results(query)
59
# Result: ['The bear loved the honey. The rabbit ate the carrots. The horse ate the apple.']
Copied!
Note: When getting an embedding your search query via the Text Embedding API, make sure to set is_query to True.

Run the Q&A model

With the relevant information in hand, running the Q&A model is as simple as an API call. You can of course run this without Opensearch if you have a small set of text to answer from.
1
import requests
2
3
url = '' # your Forefront Search Q&A url, for example <https://example.forefront.link/answer>
4
api_key = '' # your Forefront API key
5
headers = { "authorization": "Bearer <api_key>" }
6
7
# example
8
query = "What did the rabbit eat?" # Your search query is the same query used to pull top Opensearch results in the last step
9
documents = ["The bear loved the honey","The rabbit enjoyed the carrots", "..."] # a list of text strings to answer from. You can have more than three documents as long as query and documents combined are less than 2048 tokens
10
11
# construct request object
12
data = {
13
"query": query,
14
"documents": documents
15
}
16
17
# send the request
18
res = requests.post(url, json=data, headers=headers)
19
20
# print result
21
print(res)
22
# {
23
# "result": [
24
# " The rabbit ate carrot."
25
# ],
26
# "timestamp": 1634750866,
27
# "model": "gpt-j-answer"
28
# }
Copied!
You can now create your own products using the Search Q&A deployment on Forefront!

Considerations

  • In this guide, we indexed the document embeddings alongside the document text. This is perhaps not the most space efficient. You may prefer to index a document ID or url with the embeddings, and pull those separately for the question and answer step.
  • This Opensearch index created in this guide uses exact K-Nearest Neighbor (k-NN) scoring. This will be slower at scale ( tens of thousands of documents ) and you will want to use Approximate KNN search methods.
  • This guide indexes one document at a time, however there is a bulk API available for importing large amounts of documents quickly. See the docs for more info on this.
  • We recommend indexing documents in 2-3 sentence chunks to achieve best search results. You could do more than this, however search performance will likely decrease.

APIs

Last modified 1mo ago