Getting started
In this section, you will look at different ways to set up, run and consume this service on your local machine. You can choose between using Docker, Makefile, or Poetry.
Project configuration
This application requires some configuration to be set. The recommended way is by using a .env
file. You can find an example of this file in the application's directory (apps/semantic_search
). You can copy this file and rename it to .env
. Then, you can edit the values of the variables to match your desired configuration.
The following table describes the different variables that can be set on the .env
file:
Variable | Required | Description |
---|---|---|
SERVER_PORT | ✅ | The port where the server will be running |
EMBEDDING_GENERATOR | ✅ | The embedding generator to use. You can choose one of the available models |
EMBEDDING_STORE | ✅ | The embedding store to use. You can choose one of the available stores |
MATCH_THRESHOLD | ✅ | The threshold to use when matching embeddings. The value has to be between 0 and 1, being 1 an exact match |
BATCH_SIZE | ✅ | The amount of text to be used when generating embeddings |
SPACES_URL | ❌ | The URL where the embedding generator is located. This is only required if you are using the SPACES_TEXT or SPACES_INSTRUCT embedding generator |
SPACES_KEY | ❌ | The key to use when generating embeddings. This is only required if you are using the SPACES_TEXT or SPACES_INSTRUCT embedding generator |
OPENAI_TYPE | ❌ | The type of OpenAI model to use when generating embeddings (either from OpenAI or Azure). This is only required if you are using the OPENAI embedding generator |
OPENAI_MODEL | ❌ | The model to use when generating embeddings. This is only required if you are using the OPENAI embedding generator |
OPENAI_API_KEY | ❌ | The key to use when generating embeddings. This is only required if you are using the OPENAI embedding generator |
AZURE_OPENAI_ENDPOINT | ❌ | The endpoint to use when generating embeddings. This is only required if you are using the azure OpenAI integration |
AZURE_OPENAI_VERSION | ❌ | The version to use when generating embeddings. This is only required if you are using the AZURE_OPENAI embedding generator |
AZURE_OPENAI_USE_ACTIVE_DIRECTORY | ❌ | Whether to use Active Directory when generating embeddings. This is only required if you are using the AZURE_OPENAI embedding generator |
STORE_PATH | ❌ | The path where the embeddings will be stored. This is only required if you are using the LOCAL embedding store |
CHROMA_URL | ❌ | The URL of the Chroma instance to use. This is only required if you are using the CHROMA embedding store |
CHROMA_PORT | ❌ | The port of the Chroma instance to use. This is only required if you are using the CHROMA embedding store |
CHROMA_COLLECTION | ❌ | The collection to use when storing embeddings. This is only required if you are using the CHROMA embedding store |
SUPABASE_URL | ❌ | The URL of the Supabase instance to use. This is only required if you are using the SUPABASE embedding store |
SUPABASE_KEY | ❌ | The key to use when connecting to Supabase. This is only required if you are using the SUPABASE embedding store |
SUPABASE_TABLE | ❌ | The table to use when storing embeddings. This is only required if you are using the SUPABASE embedding store |
SUPABASE_FUNCTION | ❌ | The function to use when querying embeddings. This is only required if you are using the SUPABASE embedding store |
PINECONE_KEY | ❌ | The key to use when connecting to Pinecone. This is only required if you are using the PINECONE embedding store |
PINECONE_ENVIRONMENT | ❌ | The Pinecone's project environment. This is only required if you are using the PINECONE embedding store |
PINECONE_INDEX | ❌ | The index (Pinecone database) where embeddings will be stored and queried. This is only required if you are using the PINECONE embedding store |
Running the project
There are several ways to run the project. You can use Docker, Makefile, or Poetry. You can find instructions on how to run the project with each of these options below.
Docker
The most recommended way to set up the project is by using Docker. The project comes with a Dockerfile, which allows you to build a virtual image of a machine with all the configurations and dependencies needed to run the project.
To use Docker, you need to have it installed on your system. You can find instructions on how to install Docker here.
Once you have Docker installed, you must move to the semantic_search
directory. To run that, open a terminal on the project's root directory and run the following command:
cd apps/semantic_search
Once you are in the semantic_search
directory. You can proceed to build and run the Docker image. The build will create a virtual machine with all the requirements for the project to run correctly. Once the build is finished, just run the image to get your project up and running.
docker build -t semantic-search .
docker run -p 8000:8000 semantic-search
Makefile
Another way to run the application is by using the Makefile script on the project directory. First, ensure that both Make and Poetry are installed on your system.
Once you have Make and Poetry installed, you must move to the semantic_search
directory. To run that, open a terminal on the project's root directory and run the following command:
cd apps/semantic_search
If it's the first time that you are running the project with Make, you must run the init
command. This command will install all the dependencies needed to run the project.
make init
Once the dependencies are installed, you can run the project by using the dev
command.
make dev
Poetry
The most basic way to run the project is by installing the dependencies on your machine and running it with Poetry. To do this, you must have Poetry installed on your system.
Once you have Poetry installed, you must move to the semantic_search
directory. To run that, open a terminal on the project's root directory and run the following command:
cd apps/semantic_search
If it's the first time that you are running the project with Poetry, you must run the install
command. This command will install all the dependencies needed to run the project.
poetry install
Once the dependencies are installed, you can run the project by using the start
command.
poetry run start