Skip to main content

Getting started

In this section, you will look at different ways to set up, run and consume this service on your local machine. You can choose between using Docker, Makefile, or Poetry.

Project configuration

This application requires some configuration to be set. The recommended way is by using a .env file. You can find an example of this file in the application's directory (apps/semantic_search). You can copy this file and rename it to .env. Then, you can edit the values of the variables to match your desired configuration.

The following table describes the different variables that can be set on the .env file:

VariableRequiredDescription
SERVER_PORTThe port where the server will be running
EMBEDDING_GENERATORThe embedding generator to use. You can choose one of the available models
EMBEDDING_STOREThe embedding store to use. You can choose one of the available stores
MATCH_THRESHOLDThe threshold to use when matching embeddings. The value has to be between 0 and 1, being 1 an exact match
BATCH_SIZEThe amount of text to be used when generating embeddings
SPACES_URLThe URL where the embedding generator is located. This is only required if you are using the SPACES_TEXT or SPACES_INSTRUCT embedding generator
SPACES_KEYThe key to use when generating embeddings. This is only required if you are using the SPACES_TEXT or SPACES_INSTRUCT embedding generator
OPENAI_TYPEThe type of OpenAI model to use when generating embeddings (either from OpenAI or Azure). This is only required if you are using the OPENAI embedding generator
OPENAI_MODELThe model to use when generating embeddings. This is only required if you are using the OPENAI embedding generator
OPENAI_API_KEYThe key to use when generating embeddings. This is only required if you are using the OPENAI embedding generator
AZURE_OPENAI_ENDPOINTThe endpoint to use when generating embeddings. This is only required if you are using the azure OpenAI integration
AZURE_OPENAI_VERSIONThe version to use when generating embeddings. This is only required if you are using the AZURE_OPENAI embedding generator
AZURE_OPENAI_USE_ACTIVE_DIRECTORYWhether to use Active Directory when generating embeddings. This is only required if you are using the AZURE_OPENAI embedding generator
STORE_PATHThe path where the embeddings will be stored. This is only required if you are using the LOCAL embedding store
CHROMA_URLThe URL of the Chroma instance to use. This is only required if you are using the CHROMA embedding store
CHROMA_PORTThe port of the Chroma instance to use. This is only required if you are using the CHROMA embedding store
CHROMA_COLLECTIONThe collection to use when storing embeddings. This is only required if you are using the CHROMA embedding store
SUPABASE_URLThe URL of the Supabase instance to use. This is only required if you are using the SUPABASE embedding store
SUPABASE_KEYThe key to use when connecting to Supabase. This is only required if you are using the SUPABASE embedding store
SUPABASE_TABLEThe table to use when storing embeddings. This is only required if you are using the SUPABASE embedding store
SUPABASE_FUNCTIONThe function to use when querying embeddings. This is only required if you are using the SUPABASE embedding store
PINECONE_KEYThe key to use when connecting to Pinecone. This is only required if you are using the PINECONE embedding store
PINECONE_ENVIRONMENTThe Pinecone's project environment. This is only required if you are using the PINECONE embedding store
PINECONE_INDEXThe index (Pinecone database) where embeddings will be stored and queried. This is only required if you are using the PINECONE embedding store

Running the project

There are several ways to run the project. You can use Docker, Makefile, or Poetry. You can find instructions on how to run the project with each of these options below.

Docker

The most recommended way to set up the project is by using Docker. The project comes with a Dockerfile, which allows you to build a virtual image of a machine with all the configurations and dependencies needed to run the project.

To use Docker, you need to have it installed on your system. You can find instructions on how to install Docker here.

Once you have Docker installed, you must move to the semantic_search directory. To run that, open a terminal on the project's root directory and run the following command:

cd apps/semantic_search

Once you are in the semantic_search directory. You can proceed to build and run the Docker image. The build will create a virtual machine with all the requirements for the project to run correctly. Once the build is finished, just run the image to get your project up and running.

docker build -t semantic-search .
docker run -p 8000:8000 semantic-search

Makefile

Another way to run the application is by using the Makefile script on the project directory. First, ensure that both Make and Poetry are installed on your system.

Once you have Make and Poetry installed, you must move to the semantic_search directory. To run that, open a terminal on the project's root directory and run the following command:

cd apps/semantic_search

If it's the first time that you are running the project with Make, you must run the init command. This command will install all the dependencies needed to run the project.

make init

Once the dependencies are installed, you can run the project by using the dev command.

make dev

Poetry

The most basic way to run the project is by installing the dependencies on your machine and running it with Poetry. To do this, you must have Poetry installed on your system.

Once you have Poetry installed, you must move to the semantic_search directory. To run that, open a terminal on the project's root directory and run the following command:

cd apps/semantic_search

If it's the first time that you are running the project with Poetry, you must run the install command. This command will install all the dependencies needed to run the project.

poetry install

Once the dependencies are installed, you can run the project by using the start command.

poetry run start