Blog_AI_Hero.jpg
Blog

Connecting with AI: How to Create Embeddings with OpenAI

This blog post is part of an independent series called Connecting with AI, authored by Felipe Mantilla, an AI engineering expert at Gorilla Logic. The series' goal is to make the world of AI more accessible, support those who want to learn more about the field, and establish foundations on the most interesting advancements in AI. You can find the original version of this post in Spanish on his Medium blog

Getting Started with Embeddings in OpenAI

Let's start with a simple example: We'll convert a series of inputs into embeddings and save them in a JSON file (which will serve as our temporary database). This will allow us to run some interesting tests.

Screenshot 2025-02-19 at 9.58.47 AM.png

Which will give us a result like the following:

embeddings1.png

Each entry will be linked to its respective array of embeddings, generated by OpenAI’s model.

To check the similarity between these objects and a new input, we'll implement methods that calculate cosine similarity, including dot product and norm calculations, as previously shown. The result would look something like this:

Screenshot 2025-02-19 at 10.09.47 AM.png

Now, we can optimize the methods for storing and validating embeddings:

Screenshot 2025-02-19 at 10.10.55 AM.png

...Where we get the following result:

embeddings2.png

We observe the similarity between the word "animal" and our stored embeddings, which represent different types of animals. Now, let’s see what happens if we change the input to a more specific term like "feline."

embeddings3.png

Here, we can see how the values adjust for feline-related animals. You can check out the implementation in this commit.

Let’s look at a more interesting example: Imagine we have an array of phrases containing information about John.

Screenshot 2025-02-19 at 10.16.04 AM.png

Let’s save the embeddings and run some queries on our data to analyze the results.

Screenshot 2025-02-19 at 10.17.12 AM.png

embeddings4.png

As you can see, the highest similarity corresponds to the object that could best answer our question. You can experiment and test different ways of making queries. I encourage you to dive deeper into the capabilities of embeddings on your own; the implementation and code are available in the following commit.

Conclusion

In this article, you learned how to implement embeddings using OpenAI’s API to create powerful semantic searches. We explored practical examples—from a simple list of animals, to more complex queries about personal information—and discovered how cosine similarity helps us find meaningful semantic relationships.

This is just the beginning of what we can achieve with this technology: the systems you build will be able to understand context and answer questions more intelligently than a traditional search.

All the code is available in the commits previously mentioned and linked above—now it’s your turn to experiment and build something amazing! 🚀

Ready to be Unstoppable? Partner with Gorilla Logic, and you can be.

TALK TO OUR SALES TEAM