
Connecting with AI: How to Create Embeddings with OpenAI
This blog post is part of an independent series called Connecting with AI, authored by Felipe Mantilla, an AI engineering expert at Gorilla Logic. The series' goal is to make the world of AI more accessible, support those who want to learn more about the field, and establish foundations on the most interesting advancements in AI. You can find the original version of this post in Spanish on his Medium blog.
Getting Started with Embeddings in OpenAI
Let's start with a simple example: We'll convert a series of inputs into embeddings and save them in a JSON file (which will serve as our temporary database). This will allow us to run some interesting tests.
Which will give us a result like the following:
Each entry will be linked to its respective array of embeddings, generated by OpenAI’s model.
To check the similarity between these objects and a new input, we'll implement methods that calculate cosine similarity, including dot product and norm calculations, as previously shown. The result would look something like this:
Now, we can optimize the methods for storing and validating embeddings:
...Where we get the following result:
We observe the similarity between the word "animal" and our stored embeddings, which represent different types of animals. Now, let’s see what happens if we change the input to a more specific term like "feline."
Here, we can see how the values adjust for feline-related animals. You can check out the implementation in this commit.
Let’s look at a more interesting example: Imagine we have an array of phrases containing information about John.
Let’s save the embeddings and run some queries on our data to analyze the results.
As you can see, the highest similarity corresponds to the object that could best answer our question. You can experiment and test different ways of making queries. I encourage you to dive deeper into the capabilities of embeddings on your own; the implementation and code are available in the following commit.
Conclusion
In this article, you learned how to implement embeddings using OpenAI’s API to create powerful semantic searches. We explored practical examples—from a simple list of animals, to more complex queries about personal information—and discovered how cosine similarity helps us find meaningful semantic relationships.
This is just the beginning of what we can achieve with this technology: the systems you build will be able to understand context and answer questions more intelligently than a traditional search.
All the code is available in the commits previously mentioned and linked above—now it’s your turn to experiment and build something amazing! 🚀