AI embedding vectors is a fancy name for a list of numbers or coordinates that represent a piece of content in a space with multiple dimensions (usually 1536 or 3072). And the distance between those vectors represents the similarity between the pieces of content:

This approach enables relevancy search locally using simple math:
- Get a vector representation of the question from an AI model,
- Calculate the distance between the question vector and the vectors in your database.
In the most basic implementation this requires looping through all vectors in the database so the search performance is directly related to the amount of content stored in the database.
Store AI Embedding Vectors for All WordPress Content
The first step is to calculate and store the embedding vectors for all your content:

The MySQL database used by WordPress was not initially designed to perform any vector storage or math so it lacks support for specialized indexing and functionality for vector operations before version 9.0.
Version 9.0 and above supports both VECTOR
column type and DISTANCE()
function for finding the nearest neighbors.
In earlier versions of MySQL it is possible to efficiently store vectors in a VARBINARY
column and define custom MySQL functions for calculating the vector distance. However, defining custom functions is usually disabled on most hosting providers due to security concerns.
Search WordPress Content by Embedding Vector Distance
The search happens by retrieving the embedding vector for the search query from an AI model and then finding the relevancy (distance) to all of the vectors stored in the database.

Powering the Retrieval-Augmented Generation (RAG)
The vector search results can be used to retrieve relevant content from your WordPress database that LLMs can use to generate better responses to any question. This is known as retrieval-augmented generation (RAG).
This is a good alternative to training a custom AI model on your content.
Suggested Reading
- Google Cloud SQL vector support
- MySQL Vector PHP library using a custom MySQL function for calculating the cosine similarity between vectors stored in
JSON
column type.
Leave a Reply