Embedding Vectors
Embedding vectors (sometimes just called embeddings) are one of the core technologies of modern LLMs. They’re not as flashy as products like ChatGPT, but they’re one of components of the AI revolution thats upon us.
An embedding vector lets you take an arbitrary object (text, images, audio) and get a numerical representation that you can use to compare to others. For text you can take a message like “I’m excited about today” and you’ll get back an array of numbers, like so:
-0.0037502456, -0.0031740714, -0.01605105, -0.022092355, -0.008311908, -0.0015324868, -0.015941953, -0.004718491.....
These numbers don’t mean anything on their own, think of it like a very specific point in space, but it lets us measure how “conceptually similar” two different blocks of text are without looking at the individual words.
Searching based on embedding is then straightforward - for a given query just look for the most similar blocks of text to the input.
This is what it means to “Start with the Answer”. You’re no longer having to think about the various ways that you need to construct a query to find what you want, you just give and example of what you want to find and you can surface all similar results.
Searching for the Answer
Previous iterations of search would have us use individual words to find what we’re looking for. Its cumbersome and requires a user to come up with the exact words that they need to know to make a result surface - just look at Notion’s search to see how crappy the experience can be.
Embedding search lets you find what you mean instead of finding what you said.
Starting with the answer lets you cast a broad net in terms of pulling back results. You’re no longer searching for keywords, instead you’re crafting an example of what you want to find and pull back results that are similar in concept.
Here’s an example: Lets try and find people who are excited about the future of AI. Rather than trying to figure out a bunch of terms and ways to express sentiment we can just use the phrase “AI is going to change the world for the better” and find what matches that.
Query: AI is going to change the world for the better
Results:
* AI has the potential to change many ways in which we thought about society, about what we're able to do, the problems we can solve.
* I've been working on AI for decades now, and I've always believed that it's going to be the most important invention that humanity will ever make.
* The increase in quality of life that AI can deliver is extraordinary. We can make the world amazing and we can make people's lives amazing. We can cure diseases. We can increase material wealth. We can help people be happier, more fulfilled, all of these sorts of things.
* 10 most valuable startups in the world that they're doing. 29 billion. Whether or not, I mean, it'll be interesting to see what I predicted my prediction stack that AI would be the technology of 2023. And I don't, and what I want to be clear, I don't know if it will actually have tangible impact that makes it the technology of the air, but it's going to be the next thing that everyone's going to get huge fomo around.
* A lot of the recent buzz has been around such generative AI models, particularly large language models. AI though is much broader than just large language models. I believe it's the most transformative technology of our time. Fully on par with the internet. Fully on par with the mass production of automobiles.
Its as easy as that - no need to try and express exactly what you want to find in some complex query, just write out what you want to find and the embedding vector does the rest.
Here’s another sample, for finding advertisements in transcripts:
Query: This podcast is sponsored by product
Results:
* This podcast is brought to you by Squarespace. You might be familiar with Squarespace because we talk about it a lot because it is one of the core products in the who weekly offerings we need it to literally exist.
* Yeah, and fire back. Let's stop for a minute and listen to work from one of our fine sponsors. This episode is supported by Noah, News Over Audio, which is an audio app offering narrated articles from some of the world's best publications, such as the economist Bloomberg, foreign affairs in the New York Times.
* today's episode is sponsored by Nerdwallet smart money podcast smart money is a fun way to gain knowledge you can use to level up your finances every week nerd wallets in house experts answer your real world money questions and deep dive into the things you've always wondered about
* Support for this podcast comes from the Corporation for Public Broadcasting, a private corporation funded by the American people.
* This podcast is supported by Emerson. From technology that delivers cures at warp speed to software that makes clean energy reliable, Emerson innovation helps make the world healthier, safer, smarter and more sustainable. Emerson, go boldly.
Or, if you just want to capture the mood on the economy:
Query: The US economy is not going into recession
Results:
* Economists and politicians have been warning about the possibility of a looming recession for months. But that hasn't happened yet.
* Yeah, Bank of America actually published a piece today saying like, yeah, you can just not say it like you can just get over the idea that we are not in a recession. And that we might not go into one. So I think that's the big thing is like, we're constantly talking about like, oh, are we going to enter a recession? Like a recession is coming, a recession is coming, and a recession has been coming since we hit last ad a recession. It seems like the economy is doing okay.
* There's no longer this assumption that we're in a recession. The data shows we're not in recession. And so, still hiring going on.
* That was the chair of the Federal Reserve Jerome Powell yesterday suggesting that a recession is not necessarily imminent anymore. It's a view that's been backed up by a report from the Commerce Department today that our gross domestic product, GDP, grew at a faster than expected 2.4% of its pace last quarter. Fourth consecutive quarter of growth after we had those two quarters of deaths. Commerce Department also reported today that durable goods orders. It is for purchases like new cars and appliances, rose 4.7% last month. That was much higher than expected.
Start with the Answer - by providing example of what you mean and what you want to see embedding vectors can surface what you want to see, without you having to fight an awful term search.