Problem Solving with GPT-3, Customizing Semantic Search

May 23, 2022

GPT-3 is amazing at complex tasks like creative writing and summarizing. But it's surprisingly bad at reversing words. 🤔 The reason is that GPT-3 doesn't see the world the way we humans do. 👀 If you teach it to reason, it can get around its limitations to get really good. 💡

Peter Welinder @npew

And that's it! 🎉 GPT-3 correctly reverses long words! But to get there, we had to teach GPT-3 the algorithm to use to get around its limitations. 💪

How can we overcome the limitation of language models through clever prompt design? GPT-3 was built to see language in terms of “tokens” or character chunks instead of individual characters or words. The author of the tweet shows how GPT-3 struggles to do something as simple as reversing a word (top tweet) and then proceeds to literally write out an algorithm in plain English (bottom tweet) that allows GPT-3 to reverse words successfully!

The takeaway isn’t that we’ll be using Large Language Models like GPT-3 to reverse words, but rather there’s a lot of room for creativity and algorithmic thinking in getting LLMs to solve problems similar to writing code. But the key difference is LLMs enable creativity and problem solving with plain English, and I believe this is going to greatly increase the number of programmers in the world.

What I Learned This Week

If you are in a domain with specific terminology and dense documentation such as insurance, then having a handy domain-specific semantic search model that allows you to ask questions about the documentation could be a huge productivity boost. Unfortunately customizing or fine-tuning semantic search models require a large data labeling effort. You need to come up with a lot of possible questions that would return a specific paragraph in this documentation.

These question-passage pairs require a lot of time and money to generate.

GenQ to the rescue! I came across this tutorial from the vector database company Pinecone. It highlights a method called GenQ that automatically generates these question-passage pairs!

Source: Pinecone (https://www.pinecone.io/learn/genq/)

The only input required is the documents you’d want to effectively search against. The implications are huge because you could

Approach the quality of a model trained on a huge curated question-passage dataset
Customize search models at scale without spending lots of money and time

Anish's Newsletter

Discussion about this post

Anish's Newsletter

Problem Solving with GPT-3, Customizing Semantic Search

Favorite Tweet of the Week

What I Learned This Week

Discussion about this post