Practical AI
Posts
Getting Your Data Into ChatGPT's Results Like a Pro

Getting Your Data Into ChatGPT's Results Like a Pro

How ChatGPT can use your private data to give you the best possible results

Greg Howe
June 26, 2023

Hey there, teammate! Welcome to the next edition of the most practical newsletter on AI in the world!

Each week, I'll cut through the noise surrounding AI to serve you practical insights that not only help transform your business but also empower you, the business leader, to evolve into a data-driven, AI-enabled visionary.

The AI noise is constant and loud, and you are here for a quiet meal of AI goodness, so let’s get right to it! Bon Appetit!

Today’s Menu:

Private Data in Your Results

Action Item - Preparing for Grounding

Learning Cube - Grounding, Retrieval Augmented Generation (RAG), Embedding

Tool Spotlight - Chroma, Weaviate and Pinecone; LangChain

Cool Factor - Google Generative AI Learning Path

June 26, 2023

Everyone has started to see the limitations of ChatGPT, where it “hallucinates” and returns fake data. Or it returns some real data mixed in with some hallucinated data. It makes it hard to trust ANY results you get from it!

This is a frustrating but normal occurrence for a language model.

You have to keep in mind that the LLMs (large language models) like GPT were trained on data to build language patterns, not databases and huge data sets for searching down the road.

The data that helps with a language pattern becomes part of the language pattern — so you are kind of seeing a reflection when you get results from ChatGPT. It is just a reflection of the language pattern, not current real data. A good amount of the time, it is real data that is easy to use. But as our prompts and questions get more complex, the waters get a bit murky and sometimes the results show that murkiness.

One of the best ways to counteract this limitation of the LLM is to feed it real data along with your question. You can do this today with ChatGPT. Give it real data to work with as context, and then your question will get SUCH better results.

This is used for results that sound like you, or act in the way your processes do. You can give it examples of your writing and it will do things your way when generating content. You can feed it P&L statements to correlate them and produce results across months and years, producing results in a natural language.

The next wave in AI for 2023 is “Grounding”. You add real data as context to your prompts, thus grounding the LLM with real data. Instead of cutting and pasting as mentioned above, you would have a whole huge database of your “stuff” that is queried against for that context.

Relationships between items across your entire organization are formed, so your prompts can access data from all over the place to provide the BEST and most accurate result for you.

The idea is to take your prompt, create a “semantic search” (searching for things similar to what you are talking about) across all of your real data, and use those results as context for your prompt.

So you still just put in your prompt, and get a result. But the system pulls out and includes real context from your real data. The stuff behind the firewall. Your financial data, company data, processes and SOPs, databases and customer lists.

For a more detailed, but not too deep or geeky read, see Microsoft’s “Grounding LLMs” blog post https://techcommunity.microsoft.com/t5/fasttrack-for-azure/grounding-llms/ba-p/3843857

Want to feature your service or product in the world’s most practical AI newsletter? Email me at [email protected] for more information.

Action Item

Preparing For Grounding

Grounding sounds immediately beneficial, doesn’t it?

You take all your documents, all your data, all your everything, and stuff it all in a big database that is searchable across the whole thing. Then when you write a prompt, the underlying system queries all that data goodness and updates your prompt with all that yummy context, giving you the best results in the whole wide world!

That is the ideal, yes. And it IS possible in most cases, for sure.

But it doesn’t just HAPPEN.

For this to go well for you, for you to become the data-driven, AI-enabled visionary, you must plan your work carefully. Why?

1) Embedding can be expensive. To embed all of your documents and data, it can cost you a couple thousand dollars just to create and store the vectors for everything. (See today’s Learning Cube for info on Embeddings)

2) You don’t want to embed the things you will never search for. Start with what you know you will search against. What is 80% of your question space around? It will be 20% of your stuff. Create embeddings for THOSE.

3) Plan for the simple, wait on the complex. There are so many freaking advantages you can get by automating or adding AI to the simple things you do every day. Save yourself 15 min a day for a year, and you have gained back 65 hours for revenue-generating activities. Make it so that you work less.

Knowing what you do, what you have to work with, and what are the low-hanging fruit will make all the difference.

Start doing this today, before you know the tools and people you need.

Learning Cube

Grounding

Grounding is the process of using LLMs with relevant information that is use-case specific, and not available as part of the LLM's trained knowledge. You usually ground an LLM with your own data, relevant website pages, and databases of information to give your prompts the best context for the LLM to work from.

Grounding is like tailoring a suit. Despite LLMs having vast knowledge, they still need 'grounding' for specific scenarios. Just as an AI tool needs the right information for creating targeted content, LLMs require specific details to deliver the best results for each use-case.

Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) is a grounding technique. It is a process for retrieving information relevant to a prompt, providing it to the language model along with the prompt, and relying on the LLM to use it as context when responding.

Grounding is the WHAT all your data is for…
RAG is the HOW your data gets to the LLM for use…

Embedding

Embedding is the technique of making all of your data searchable. Your documents and data get transformed into “vectors”, thousands of fields per vector. Then vector #1 for an SOP can be compared to vector #1 for an email you wrote, and it will be an apples-to-apples comparison.

Think of how you describe something having a particular color. Color could be your vector #1. My pool ball and your blade of grass will each have a color…a vector #1. We can compare these two very different things using that same vector. Both might be “green”. But one might be “dark green” while one is “light green”. So the amount of green matters.

Embedding gives you a way of searching across all these things. If someone says “Give me all my green items”, then both the pool ball and the blade of grass might both be returned by the LLM. But if someone says “Give me all the dark green items” then only one of the two will be returned because the prompt had specific restrictions that the LLM saw in vector #1.

So you GROUND the LLM with all of your documents and data, to ensure it has what it needs to give good results.
You use EMBEDDINGS of your documents and data to make them searchable and usable.
You use the RAG technique to do the actual search and add the results to the prompt for the LLM to use.

Tool Spotlight

Chroma, Weaviate and Pinecone; LangChain

In order to quickly process, and efficiently store, your documents as embeddings, several companies have brought out vector database offerings. There is nothing special about the vectors…they are just numbers being stored. But searching them effectively and quickly makes a big difference.

Chroma, Weaviate and Pinecone — These 3 are the current big names in the vector database world. There are more than these and more will come out. (Side note that I have used Google Sheets as my vector database in small use cases just fine)

Weaviate and Pinecone are better for lots of data with lots of relationship exploration. Chroma is better for smaller loads and general solid use. I am using Chroma for most of my current work these days simply because I don’t need the bigger database offerings. All 3 have their place and are excellent.

Chroma

Chroma is designed as a database that learns. It provides a simple API with basic commands for creating collections, adding documents, and querying the database. Users can bring their own vector embeddings (I use OpenAI’s embeddings API for example) or use Chroma's built-in embedding capabilities. It also supports filtering on arbitrary metadata. Chroma is currently a database that sits on your servers. A hosted version of Chroma is being planned (read here).

I like what their own site says about their product:

Developers use Chroma to give LLMs pluggable knowledge about their data, facts, tools, and prevent hallucinations. Many developers have said they want "ChatGPT but for my data" - and Chroma provides the "for my data" bridge through embedding-based document retrieval.

Weaviate

Weaviate is a vector database that can scale to handle billions of data objects. Weaviate supports vector search, allowing users to index billions of data objects for searching. It also supports hybrid search techniques, combining keyword-based search with vector search for state-of-the-art results. Applying especially to grounding, it offers a generative search feature, where search results can be piped through language models like GPT-3 to generate next-gen search experiences.

“Weaviate is an open-source vector database. It allows you to store data objects and vector embeddings from your favorite ML-models, and scale seamlessly into billions of data objects.”

- their website

Pinecone

Pinecone is a cloud-based vector database and indexing service provided by Pinecone Systems Inc. It is designed for high-performance vector similarity search. Pinecone aims to simplify the process of building and deploying vector similarity search applications. It provides an API for indexing and searching high-dimensional vectors efficiently. Pinecone leverages an approximate nearest neighbor (ANN) algorithm to speed up similarity searches, enabling real-time responses even with large-scale datasets. It also offers additional features such as automatic indexing, dynamic data updates, and support for incremental learning, making it suitable for real-time applications.

LangChain

LangChain is an incredible tool that builds applications across LLMs and vector databases and, and, and… It is a growing tool!

So this app builder is open sourced out on GitHub. Follow the link to read up on it and all it can do. I will let its own documentation give you the basic summary:

“There are six main areas that LangChain is designed to help with. These are, in increasing order of complexity:

📃 LLMs and Prompts:

This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.

🔗 Chains:

Chains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.

📚 Data Augmented Generation:

Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.

🤖 Agents:

Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.

🧠 Memory:

Memory refers to persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.

🧐 Evaluation:

[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.”

This is an emerging and growing, and now well-financed, application builder. It is complex, and you need to know a bit of coding to work it. I would not recommend it for power user work. However, for those used to building applications you will find this rather straight-forward (for now).

This is what everyone who is building real AI apps will be using to build apps for the coming year or two. Don’t get me wrong, there are other tools and well-worn tools that will do the job just fine to build you a killer AI app. But LangChain is current thought, current technology, and built around AI.

Cool Factor

Google Generative AI Learning Path

Google has done something that is cool for so many reasons. They have created, and put out for free, a learning path for people to better understand Generative AI.

It has general introduction that you will need to head into this AI-enabled future, but also has some more intermediate-level “introductions” to things like “Attention Mechanism”.

I highly recommend going through the learning path just so you get the big picture better, and can begin to think like an AI-enabled visionary and entrepreneur!

Feature your service/product in the world’s most practical AI newsletter

Practical AI is the world’s most practical AI newsletter with subscribers from many different industries and countries, all looking to make use of tools and services to bring AI into their businesses. You can book your ad spot by emailing me at [email protected]

In Closing…

What did you think of this week’s newsletter?

Your feedback helps me create better content for you!

Just reply back to this, or email me directly at [email protected], and let me know what you think:

5 Stars - Loved It!
3 Stars - Meh…not bad
1 Star - This sucks

If you want to sign up for this newsletter or share it with a friend, you can find us right here

Thanks for reading. Let me know if you applied anything from this newsletter!

See you next week!

Greg