Responses that are more factual, less toxic, and better aligned to specific requirement. There are two main approaches that you can take for fine-tuning depending on your use case and chosen foundation model. If you are interested in fine-tuning your model on domain-specific data, see domain adaptation fine-tuning. If you are interested in instruction-based fine-tuning using prompt and response expand pools, see instruction-based fine-tuning. Retrieval Augmented generation. Foundation models are usually trained offline, making the model agnostic to any data that is created after the model was trained. Additionally, foundation models are trained on very general domain corpora, making them less effective for domain-specific tasks. You can use Retrieval Augmented Generation, or RAG, to retrieve data from outside of foundation model and augment your prompts by adding the relevant retrieval data in context. For more information about RAG model architectures, see Retrieval Augmented Generation for Knowledge-Intensive NLP Tasks. With RAG, the external data used to augment your prompts can come from multiple data sources, such as a document repositories, databases, or APIs. The first step is to convert your documents and any user queries into a compatible format to perform relevancy search. To make the formats compatible, a document collection or knowledge library and user submitted queries are converted to numerical representations using embedding language models. Embedding is the process by which text is given numerical representation in the vector space. Rack model architectures compare the embeddings of user queries within the vector of the knowledge library. The original user prompt is then appended with relevant context from similar documents within the knowledge library. This augmented prompt is then sent to the foundation model. You can update knowledge libraries and their relevant embeddings asynchronously.