4.4 C
New York
Wednesday, November 13, 2024

Optimizing your generative AI deployment with new accelerators


The journey from a great idea for a generative AI use case to its implementation in a production environment often resembles navigating a maze. Each shift presents new challenges (whether technical obstacles, security concerns, or changing priorities) that can halt progress or even force you to start over.

Cloudera recognizes the difficulties many companies face when embarking on this path and that is why we started creating accelerators for machine learning projects (AMP). AMPs are fully developed machine learning prototypes that can be deployed with a single click directly from Cloudera Machine Learning. AMPs allow data scientists to go from an idea to a fully functional ML use case in a fraction of the time. By providing pre-built workflows, best practices, and integration with enterprise-grade tools, AMPs eliminate much of the complexity involved in building and deploying machine learning models.

In line with our ongoing commitment to supporting machine learning professionals, Cloudera is pleased to announce the launch of five new accelerators. These cutting-edge tools focus on hot topics in generative AI, enabling companies to unlock innovation and accelerate the development of impactful solutions.

Fine tuning study

Fine-tuning has become an important methodology for creating specialized large language models (LLMs). Since LLMs are trained in essentially the entire Internet, they are generalists capable of doing many different things very well. However, for them to truly excel at specific tasks, such as code generation or language translation for rare dialects, they must be up to the task with a more focused and specialized data set. This process allows the model to refine its understanding and adapt its results to better fit the nuances of the specific task, making it more accurate and efficient in that domain.

Fine Tuning Studio is an AMP developed by Cloudera that provides users with a comprehensive application and “ecosystem” to manage, tune, and evaluate LLMs. This application is a launcher that helps users organize and distribute other Cloudera Machine Learning workloads (primarily through the Jobs feature) that are specifically configured for LLM training and assessment type tasks.

RAG with knowledge graph

Retrieval Augmented Generation (RAG) has become one of the default methodologies for adding additional context to an LLM’s responses. This application architecture uses rapid engineering and vector stores to provide an LLM with new information at inference time. However, the performance of RAG applications is far from perfect, leading to innovations such as the integration of knowledge graphs, which structure data into interconnected entities and relationships. This addition improves retrieval accuracy, contextual relevance, reasoning capabilities, and domain-specific understanding, raising the overall effectiveness of RAG systems.

RAG with Knowledge Graph demonstrates how integrating knowledge graphs can improve RAG performance, using a solution designed for academic research paper retrieval. The solution ingests important AI/ML documents from arXiv in the Neo4j knowledge graph and vector warehouse. For the LLM, we use Meta-Llama-3.1-8B-Instruction which can be used both remotely and locally. To highlight the improvements that knowledge graphs offer to RAG, the user interface compares the results with and without a knowledge graph.

PromptBrew by Vertav

80% of generative AI success depends on prompts, and yet most AI developers can’t write good prompts. This gap in rapid engineering skills often leads to suboptimal results, as the effectiveness of generative AI models depends largely on how well they are guided by instructions. Developing precise, clear, and contextually appropriate cues is crucial to maximizing the model’s capabilities. Without well-designed cues, even the most advanced models can produce irrelevant, ambiguous, or low-quality results.

PromptBrew provides AI-powered support to help developers create reliable, high-performing messages with ease. Whether you’re starting with a specific project goal or a draft, PromptBrew guides you through a streamlined process, offering suggestions and optimizations to refine your prompts. By generating multiple candidate suggestions and recommending improvements, you ensure that your input is tailored for the best possible results. These optimized prompts can be seamlessly integrated into your project workflow, improving performance and accuracy in generative AI applications.

Chat with your documents

This AMP shows how to create a chatbot using an open source, pre-trained, instruction-following large language model (LLM). The chatbot’s responses are improved by providing context from an internal knowledge base, created from documents uploaded by users. This context is retrieved using semantic search, powered by an open source vector database.

Compared with the original LLM Chatbot augmented with business data AMP, this version includes new features such as ingesting user documents, automatically generating questions, and transmitting results. It also leverages Llama Index to implement the RAG pipeline.

For more information, Click here.

Related Articles

Latest Articles