How Hypotenuse AI uses AWS to make LLMs more factually accurate

How Hypotenuse AI uses AWS to make LLMs more factually accurate

Hypotenuse AI is an AI writer built for ecommerce brands and SEO teams to manage and create content. Here's how they do it in a more contextual and factually accurate way with AWS OpenSearch.

Glendon Thaiw
Amazon Employee
Published Jul 3, 2024
Last Modified Jul 4, 2024
Co-authored by Hypotenuse AI & Amazon Web Services (AWS)
  • Joshua Wong, Founder & CEO, Hypotenuse AI
  • Ng Shi Hui, Marketing Lead, Hypotenuse AI
  • AI writer, Hypotenuse AI
  • Glendon Thaiw, Startup Solutions Architect, AWS
Have you been living under a rock?
You might have heard that phrase before. Maybe not directed at you, but to someone who didn’t seem to know what’s happening in the outside world.
They may know a bunch of stuff they’ve been exposed to and within their circle, but there are some popular and widespread news that didn’t get on their radar.
Large Language Models (LLMs) have a bit of that quirk.
Large Language Models (LLMs) are machine learning models that can understand and generate natural human language. They’re trained on large amounts of data from across the web using deep learning techniques, specifically a variant of neural networks known as transformers. While LLMs have been extended to be multi-modal (e.g. taking in text and images), we will focus on our text use cases here.
They are the underlying model behind the wave of new AI chatbots, search engines, and text generators we use today. Instead of just giving templated responses, LLMs can understand your question in its context and provide a tailored response.

The problem with LLMs

LLMs are a powerful way to streamline and speed up your current workflows in areas such as content creation, programming, and summarizing large documents.
To do this, they first undergo extensive pre-training on large datasets, which can contain a significant proportion of text data available on the Internet. During this phase, their primary task is to learn how to predict the next word (or specifically, predicting the next token). In the process of learning how to predict the next word effectively, LLMs also learn about the structures, patterns, and nuances of language, some level of reasoning, and “remember” some of the world knowledge contained in its training data.
After pre-training, LLMs often undergo further training techniques such as Reinforcement Learning from Human Feedback (RLHF) to align it to certain objectives, such as being able to respond conversationally, listen to instructions, or steer away from talking about topics.
When it’s time to generate text—where to answer a question, write a summary, or create content—the input text is broken into tokens, and the LLM will predict the next token based on the inputs and what it has learned through its training. Because of the way LLMs are trained and how they produce output, they come with flaws, including hallucinations, bias, and making mistakes. 
Hallucinations & Bias
Large language models (LLMs) tend to "hallucinate”. That means generating inaccurate or made-up facts that don't align with reality. This is a major issue, as it can mislead users who expect factual and trustworthy information.
Here are some reasons why this happens:
  • Outdated information: LLMs are trained on a fixed dataset before having it go live. Once it’s live, it doesn’t continue to train and learn new data. No matter how up-to-date the dataset is, it will quickly go stale, as new information and news are created every day—whether about newly elected politicians, the weather today, or recent celebrity events. This means that when questioned about new information, an LLM might simply not know about it at all, and may end up hallucinating its response to ensure it still predicts the next word.
  • Incomplete representation: LLMs don’t have explicit memory or a database they can reference. Even if it has been trained on the exact facts before, it can still make mistakes or misremember what it has been trained on.
  • Limitations of its training data: LLMs are trained on large datasets that may contain inaccuracies and biases, which the model might learn and reproduce.
  • Insufficient data: To satisfy the inquirer, the LLM might try to produce a response even if it doesn’t have the data to support it. That makes it likely to be false and inaccurate.
  • Misinterpretation: The way a question or prompt is phrased can lead to the model misinterpreting the user’s intent, therefore pulling from unrelated contexts and data points and generating fabricated responses.

Generic & Irrelevant

LLMs often produce responses that may not be relevant or specific to the user. They lack context and are often trained on generic datasets across a wide range of topics, as they’re designed to be scalable and broadly applicable.
With these limitations, users will have to do their own research and triple-check what the LLMs are producing. That sometimes leads to lower trust levels.

What is Hypotenuse AI?

Hypotenuse AI is an AI writer built for ecommerce brands and SEO teams to manage and create content with AI. They can create SEO blog articles, optimize product listing pages for ecommerce websites, and generate social media captions in a way that captures a brand’s voice.
They’ve built and trained bespoke AI models specialized for ecommerce that align strongly with each brand’s writing style. Beyond the models, the platform combines and serves this in smooth workflows that integrate with ecommerce systems and handle product information in bulk, without needing a huge change in existing processes.
On Hypotenuse AI, ecommerce brands can write or rewrite product descriptions in bulk while maintaining the quality of content on every product detail page and listing page. This is done by enriching your product with information from the web or even your product image.

Factual accuracy

When you write blog articles on the platform, it doesn’t just produce strings of text based on the datasets it gets trained on. Their factual research feature pulls facts from across the web in real-time and applies them to the content produced.
In their SEO pro mode, the generated blog articles also include hyperlinks that connect to updated reputable sources on the internet.

Contextual knowledge

With their company knowledge feature, brands get a bespoke AI model trained on company material, past writings, perspectives, and anything related to the company so the content produced would be written in the right context.

Hypotenuse AI’s approach to factual accuracy

Despite the limitations mentioned above, Hypotenuse AI can create content that is relevant and factually accurate with the help of AWS.

Proprietary knowledge bases

Instead of using generic datasets, Hypotenuse AI models are trained on high-converting marketing content and proprietary knowledge bases that are unavailable to others.
This means that the content produced is designed for a very targeted purpose—like a highly-experienced marketer writing content compared to a generalist.


While LLMs produce responses based on a fixed dataset, there are ways that Hypotenuse AI extends this to other datasets that are relevant to the request at hand.
One way to do this is through AWS OpenSearch. It enables LLMs to retrieve data from external sources that may have been too large for LLMs alone to deal with. This process is called Retrieval Augmented Generation.

Retrieval Augmented Generation

Retrieval Augmented Generation (or RAG) allows language models to tap into external data sources like the internet or internal knowledge bases to augment their response.

Augmenting LLMs using OpenSearch

AWS OpenSearch service acts as the retriever component that supplies relevant contextual data to the language model.
During setup, the knowledge source (e.g. our internal knowledge base) is first indexed by OpenSearch, creating a precise and fast way of retrieving relevant information.
During inference time, when a query is received, this is sent to OpenSearch. OpenSearch pulls the relevant precise information based on what was asked and returns this information back to Hypotenuse AI’s LLM.
This retrieved information is combined with the original user query and given to the LLM to generate a relevant response for the user. 

By using relevant and precise data provided by OpenSearch to augment the LLM’s inputs, the LLM can generate a much more accurate response and make use of that information rather than hallucinating its own.

Example: Hypotenuse AI’s HypoChat

HypoChat is an AI chatbot on Hypotenuse AI, where you can enter a query and receive a response.
It employs RAG to pull information from external sources to generate more accurate, specific, and contextual replies, as illustrated in this example.
The same question was asked, one using RAG and the other without using RAG.

Response without using RAG

Without RAG, the user gets a more generic and one-dimensional response that is probably not very helpful to the user. There aren’t any substantiation or examples to back up what is being mentioned as well, making the response hard to trust.

Response using RAG

With RAG, HypoChat could pull information from the web, producing a more specific and elaborate response with exact numbers to inform. It also substantiates every data point with its live source so the user can easily verify and reference it.

How this helps ecommerce brands and marketers

Personalized for Better Engagement

In today's crowded online space, generic, one-size-fits-all content won't cut it. Ecommerce brands and marketers have to produce content that is relevant and targeted towards their audiences.
By tapping into company knowledge and external data sources on Hypotenuse AI, brands can now speed up their content workflows with AI and still write in a way that empathizes with their audiences’ needs and pain points, is factually accurate, and substantiates points that are made.
This helps brands resonate with readers and improves trust with more accurate and useful content, which in turn improves engagement.

SEO-optimized for More Traffic

As Google doubles down on poor-quality content, it’s even more crucial to create factually accurate content that is relevant and useful to readers.
That means linking your content to updated sources, providing strong examples, and writing content that matches the intent. On Hypotenuse AI, you can produce articles that understand search intent and are already embedded with reputable sources. This helps to boost the relevancy and credibility of your articles, which gives them the best chance at ranking high on search engines.


Retrieval Augmented Generation with AWS OpenSearch is a great way to enhance the power of LLMs by addressing their limitations on insufficient data or outdated information. This helps them produce responses that are more factual and better fit the context of a query, thereby supporting businesses and users in a more concrete manner.

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.