On the Performance of Large Language Models, Hallucinations, and Mitigation Techniques - PART 2
Not all Hallucinations in LLMs are the same or have the same cause, thus it is necessary to understand how this phenomena works to apply the proper mitigation.
Published Aug 6, 2024
The concept of hallucination traces its roots to the fields of psychology and is defined as ”the perception of an entity or event that is absent in reality” (”Macpherson (2013)). Within the realm of NLP, hallucination is typically referred to as ”a phenomenon in which the generated content appears nonsensical or unfaithful ” (—Maynez et al. (2020)). This concept is similar to the phenomenon of hallucination observed in human psychology. Generally, hallucinations in natural language gener- ation tasks can be categorized into two primary types: intrinsic hallucination and extrinsic hallucination. Specifically, intrinsic hallucinations pertain to the outputs of LLMs that conflict with the source content. Conversely, extrinsic hallucinations refer to the LLM generations that cannot be verified from the source content.
However, the documentation evaluated for this literature review introduce a more granular taxonomy. This refined taxonomy seeks to encapsulate the distinct intricacies associated with LLM hallucinations. Table 1 is taken from the original paper, and it presents intuitive examples of the definition of LLM hallucination, accompanied by corresponding explanations. The details of the proposed categories are based on the survey ”Huang et al. (2023)” and are elaborated below.
Existing LLMs can occasionally produce outputs that are either inconsistent with real-world facts or potentially misleading, presenting challenges to the trustworthiness of applications using artificial intelligence. In this context, these factual errors are categorized as factuality hallucinations. If the generated factual content can be verified against a reliable source, they can be further divided into two primary types:
Factual Inconsistency the LLM’s output contains facts that can be based in real-world data, but present contradictions. This type of hallucination occurs most frequently and arises from diverse sources. As shown in Table 1, when inquired about ”the first person to land on the Moon”, the model wrongly generated ”Yuri Gagarin”.
Factual Fabrication the LLM’s output contains facts that are unverifiable against established real-world knowledge. As shown in Table 1, while ”the origins of unicorns” traditionally lack empirical grounding, the model fabricated a historical origin for unicorns.
LLMs are trained to align with user instructions, so ensuring consistency with user-provided instruc- tions and contextual information becomes very important. Furthermore, LLM’s faithfulness shows the logical consistency of its provided content. From this perspective, there are three subtypes of faithfulness hallucinations:
Instruction inconsistency refers to the LLM’s outputs that deviate from a user’s prompt. While some deviations might be harmless, the inconsistencies shows unintentional misalignment with user instructions. As described in the example in Table 1, the user’s intention is translation, but the LLM erroneously deviated from the user’s instruction and performed a question-answering task.
Context inconsistency points to instances where the LLM’s output is unfaithful with the user’s provided contextual information. For example, as shown in Table 1, the user mentioned the Nile’s source being in the Great Lakes region of central Africa, yet the LLM’s response contradicted the context.
Logical inconsistency The LLM outputs exhibit internal logical contradictions in reasoning tasks. For example, as shown in Table 1, while the reasoning step of dividing both sides of the equation by 2 is correct, the final answer of x=4 is not.
The root causes of hallucinations in LLMs are primarily divided into two key aspects: Data, Training.
The foundation of Large Language Models (LLMs) lies in pre-training data, providing them with general capabilities and factual knowledge. This foundation primarily manifests in two dimensions: potential risks arising from flawed data sources and the suboptimal utilization of factual knowledge embedded in the data.
Scaling the pre-training data significantly enhances the capabilities of LLMs. However, challenges emerge in ensuring consistent data quality, introducing the risk of misinformation and bi- ases. Additionally, the lack of domain-specific knowledge and current facts in the data may result in knowledge boundaries, limiting LLMs in specific scenarios. These limitations can be categorized into misinformation and biases, duplication bias, and knowledge boundary limitations. Refer to Table 2, extracted from the original paper, for examples of data-induced hallucination.
Misinformation and Biases. In response to the growing demand for large-scale corpora, heuristic data collection methods are employed to efficiently collect large volumes of data. While these methods yield comprehensive datasets, they also run the risk of introducing inadvertent errors, heightening the potential for imitative falsehoods. Additionally, social biases may unintention- ally infiltrate the learning process of Large Language Models, encompassing duplication bias and various social biases that can contribute to hallucinations.
When LLMs undergo training on factually incorrect data, there is a risk of unintentionally amplifying these inaccuracies, potentially resulting in hallucinations labeled as ”imitative falsehoods.” For instance, as illustrated in Table 2, the assertion ’Thomas Edison invented the light bulb’ is a widely debunked misconception. LLMs trained on such factually incorrect data may generate mis- leading outputs.
Duplication Bias. The memorization capability of large language models poses challenges when confronted with duplicated information in pre-training data. This duplication can cause a shift in LLMs from generalization to memorization, leading to a duplication bias. In this bias, LLMs tend to excessively prioritize the recall of duplicated data, resulting in hallucinations that deviate from the intended content.
Hallucinations, particularly those associated with gender and nationality, are closely linked to certain social biases. For instance, LLMs may erroneously associate the nursing profession with females, even in contexts where gender is not explicitly mentioned. Such biases can inadvertently originate from internet-based texts, which often harbor diverse and biased perspectives, and subse- quently manifest in the generated content.
Knowledge Boundary. While the vast pre-training corpora empower LLMs with extensive factual knowledge, they inherently possess boundaries. This limitation primarily surfaces in two dimensions: the absence of up-to-date factual knowledge and specialized domain knowledge. An example is presented in Table 3 (Table 3 is taken from the original paper)
Domain Knowledge Deficiency. While LLMs exhibit impressive performance in various generic domain tasks, their proficiency is inherently limited in specialized domains. Since these general- purpose LLMs primarily rely on extensive publicly available datasets, their expertise falls short in domains with proprietary training data. Consequently, when faced with issues requiring domain- specific knowledge, such as medical or legal inquiries, these models may prominently display hal- lucinations, often characterized by factual fabrications.
Outdated Factual Knowledge. Another inherent limitation in LLMs pertains to their restricted capacity for up-to-date information. The factual knowledge ingrained in LLMs possesses evident temporal boundaries and can become obsolete as time progresses. Once these models undergo training, their internal knowledge remains static and is not subject to updates. This presents a chal- lenge, especially in the context of our dynamically changing world. Faced with inquiries that extend beyond their temporal scope, LLMs frequently resort to fabricating or guessing facts, offering responses that might have been accurate in the past but are now outdated.
The training process of LLMs mainly encompasses two primary stages: 1) the pre-training stage, where LLMs learn general-purpose representations and capture world knowledge, and 2) the align- ment stage, where LLMs are adapted to better align with user instructions and style preferences. While this process provides LLMs with remarkable performance, any shortfalls in these stages can inadvertently lead to hallucinations.
Hallucination from Pre-training. Pre-training acts as the foundational phase for Large Language Models (LLMs), often utilizing a transformer-based architecture for conducting causal language modeling on extensive corpora. Nevertheless, challenges associated with hallucination may emerge due to the inherent architectural design and the specific training strategies applied.
Transformer-based Architecture Flaws: (due to Inadequate Unidirectional Representation). LLMs predict the subsequent token based solely on preceding tokens in a left-to-right approach. This uni- directional methodology, while facilitating efficient training, also has its limitations. It relies solely on context from a singular direction, limiting its capacity to grasp complex contextual dependencies and potentially heightening the risk of hallucination.
Attention Glitches. The Transformer-based architecture, featuring the self-attention module, may at times display unpredictable reasoning errors in the context of algorithmic reasoning. These errors can manifest across both long-range and short-range dependencies, irrespective of the model’s scale.
Next, in Part 3, I will explore two main techniques that can be used to mitigate Hallucinations in Largue Language models.