Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequence[string] #1844

Open
amin-kh96 opened this issue Jan 13, 2025 · 7 comments
Open

Sequence[string] #1844

amin-kh96 opened this issue Jan 13, 2025 · 7 comments
Labels
question Further information is requested

Comments

@amin-kh96
Copy link

does RAGAS specifically require a flattened format for the context?
or Do you think this format is ok? Sequence[string]
I am working with version 0.1.7

@amin-kh96 amin-kh96 added the question Further information is requested label Jan 13, 2025
@amin-kh96
Copy link
Author

@dosue
why don't you give me the answer?

@amin-kh96
Copy link
Author

@dosu
why don't you give me the answer?

Copy link

dosubot bot commented Jan 13, 2025

RAGAS version 0.1.7 expects the context input to be in the format of a List[str] (a list of strings) [1]. The Sequence[string] format is not explicitly mentioned as supported, so it's recommended to use a List[str] to ensure compatibility.

This is a bit tricky, @jjmachan, could you lend your expertise on whether RAGAS specifically requires a flattened format for the context?

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@hundredeuk2
Copy link
Contributor

This issue seems related to #1807. To assist you more effectively, I’ve replicated your described data format in the same development environment (RAGAS version 0.1.7), and everything appears to work fine on my end. However, to expedite troubleshooting and ensure we can address your specific issue accurately, it would be extremely helpful if you could share a portion of your dataset or provide the full code you used.

from ragas import evaluate
import datasets
from ragas.metrics import (
    answer_relevancy,
    faithfulness,
    context_recall,
    context_precision,
)


sample = {
"contexts": [["context1_chunk1", "context1_chunk2"], ["context2_chunk1"]],
"question": ["question1", "question2"],
"answer": ["answer1", "answer2"]
}
dt = datasets.Dataset.from_dict(a)

result = evaluate(
    dt,
    metrics=[
        faithfulness,
        answer_relevancy,
    ],
)
Evaluating:   0%|          | 0/4 [00:00<?, ?it/s]No statements were generated from the answer.
No statements were generated from the answer.
Evaluating: 100%|██████████| 4/4 [00:03<00:00,  1.05it/s]
|   | contexts                           | question   | answer   | faithfulness | answer_relevancy |
|---|------------------------------------|------------|----------|--------------|------------------|
| 1 | [context1_chunk1, context1_chunk2] | question1  | answer1  | NaN          | 0.859496         |
| 2 | [context2_chunk1]                  | question2  | answer2  | NaN          | 0.000000         |

@amin-kh96
Copy link
Author

amin-kh96 commented Jan 14, 2025

@hundredeuk2
here is the full code.
I extracted my embeddings and textual data related to embeddings, and now I am receiving this output user_llm_interaction_embeddings_c1521dd5_b819_4241_b3a4_3e5c1388037c.json

the full code:

import json
import os
from pathlib import Path

import numpy as np
from datasets import Dataset
import typing as t
import torch

from langchain_core.outputs import LLMResult, Generation
from langchain_core.outputs import LLMResult
from transformers import AutoModel, AutoTokenizer

from ragas.llms.prompt import PromptValue
from ragas.embeddings import BaseRagasEmbeddings
from ragas.metrics import context_utilization
from ragas.llms import BaseRagasLLM
from ragas.llms.prompt import PromptValue
from langchain_core.outputs import LLMResult, Generation
from ragas import evaluate

current_script_path = Path(file).resolve()
config_base_path = current_script_path.parents[1]

Load the ground truth data

file_path= os.path.join('src','assets','GT.json')
with open(config_base_path / file_path) as f:
ground_truth_data = json.load(f)

Load the question and the answer and the chunks

file_path= os.path.join('src','assets','user_llm_interaction_embeddings_c1521dd5_b819_4241_b3a4_3e5c1388037c.json')
with open(config_base_path / file_path) as f:
llm = json.load(f)

#creating a dataset of str type
#new_data_set = []
question=[]
context=[]
answer=[]
#extarcting str data of 'question' and 'answer'
for item in llm:
if item['role'] == 'user':
for c in item['content']:
question.append(c['text'])
else:
for c in item['content']:
answer.append(c['text'])

Iterate through each dictionary in your data

#for item in ground_truth_data:
# Check if 'content' key exists in the dictionary
#if 'content' in item:
# Access the value of the 'content' key and append it to the context list
#context.append(item['content'])
#else:
#print(f"'content' key not found in item with id: {item.get('id')}")

Check the length of context to see if anything was appended

#print(f"Number of context entries extracted: {len(context)}")

Iterate through each dictionary in your data

for item in llm:
# Check if 'content' key exists in the dictionary
if item['role'] == 'assistant':
# Access the value of the 'content' key and append it to the context list
context.append(item['chunks'])
else:
""" print(f"'content' key not found in item with id: {item.get('id')}") """

Check the length of context to see if anything was appended

""" print(f"Number of context entries extracted: {len(context)}") """

Replace the IDs with the corresponding content

chunk_embeddings = []
chunk_string = []

for sublist in context:
strings = [] # Initialize for each sublist
embeddings = [] # Initialize for each sublist
for idx, i in enumerate(sublist):
for item in ground_truth_data:
if item['id'] == i:
# Append matching content and embeddings
strings.append(item['content'])
embeddings.append(item['text_vector'])
# Append results for the current sublist
chunk_embeddings.append(embeddings)
chunk_string.append(strings)

Initialize empty lists for dataset

new_ragas_dataset = {
"question": [],
"contexts": [],
"answer": []
}

Assuming question, context, and answer lists are already available

for i in range(len(question)):
new_ragas_dataset['question'].append(question[i])

# For now, we assign all the chunks (contexts) to each question
new_ragas_dataset['contexts'].append(chunk_string[i])  # context is a list of chunks

# Assign the corresponding answer
new_ragas_dataset['answer'].append(answer[i])

Print to verify the format

#print(f"Dataset length: {len(new_ragas_dataset['question'])}")
#print(f"Sample entry:\n{new_ragas_dataset['question'][0]}") # Question sample
#print(f"Related contexts: {len(new_ragas_dataset['contexts'][0])}") # Contexts for the first question
#print(f"Answer sample: {new_ragas_dataset['answer'][0]}") # Answer sample

Initialize an empty list to hold the new dataset

data_set = []

Iterate through the list and combine every two dictionaries

for i in range(0, len(llm), 2):
# Calculate the corresponding index for chunky_emi
chunk_index = i // 2 # Map llm index to chunky_emi index
combined_dict = {
"text_vector_1": llm[i].get("text_vector", []),
"text_vector_2": llm[i + 1].get("text_vector", []),
'chunks_embd': chunk_embeddings[chunk_index]
}
data_set.append(combined_dict)

#for j in llm:
# Check if 'content' key exists in the dictionary
#if j['role'] == 'assistant':
# Access the value of the 'content' key and append it to the context list
#data_set.append(j['chunks'])
#else:
#pass

#def map_chunks(data_set, ground_truth_data):
#for item in data_set: # Iterate over each dictionary in data_set
#c = [] # Reset c for each item
#for chunk_id in item['chunks']: # Loop through 'chunks' in the current dictionary
#for element in ground_truth_data: # Loop through ground_truth_data
#if element['id'] == chunk_id: # Match chunk_id with element's id
#c.append(element['text_vector']) # Append the matching text_vector to c
#item['chunks'] = c # Replace the original 'chunks' (ids) with the mapped text_vector values

#return data_set  # Return the updated data_set

#data_set = map_chunks(data_set, ground_truth_data)
#data_set.append(chunk_embeddings)

Assuming data_set is a list of dictionaries

ragas_data = [
{
"question": entry["text_vector_1"], # Assuming this is a list of strings
"answer": entry["text_vector_2"], # Assuming this is a list of strings
"contexts": entry["chunks_embd"] # Assuming this is a list of lists of strings
}
for entry in data_set
]

Create the required structure that structures the data for the Hugging Face Dataset creation.

formatted_data = {
"question": [entry["question"] for entry in ragas_data],
"contexts": [entry["contexts"] for entry in ragas_data],
"answer": [entry["answer"] for entry in ragas_data]
}

Define the column_map to match custom columns to expected ones

column_map = {
"question": "question", # Match to the correct column
"answer": "answer", # Match to the correct column
"contexts": "contexts" # Match to the correct column
}

Create a Dataset using the Hugging Face datasets library

new_ragas_dataset = Dataset.from_dict(new_ragas_dataset)

Assuming new_ragas_dataset is your dataset

#def flatten_contexts(example):
#example['contexts'] = [item for sublist in example['contexts'] for item in sublist]

return example

#new_ragas_dataset = new_ragas_dataset.map(flatten_contexts)
model_name = 'distilbert-base-uncased'

class CustomHuggingFaceRagasEmbeddings(BaseRagasEmbeddings):
def init(self, model_name: str, custom_embeddings: dict = None):
"""
Initialize the Custom Hugging Face Ragas Embeddings with the specified model and custom embeddings.

    Parameters:
        model_name (str): The name of the Hugging Face model to use (e.g., 'distilbert-base-uncased').
        custom_embeddings (dict): A dictionary with pre-computed custom embeddings (optional), 
                                  where keys are text strings and values are the corresponding embeddings.
    """
    self.model_name = model_name
    self.tokenizer = AutoTokenizer.from_pretrained(model_name)
    self.model = AutoModel.from_pretrained(model_name)
    self.custom_embeddings = custom_embeddings if custom_embeddings else {}  # Store custom embeddings as a dict

def embed_text(self, text: str) -> np.ndarray:
    """
    Check if the text has an existing custom embedding, if not, compute it using the model.
    
    Parameters:
        text (str): The text to embed.
    
    Returns:
        np.ndarray: The embedding for the text.
    """
    if text in self.custom_embeddings:
        # Return the custom embedding if it exists
        return np.array(self.custom_embeddings[text])
    
    # Generate a new embedding using the Hugging Face model
    inputs = self.tokenizer(text, return_tensors='pt', padding=True, truncation=True)

    with torch.no_grad():
        outputs = self.model(**inputs)

    # Use the CLS token embedding
    embedding = outputs.last_hidden_state[:, 0, :].numpy()
    
    # Optionally, save this new embedding to custom_embeddings for future use
    self.custom_embeddings[text] = embedding
    
    return embedding

def embed_documents(self, texts: list) -> np.ndarray:
    """
    Generate embeddings for a list of documents. Check if custom embeddings are available first.
    
    Parameters:
        texts (list): A list of documents to embed.
    
    Returns:
        np.ndarray: An array of embeddings for the documents.
    """
    embeddings = []
    for text in texts:
        embedding = self.embed_text(text)
        embeddings.append(embedding)
    
    return np.array(embeddings)

def embed_query(self, query: str) -> np.ndarray:
    """
    Generate an embedding for a single query.
    
    Parameters:
        query (str): The query to embed.
    
    Returns:
        np.ndarray: The embedding for the query.
    """
    return self.embed_text(query)

Initialize the custom embeddings class with your pre-existing embeddings (if any)

Define the custom LLM class

class CustomRagasLLM(BaseRagasLLM):
def init(self, api_key: str = None):
"""
Initialize the custom LLM, optionally using an API key if necessary.
"""
self.api_key = api_key

async def _call(self, prompt: str) -> str:
    """
    Process the prompt and return a result. This can be customized to
    use a local model or perform any required logic.
    """
    if not self.api_key:
        return f"Processed: {prompt} (without API key)"
    else:
        # Handle LLM response if using an API
        return f"Processed: {prompt} (with API key: {self.api_key})"
   
async def generate_text(
    self,
    prompt: PromptValue,
    n: int = 1,
    temperature: float = 1e-8,
    stop: t.Optional[t.List[str]] = None,
    callbacks: t.List = []
) -> LLMResult:
    # Synchronous generation logic
    text = await self._call(prompt)
    return LLMResult(generations=[[Generation(text=text)]])
async def agenerate_text(
    self,
    prompt: PromptValue,
    n: int = 1,
    temperature: float = 1e-8,
    stop: t.Optional[t.List[str]] = None,
    callbacks: t.List = []
) -> LLMResult:
    """
    Asynchronous method to generate text. This should allow for async processing.
    """
    # Simulate an asynchronous call, here we directly call the sync method for now
    text = await self._call(prompt)
    return LLMResult(generations=[[Generation(text=text)]])     

if name == "main":

# custom_embeddings_dict = {
#    # Example: "question text": [embedding values], "context text": [embedding values], etc.
# }
custom_embeddings_dict = formatted_data
# custom_embeddings = CustomHuggingFaceRagasEmbeddings(model_name='distilbert-base-uncased', custom_embeddings=custom_embeddings_dict)

ragas_embeddings = CustomHuggingFaceRagasEmbeddings(model_name='distilbert-base-uncased', custom_embeddings=custom_embeddings_dict)
custom_llm =CustomRagasLLM(api_key=None) 

# Define the evaluation metrics
metrics = [context_utilization]
# Create a Dataset using the Hugging Face `datasets` library
#ragas_dataset = Dataset.from_dict(formatted_data)
# Run the evaluation
evaluation_report = evaluate(new_ragas_dataset, metrics=metrics, embeddings=ragas_embeddings, llm=custom_llm, column_map=column_map)  # Pass your custom LLM here

print("RAGAS Evaluation Report:")
print(evaluation_report)

@amin-kh96
Copy link
Author

@hundredeuk2
the file that I attached contains the answer and the question then from another file I extracted the chunks.
it was a bit heavy. This can give you insights.

@hundredeuk2
Copy link
Contributor

While reviewing the evaluation process, it appears that the variable new_ragas_dataset is central to performing the assessment. I replicated the steps on my end, and the same error was reproduced. Upon investigation, it seems the root cause lies in the contexts column being returned as an empty list.

This suggests there might be an issue with retrieving or processing the documents during the refinement stage. It would be helpful to double-check if something is being missed or mishandled in that step.

Additionally, since the ground_truth_data from your source code wasn’t included, I couldn’t replicate your setup with complete accuracy. It’s possible that this could also be influencing the observed behavior.

print(new_ragas_dataset)
{'question': ['Quali sono gli errori del macchinario futura, riguardanti la tensione a 3V?',
  "La macchina futura prevede qualche specifica per la conduttività dell'acqua?",
  'Quali sono gli errori che mi segnalano questi problemi?'],
 'contexts': [[], [], []],
 'answer': ['Gli errori riguardanti la tensione a 3V per il macchinario Futura sono i seguenti:\n\n1. **E306**: Tensione 3.3V sotto il limite\n   - Verificare le connessioni in uscita\n   - Sostituire la scheda idrico\n\n2. **E307**: Tensione 3.3V interno sopra il limite\n   - Verificare le connessioni in uscita\n   - Sostituire la scheda idrico\n\n3. **E716**: Tensione 3.3V sotto il limite\n   - Verificare connessioni in uscita\n   - Sostituire scheda base tavoletta\n\n4. **E717**: Tensione 3.3V interno sopra il limite\n   - Verificare connessioni in uscita\n   - Sostituire scheda base tavoletta\n\n5. **E449**: Tensione 3.3V sotto il limite\n   - Verificare i collegamenti in uscita\n   - Sostituire la scheda poltrona\n\n6. **E450**: Tensione 3.3V interno sopra il limite\n   - Verificare i collegamenti in uscita\n   - Sostituire la scheda poltrona\n\n7. **E116**: Tensione 3.3V sotto il limite\n   - Verificare le connessioni in uscita\n   - Sostituire la scheda base tavoletta medico\n\n8. **E117**: Tensione 3.3V interno sopra il limite\n   - Verificare le connessioni in uscita\n   - Sostituire la scheda base tavoletta medico\n\n9. **E224**: Tensione 3.3V sotto il limite\n   - Verificare le connessioni in uscita\n   - Sostituire la scheda idrico\n\n10. **E225**: Tensione 3.3V interno sopra il limite\n    - Verificare le connessioni in uscita\n    - Sostituire la scheda idrico',
  "Sì, la macchina Futura prevede specifiche precise per la conduttività dell'acqua utilizzata.\n\nEcco i punti salienti relativi alla conduttività dell'acqua:\n\n1. **Conduttività Massima:** L'acqua demineralizzata distillata utilizzata per l'alimentazione della sterilizzatrice dovrebbe avere una **conduttività inferiore a 15 μS/cm**.\n2. **Sensore di Conduttività:** La sterilizzatrice è dotata di un sensore di conduttività che verifica automaticamente la qualità dell'acqua trattata dal filtro demineralizzatore integrato.\n3. **Allarme Conduttività:** Se i valori di conduttività superano i 20 μS/cm, un allarme segnalerà la necessità di procedere alla sostituzione della cartuccia di resina a scambio ionico.\n4. **Rapporto Ciclo:** La misura di conducibilità dell’acqua viene effettuata automaticamente ad ogni avvio del ciclo di sterilizzazione o di test e il valore relativo è riportato sul rapporto ciclo.\n\nL'utilizzo di acqua con conduttività superiore ai limiti indicati può causare danni al macchinario e compromettere l'efficacia della sterilizzazione, incrementando anche il rischio di ossidazione e la formazione di residui calcarei.",
  "Gli errori che possono segnalare problemi relativi alla conduttività dell'acqua sono generalmente associati a malfunzionamenti del sistema idrico o delle sonde di livello. Tuttavia, in base ai documenti forniti, non sembra esserci un errore specifico che segnali direttamente un problema di conduttività dell'acqua. Gli errori che potrebbero indirettamente indicare problemi legati alla qualità dell'acqua o alla conduttività sono:\n\n1. **W64**: **Manca H₂O, riempire serbatoio 2**\n   - Azioni: Riempire il serbatoio H₂O₂, verificare la scheda igiene, verificare la sonda di livello.\n\n2. **W212**: **Serbatoio disinfettante pieno**\n   - Possibile implicazione di conduttività alta se il serbatoio pieno impedisce il normale funzionamento.\n\n3. **E55**: **S1 livello sonde incongruente Max = on Min = off**\n   - Incongruenza sonde serbatoio 1 W.H.E.: Verificare lo stato della scheda igiene, verificare le sonde, verificare lo stato dei led della scheda igiene.\n\n4. **E57**: **S2 Livello sonde incongruente Max = on Min = off**\n   - Incongruenza sonde serbatoio 2 del sistema W.H.E.: Verificare lo stato della scheda igiene, verificare le sonde, verificare lo stato dei led della scheda igiene.\n\n5. **E60**: **Durante il funzionamento normale del riunito, lettura di sonda di FULL coperta**\n   - Tentare la procedura di svuotamento del sistema W.H.E.. Verificare la presenza di trafilamenti acqua nel sistema W.H.E..\n\n6. **E67**: **Anomalia della sonda di massimo serbatoio 1**\n   - Serbatoio 1 eliminato per sonda STOP attiva: Verificare l'ingresso acqua al sistema W.H.E., verificare le sonde del serbatoio W.H.E.\n\n7. **E68**: **Anomalia della sonda di massimo serbatoio 2**\n   - Serbatoio 2 eliminato per sonda STOP attiva: Verificare l'ingresso acqua al sistema W.H.E., verificare le sonde del serbatoio W.H.E.\n\nIn caso emerga uno di questi errori, potrebbe essere utile controllare anche la qualità dell'acqua utilizzata per assicurarsi che rientri nei parametri specificati di conduttività per evitare ulteriori problemi di funzionamento del macchinario Futura."]}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants