[R-254] Issue in Evaluation using local LLM #955

sheetalkamthe55 · 2024-05-15T03:55:24Z

[ ] I checked the documentation and related resources and couldn't find an answer to my question.

Your Question

“WARNING:ragas.llms.output_parser:Failed to parse output. Returning None.”

I tried including the trace using Langsmith to check for requests and responses. For the given input prompt I believe it is an issue of context length because I get a blank response. I tried different LLMs but the error remains the same.

Code Examples
Hosted LLama 2 model with LLamaCPP. Below is the command I used
python3 -m llama_cpp.server --model /tmp/llama_index/models/llama-13b.Q5_K_M.gguf --port 8009 --host 129.69.217.24 --chat_format llama-2

Following is a sample testset I am using,
Ragas_dataset.csv

Can ignore the dataset part, I tried the same with
fiqa_eval = load_dataset("explodinggradients/fiqa", "ragas_eval")
but same issue persists.

Code:

from datasets import load_dataset
dataset = load_dataset("csv", data_files="Ragas_dataset.csv")

from tqdm import tqdm
import pandas as pd
from datasets import Dataset
def create_ragas_dataset( eval_dataset):
  rag_dataset = []
  for row in tqdm(eval_dataset):
    rag_dataset.append(
        {"question" : row["question"],
         # "answer" : result["answer"],
         "answer" : row["ground_truth"],
         "contexts" : [row["contexts"]],
         "ground_truth" : row["ground_truth"]
         }
    )
  rag_df = pd.DataFrame(rag_dataset)
  rag_eval_dataset = Dataset.from_pandas(rag_df)
  return rag_eval_dataset

basic_qa_ragas_dataset = create_ragas_dataset(dataset["train"].select(range(2)))

from langchain.chat_models import ChatOpenAI
from ragas.llms import LangchainLLMWrapper

inference_server_url = "http://localhost:8009/v1"

chat = ChatOpenAI(
    model="/tmp/llama_index/models/llama-13b.Q5_K_M.gguf",
    openai_api_key="no-key",
    openai_api_base=inference_server_url,
    max_tokens=5,
    temperature=0,
)

vllm = LangchainLLMWrapper(chat)

from ragas.metrics import (
    context_precision,
    faithfulness,
    context_recall,
)
from ragas.metrics.critique import harmfulness

# change the LLM

faithfulness.llm = vllm
context_precision.llm = vllm
context_recall.llm = vllm
harmfulness.llm = vllm

from ragas import evaluate

result = evaluate(
    basic_qa_ragas_dataset, 
    metrics=[faithfulness]
)
result

Additional context
Please let me know if I should provide more information

_R-254

The text was updated successfully, but these errors were encountered:

pauljaspersahr · 2024-05-29T08:07:09Z

Having the same issue with langchain Ollama llm wrapper and llama3

visionKinger · 2024-06-15T02:23:24Z

Have the same situation with Langchan Azure OpenAI

sheetalkamthe55 added the question Further information is requested label May 15, 2024

jjmachan added the linear Created by Linear-GitHub Sync label May 17, 2024

jjmachan changed the title ~~Issue in Evaluation using local LLM~~ [R-254] Issue in Evaluation using local LLM May 17, 2024

jjmachan self-assigned this May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[R-254] Issue in Evaluation using local LLM #955

[R-254] Issue in Evaluation using local LLM #955

sheetalkamthe55 commented May 15, 2024 •

edited by jjmachan

pauljaspersahr commented May 29, 2024 •

edited

visionKinger commented Jun 15, 2024

[R-254] Issue in Evaluation using local LLM #955

[R-254] Issue in Evaluation using local LLM #955

Comments

sheetalkamthe55 commented May 15, 2024 • edited by jjmachan

pauljaspersahr commented May 29, 2024 • edited

visionKinger commented Jun 15, 2024

sheetalkamthe55 commented May 15, 2024 •

edited by jjmachan

pauljaspersahr commented May 29, 2024 •

edited