-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: Unable to Use Gemini/Ollama: OpenAI Hardcoded in ResumeFacade, Hardcoded OpenAI API Calls Prevent Using Other LLMs, LLM_MODEL_TYPE Configuration Not Respected: Always Calls OpenAI #1069
Comments
Also running into the same problem. I will try to check the code on my own, but I am not sure if I will find anything, since it is my 1st time looking at this code. |
Results, turns out the application never checks the config file that we set up. (Only tested for Ollama, and still causes errors after, but now it actually sends requests to either openAI or Ollama)changes I made to the llm_generate_resume.py (lines 5-10, 35-38):import os
import textwrap
from config import LLM_MODEL_TYPE, LLM_MODEL, LLM_API_URL
from src.libs.resume_and_cover_builder.utils import LoggerChatModel
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_ollama import OllamaLLM
from dotenv import load_dotenv
from concurrent.futures import ThreadPoolExecutor, as_completed
from loguru import logger
from pathlib import Path
# Load environment variables from .env file
load_dotenv()
# Configure log file
log_folder = 'log/resume/gpt_resume'
if not os.path.exists(log_folder):
os.makedirs(log_folder)
log_path = Path(log_folder).resolve()
logger.add(log_path / "gpt_resume.log", rotation="1 day", compression="zip", retention="7 days", level="DEBUG")
class LLMResumer:
def __init__(self, openai_api_key, strings):
self.llm_cheap = LoggerChatModel(
ChatOpenAI(
model_name="gpt-4o-mini", openai_api_key=openai_api_key, temperature=0.4
)
)
if(LLM_MODEL_TYPE == "ollama"):
self.llm_cheap = LoggerChatModel(
OllamaLLM(model=LLM_MODEL)
)
self.strings = strings $ changes I made to the /src/libs/utility.py (lines 5-10, 80): # app/libs/resume_and_cover_builder/utils.py
import json
import openai
import time
from datetime import datetime
from typing import Dict, List, Union
from langchain_core.messages.ai import AIMessage
from langchain_core.prompt_values import StringPromptValue
from langchain_openai import ChatOpenAI
from langchain_ollama import OllamaLLM
from .config import global_config
from loguru import logger
from requests.exceptions import HTTPError as HTTPStatusError
class LLMLogger:
def __init__(self, llm: ChatOpenAI):
self.llm = llm
@staticmethod
def log_request(prompts, parsed_reply: Dict[str, Dict]):
calls_log = global_config.LOG_OUTPUT_FILE_PATH / "open_ai_calls.json"
if isinstance(prompts, StringPromptValue):
prompts = prompts.text
elif isinstance(prompts, Dict):
# Convert prompts to a dictionary if they are not in the expected format
prompts = {
f"prompt_{i+1}": prompt.content
for i, prompt in enumerate(prompts.messages)
}
else:
prompts = {
f"prompt_{i+1}": prompt.content
for i, prompt in enumerate(prompts.messages)
}
current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
# Extract token usage details from the response
token_usage = parsed_reply["usage_metadata"]
output_tokens = token_usage["output_tokens"]
input_tokens = token_usage["input_tokens"]
total_tokens = token_usage["total_tokens"]
# Extract model details from the response
model_name = parsed_reply["response_metadata"]["model_name"]
prompt_price_per_token = 0.00000015
completion_price_per_token = 0.0000006
# Calculate the total cost of the API call
total_cost = (input_tokens * prompt_price_per_token) + (
output_tokens * completion_price_per_token
)
# Create a log entry with all relevant information
log_entry = {
"model": model_name,
"time": current_time,
"prompts": prompts,
"replies": parsed_reply["content"], # Response content
"total_tokens": total_tokens,
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"total_cost": total_cost,
}
# Write the log entry to the log file in JSON format
with open(calls_log, "a", encoding="utf-8") as f:
json_string = json.dumps(log_entry, ensure_ascii=False, indent=4)
f.write(json_string + "\n")
class LoggerChatModel:
def __init__(self, llm: Union[ChatOpenAI,OllamaLLM]):
self.llm = llm Sample: new error that I am getting:
|
Fixed the previous issue, now the application runs with Ollama. The generated PDF text is good, but very small, probably is caused by the default template used.changes I made to the /src/libs/utility.py (lines 90-91):"""
This module contains utility functions for the Resume and Cover Letter Builder service.
"""
# app/libs/resume_and_cover_builder/utils.py
import json
import openai
import time
from datetime import datetime
from typing import Dict, List, Union
from langchain_core.messages.ai import AIMessage
from langchain_core.prompt_values import StringPromptValue
from langchain_openai import ChatOpenAI
from langchain_ollama import OllamaLLM
from .config import global_config
from loguru import logger
from requests.exceptions import HTTPError as HTTPStatusError
class LLMLogger:
def __init__(self, llm: ChatOpenAI):
self.llm = llm
@staticmethod
def log_request(prompts, parsed_reply: Dict[str, Dict]):
calls_log = global_config.LOG_OUTPUT_FILE_PATH / "open_ai_calls.json"
if isinstance(prompts, StringPromptValue):
prompts = prompts.text
elif isinstance(prompts, Dict):
# Convert prompts to a dictionary if they are not in the expected format
prompts = {
f"prompt_{i+1}": prompt.content
for i, prompt in enumerate(prompts.messages)
}
else:
prompts = {
f"prompt_{i+1}": prompt.content
for i, prompt in enumerate(prompts.messages)
}
current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
# Extract token usage details from the response
token_usage = parsed_reply["usage_metadata"]
output_tokens = token_usage["output_tokens"]
input_tokens = token_usage["input_tokens"]
total_tokens = token_usage["total_tokens"]
# Extract model details from the response
model_name = parsed_reply["response_metadata"]["model_name"]
prompt_price_per_token = 0.00000015
completion_price_per_token = 0.0000006
# Calculate the total cost of the API call
total_cost = (input_tokens * prompt_price_per_token) + (
output_tokens * completion_price_per_token
)
# Create a log entry with all relevant information
log_entry = {
"model": model_name,
"time": current_time,
"prompts": prompts,
"replies": parsed_reply["content"], # Response content
"total_tokens": total_tokens,
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"total_cost": total_cost,
}
# Write the log entry to the log file in JSON format
with open(calls_log, "a", encoding="utf-8") as f:
json_string = json.dumps(log_entry, ensure_ascii=False, indent=4)
f.write(json_string + "\n")
class LoggerChatModel:
def __init__(self, llm: Union[ChatOpenAI,OllamaLLM]):
self.llm = llm
def __call__(self, messages: List[Dict[str, str]]) -> str:
max_retries = 2
retry_delay = 10
for attempt in range(max_retries):
try:
reply = self.llm.invoke(messages)
#parsed_reply = self.parse_llmresult(reply)
#LLMLogger.log_request(prompts=messages, parsed_reply=parsed_reply) generated with sample data. |
I will try to commit the code once I fix it in all files. |
Thanks, I will be waiting for the new commited code. |
Describe the bug
The application is currently hardcoded to use the OpenAI API for LLM interactions, preventing the use of other LLM providers like Gemini and Ollama, despite the configuration options provided in config.py (LLM_MODEL_TYPE, LLM_MODEL, LLM_API_URL). The ResumeFacade and associated LLM classes within the src/libs/resume_and_cover_builder/llm directory directly call the OpenAI API without checking the configured LLM provider. This results in either the OpenAI API being called regardless of configuration, or errors if the OpenAI API key is invalid or rate limits are exceeded. The application should instead use the specified LLM_MODEL_TYPE to dynamically select the appropriate LLM provider for resume and cover letter generation.
Steps to reproduce
Bug Report: LLM_MODEL_TYPE Configuration Not Respected (OpenAI Always Called)
Summary:
When configuring the application to use either Gemini or Ollama by setting
LLM_MODEL_TYPE
inconfig.py
, the application continues to call the OpenAI API instead of the configured LLM. This behavior indicates that theLLM_MODEL_TYPE
setting is not being respected, and the OpenAI API is being used regardless of the configuration.Steps to Reproduce:
Configure for Gemini/Ollama:
config.py
file, setLLM_MODEL_TYPE
to either"gemini"
or"ollama"
.LLM_MODEL
andLLM_API_URL
are correctly configured (e.g.,LLM_MODEL = "llama2"
andLLM_API_URL = "http://127.0.0.1:11434/"
).Confirm API Keys:
secrets.yaml
file.llm_api_key
.Run the Application:
Select an Action:
Observe Logs and Output:
401 Unauthorized
errors in the logs.Expected Behavior:
LLM_MODEL_TYPE
is set to"gemini"
or"ollama"
, the application should not make any calls to the OpenAI API.Actual Behavior:
LLM_MODEL_TYPE
setting.Root Cause:
The issue appears to be in the LLM selection logic. Despite setting
LLM_MODEL_TYPE
to"gemini"
or"ollama"
, the code is bypassing this configuration and defaulting to OpenAI. This suggests that theLLM_MODEL_TYPE
setting is not being used to determine which LLM to use.Relevant Code Snippets:
Here are the relevant parts of the code that need to be reviewed and fixed:
config.py
secrets.yaml
LLM Selection Logic (Example)
Proposed Fix:
Review the LLM Selection Logic:
LLM_MODEL_TYPE
setting is correctly used to determine which LLM client to instantiate.Remove Hardcoded OpenAI Calls:
Test with Gemini and Ollama:
LLM_MODEL_TYPE
setting.Additional Notes:
Let me know if you need further details or assistance in resolving this issue! 😊
Expected behavior
Result expected should be working with choosen model, not going to openai endpoint regardless of the settings/config
Actual behavior
It is trying to use open ai api endpoint to make the resume because the developers for some reason hardcoded it in the resume_facade.py and other llm files in the llm folder. The bot is simply not functional.
Branch
main
Branch name
No response
Python version
Latest version
LLM Used
gemini
Model used
gemini-1.5-pro
Additional context
Even if you switch to ollama using locally hosted models, it is same malfunction.
The text was updated successfully, but these errors were encountered: