Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Unable to Use Gemini/Ollama: OpenAI Hardcoded in ResumeFacade, Hardcoded OpenAI API Calls Prevent Using Other LLMs, LLM_MODEL_TYPE Configuration Not Respected: Always Calls OpenAI #1069

Open
krixx646 opened this issue Jan 20, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@krixx646
Copy link

Describe the bug

The application is currently hardcoded to use the OpenAI API for LLM interactions, preventing the use of other LLM providers like Gemini and Ollama, despite the configuration options provided in config.py (LLM_MODEL_TYPE, LLM_MODEL, LLM_API_URL). The ResumeFacade and associated LLM classes within the src/libs/resume_and_cover_builder/llm directory directly call the OpenAI API without checking the configured LLM provider. This results in either the OpenAI API being called regardless of configuration, or errors if the OpenAI API key is invalid or rate limits are exceeded. The application should instead use the specified LLM_MODEL_TYPE to dynamically select the appropriate LLM provider for resume and cover letter generation.

Steps to reproduce

Bug Report: LLM_MODEL_TYPE Configuration Not Respected (OpenAI Always Called)


Summary:
When configuring the application to use either Gemini or Ollama by setting LLM_MODEL_TYPE in config.py, the application continues to call the OpenAI API instead of the configured LLM. This behavior indicates that the LLM_MODEL_TYPE setting is not being respected, and the OpenAI API is being used regardless of the configuration.


Steps to Reproduce:

  1. Configure for Gemini/Ollama:

    • In the config.py file, set LLM_MODEL_TYPE to either "gemini" or "ollama".
    • If using Ollama, ensure the LLM_MODEL and LLM_API_URL are correctly configured (e.g., LLM_MODEL = "llama2" and LLM_API_URL = "http://127.0.0.1:11434/").
  2. Confirm API Keys:

    • Ensure valid API keys for Gemini or Ollama are present in the secrets.yaml file.
    • For Gemini, the key should be under llm_api_key.
    • For Ollama, no API key is required, but the local server must be running.
  3. Run the Application:

    • Execute the application using:
      python main.py
  4. Select an Action:

    • Choose any action that triggers LLM interaction, such as:
      • "Generate Resume"
      • "Generate Tailored Cover Letter"
  5. Observe Logs and Output:

    • Check the logs for evidence of OpenAI API calls. For example:
      DEBUG:openai._base_client:Sending HTTP Request: POST https://api.openai.com/v1/chat/completions
      
    • The generated resume or cover letter may be incorrect or empty, as the application fails to interact with the configured LLM (Gemini/Ollama).
    • If the OpenAI API key is invalid, you may see 401 Unauthorized errors in the logs.

Expected Behavior:

  • When LLM_MODEL_TYPE is set to "gemini" or "ollama", the application should not make any calls to the OpenAI API.
  • The logs should show interactions with the Gemini or Ollama API, depending on the configuration.
  • The generated resume or cover letter should reflect the output from the chosen LLM.

Actual Behavior:

  • The application always calls the OpenAI API, regardless of the LLM_MODEL_TYPE setting.
  • The logs show OpenAI API requests, such as:
    DEBUG:openai._base_client:Sending HTTP Request: POST https://api.openai.com/v1/chat/completions
    
  • The generated output is either incorrect or fails entirely, as the application does not interact with the configured LLM.

Root Cause:

The issue appears to be in the LLM selection logic. Despite setting LLM_MODEL_TYPE to "gemini" or "ollama", the code is bypassing this configuration and defaulting to OpenAI. This suggests that the LLM_MODEL_TYPE setting is not being used to determine which LLM to use.


Relevant Code Snippets:

Here are the relevant parts of the code that need to be reviewed and fixed:

config.py

LLM_MODEL_TYPE = 'gemini'  # or 'ollama'
LLM_MODEL = 'gemini-pro'   # or 'llama2' for Ollama
LLM_API_URL = ''           # Only required for Ollama

secrets.yaml

llm_api_key: YOUR_GEMINI_API_KEY  # or leave blank for Ollama

LLM Selection Logic (Example)

if LLM_MODEL_TYPE == 'openai':
    llm = OpenAIClient(api_key=OPENAI_API_KEY)
elif LLM_MODEL_TYPE == 'gemini':
    llm = GeminiClient(api_key=GEMINI_API_KEY)
elif LLM_MODEL_TYPE == 'ollama':
    llm = OllamaClient(api_url=LLM_API_URL, model=LLM_MODEL)
else:
    raise ValueError(f"Unsupported LLM model type: {LLM_MODEL_TYPE}")

Proposed Fix:

  1. Review the LLM Selection Logic:

    • Ensure the LLM_MODEL_TYPE setting is correctly used to determine which LLM client to instantiate.
    • Add debug logs to verify which LLM client is being used.
  2. Remove Hardcoded OpenAI Calls:

    • Check for any hardcoded OpenAI API calls in the code and replace them with the configured LLM client.
  3. Test with Gemini and Ollama:

    • Verify that the application interacts with the correct LLM based on the LLM_MODEL_TYPE setting.

Additional Notes:

  • OpenAI call is bypassing this logic.
  • This issue prevents the use of alternative LLMs (Gemini/Ollama) and forces the application to rely on OpenAI, even when not configured to do so.

Let me know if you need further details or assistance in resolving this issue! 😊

Expected behavior

Result expected should be working with choosen model, not going to openai endpoint regardless of the settings/config

Actual behavior

It is trying to use open ai api endpoint to make the resume because the developers for some reason hardcoded it in the resume_facade.py and other llm files in the llm folder. The bot is simply not functional.

Branch

main

Branch name

No response

Python version

Latest version

LLM Used

gemini

Model used

gemini-1.5-pro

Additional context

Even if you switch to ollama using locally hosted models, it is same malfunction.

@krixx646 krixx646 added the bug Something isn't working label Jan 20, 2025
@Gui153
Copy link

Gui153 commented Jan 21, 2025

Also running into the same problem. I will try to check the code on my own, but I am not sure if I will find anything, since it is my 1st time looking at this code.

@Gui153
Copy link

Gui153 commented Jan 21, 2025

Results, turns out the application never checks the config file that we set up.
The problem can be fixed by changing all llm files and the utility file , so that they can start checking for the config file, and assign the proper class for each option.

(Only tested for Ollama, and still causes errors after, but now it actually sends requests to either openAI or Ollama)

changes I made to the llm_generate_resume.py (lines 5-10, 35-38):

import os
import textwrap
from config import LLM_MODEL_TYPE, LLM_MODEL, LLM_API_URL
from src.libs.resume_and_cover_builder.utils import LoggerChatModel
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_ollama import OllamaLLM
from dotenv import load_dotenv
from concurrent.futures import ThreadPoolExecutor, as_completed
from loguru import logger
from pathlib import Path

# Load environment variables from .env file
load_dotenv()

# Configure log file
log_folder = 'log/resume/gpt_resume'
if not os.path.exists(log_folder):
    os.makedirs(log_folder)
log_path = Path(log_folder).resolve()
logger.add(log_path / "gpt_resume.log", rotation="1 day", compression="zip", retention="7 days", level="DEBUG")

class LLMResumer:
    def __init__(self, openai_api_key, strings):
        self.llm_cheap = LoggerChatModel(
            ChatOpenAI(
                model_name="gpt-4o-mini", openai_api_key=openai_api_key, temperature=0.4
            )
        )
        if(LLM_MODEL_TYPE == "ollama"):
           self.llm_cheap = LoggerChatModel(
                OllamaLLM(model=LLM_MODEL)
            ) 
        self.strings = strings

$ changes I made to the /src/libs/utility.py (lines 5-10, 80):

# app/libs/resume_and_cover_builder/utils.py
import json
import openai
import time
from datetime import datetime
from typing import Dict, List, Union
from langchain_core.messages.ai import AIMessage
from langchain_core.prompt_values import StringPromptValue
from langchain_openai import ChatOpenAI
from langchain_ollama import OllamaLLM
from .config import global_config
from loguru import logger
from requests.exceptions import HTTPError as HTTPStatusError


class LLMLogger:

    def __init__(self, llm: ChatOpenAI):
        self.llm = llm

    @staticmethod
    def log_request(prompts, parsed_reply: Dict[str, Dict]):
        calls_log = global_config.LOG_OUTPUT_FILE_PATH / "open_ai_calls.json"
        if isinstance(prompts, StringPromptValue):
            prompts = prompts.text
        elif isinstance(prompts, Dict):
            # Convert prompts to a dictionary if they are not in the expected format
            prompts = {
                f"prompt_{i+1}": prompt.content
                for i, prompt in enumerate(prompts.messages)
            }
        else:
            prompts = {
                f"prompt_{i+1}": prompt.content
                for i, prompt in enumerate(prompts.messages)
            }

        current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")

        # Extract token usage details from the response
        token_usage = parsed_reply["usage_metadata"]
        output_tokens = token_usage["output_tokens"]
        input_tokens = token_usage["input_tokens"]
        total_tokens = token_usage["total_tokens"]

        # Extract model details from the response
        model_name = parsed_reply["response_metadata"]["model_name"]
        prompt_price_per_token = 0.00000015
        completion_price_per_token = 0.0000006

        # Calculate the total cost of the API call
        total_cost = (input_tokens * prompt_price_per_token) + (
            output_tokens * completion_price_per_token
        )

        # Create a log entry with all relevant information
        log_entry = {
            "model": model_name,
            "time": current_time,
            "prompts": prompts,
            "replies": parsed_reply["content"],  # Response content
            "total_tokens": total_tokens,
            "input_tokens": input_tokens,
            "output_tokens": output_tokens,
            "total_cost": total_cost,
        }

        # Write the log entry to the log file in JSON format
        with open(calls_log, "a", encoding="utf-8") as f:
            json_string = json.dumps(log_entry, ensure_ascii=False, indent=4)
            f.write(json_string + "\n")


class LoggerChatModel:

    def __init__(self, llm: Union[ChatOpenAI,OllamaLLM]):
        self.llm = llm

Sample: new error that I am getting:

2025-01-21 01:43:12.517 | ERROR    | src.libs.resume_and_cover_builder.utils:__call__:103 - Unexpected error occurred: 'str' object has no attribute 'content', retrying in 80 seconds... (Attempt 4/15)
DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Type', b'application/x-ndjson'), (b'Date', b'Tue, 21 Jan 2025 06:43:12 GMT'), (b'Transfer-Encoding', b'chunked')])
INFO:httpx:HTTP Request: POST http://127.0.0.1:11434/api/generate "HTTP/1.1 200 OK"
DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete
2025-01-21 01:43:14.426 | ERROR    | src.libs.resume_and_cover_builder.utils:__call__:103 - Unexpected error occurred: 'str' object has no attribute 'content', retrying in 40 seconds... (Attempt 3/15)
DEBUG:httpcore.http11:receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete
2025-01-21 01:43:15.679 | ERROR    | src.libs.resume_and_cover_builder.utils:__call__:103 - Unexpected error occurred: 'str' object has no attribute 'content', retrying in 80 seconds... (Attempt 4/15)
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Type', b'application/x-ndjson'), (b'Date', b'Tue, 21 Jan 2025 06:43:16 GMT'), (b'Transfer-Encoding', b'chunked')])
INFO:httpx:HTTP Request: POST http://127.0.0.1:11434/api/generate "HTTP/1.1 200 OK"
DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete
2025-01-21 01:43:17.354 | ERROR    | src.libs.resume_and_cover_builder.utils:__call__:103 - Unexpected error occurred: 'str' object has no attribute 'content', retrying in 80 seconds... (Attempt 4/15)
DEBUG:httpcore.http11:receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete
DEBUG:httpcore.connection:close.started
DEBUG:httpcore.connection:close.complete
DEBUG:httpcore.connection:close.started
DEBUG:httpcore.connection:close.complete
DEBUG:httpcore.connection:close.started
DEBUG:httpcore.connection:close.complete
2025-01-21 01:43:24.076 | ERROR    | src.libs.resume_and_cover_builder.utils:__call__:103 - Unexpected error occurred: 'str' object has no attribute 'content', retrying in 80 seconds... (Attempt 4/15)
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Type', b'application/x-ndjson'), (b'Date', b'Tue, 21 Jan 2025 06:43:29 GMT'), (b'Transfer-Encoding', b'chunked')])
INFO:httpx:HTTP Request: POST http://127.0.0.1:11434/api/generate "HTTP/1.1 200 OK"
DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>
DEBUG:httpcore.connection:connect_tcp.started host='127.0.0.1' port=11434 local_address=None timeout=None socket_options=None
DEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x000001DA9CB62C90>
DEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_headers.complete
DEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:send_request_body.complete
DEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Content-Type', b'application/x-ndjson'), (b'Date', b'Tue, 21 Jan 2025 06:43:36 GMT'), (b'Transfer-Encoding', b'chunked')])
INFO:httpx:HTTP Request: POST http://127.0.0.1:11434/api/generate "HTTP/1.1 200 OK"
DEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>
DEBUG:httpcore.http11:receive_response_body.complete
DEBUG:httpcore.http11:response_closed.started
DEBUG:httpcore.http11:response_closed.complete

@Gui153
Copy link

Gui153 commented Jan 21, 2025

Fixed the previous issue, now the application runs with Ollama. The generated PDF text is good, but very small, probably is caused by the default template used.

changes I made to the /src/libs/utility.py (lines 90-91):

"""
This module contains utility functions for the Resume and Cover Letter Builder service.
"""

# app/libs/resume_and_cover_builder/utils.py
import json
import openai
import time
from datetime import datetime
from typing import Dict, List, Union
from langchain_core.messages.ai import AIMessage
from langchain_core.prompt_values import StringPromptValue
from langchain_openai import ChatOpenAI
from langchain_ollama import OllamaLLM
from .config import global_config
from loguru import logger
from requests.exceptions import HTTPError as HTTPStatusError


class LLMLogger:

    def __init__(self, llm: ChatOpenAI):
        self.llm = llm

    @staticmethod
    def log_request(prompts, parsed_reply: Dict[str, Dict]):
        calls_log = global_config.LOG_OUTPUT_FILE_PATH / "open_ai_calls.json"
        if isinstance(prompts, StringPromptValue):
            prompts = prompts.text
        elif isinstance(prompts, Dict):
            # Convert prompts to a dictionary if they are not in the expected format
            prompts = {
                f"prompt_{i+1}": prompt.content
                for i, prompt in enumerate(prompts.messages)
            }
        else:
            prompts = {
                f"prompt_{i+1}": prompt.content
                for i, prompt in enumerate(prompts.messages)
            }

        current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")

        # Extract token usage details from the response
        token_usage = parsed_reply["usage_metadata"]
        output_tokens = token_usage["output_tokens"]
        input_tokens = token_usage["input_tokens"]
        total_tokens = token_usage["total_tokens"]

        # Extract model details from the response
        model_name = parsed_reply["response_metadata"]["model_name"]
        prompt_price_per_token = 0.00000015
        completion_price_per_token = 0.0000006

        # Calculate the total cost of the API call
        total_cost = (input_tokens * prompt_price_per_token) + (
            output_tokens * completion_price_per_token
        )

        # Create a log entry with all relevant information
        log_entry = {
            "model": model_name,
            "time": current_time,
            "prompts": prompts,
            "replies": parsed_reply["content"],  # Response content
            "total_tokens": total_tokens,
            "input_tokens": input_tokens,
            "output_tokens": output_tokens,
            "total_cost": total_cost,
        }

        # Write the log entry to the log file in JSON format
        with open(calls_log, "a", encoding="utf-8") as f:
            json_string = json.dumps(log_entry, ensure_ascii=False, indent=4)
            f.write(json_string + "\n")


class LoggerChatModel:

    def __init__(self, llm: Union[ChatOpenAI,OllamaLLM]):
        self.llm = llm

    def __call__(self, messages: List[Dict[str, str]]) -> str:
        max_retries = 2
        retry_delay = 10

        for attempt in range(max_retries):
            try:
                reply = self.llm.invoke(messages)
                #parsed_reply = self.parse_llmresult(reply)
                #LLMLogger.log_request(prompts=messages, parsed_reply=parsed_reply)

generated with sample data.

resume_base.pdf

@Gui153
Copy link

Gui153 commented Jan 21, 2025

I will try to commit the code once I fix it in all files.

@krixx646
Copy link
Author

Thanks, I will be waiting for the new commited code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants