[Bug]: function_call is empty, it is not a function call & execution beyond timeout limit #35

JohnsonZheng03 · 2024-12-16T13:23:55Z

Is there an existing issue for the same bug?

I have checked the existing issues.

Describe the bug and reproduction steps

Greetings! Thanks for your great work!

Recently I'm using aide to test some other more complex competitions than Kaggle, but found some two problems that stuck the process.

The first is an issue concerning "function_call is empty, it is not a function call".

For example,

Error occurred: function_call is empty, it is not a function call: ChatCompletionMessage(content='### Evaluation of Code Execution and Findings\n\n#### 1. Code Overview:\n The code executes a typical machine learning pipeline that involves:\n - Loading the training and test datasets.\n - Splitting the training dataset into training and validation sets.\n - Training a linear regression model on the training data.\n - Evaluating the model using Root Mean Squared Error (RMSE) on the validation set.\n - Making predictions for the test dataset.\n - Generating a submission file in the required format.\n\n#### 2. Bug Check:\n Based on a quick analysis of the code and its execution, there do not seem to be any major bugs. The key steps of data processing, model training, prediction, and submission preparation appear to be implemented correctly. \n\n However, there are a few considerations to ensure the robustness of the code:\n\n - Data Preprocessing: The code does not handle any potential missing values in the dataset. If the training or test data contains missing values, this could result in errors during training or prediction. It might be beneficial to inspect for missing values before the model is trained.\n \n - Feature Scaling: Linear Regression does not necessarily require feature scaling, but in some cases, especially with features of different scales, it could improve performance. Since the dataset might contain features with varying scales, you could consider applying scaling techniques like StandardScaler or MinMaxScaler.\n\n - Model Choice: The code uses Linear Regression, which is a simple model that might not capture the complexity of the dataset if there are non-linear relationships. While this might be a reasonable starting point, it's important to check whether the model performance could be improved using more complex algorithms like Decision Trees, Random Forest, or Gradient Boosting.\n\n#### 3. Empirical Findings:\n\n - Validation RMSE: The validation RMSE is reported as approximately 0.71. This value seems reasonable for a basic model like Linear Regression. However, it would be helpful to compare this baseline performance with more complex models to evaluate the effectiveness of this approach.\n\n - Execution Time: The execution time is not reported explicitly but is stated as "a moment seconds," indicating that the model training and prediction happened quickly, which is expected with the simplicity of Linear Regression.\n\n#### 4. Recommendations for Improvement:\n\n - Handling Missing Data: You can check for missing values and handle them using imputation techniques if necessary:\n python\n X = X.fillna(X.mean()) # Impute missing values with the mean of the respective feature\n \n \n - Feature Engineering: Explore additional feature engineering techniques (e.g., interaction terms or polynomial features) to capture more complex relationships within the data.\n \n - Model Comparison: After validating the Linear Regression model, you might want to try other models like Random Forest or Gradient Boosting for potentially better results. You can also experiment with hyperparameter tuning to improve model performance.\n\n - Cross-Validation: Instead of using a single train-validation split, you could use cross-validation (e.g., K-fold cross-validation) to ensure that the model generalizes well to different subsets of the data:\n python\n from sklearn.model_selection import cross_val_score\n scores = cross_val_score(model, X, y, cv=5, scoring='neg_mean_squared_error')\n rmse = np.sqrt(-scores.mean())\n print(f"Cross-validated RMSE: {rmse}")\n \n\n### Conclusion:\nThe code runs without any bugs and achieves an RMSE of 0.71 on the validation set. However, improvements can be made in terms of handling missing data, experimenting with different models, and incorporating feature scaling or engineering techniques. These enhancements could potentially improve the model's performance, especially when competing in a real-world Kaggle competition.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None)

The func-call error directly causes the process's stuck.

The other problem is concerning "execution beyond timeout limit".

For example, in one case where the agent tried to use a RandomForest classifier on a large dataset in my local python kernel, I saw the "executing code" bar in the command line lasted for a long time which is definitely longer than the "timeout" feature that was set to be 600, 10 min, in the config.

My config is as follows. I'm not sure if I had set it right and are there other places to set the limit?

/aide/aideml/aide/utils/config.yaml

Looking forward to your reply, many thanks!

Johnson

AIDE Installation

Commandline

AIDE Version

latest

Operating System

Linux

Logs, Errors, Screenshots, and Additional Context

No response

dexhunter · 2024-12-17T03:41:24Z

Hi Johnson @JohnsonZheng03 , thanks for reporting, just wonder which llm are you using?

JohnsonZheng03 · 2024-12-17T04:06:36Z

Hello! I’m using the gpt-4-turbo api. Johnson Dixing (Dex) Xu ***@***.***>于2024年12月17日周二11:41写道：

…

Hi Johnson @JohnsonZheng03 <https://github.com/JohnsonZheng03> , thanks for reporting, just wonder which llm are you using? — Reply to this email directly, view it on GitHub <#35 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BL7X47CN2GGZIAZMRLR4PUL2F6MPTAVCNFSM6AAAAABTWEN7LKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNBXGQ2DANJXG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

JohnsonZheng03 · 2024-12-20T09:38:54Z

Thanks for the update!

The execution time update works well. But the "function_call is empty, it is not a function call" issue is still there. Have you posted any new updates to solve that?

dexhunter · 2024-12-20T10:22:13Z

Have you posted any new updates to solve that?

Thanks for the feedback. I think the function call issue might relate to the prompt and I am still testing some edge cases to make sure the issue is resolved.

JohnsonZheng03 added the bug Something isn't working label Dec 16, 2024

dexhunter added a commit to dexhunter/aideml that referenced this issue Dec 17, 2024

⚡ update execution timeout logic more aggresively (WecoAI#35)

ec3f86b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: function_call is empty, it is not a function call & execution beyond timeout limit #35

[Bug]: function_call is empty, it is not a function call & execution beyond timeout limit #35

JohnsonZheng03 commented Dec 16, 2024

dexhunter commented Dec 17, 2024

JohnsonZheng03 commented Dec 17, 2024 via email

JohnsonZheng03 commented Dec 20, 2024

dexhunter commented Dec 20, 2024

[Bug]: function_call is empty, it is not a function call & execution beyond timeout limit #35

[Bug]: function_call is empty, it is not a function call & execution beyond timeout limit #35

Comments

JohnsonZheng03 commented Dec 16, 2024

Is there an existing issue for the same bug?

Describe the bug and reproduction steps

AIDE Installation

AIDE Version

Operating System

Logs, Errors, Screenshots, and Additional Context

dexhunter commented Dec 17, 2024

JohnsonZheng03 commented Dec 17, 2024 via email

JohnsonZheng03 commented Dec 20, 2024

dexhunter commented Dec 20, 2024