Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate logs with GitHub API data ingestions #11404

Open
l-koppuravuri-BL opened this issue Nov 8, 2024 · 20 comments
Open

Duplicate logs with GitHub API data ingestions #11404

l-koppuravuri-BL opened this issue Nov 8, 2024 · 20 comments
Assignees
Labels
Connector Connector specialty review needed

Comments

@l-koppuravuri-BL
Copy link

Describe the bug
We tried using function apps and logic apps to ingest GitHub data to Sentinel, and we found that both solutions were producing duplicate data. We wanted to make sure before looking into the code to see if this was a known problem or if there were any imitations.

To Reproduce
Steps to reproduce the behavior:
1As mentioned in the documentation, deployed solutions and updated orgs , lastjobruntime.json files. jobs running with default schedule of 10 minutes.

Expected behavior
should not see the duplicate logs.

Screenshots
Image

@l-koppuravuri-BL
Copy link
Author

I have already looked into issue # #9356, but the solution offered has not helped because I am already using an org.json file with the new structure and do not see any rate limits.

@v-rusraut v-rusraut added the Connector Connector specialty review needed label Nov 11, 2024
@v-rusraut
Copy link
Contributor

Hi @l-koppuravuri-BL , Thanks for flagging this issue, we will investigate this issue and get back to you with some updates. Thanks!

@v-sudkharat
Copy link
Contributor

@l-koppuravuri-BL, you are using this data connector to pull the data ? - https://github.com/Azure/Azure-Sentinel/tree/master/DataConnectors/GithubFunction

And Is there any other connector configured and pointing out to the same workspace?

@l-koppuravuri-BL
Copy link
Author

l-koppuravuri-BL commented Nov 12, 2024

@v-sudkharat No additional connectors are configured on this workspace.

@onyigbo
Copy link

onyigbo commented Nov 12, 2024

In the logic app step where you have the API url, were you able to specific the time interval and format (start and end time, some might want ISO format) based on Github documentation on how the API should be used?

@v-sudkharat
Copy link
Contributor

@l-koppuravuri-BL, Could you please confirm this connector has been configured - https://github.com/Azure/Azure-Sentinel/tree/master/DataConnectors/GithubFunction

Or please let us know if any logic app has been set into env mentioned in above comment. Thanks!

@l-koppuravuri-BL
Copy link
Author

@v-sudkharat We have set up connectors based on both logic apps and function apps, and we have observed that they behave in the same way.

Since both connectors report to distinct log analytics workspaces, there should not be any conflicts or duplicate logs.

@onyigbo : I am not sure to whom you are addressing the time interval question, but I have not set it up that way and did not notice any input parameters during deployment. Image

@v-sudkharat
Copy link
Contributor

Hey @l-koppuravuri-BL,
Both the below data ingestion methods -

  1. Using Function APP - https://github.com/Azure/Azure-Sentinel/tree/master/DataConnectors/GithubFunction
  2. Using Logic Apps - https://github.com/Azure/Azure-Sentinel/tree/18a715bfdccdfe5b1e5474cc01b418b965ab7572/DataConnectors/GitHub

are ingesting the GitHub audit logs via a - GraphQL ( https://developer.github.com/v4/interface/auditentry/ )

So, as you have configured both, it will uses the same table "GitHub_CL" to ingest the data and due to that the duplication occurred.
Could you please stop the one of the ingestion and check is there any duplication still having into the workspace?

And, if both the logic app and function app are configured into different Workspace, you can verify the different workspace ID, and KEY has been entered while pre-deployment:
For function app you can check in deployed function app Env variables -

Image

For Logic app you can check in deployed azureloganalyticsdatacollector-GitHubPlaybooks -

Image

If both the values are same, then it's ingesting the data into same workspace which cause the duplicates.

Thanks!

@l-koppuravuri-BL
Copy link
Author

l-koppuravuri-BL commented Nov 18, 2024

Hello @v-sudkharat

As shared earlier, we have configured function app and logic app solutions on distinct log analytics workspaces.

In fact, initially we tried with a logic app-based solution and then deployed a function app-based solution after nearly 2 months, as we see duplicate logs.

Function App
Image

logic app

Image

@v-sudkharat
Copy link
Contributor

@l-koppuravuri-BL, Thanks for clarify the details, we will check on the Function App and get back you.

@l-koppuravuri-BL
Copy link
Author

@v-sudkharat : I want to check if you have any further updates. If you like, we can also setup a meeting and go through the configuration.

@v-sudkharat
Copy link
Contributor

@l-koppuravuri-BL, was on leave need some more time to investigate on it and checking with the concern team, once we get done, if required we can schedule meeting. Thanks for the co-operation.

@v-sudkharat
Copy link
Contributor

@l-koppuravuri-BL, Could you please send the

  1. Share this below files -
    Image

  2. Please share the GitHub_CL logs.

In mail ID - [email protected]

Tagging connector authors for visibility - @dicolanl / @sreedharande

Thanks!

@l-koppuravuri-BL
Copy link
Author

@v-sudkharat : sent both files.

@v-sudkharat
Copy link
Contributor

@l-koppuravuri-BL , Please share the GitHub_CL log. You can send the recent 24 hr one

@v-sudkharat
Copy link
Contributor

@l-koppuravuri-BL , waiting for GitHub_CL logs. Thanks!

@l-koppuravuri-BL
Copy link
Author

@v-sudkharat : sent logs

@l-koppuravuri-BL , waiting for GitHub_CL logs. Thanks!

sent the logs

@v-sudkharat
Copy link
Contributor

@l-koppuravuri-BL , Received. Thanks!

@l-koppuravuri-BL
Copy link
Author

@v-sudkharat : would like to check if you made any progress with the investigation.

@v-sudkharat
Copy link
Contributor

@l-koppuravuri-BL, We are following with the connector author to investigate on this issue. Once we get any response from them will update you. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Connector Connector specialty review needed
Projects
None yet
Development

No branches or pull requests

4 participants