Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getting spark checking error time to time #3766

Open
Am1rr3zA opened this issue Jul 18, 2018 · 1 comment
Open

getting spark checking error time to time #3766

Am1rr3zA opened this issue Jul 18, 2018 · 1 comment

Comments

@Am1rr3zA
Copy link

I have installed dd-agent (version 6) in one of our EC2 instances and changed spark conf so I can monitor my EMR cluster

init_config:
instances:
  - spark_url: http://10.0.0.1:8088
    cluster_name: AnalyticDataDog-Test
    spark_cluster_mode: spark_yarn_mode
    tags:
      - instance:Test
      - cluster:Analytics
      - env:EMR

after I have restarted my dd-agent everything works fine and I started to get spark metrics
but after a couple of minutes I start to get this:

ERROR | (runner.go:277 in work) | Error running check spark: 
[
  {
    "message": "Expecting value: line 1 column 1 (char 0)",
    "traceback": "Traceback (most recent call last):\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/checks/base.py\", line 294, in run\n    self.check(copy.deepcopy(self.instances[0]))\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/spark/spark.py\", line 153, in check\n    spark_apps = self._get_running_apps(instance, requests_config)\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/spark/spark.py\", line 260, in _get_running_apps\n    return self._get_spark_app_ids(running_apps, requests_config, tags)\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/spark/spark.py\", line 428, in _get_spark_app_ids\n    SPARK_SERVICE_CHECK, requests_config, tags)\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/spark/spark.py\", line 635, in _rest_request_to_json\n    response_json = response.json()\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/requests/models.py\", line 896, in json\n    return complexjson.loads(self.text, **kwargs)\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/simplejson/__init__.py\", line 505, in loads\n    return _default_decoder.decode(s)\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/simplejson/decoder.py\", line 370, in decode\n    obj, end = self.raw_decode(s)\n  File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/simplejson/decoder.py\", line 400, in raw_decode\n    return self.scan_once(s, idx=_w(s, idx).end())\nJSONDecodeError: Expecting value: line 1 column 1 (char 0)\n"
  }
]

When I also try to check the DataDog service service datadog-agent status, I am getting:

● datadog-agent.service - "Datadog Agent"
   Loaded: loaded (/lib/systemd/system/datadog-agent.service; enabled)
   Active: active (running) since Wed 2018-07-18 17:53:21 UTC; 1h 37min ago
 Main PID: 18098 (agent)
   CGroup: /system.slice/datadog-agent.service
           └─18098 /opt/datadog-agent/bin/agent/agent start -p /opt/datadog-agent/run/agent.pid

Jul 18 19:29:08 ip-10-0-0-209 agent[18098]: 2018-07-18 19:29:08 UTC | INFO | (runner.go:309 in work) | Done running check memory
Jul 18 19:29:08 ip-10-0-0-209 agent[18098]: 2018-07-18 19:29:08 UTC | INFO | (runner.go:246 in work) | Running check network
Jul 18 19:29:08 ip-10-0-0-209 agent[18098]: 2018-07-18 19:29:08 UTC | INFO | (runner.go:309 in work) | Done running check network
Jul 18 19:29:08 ip-10-0-0-209 agent[18098]: 2018-07-18 19:29:08 UTC | INFO | (runner.go:246 in work) | Running check ntp
Jul 18 19:29:08 ip-10-0-0-209 agent[18098]: 2018-07-18 19:29:08 UTC | INFO | (runner.go:309 in work) | Done running check ntp
Jul 18 19:29:08 ip-10-0-0-209 agent[18098]: 2018-07-18 19:29:08 UTC | INFO | (runner.go:246 in work) | Running check uptime
Jul 18 19:29:08 ip-10-0-0-209 agent[18098]: 2018-07-18 19:29:08 UTC | INFO | (runner.go:309 in work) | Done running check uptime
Jul 18 19:29:37 ip-10-0-0-209 agent[18098]: 2018-07-18 19:29:37 UTC | INFO | (transaction.go:121 in Process) | Successfully posted payload to "https://6-3-3-app.agent.datadoghq.com/api/v1/series?api_key=*************************5c3b2"
Jul 18 19:29:59 ip-10-0-0-209 agent[18098]: 2018-07-18 19:29:59 UTC | ERROR | (runner.go:277 in work) | Error running check spark: [{"message": "Expecting value: line 1 column 1 (char 0)", "traceback": "Traceback (most recent call ...checks/base.py\",
Jul 18 19:30:10 ip-10-0-0-209 agent[18098]: 2018-07-18 19:30:10 UTC | ERROR | (runner.go:277 in work) | Error running check spark: [{"message": "500 Server Error: Connection refused (Connection refused) for url: http://ip-10-0-0-17...back": "Traceback
Hint: Some lines were ellipsized, use -l to show in full.
@SiddChugh
Copy link

Hey, I am getting the same error as well.
I was wondering if you got around it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@Am1rr3zA @SiddChugh and others