Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporal POC #3778

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Temporal POC #3778

wants to merge 4 commits into from

Conversation

y242yang
Copy link
Contributor

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

This PR adds temporal support and runs temporal workers on top of yarn containers. Because it does not affect any existing behaviors, there is no gobblin JIRA addressing this.

Description

We start temporal workers and assign 1 worker per yarn container.

Tests

E2E testing is done with customized temporal workflow and temporal workers.

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work, ann... but very large review here.
hence this review is only part one: I'll pause now and return later to finish

@@ -234,5 +234,7 @@ public class GobblinClusterConfigurationKeys {

public static final String HELIX_JOB_SCHEDULING_THROTTLE_TIMEOUT_SECONDS_KEY = "helix.job.scheduling.throttle.timeout.seconds";
public static final long DEFAULT_HELIX_JOB_SCHEDULING_THROTTLE_TIMEOUT_SECONDS_KEY = Duration.ofMinutes(40).getSeconds();;

public static final String TEMPORAL_WORKER_SIZE = "temporal.worker.size";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's choose an unambiguous name. is this num workers or size of each one? if the latter, what units are we discussing (e.g. mb of memory)?

Comment on lines +82 to +83
* This class runs the {@link GobblinHelixJobScheduler} for scheduling
* and running Gobblin jobs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're not changing anything here, leave this file off the commit.

non-substantive, formatting-only changes are OK, but belong in a separate commit

@@ -28,7 +28,7 @@
/**
* Metrics that relates to jobs launched by {@link GobblinHelixJobLauncher}.
*/
class GobblinHelixJobLauncherMetrics extends StandardMetricsBridge.StandardMetrics {
public class GobblinHelixJobLauncherMetrics extends StandardMetricsBridge.StandardMetrics {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the need to call from the temporal sub-package you've created... but that sidesteps the potential confusion of using "helix-named" classes, when we're not actually involved w/ helix at runtime.

a good compromise for this PR, might be to add a TODO to all three--this, the job scheduler metrics, and the GobblinHelixJobLauncherListener--about renaming to be helix/temporal agnostic... (or if merited to instead create a separate temporal-specific variant)

Copy link
Contributor

@phet phet Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and a bunch of other files represent cumulatively a specific worker w/ its particular workflows and activities (plus supporting abstractions, like Workload and WFAddr)--let's put them into their own package, separate from the worker-agnostic scaffolding for running arbitrary workers.

I know we presently hard-code this one, but our next follow-on PR will be to load whatever configured specific worker using reflection. separating today's "example worker" from the reusable scaffolding, anticipates that future.

Comment on lines +11 to +12
public class NestingExecWorkflowImpl
extends AbstractNestingExecWorkflowImpl<IllustrationTask, String> {
Copy link
Contributor

@phet phet Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming-wise, this is really just one among a family of NestingExecWorkflow impls, the one for IllustrationTasks.

the class name should reflect that

import org.apache.gobblin.util.SerializationUtils;

/**
* An implementation of {@link JobLauncher} that launches a Gobblin job using the Temporal task framework.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it sounds here like temporal is the fundamental proposition, but I don't immediately see where it comes in within this class def. is the class actually more general? please update the javadoc to guide on where that stands

@Slf4j
public class GobblinTemporalJobLauncher extends GobblinJobLauncher {

private static final Logger LOGGER = LoggerFactory.getLogger(GobblinTemporalJobLauncher.class);
Copy link
Contributor

@phet phet Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, isn't this redundant, given @Slf4j?

Comment on lines +78 to +81
final GobblinHelixJobSchedulerMetrics jobSchedulerMetrics;
final GobblinHelixJobLauncherMetrics launcherMetrics;
final GobblinHelixPlanningJobLauncherMetrics planningJobLauncherMetrics;
final HelixJobsMapping jobsMapping;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noting again, what I previously mentioned: there's a confusing amount of reference to helix, given the framework isn't actually involved in execution

Comment on lines +84 to +87
public GobblinTemporalJobScheduler(Config sysConfig,
EventBus eventBus,
Path appWorkDir, List<? extends Tag<?>> metadataTags,
SchedulerService schedulerService) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line-continuation indentation should be four spaces. see the GobblinHelixJobLauncher ctor as an example

import org.apache.gobblin.yarn.event.DelegationTokenUpdatedEvent;


public class YarnTemporalAppMasterSecurityManager extends YarnContainerSecurityManager{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs javadoc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants