Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] JobRunrException: MigrateFromV6toV7Task #1036

Closed
uben01 opened this issue May 10, 2024 · 8 comments
Closed

[BUG] JobRunrException: MigrateFromV6toV7Task #1036

uben01 opened this issue May 10, 2024 · 8 comments

Comments

@uben01
Copy link

uben01 commented May 10, 2024

JobRunr Version

7.1.1

JDK Version

OpenJDK 17.0.11

Your SQL / NoSQL database

Postgres 14

What happened?

After a successful stage deploy, we've got an exception from the production. We've got no idea what the relevant parts are from the code, but everything seems to be normal after the exception.
The production deployment broke, and no jobs could be started until the related jobs were deleted from the db.

How to reproduce?

Relevant log output

org.jobrunr.JobRunrException: JobRunr encountered a problematic exception. Please create a bug report (if possible, provide the code to reproduce this and the stacktrace)
	at org.jobrunr.JobRunrException.shouldNotHappenException(JobRunrException.java:43)
	at org.jobrunr.utils.mapper.jackson.JacksonJsonMapper.deserialize(JacksonJsonMapper.java:90)
	at org.jobrunr.jobs.mappers.JobMapper.deserializeJob(JobMapper.java:20)
	at org.jobrunr.server.tasks.startup.MigrateFromV6toV7Task$V6BatchJobMapper.deserializeJob(MigrateFromV6toV7Task.java:111)
	at org.jobrunr.storage.sql.common.JobTable.toJob(JobTable.java:366)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at org.jobrunr.storage.sql.common.db.SqlSpliterator.tryAdvance(SqlSpliterator.java:48)
	at java.base/java.util.Spliterator.forEachRemaining(Spliterator.java:332)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
	at org.jobrunr.storage.sql.common.JobTable.selectJobList(JobTable.java:232)
	at org.jobrunr.storage.sql.common.DefaultSqlStorageProvider.getJobList(DefaultSqlStorageProvider.java:337)
	at org.jobrunr.storage.StorageProvider.getJobs(StorageProvider.java:230)
	at org.jobrunr.storage.ThreadSafeStorageProvider.getJobs(ThreadSafeStorageProvider.java:206)
	at org.jobrunr.server.tasks.startup.MigrateFromV6toV7Task.migrateBatchJobs(MigrateFromV6toV7Task.java:57)
	at org.jobrunr.server.tasks.startup.MigrateFromV6toV7Task.run(MigrateFromV6toV7Task.java:44)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "currentValue" (class org.jobrunr.jobs.context.JobDashboardProgressBar$JobDashboardProgress), not marked as ignorable (4 known properties: "succeededAmount", "totalAmount", "failedAmount", "progress"])
 at [Source: (String)"{"@class":"org.jobrunr.jobs.BatchJob","version":6,"jobDetails":{"className":"com.emarsys.estm.service.jobrunr.JobSchedulerWrapper","staticFieldName":null,"methodName":"enqueueLaunchProcessingJobs","jobParameters":[{"className":"java.util.UUID","actualClassName":"java.util.UUID","object":"172b6ca9-643e-4b9e-9046-893e779d280e"},{"className":"[Ljava.lang.Integer;","actualClassName":"[Ljava.lang.Integer;","object":[0,2,8,10,12,14,16,18,20,22,-1]}],"cacheable":false},"jobSignature":"com.emarsys.estm."[truncated 1939 chars]; line: 1, column: 2319] (through reference chain: org.jobrunr.jobs.BatchJob["metadata"]->java.util.concurrent.ConcurrentHashMap["jobRunrDashboardProgressBar-3"]->org.jobrunr.jobs.context.JobDashboardProgressBar$JobDashboardProgress["currentValue"])
	at com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:61)
	at com.fasterxml.jackson.databind.DeserializationContext.handleUnknownProperty(DeserializationContext.java:1138)
	at com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:2224)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1719)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownVanilla(BeanDeserializerBase.java:1697)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:320)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:215)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
	at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:170)
	at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:136)
	at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromAny(AsPropertyTypeDeserializer.java:240)
	at com.fasterxml.jackson.databind.deser.std.UntypedObjectDeserializerNR.deserializeWithType(UntypedObjectDeserializerNR.java:115)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:625)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:449)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:32)
	at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:170)
	at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:136)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserializeWithType(MapDeserializer.java:492)
	at com.fasterxml.jackson.databind.deser.impl.FieldProperty.deserializeAndSet(FieldProperty.java:147)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:314)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:215)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
	at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:170)
	at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:136)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithType(BeanDeserializerBase.java:1306)
	at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:74)
	at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323)
	at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4825)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3772)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3740)
	at org.jobrunr.utils.mapper.jackson.JacksonJsonMapper.deserialize(JacksonJsonMapper.java:86)
	... 22 common frames omitted
@uben01
Copy link
Author

uben01 commented May 13, 2024

We'll try to reproduce it this week.

First we thought it caused no problem, but later we found that no job could have been started after the exception occurred. We managed to delete the old batch jobs from the db, and it solved the problem.

@auloin
Copy link
Contributor

auloin commented May 13, 2024

Hi @uben01, is it possible to share the JSON of the BatchJob causing the issue with us?

@uben01
Copy link
Author

uben01 commented May 13, 2024

Sadly we only deleted the related jobs from the db, but I'll try to reproduce the issue later

@uben01
Copy link
Author

uben01 commented May 13, 2024

We were not able to reproduce it, but we found a different stacktrace as well. It might shed some light to the events:

Some additional context, that we have multiple pods running the same application. There might have been a problem with migrations running in a wrong order.

org.jobrunr.storage.StorageException: org.postgresql.util.PSQLException: ERROR: operator does not exist: uuid = character varying
  Hint: No operator matches the given name and argument types. You might need to add explicit type casts.
  Position: 51
	at org.jobrunr.storage.sql.common.DefaultSqlStorageProvider.announceBackgroundJobServer(DefaultSqlStorageProvider.java:87)
	at org.jobrunr.storage.ThreadSafeStorageProvider.announceBackgroundJobServer(ThreadSafeStorageProvider.java:60)
	at org.jobrunr.server.ServerZooKeeper.announceBackgroundJobServer(ServerZooKeeper.java:78)
	at org.jobrunr.server.ServerZooKeeper.run(ServerZooKeeper.java:53)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: org.postgresql.util.PSQLException: ERROR: operator does not exist: uuid = character varying
  Hint: No operator matches the given name and argument types. You might need to add explicit type casts.
  Position: 51
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2713)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2401)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:368)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:498)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:415)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:190)
	at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:152)
	at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61)
	at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeUpdate(HikariProxyPreparedStatement.java)
	at org.jobrunr.storage.sql.common.db.Sql.delete(Sql.java:144)
	at org.jobrunr.storage.sql.common.BackgroundJobServerTable.announce(BackgroundJobServerTable.java:48)
	at org.jobrunr.storage.sql.common.DefaultSqlStorageProvider.announceBackgroundJobServer(DefaultSqlStorageProvider.java:84)
	... 9 common frames omitted

@auloin
Copy link
Contributor

auloin commented May 13, 2024

@uben01 I don't think this is related to the previous issue. The latter issue is probably caused by the fact that not all your pods are on v7 yet (as pointed out by other users):

I think the issue was caused because the migration is not backwards compatible, that caused errors because we had pods of the same service in different versions of JobRunr (they are updated one by one)

Regarding the initial issue; could you tell us which version of JobRunr you were running in production?

@uben01
Copy link
Author

uben01 commented May 13, 2024

We were upgrading from v6.3.5 to v7.1.1

@rdehuyss
Copy link
Contributor

Do you have a backup of your db before the upgrade? That way, it should be reproducible?

@rdehuyss
Copy link
Contributor

We just double checked and we can't see what is happening to be honest. We have an integration test for this exact scenario which is green.

If you encounter it again, feel free to reopen but we do need a copy of the JSON of the BatchJob in such a case. Then we should be able te reproduce the issue really easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants