Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coredump for OCRSearch feature extraction #359

Open
sauterl opened this issue Nov 22, 2022 · 10 comments
Open

Coredump for OCRSearch feature extraction #359

sauterl opened this issue Nov 22, 2022 · 10 comments
Labels

Comments

@sauterl
Copy link
Collaborator

sauterl commented Nov 22, 2022

In case others have this issue as well, I document it here:

Using the OCRSearch feature in an extraction configured as follows:

{
	"input":{
		"path": "path/to/videos/",
		"depth": 1,
		"skip": 0,
		"id": {
			"name": "FileNameObjectIdGenerator",
			"properties": {}
		}
	},
	"extractors":[
		{"name": "OCRSearch"}
	],
	"metadata": [
		{"name": "TechnicalVideoMetadataExtractor"},
		{"name": "EXIFMetadataExtractor"}
	],
	"exporters":[
		{
			"name": "ShotThumbnailsExporter",
			"properties": {
				"destination":"thumbnails/"
			}
		}
	],
	"segmenter": {
		"name":"org.vitrivr.cineast.core.extraction.segmenter.video.V3CMSBSegmenter",
		"properties": {
			"folder": "path/to/msb"
		}
	},
	"database": {
		"writer": "JSON",
		"selector": "NONE",
		"host":"./ocr"
	},
	"pipeline":{
		"shotQueueSize": 20,
		"threadPoolSize": 20,
		"taskQueueSize": 20
	}
}

after roughly 7000 segments, a coredump stopped the extraction, which was executed using:

java -Xmx32G -Xms32G -jar cineast-runtime/build/libs/cineast-runtime- ../cineast.json extract -e ../extraction-most.json`

This occurred on a Ubuntu 20.04.5 LTS, using openjdk

openjdk version "17.0.5" 2022-10-18
OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed mode, sharing)

with 40 cores of type Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz

GPU:
NVIDIA GeForce RTX 2080 TI, nvidia-driver 450.51.06 and CUDA 11.0.228

@sauterl sauterl changed the title [Bug] Coredump for OCRSearch feature extraction Coredump for OCRSearch feature extraction Nov 22, 2022
@sauterl sauterl added the bug label Nov 22, 2022
@lucaro
Copy link
Member

lucaro commented Nov 22, 2022

and what did the core dump say?

@sauterl
Copy link
Collaborator Author

sauterl commented Nov 22, 2022

Unfortunately there is no further information than a generic core dumped message. I did check in the user's home or the working direktory and none of the usual error logs appeared. My assumption (since a proper java coredump would produce a log file), that some native library had an issue.

@lucaro
Copy link
Member

lucaro commented Nov 22, 2022

Yes, that's usually the only way to trigger a core dump. It should still dump something though. If it were killed by the OS before it was able to do so, you would get a different message.

@silvanheller
Copy link
Member

Probably the same issue as #273, where we did not have sufficient information to reproduce the bug. Thanks for the information!

@x4e-jonas
Copy link
Contributor

I'm observing a similar issue, thus I'm not sure if this is the same as described here or in #273.

It seems to me, that the extraction process loads the objects into memory one-by-one. As soon as the sum of objects is larger than the available memory, the process goes OOM.
This happens after a some amount of large images or in case of videos (they tend to be several hundreds of MBs) after a hand full already. Allocating more memory (up to 32G or so) does not solve this issue but allows to run the extraction a bit longer. The only workaround so far is to use a wrapper script that shards the files into a sequence of multiple distinct extraction procedures.

@lucaro
Copy link
Member

lucaro commented Jul 6, 2023

If this is indeed the problem, possible workarounds would include:

  1. making the extraction thread pool size smaller in order to reduce the number of feature instances operating simultaneously
  2. reducing the video resolution in the decoder settings

Can you check if one or both of these measures resolves your issue?

@x4e-jonas
Copy link
Contributor

  1. results in

DEBUG o.v.c.s.r.GenericExtractionItemHandler - ExtractionPipeline is full - deferring emission of segment. Consider increasing the thread-pool count for the extraction pipeline.

@lucaro
Copy link
Member

lucaro commented Jul 10, 2023

This is expected behavior in this case and what you would want to happen. This only tells you that the feature extraction is slower than the decoder, and if you have more compute resources, you could increase throughput. In case you are compute - or in this case memory - limited, you'd want the pipeline to slow down rather than trying to consume more resources than are available. Were you able to run your extraction without anything crashing?

@x4e-jonas
Copy link
Contributor

The video resolution is already limited to 640x480. The affected instance has 32 CPU cores, 64G RAM, 4352 GPU cores and 11G GRAM. What parameters would you recommend to run at full capacity?

@lucaro
Copy link
Member

lucaro commented Jul 11, 2023

Whatever works 😉
The question cannot be answered as posed since the answer also depends on the content to be processed. Is 640x480 enough for all the text to be readable? If yes, can you go lower and still have it readable? If not, how much larger do you need to go? How long is the mean shot of your content? Does it have a lot of short sequences, or is it more of one continuous recording?
The limiting factor you have here is the longest shot you need to keep in memory, so the total maximum memory consumption is proportional to the duration of the longest shot times the video resolution. As long as you can keep this below your hardware limit, you should not experience any problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants