Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rotation may need an overhaul #322

Open
ferdnyc opened this issue Sep 8, 2019 · 4 comments
Open

Rotation may need an overhaul #322

ferdnyc opened this issue Sep 8, 2019 · 4 comments

Comments

@ferdnyc
Copy link
Contributor

ferdnyc commented Sep 8, 2019

Given the issues reported in OpenShot/openshot-qt#2969 (and others), and based on my own tests, it seems like the current rotation support in OpenShot is... insufficient.

The problem arises when you have a video rotated in a way that changes its dimensions (which, you know, is most rotation effectively, except for 0° and 180°.) Because libopenshot still processes the video as though it has the width and height of its original dimensions, things don't go well.

This can be seen by creating a vertical video profile, then importing a video in true vertical aspect, vs. a video with standard wide aspect and an initial rotation. The true-upright video loads in just fine, and formats correctly to the output frame. The rotated widescreen video is loaded shrunken down (with "Best Fit" scaling) in the center of the output frame, only about half of its full size. This seems to be because libopenshot is scaling it as if it needs to fit a 1920px wide image inside a 1080px wide frame.

My guess is that what's missing, for starters, is a post-rotation coordinate mapping of the QTransform:

QTransform transform;
// Transform source image (if needed)
ZmqLogger::Instance()->AppendDebugMethod("Timeline::add_layer (Build QTransform - if needed)", "source_frame->number", source_frame->number, "x", x, "y", y, "r", r, "sx", sx, "sy", sy);
if (!isEqual(r, 0)) {
// ROTATE CLIP
float origin_x = x + (scaled_source_width / 2.0);
float origin_y = y + (scaled_source_height / 2.0);
transform.translate(origin_x, origin_y);
transform.rotate(r);
transform.translate(-origin_x,-origin_y);
transformed = true;
}

QTransform has all sorts of map() and mapRect() functions to translate spatial data into the transformation's coordinate system. mapRect() is even documented to return the bounding box, for rotating or shearing transforms, which sounds like exactly what we'd want. (A full-HD video rotated to 45° isn't actually 1920px × 1080px, after all — it's somewhat narrower, but significantly taller than that.)

And for initial rotation, like in vertical video (most of which is stored at horizontal dimensions with a 90° initial rotation), ideally we'd want libopenshot to consider that video's width to be 1080, and its height to be 1920. Which may not be possible with the current rotation logic, as it simply applies the in-built rotation as a default initial value of OpenShot's Rotation property. That doesn't seem good enough, if we even need to translate the dimensions.

@alansharkey
Copy link

So, is it going to be fixed?

Alan

@jeffski
Copy link
Contributor

jeffski commented Feb 19, 2021

+1 for this. have noticed this issue as well using phone footage. Have been fixing the source video using FFMpeg for the time being but something in libopenshot to handle this would be good. Be interesting to know how efficient it would be however. With ffmpeg I need to re-encode to fix the issue. Something like this:

ffmpeg -i rotated.mp4 -c copy -metadata:s:v:0 rotate=0 temp.mp4
ffmpeg -i temp.mp4 -vf "transpose=clock" -crf 23 -c:a copy fixed.mp4

First line resets the rotation but as mentioned the width/height is still the wrong way round. Second line rotates the video correctly with the correct width/height.

@ferdnyc
Copy link
Contributor Author

ferdnyc commented Mar 16, 2021

@jeffski

Yeah, the efficiency question is a good one, I think it all comes down to when and how the rotation is done.

The argument for not "pre-applying" rotation to input Frames (effectively, doing the internal equivalent of your reencoding), is that eventually in the course of processing frame images a Writer is going to use QPainter::drawImage() to dump the Frame's pixel contents into an output image anyway. That operation's a heavy one anyway, regardless what sort of rotation (along with any other transforms) is being applied, and the transforms don't really add much as a result. (Shunting all those pixels from one part of memory to another is expensive enough that tweaking where they land in the output buffer isn't all that costly, at least relatively.) Plus, we may need to rotate the video anyway to apply keyframe rotation. So, it makes sense to delay rotation as long as possible, although that introduces a lot of potential complexity.

Regardless when the rotation is applied, combining intrinsic rotation and user rotation (as we're currently doing) isn't really a good idea. When an input video contains intrinsic rotation, we want to also rotate its frame dimensions so that we can apply proper scaling. When the user rotates a frame with keyframe rotation, we want that to happen after scaling without having any effect on the size. (Otherwise, a rotating video in Best Fit mode would change size based on how its bounding box's dimensions changed from frame to frame.)

So, it really needs to go:

  1. Compute frame dimensions
  2. Check for intrinsic rotation
  3. Apply rotation and recompute dimensions
  4. Scale based on rotated dimensions
  5. Apply keyframe rotation

Delaying rotation until output

As I said, to be most efficient that entire list would be done all at once, while painting each frame into its output image. It might even be possible to compute steps 1-3 a single time in FFmpegReader, and have it create a QTransform that represents the intrinsic rotation. That could then be attached to each Reader and/or Frame object. Then, instead of constructing a brand new QTransform so we can apply things like keyframe scaling, translation, and rotation, we could instead use the initial pre-transformation as the starting point. Automatic scaling would take the raw image size, apply the pre-transform to it, then take the resulting dimensions and scale them by applying additional transformations. Any keyframe modifications would then become additional transformations, and finally the QTransform representing all of the combined alterations gets applied to the QPainter.

But if we do that...

  1. Code for things like Frame::GetWidth() and Frame::GetHeight(), as well as the code that outputs frame images via Frame::GetImage(), Frame::GetPixels(), Frame::Thumbnail() and Frame::Save(), would apply the initial transform when generating their results.

    So, we might have a thing where a rotated video has a reader.info.width == 1920, reader.info.height == 1080, reader.info.rotation == 270, and that causes frames to be generated with Frame::GetWidth() == 1080, Frame::GetHeight() == 1920, and each Frame contains a QImage of size 1920x1080, plus an initial QTransform that's had a 270° pre-rotation applied to it.

  2. I think we'd probably end up having to differentiate routines that need to access the raw image data as-is, vs. ones that would get the image with the transform already applied (so it matches the dimensions provided by Frame::GetWidth() and Frame::GetHeight()). Maybe means adding a Frame::GetTransformedImage(), or making GetImage() apply the transform by default and having a Frame::GetUntransformedImage() instead... dunno, have to think on that some more. (Ditto for Frame::Get{Transformed/Untransformed}Pixels.)

  3. Then there's the question of the various Frame::AddImage() methods, which should either (a) take an optional rotation parameter, or (b) have forms that clear out any previous rotation, and other forms that retain existing transform parameters (used when storing the results of some operation performed on the untransformed image data).

  4. Effects would have to be sorted into two piles, to determine which methods to use:

    • Ones that have to apply intrinsic rotation prior to operating on the frame, and should clear the transform when storing the results
      (Most of them — basically anything that alters the image spatially: Crop, Bars, Shift / ColorShift, Deinterlace, Mask, Pixelate since the pixels can be rectangular, Caption, all of the OpenCV effects, etc.)
    • Ones that really don't care, but will want any existing transform to be preserved when they operate directly on the raw untransformed image.
      (A few whole-image effects like Brightness, Saturation, Hue, Negate... and surprisingly enough ChromaKey, which does exactly the same thing regardless of orientation)

    The first group would be more expensive when operating on rotated video, but it would have no impact at all on the second group.

...Plus any other issues I haven't thought of. There could definitely be more than what's here.

Even if this is a complete list, though... given all the complexity involved, maybe we're better off just taking the hit on efficiency by building each Frame with the intrinsic rotation already applied.

@jeffski
Copy link
Contributor

jeffski commented Mar 16, 2021

Thorough explanation. Doesn't sound like a simple task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants