An alternative to "Extract Every N"

The Extraction process failing on you, and you aren't getting an error back with clear instructions? Tell us about it here


Forum rules

Read the FAQs and search the forum before posting a new topic.

This forum is for reporting errors with the Extraction process. If you want to get tips, or better understand the Extract process, then you should look in the Extract Discussion forum.

Please mark any answers that fixed your problems so others can find the solutions.

Locked
User avatar
Fed
Posts: 16
Joined: Sat Jun 18, 2022 3:31 pm
Been thanked: 1 time

An alternative to "Extract Every N"

Post by Fed »

So, it's been several times I wished I had this option, so I'm going to ask if it's possible to add it in the future - if it's not too inconvenient or against some other reasoning.

I would love to have "Extract M amount" option when extracting faces.
Let's say I have a 30m interview of a person and I want to get a 1k faces out of this interview. Let's say I know the person I need takes 50%-70% of screen time.
So... To avoid extracting the whole 30m video, I need to check the framerate, calculate the amount of frames in the video and divide it by the rough amount of frames I need.
It's not super hard math, but why not let the machine do it?
Why can't I just say "I need 2000 frames extracted". The machine can calculate, which each frame it is.
And then, after I sorted and deleted the faces and got 1200 left - again, why don't I say "I need to extract 1000 faces from the alignments" and again let the machine to calculate that I need every 1.2th face (well, give or take a face or two).

And a somewhat related question - why can't I use a float as N in "Extract Every N"?
I mean I can enter a float in the extraction in the tools->alignment tab.
But it doesn't seem to work as I would expect it to. I think it even broke something (or I have some unrelated bug, or maybe I just need some sleep, but I don't think so). Now I get the same amount of face extracted when I use 1 and 2 for N in "Extract Every N". 3 works as it should though.

Would be convenient if I could just extract every 1.7th face out of the alignments.
I have 10 videos and I want 10k faces out of them since it seems like the easiest way to maximize variety.
One of the videos ends up having 1700 faces in the alignments (and let's say these faces are different enough because it was a long video and I already extracted every 20th frame to get here). I see two sub-optimal choices here. First - extract every 2nd frame and get less that a 1000 faces from this video. Second - extract all of them and delete 700 faces by hand. Well, there's a third option too - have 700 extra faces what brings me over the 10k faces (and I remember 1k to 10k was the recommended amount somewhere in the guides).

User avatar
torzdf
Posts: 2649
Joined: Fri Jul 12, 2019 12:53 am
Answers: 159
Has thanked: 128 times
Been thanked: 623 times

Re: An alternative to "Extract Every N"

Post by torzdf »

Fed wrote: Sat Apr 29, 2023 9:06 pm

I would love to have "Extract M amount" option when extracting faces.

The main reason that this isn't in FS (and I agree it may be useful) is that when you want to extract (for example) 1000 items from a video, you really want to be extracting 1000 faces (not from 1000 frames). The reason being that if you extract 1000 frames, there is no guarantee that there are faces in the frames you are extracting from so you will end up, most likely, with fewer than 1000 faces.

At extract time, the process does not know which frames have faces, and of those frames with faces, which of those faces are of the individual you are interested in.

If using the alignment tool to re-extract, then you could specify the exact number of faces, and they could be selected evenly distributed (because at this point we know which frames have faces and can assume the alignments file has been cleaned). I will try to remember to look at the possibility of adding this to the alignments tool.

And a somewhat related question - why can't I use a float as N in "Extract Every N"?

Because it wouldn't make sense? You can't extract from a partial frame. Frames are whole numbers. You either extract from a frame or you don't. EEN=5 would skip 4 frames and extract from the 5th. You can't extract every 2.5 frames (for example)

My word is final

User avatar
Fed
Posts: 16
Joined: Sat Jun 18, 2022 3:31 pm
Been thanked: 1 time

Re: An alternative to "Extract Every N"

Post by Fed »

torzdf wrote: Mon May 01, 2023 12:22 pm

At extract time, the process does not know which frames have faces, and of those frames with faces, which of those faces are of the individual you are interested in.

It's the same level of uncertainty as with extracting every N frame. But it's easier to calculate in your head.

The way I see it, both "extract every N frame" and "extract M frames" have the same idea - to get enough frames with faces and avoid extracting way too many frames.
So to decide either N or M you need to have some rough idea of what amount of frames is probably enough and what is "way too many". Either way you need to have some rough understanding of how many frames you need.
And then when you make that guess, you need to do some math using video length and framerate to decide what's N in "extract every N frame".
Otherwise - how do you decide what N equals to? How do you decide between 2, 7 and 10? And 50? What if 2 is too much already?

Let me rephrase this.
FrameRate * VideoLength / N = ExtractedFrames.
FrameRate and VideoLength are defined by the video.
And to decide what N should be you need to decide what result you want. You need to know how many ExtractedFrames you need. Roughly. Like "2000 frames will have enough faces and limit the amount of unnecessary work". And then you can calculate N.
And by "enough" I mean a sufficient amount of frames to ensure that there are at least as many faces in the extracted frames as you need.

And if you do have some rough understanding of how many ExtractedFrames you need, why not just use this number and let the machine do the math?

torzdf wrote: Mon May 01, 2023 12:22 pm

Because it wouldn't make sense? You can't extract from a partial frame. Frames are whole numbers. You either extract from a frame or you don't. EEN=5 would skip 4 frames and extract from the 5th. You can't extract every 2.5 frames (for example)

Of course you can! - on average.
For example - I have 2500 faces and I only want a 1000.
So I need to extract every 2,5 frame. I don't really need it to be every 2,5 frame because, yeah, of course you can't literally extract every 2,5th frame.
For example, what I mean is - just make a variable, increase it by 1 for every frame and reduce it by N for every extracted frame. And when it's higher than 0 - extract the frame. That would be effectively extracting every N frames.
But if you accept this variation of the algorithm - you can use 2,5 (or any other float) there. That's what I mean by using a float as N in "Extract Every N".
It makes sense from the standpoint of "I have 2500 faces, but I want a 1000 faces that are taken from this 2500 as evenly as possible".

Last edited by Fed on Mon May 01, 2023 10:40 pm, edited 1 time in total.
Locked