| Printable Version of Topic
Click here to view this topic in its original format |
| Unofficial VirtualDub Support Forums > VirtualDub Filters and Filter Development > Unespected Behavior Using Prefecth2 |
| Posted by: jpsdr Nov 28 2011, 09:23 AM |
| Hello. I've very recently updated some of my filters to replace the LAG parameters, using prefetch2 and fa->mpSourceFrames[]->mpPixmap and fa->mpOutputFrames[]->mpPixmap instead. Principaly my IVTC filter. I've a long time ago created a filter wich remove frames, using prefetch2. This one worked perfectly. Now, the problem i have : a : If i run my new IVTC filter, save the result in a file, and then afterward run my remove frame filter, everything worked fine and as expected. b : If i run in the same process, my new IVTC filter followed by the remove frame filter, result is not the expected one (not the same it was in a), and even can result in a crash ! c : My old IVTC filter followed by the remove frame filter in the same process work perfectly fine. I've put the source and filters here : http://dl.free.fr/b5pAivE8o It's the IVTC v5.0.0 follwed by the Remove Frame v1.2.2 (well, code for this one is very simple...). If you can take a look, maybe i'm not using properly the cached frames or anything else. If you want me to provide a sample file wich crash in case b but not in case a, i can, tell me (around 500MB). Edit : So either i somehow trig an hidden problem in Vdub, either i'm doing something "not allowed" in the way i'm using prefetch2, but i don't know what. |
| Posted by: jpsdr Nov 30 2011, 02:28 PM | ||||||||
| I've investigated more. My remove frame filter contain the following :
I've put in my IVTC code the following :
Result, with only my IVTC filter :
result putting the remove frame after the IVTC filter :
The information reported by pxsrc.mFrameNumber is incorrect in the filter when the remove frame filter is after it. IVTC filter beleived it's frame number 12 when in reality it's frame number 10... I need this information accurate in my filter, is this behavior "normal", in that case, i have to switch back to what i used to do before : Manualy increment a counter each time i enter the run procedure, to be sure to have the correct frame number, or is it a bug ? Or is there another structure field wich will provide me the right information ? Edit : Using a counter didn't work, behavior was the same. |
| Posted by: jpsdr Dec 1 2011, 08:41 AM |
| Last test i've made, using internal counter instead of pxsrc.mFrameNumber is to compare result between : a: Using my IVTC filter alone, saving result in a file, then, afterward, process this file with my remove frame filter. b: Using IVTC and remove frame in the same process. To test i've used an YV12 telecine "RGB cube", to have the number in each frame. For input frame 0 to 20, result is : X/Y mean frame reconstructed from X & Y frames. a: Good result, frames in the final result file are : 0 1 3 4 5 6 8 9 10 11 13 14 15 16/17 18 19 20 b: Bad result, frames in the final result file are : 0 1 3 5 5 6 7 10 11 11 12 14 16 17 17 20 21 I'm totaly lost, i've no idea what's happening. I need help on this... |
| Posted by: jpsdr Dec 4 2011, 08:12 AM |
| All of this is with version 1.10.1-test16 |
| Posted by: phaeron Dec 11 2011, 08:59 PM |
| I think I see what's going on. What's happening is that the remove frame filter is only requesting some of the frames that the IVTC filter would produce. This results in gaps in the frame list that the IVTC filter is being asked to produce. Your filter needs to deal with this -- it cannot assume that frames are requested in order or expect an output frame number that increments without gaps. What your IVTC filter is seeing is the same that you would get if you deleted the frames manually on the timeline in the same way that the Remove Filters frame does. |
| Posted by: jpsdr Dec 12 2011, 09:06 AM |
| ... i don't understand very well. My old IVTC filter works perfectly fine with remove frame, both, the old and the new have the same algorithm and are doing things the same way. The only difference between them it's that the old uses an internal buffer and LAG function, the new uses the new process you had implemented, using prefetch2 and fa->mpSourceFrames[]->mpPixmap. Only source of input changes... ............................................... hmmmmmmmmmm................ That means that when i'm not using prefetch2, the input is all the frames in the correct order, but when i'm using prefetch2, i ask specific frame, using the frame parameter, and this is what produce problems, because the frame value may not be the same value asked previously +1. And even if remove frame is after IVTC, it has an effect before it ? ...... I think i've a little troubles to realy understand, and i'll stay with my old method, using new present too much risks and troubles and high uncompatibility risks with what i'm doing. Or i must include the remove frame part inside the IVTC filter, for the new process. That means also that i must use internal counters inside filters, because value reported by pxsrc.mFrameNumber is not realiable to make things like "doing this on the 1000th frame processed". Whatever version i use of VDub, if input is 20000 frames, and if i put the following filters, in this oder : Filter A IVTC Remove frame Filter B How many frames Filter A and IVTC will see and process ? 20000 or 16000 ? Filter B should see only 16000 frames. |
| Posted by: jpsdr Dec 12 2011, 11:56 AM |
| I've made some test, with my manual IVTC filter. Old version : Filter has LAG, internal buffer. New version : Using prefetch2. Video input file : 200 frames. Number of frames processed : Number of time the run function has been called (and process a frame). Old version alone : Number of frames processed by run function : 200 Old version + Remove frame : Number of frames processed by run function : 200 => GOOD. Correct behavior for me and the filter to work properly. New version alone : Number of frames processed by run function : 200 New version + Remove frame : Number of frames processed by run function : 160 => NOT GOOD. I see a dead end for me, i can't use prefetch2 as i absolutely need all the frames to be processed at least in some filters before the remove frames... And apparently all the filters using prefetch2 are affected in the chain, and not only those after. Otherwise, there is no point to put the filter at a specific position. It acts as if it has been put on the top of the filters. |
| Posted by: jpsdr Dec 13 2011, 02:43 PM |
| I've done more testing, result are, for a video input file of 200 frames, filters are listed on the order, within () is the number of frames processed. Note : my old IVTC has LAG, and is the only one with LAG on these tests. Filter A : A filter i've made using prefetch2. Filter B : A filter i've made not using prefetch2. Old IVTC (200) Remove Frame Filter A (200) Old IVTC (200) Remove Frame Old IVTC (200) Filter A (160) Remove Frame Old IVTC (200) Filter A (160) null transform Remove Frame Old IVTC (200) Filter A (160) Filter B Remove Frame It seems, that what saved me is the fact that my filter have LAG, and so all fiters until it process the expected number of frames. From what i see, if i have 3 A, B, C filters, no LAG, doing : A + B + C + Remove frame is the same as doing : Remove frame + A +B +C I don't think this it realy the behavior expected... EDIT : I've tried : Filter A (200) Your IVTC with reduce frame Filter A (200) Your bob doubler How do you manage to have no impact on filter before ? I haven't been able to find out in the code source of your filter what you're doing differently from what i'm doing in my filter. Your IVTC, when reducing frame rate, finaly do exactly what i'm doing in my remove frame : changing fa->dst.mFrameRateHi, fa->dst.mFrameRateLo and fa->dst.mFrameCount in GetParams, and in prefetch2 selecting the frames to output... Can you take a look at my remove frame, and tells me why it has effect on filter before, and not your IVTC ? I've looked at your IVTC filter, and i absolutely have no idea what's the trick/difference between what you are doing and what i am doing, wich create the fact that my filter has effect on the filters before it, and not yours ! |
| Posted by: jpsdr Dec 14 2011, 12:17 PM |
| More tests : Video input : A file YV12 color cube 3:2 TFF of 200 frames. Filter A Your IVTC with reduce frames. Frames processed by filter A : 200 Filter A My remove frame filter Frames processed by filter A : 160 I've tried everything i could see in the code of your IVTC filter, adding swap_buffer, having a little bit of code in Run() proc for not being empty, nothing changes, the filter before is always processing only 160 frames, i'm unable to get the same behavior of your IVTC filter. Video input : Create an 3:2 TFF RGB cube in YV16 mode, cut it to 200 frames. Filter A My remove frame filter Frames processed by filter A : 160 Filter A Your IVTC with reduce frames. Frames processed by filter A : 204 Save the input video created to a file, remove all filters, close video and use the saved file as input. Filter A My remove frame filter Frames processed by filter A : 160 Filter A Your IVTC with reduce frames. Frames processed by filter A : 200 |
| Posted by: jpsdr Dec 19 2011, 08:50 AM |
| So.... any idea why my remove frame filter afect also previous filters and not your IVTC ????? |
| Posted by: phaeron Dec 19 2011, 06:52 PM |
| Sorry for the delay in responding. The reason for the difference is that VirtualDub's built-in IVTC filter does some unnecessary work when running in reduce rate mode with a manual offset: it still fetches a window of 11 frames. The overlap of these windows effectively causes all frames to be requested from the upstream filter. It could omit the window in which case you would see the same behavior as your filter, which is that some frames are skipped in the upstream. You cannot assume that your filter will be called for a contiguous set of frames. This was never guaranteed with edits or the frame rate conversion option, and it is even less true with frame rate conversion being possible in filters. VirtualDub always attempts to fetch the minimal set of frames from the upstream filter, with caching used to reduce the number of duplicate frame requests. If the downstream filter only requests four out of every five frames, that is all that the upstream filter will be asked to produce. The reason that it does this when the lag flag is set is to avoid breaking such filters, but that carries a significant performance penalty when sparse frames are requested: a filter with a lag of M that is asked to produce N sparse frames will run M*N times. This is up to M times as expensive as a filter written to use prefetch2-style upstream windowing. For your filter to work completely correctly in all cases, the rule you need to follow is as follows: the output produced by your filter must effectively be a pure function of the source frames and your constant filter data. You have to be careful about any cached data between frames to avoid introducing errors on seeks. The way that the internal IVTC filter works is that it prefetches all of the source frames it might need and internally caches data produced from these frames so that no extra work is done in sequential operation. However, should seeking occur, it is designed to always produce the same result regardless of the subset or order of frames requested. |
| Posted by: jpsdr Dec 20 2011, 07:40 AM |
| You cannot assume that your filter will be called for a contiguous set of frames. ........ Veryyy troublesome... This is what my algorithm and the structure of the filter is is based on (pipeline structure), wich greatly increase speed... This breaks a looooot of things in the structure of my filter... And it may even be almost totaly uncompatible with... If, for now, i use as workarond a windows prefecth of 11 frames in my remove frames filters, will it work ? (I'll try and test this...) Thanks for your answer. |
| Posted by: jpsdr Dec 20 2011, 09:09 AM |
| A little more explaination. I've (a long time ago...) tested several IVTC filters, without being convinced. When i've developped my filters, it was of course for anime wich telecine pattern change almost on every scene. At the time, all the filters seems to work on the same idea : Detect the 2 frames with higher correlation computation. I've tested another idea : Detect frames where doing IVTC significantly reduce correlation. At the time my filter worked at places where all others failled. So, i think i've been able to have something wich produce very good result. Feel free to test it (i insist... Note : There is a crash bug i've corrected in the new version, so, if you want to test (to see how good it is... It wasn't the case in old days, because only RGB was avaible, but now, i'm working with, i'll not say YV12 but 4:2:0 interlaced data, source can be either DVD or Blu-Ray (from avisynth script with DGMPEG or DGIndexNV). My purpose is to try to have best result, so not upscalling 4:2:0 to 4:2:2 (if necessary) with interlaced content, only doing this when i'm sure all picture are progressive. Even if i have good result on my filter, it's not perfect, sometimes it fails. This is why i have a manual mode, where you input a file wich will force manual pattern on some frames. For exemple, on an input video of 2000 frames, i'll tell : pattern is 3 for frames 894 to 902. Now, after the IVTC, some frames still may be interlaced. More commun reason : fading. Either black/white fading, wich was field made in these kind of anime, wich will always be interlaced, of fade between scene with different telecine pattern. Also sometimes some scene (very rare) where interlace doesn't follow any pattern. So, after IVTC, some few frames may need to be deinterlaced. This is why i have a manual mode for my deinterlace filter, using an input file wich will tell it wich frames should be deinterlaced. So, on the 1600 frames file result, i will use my deinterlace filter and say : Deinterlace only frames 19 to 32, 541 to 550 and 1580 to 1600 (standard black ending fade). Now, and only now, i can upscale my 4:2:0 video to 4:2:2 (if necessary), using a progressive method, because i'm sure that all my frames are progressive. I don't alway need to upscale to 4:2:2, sometimes, my whole process stay in 4:2:0. Now, i have : Old IVTC filter, with LAG, wich can produce either a stable output telecine pattern of 2, or progressive output, creating always frame 3 duplicate of 2. Using an internal counter to count frames. My old deinterlace filter, without LAG, but use an internal counter to count frames. Before the possibility to reduce frames, i had 2 steps : - Saving a 4:2:0 file with stable telecine pattern of 2. - Opening this files with an avisynth scripts, with the correct doubleweave+pulldown pattern inside and deinterlacing it. Still with old versions, introduce remove frame possibility. Great !! This will avoid me to create an intermediate file with the pattern of 2, and allow me to do everything in one step. More practical and faster. Just changed the default setting of my IVTC filter to create progressive output, and create a remove frame filter wich remove the 3rd frame every 5 frames. Now, my ITVC and remove filters are still the old version, but my chain is the following, and in that order. - Old ITVC filter, with manul file wich tell it to apply pattern of 3 to frame 894 to 902, number according the 2000 frames video input. - Remove frame - Old deinterlace, with manual file wich tell it to deinterlace only frames 19 to 32, 541 to 550 and 1580 to 1600, but now according the 1600 frames videos. For now, this works fine. Everything break down when i've tried to use prefetch2, but the rest of the story is tell here. But, with my manual frames settings options, and the fact my IVTC algorithm need contiguous inputs, at least, on it's level on the filter chain, i think you can see what my troubles are, and the fact i'm facing some kind of dead end... |
| Posted by: jpsdr Dec 20 2011, 12:14 PM |
| My remove frame filter remove k frames each n frames. I've first try to add in the prefetch2 a +/n windows frames. It worked, and after i've try better : a +/-k windows. Worked also. Why fetch +/5 frames if only +/-1 are enough, in standard 1 frame on 5 removed ? Made several tests with several values, seems to work fine. More : for a 200 input files, all the filter before my remove frame process the 200 frames, and those after 160, and big things is that the .mFrameNumber field vary from 0 to 199 before, and 0 to 159 after. So, no need anymore of the internal counter, and manual setting will work fine. Things behave properly, at least, from my point of view. If in the filter order A B C D E, i've put the remove frame C after B and before D, it's not for it having an effect on A and B, otherwise, i would have put it before A. Here a link to all properly working : http://dl.free.fr/nVBDi9dLZ http://dl.free.fr/r1MM27SDb If page is in french, click on Telecharger ce fichier If after testing my IVTC filter you find it interesting, tell me... |
| Posted by: phaeron Dec 20 2011, 09:35 PM |
| This will work for now, but is highly likely to break in future versions. For instance, it would fail if VirtualDub fetched frames backwards. Simply put, if you are relying on the order in which frames are requested you are relying on unspecified behavior. You can still do pipelining in a way that doesn't require a specific order, namely by caching intermediate results. |
| Posted by: jpsdr Dec 21 2011, 09:20 AM |
| Note : After (i've lost count) number of edits, this post is a total mess... sorry Globaly, what i've big trouble with is the fact that in an filter chain, where you can put a specific order (A B C D E), and if you have put it in this order, it's for a reason, you can have a filter which affect previous. If C is a decimate filter, doing : A B C D E or C A B D E or A B D E C is the same thing. This is where, still from my point of view : No, there is a problem. If i chose to put C after B, it's because i want the effect of C applied only after the ouput stream of B. So, if C is a decimate filter, decimation is made only after the ouput of B. Otherwise, there is no point of having a decimate filter (or possibility of decimate) in filter, considering the fact that a filter is something you put in a chain with a specific oder. In that case, decimate should be an option only aviable on the same level as "Frame rate..." (Video frame rate control), where you know, for sure, that it will be done before the chain filter. This is still, of course, my point of view. As i use often VDub and found it very usefull, having this behavior very annoying for make me a little... So, if i understand properly, unless you have a change of mind in behavior - Using the new versions of my filters, but without decimating, and going back to have to split process in two parts, using intermediate file. - Using new version and decimating, but stop updating VDub version when it stop working. - Using my old version with LAG and decimating, but will not LAG be deprecated one day ? - Having a very hard time to redesign filter, if it's even possible. I think i've just had a revelation... Understanding prefetch. You said my trick may not work in the future. Question is why ? prefetching a frame n at stage C, mean : filter C needs frame n, so, filter B must produce frame n, so filter B will process frame n, and so on going up. ... I see... i understand now why decimate filter wich preftech only decimated frames will also affect upstream, and producing gap... It make sense... I'll think i begin to understand.... Now, if i decimate k frames, prefetching a windows of +/-k will make me sure that all frames will be asked to the previous filter. More optimised, but always thinking during a job/process, not a seeking : If in my decimate, i keep a previous frame variable, compare it to the actual frame i have to prefetch, and prefetch only the frames in the gap, this will also make me sure all the frames will be asked to the previous filter. So, again, why it could not work in the future ? So, after some understanding, i would said, from my point a view, problem is not from the way things are handled now, but how a decimating filter, is designed, and mine was bad. When creating a decimating filter, supposed to be a filter in a process chain, you must be sure that it will not affect upstream, and so it must prefetch the fames it'll drop. What make me more and more wondering : Why what i've done may not work in the future........... ####################### Questions : What is the status of .mFrameNumber information before and after a pure decimate filter ? I call a pure decimate filter wich prefetch only one frame, not doing the trick i've done. Is the following correct, for 1/5 frames decimated, with 200 frames input : - Upstream filters will only see 160 frames, but with .mFrameNumber going from 0 to 199 with gaps in numbers. - Downstream filters will see 160 frames, but with .mFrameNumber going from 0 to 159, without gaps. A filter which increase number of frames, for exemple double, all the upstream filters will have their frames processed doubled ???? Now that i have a little understood prefetch, i would said no, but if i've understood decimating, i've not yet understood multiplying, but as i've no interest on it now, i leave it aside. If i internaly in my filter, have a previous frame frame variable, wich store .mFrameNumber just before exiting Run(). Can i, at least, consider that in the next call of the Run() procedure, i'll have previous frame<.mFrameNumber ? This way, i may try to detect gaps, and try to fill them (even if it'll absolutely not be easy). Of course, i'm talking of when running a batch/job process, not a seeking situation ! I have, in my deinterlace filter, a special case i use in manual mode, the 0 mode. If in my input file i have : 100 102 0 It will mean : Apply filter mode 0 from frame 100 to 102. Filter mode 0 consist to duplicate previous frame of window filter in the window => duplicate frame 99 in frame 100 to 102 here. Actualy it works by checking if next frame will be mode 0. So, during process of frame 99, it will detect that next frame will have filter mode 0 (but current is not mode 0), and so buffer the ouput of the frame 99 (frame 99 may have to be deinterlaced). I see a possible big unsolvable problem... How to be sure to always being able to catch the output of frame number 99 ???? If a stupid decimating ################################# Finaly, if (or when) you have time, i'd realy like a feedback from you of what you think of the results of my IVTC filter. I've tested and compared with a little video i've used to test my filter when i've ported the new version, and mine works on places where yours failled. If interested by the video i've used, and the specific process i've used, tell me, i'll post them and provide links. |
| Posted by: phaeron Dec 22 2011, 12:22 AM | ||||||||
You can keep a previous frame number, but there is an important gotcha: you can't use it in Prefetch/Prefetch2. Why? Because the output of Prefetch/Prefetch2 has to be consistent and independent of Run(). Even in current versions, there is a delay between the time that upstream frame requests are prefetched and Run() gets called. Currently these requests get dispatched in order but this is not guaranteed. One of the features I would like to implement is 32/64-bit filter bridging, and in order to do that I would need to be able to speculatively call Prefetch[2] in order to reduce IPC overhead.
Yes, this is correct. A filter always produces the same sequence of frames -- it's just that the downstream filters may select only a subset of them.
No, this shouldn't happen. VirtualDub has two mechanisms to combat this, one being a frame cache and the other being the merging of requests in the request queue. If the downstream filter requests frame 4 from the upstream filter twice, both of the requests are connected so that the frame is not released until both requests are discarded. Generally this means that when all filters run at the same frame step, each will only process frames at a 1:1 rate regardless of any prefetch overlaps.
Usually, yes, although as you've guessed this is definitely not the case in preview windows.
You can't. There is no really good way around this. If you think about it, resolving this chain could effectively force all frames to be processed all the way from the start of the time line. This is one reason that VirtualDub doesn't allow you to prefetch your own filter's output (recursive filter). There are a couple of ways around this. For many filters, the effect of previous output frames diminishes considerably over several frames, and so it's OK to preroll only from a certain distance. This is the logic used in the current runtime for laggy filters, and the way you would replicate this with prefetch is to prefetch N frames and use those frames if necessary to prime internal buffers, using a previous frame check as you have described. For this to work you need to prefetch those frames always since you don't know at that point if you will need them. The other way, which is more directly applicable in your case of "duplicate previous frame," is to precompute or compute in prefetch the original output frame and use that instead. Since you know frames 100-102 are the same as frame 99, you can just prefetch for frame 99 instead. The cookies in the prefetches can be set so that Run() knows this is happening instead of having to re-deduce this. |
| Posted by: jpsdr Dec 22 2011, 09:01 AM | ||||||
First, thanks for all your answers and time, i think i see some things clearer.
For a decimate of k frame each n frame : My idea was the following : previous frame initialised to -1 in start proc. In prefetch2 : compute the frame i want to ouput. If previous frame+1 != frame => prefetch the k previous frames. I think i don't need to prefetch the next frames indeed. Prefetching from previous frame to actual frame may result in a disaster in case of seeking... And after, still in prefetch2 function, update previous frame to frame. Independant to Run(). Can this, still only in a job process, not seeking, guarantee that all frame will be processed and in order by upstream filters ? If not, why ? Edit : Bad idea... I'll stay with always prefetch +/- k windows.
Ok... found no real specific information on SDK v1.1 (Is there a new version btw ?). So :
Where is the cookie ? How can it be retrieve in Run() proc ? Do you think you'll have time to provide me feedback (what you think of) of tests results of my IVTC filter, or you will not have time or are you just not interested in ? Don't pushing, just want to know what to expect. |
| Posted by: phaeron Dec 22 2011, 10:06 PM | ||||||||||
Sorry, but this is not allowed. It is not allowed because you are not guaranteed that prefetching happens in order or that there is a Run() call for every prefetch, and therefore your previous frame prediction in prefetch may be wrong. This will break if I implement speculative prefetch caching or multithreaded prefetching. Also, I'd like to remind you that prefetch is required to be thread-safe, for similar reasons.
Nope. This is not guaranteed for three reasons:
Working on it. I just did a big conversion to a new XML/.NET-based help builder, so I'm still working out the kinks.
The cookie is the third argument to PrefetchFrame(). It's an arbitrary value you can supply with the prefetch request that is guaranteed to be returned with the VDXFBitmap::mCookie field that contains the corresponding frame. VirtualDub doesn't interpret this value in any way, so you can store any 64-bit value you want in it. Be careful with storing pointers, though, since you have no idea if the prefetches will actually be used. You can't use the cookie to store pointers to allocated memory if you have no other way of freeing that memory.
Sorry, I'm afraid not, since I'm working on releases. I actually don't have telecined output available on my computer at the moment, anyway. |
| Posted by: ale5000 Dec 22 2011, 11:32 PM | ||
Lol, I would like to see it |
| Posted by: phaeron Dec 23 2011, 08:21 AM |
| http://www.virtualdub.org/beta/backwardsfilt.zip Warning: This filter will perform pretty badly due to reverse decoding, which is another reason I don't encourage doing this with current versions. |
| Posted by: jpsdr Dec 23 2011, 09:04 AM | ||||
From what i've understood : If i use my filters the way they are designed, and not using any unknow filters which may change frame order in my chain, and the only decimating is mine, working as i've allready described, i should never encounter problems.
To improve performance of my filters, globaly IVTC likes, i decided to not use SWAP_BUFFERS, because you have 3 cases : - Nothing change. - Only one field change. - Both field change. To minimize memory transfer, not using SWAP_BUFFER seems a good idea, beacause in first case you do nothing, and in second case, you move only one field. But... effect i've discovered is in that case, that you can't prefecth in ascending order, if first frame is current-N... Because what you have in dst, is the 1rst prefecthed frame. So prefetch is made in the following order : N, N-2,N-1,N+1,N+2,N+6,N+7 for IVTC. Would it be better, for performance issue, to prefecth N-2, N-1, N, N+1, N+2, N+6, N+7 and always do memory transfer ? As this filter do a lot of process, and the way it's implemented now, changing the prefetch order and always forcing transfert will only take a few minutes, and time add by to transfer will probably have few effect on speed. So, question : What is better ? My decimate do the same thing : It 1rst prefecth N, and after N-k to N+k (except N, of course). AS this filter does nothing, adding memory transfert will greatly increase it's processing time, and as Run() actualy is empty, i would have to add all mode transfert. But again, what is better ? But, doing this, despite prefetch order not in order, upstream process all frames in order.
Ok, if problem was only providing a little video source to test, i would gladly have provided a link (and of course, i was not asking to take a look at the code...). Well, you take time to answer my questions, that's already something. |
| Posted by: jpsdr Dec 23 2011, 09:10 AM |
| I've just take a look at your backward. Others questions : Why the need of Prefetch, Prefetch2 is not enough ? Is PrefetchFrameDirect more interesting to use instead of PrefetchFrame ? |
| Posted by: phaeron Dec 23 2011, 09:53 PM | ||||
| Hmmmm. This is an interesting case. Yes, it would be slightly better if you could prefetch the in-place frame in the middle. As it turns out, it's a moot point. The problem with an in-place filter (!SWAP_BUFFERS) is that it forces frames to be removed from the cache because the frames are modified. What this means is that making your filter in-place does no good as it merely causes VirtualDub to do the copy instead. There is a predictor in the runtime which tracks whether frames are getting re-requested multiple times in quick succession, and if the predictor says yes, the frame is copied to a new buffer so the original can stay in the cache. This prevents the frame from being recomputed, which would be much more expensive than a memcpy(). I have a feeling that it won't be possible to come up with a good solution for this in a general prefetch model until late fetches are possible, i.e. the ability to re-request new frames from Run(). This would allow prefetching a minimal set of frames up front and then pushing the request back into the pending queue from Run() if it turns out more frames are needed. I'm not sure what effect this could have on pipelining, though. Hopefully if it only happens on seeks it would only occasionally stall the pipeline. BTW, a memcpy should not be that slow. A modern CPU can copy large blocks of memory very quickly. One thing that may be slowing down a copy is if you are doing one row at a time. This is unavoidable for field copies, but for full frame copies, you can check whether the scan lines are contiguous and do a full plane copy if so, which is much faster. VirtualDub's VDMemcpyRect() does this.
Prefetch2 is only available starting with the V14 API (1.9.1); the filter I posted only requires V12 and should work back to 1.7.4.
Yes. If you know that your filter isn't going to modify the frame, then PrefetchFrameDirect() allows smart rendering to copy the compressed frame and avoid dropping out of direct stream copy mode. In this case it would likely only be advantageous for a key frame only format like Huffyuv or DV. The filter runtime could potentially also skip calling runProc() entirely although the current code doesn't do this. |
| Posted by: jpsdr Dec 24 2011, 07:46 AM | ||||||||||
Like IVTC filters...
Taken from sdk v1.2
Does it mean : Testing if pitch=w*4 (or w or w*2) ? I know pitch can be negative (this is why you forbid 1rst method), and i force it to negative and re-arange src pointer in my IVTC filters, to match the original old pattern of when i've develloped the filter. I've corrected it in my manual IVTC and another field manipulation filter i've made, to avoid recursed memory transfer (not very good for CPU memory caching), but for my IVTC filter, changing back has too much impact and is too much complicated. So, considering the fact that actualy, for my IVTC filter : - It's field transfert. - Memory transfert are made backward I think i'll stay with actual way of prefetch.
Is it in the src/include files provided in the sdk, or in some other file in the source code ? |
| Posted by: phaeron Dec 24 2011, 08:54 PM |
| VDMemcpyRect() is a VirtualDub internal function, not part of the SDK. The warning in the SDK about not doing one big memcpy() in general is correct, but contiguous scan lines are a special case where that does work. If pitch == w*4, then there is no functional difference between copying one scan line at a time and copying the whole frame at once. This only works when there are no gaps, so that's why you need a pitch check before doing this. |
| Posted by: jpsdr Dec 25 2011, 09:18 AM |
| Thanks, and merry Xmas. |
| Posted by: jpsdr Mar 25 2012, 08:42 AM |
| I've totaly forget one little thing... To finalise why my filter need a continuous input of frame, from begining, and can't work properly in preview : Because of one important thing in algorithm : History. I've some kind of threshold to validate the detection of an IVTC pattern, and afterward, if this threshold is not validated, the detected pattern is kept. So, you can have a good pattern detected (strong interlaced information), followed by 100 frames where with IVTC almost undectable (and under threshold), in that case, "old" detected pattern is kept, until a scene change or another validated detected pattern. |
| Posted by: phaeron Mar 31 2012, 09:18 PM |
| There's unfortunately no way to perfectly support seeking in this model without requiring all frames to be reprocessed potentially back to the beginning of the file. You can approximate it by fetching a bunch of frames on a seek and omitting those frames when you have cached information, but the longer the look-back window the more memory this will require. |
| Posted by: jpsdr Apr 1 2012, 08:24 AM |
| I think there is something i may not have been clear enough : Seeking is absolutely not my concern. I've always considering seeking/preview purely be not compatible with it, this is why i've not implemented the preview button. My only concern, now having changed my filter wich remove frames in adding to it to request by prefetch a windows of more frame (described in post 14) wich solved now the problem of not having all frames processed i had in the first place, is to be sure that all frames will be processed and in the correct order when i start a process, in a case where i perfectly know the filters i use (there is no filter wich will request frames backward |