Welcome Guest ( Log In | Register )


Important

The forums will be closing permanently the weekend of March 15th. Please see the notice in the announcements forum for details.

Pages: (3) 1 2 [3]  ( Go to first unread post )
VirtualDub-MPEG2 1.5.9, come and get it
« Next Oldest | Next Newest » Track this topic | Email this topic | Print this topic
fccHandler
Posted: Nov 24 2003, 05:40 AM


Administrator n00b


Group: Moderators
Posts: 3961
Member No.: 280
Joined: 13-September 02



Thank you for the valuable info. This will be useful if I ever get a P4 (I am currently negotiating with Santa). smile.gif

I'm curious about a few things, though:

QUOTE
  • Slow IMUL. Scalar multiplies go through the FPU and are slow again.

I've heard this, that IMUL is implemented using the FPU, but does that mean EMMS is necessary first? If so, do you need to guard against the C++ compiler producing an IMUL in the middle of what is (mostly) MMX function calls? I don't want to have to execute EMMS between every C++ and __asm block.

QUOTE
  • 64K aliasing. All lines in a set in the L1 cache share the upper 16 bits.  That means if two 64-byte aligned sets are separated by 64K, they cannot both be in L1 at the same time, so allocating your MPEG buffers with VirtualAlloc() calls is a bad idea.  The stack is a big problem here on HT systems.

What would be the best way to allocate the buffers, to help P4 performance but not hinder P3?

--------------------
May the FOURCC be with you...
 
     Top
phaeron
Posted: Nov 24 2003, 10:19 AM


Virtualdub Developer


Group: Administrator
Posts: 7773
Member No.: 61
Joined: 30-July 02



No, emms is not required, as IMUL doesn't require the FPU registers. In fact, this is the model that was used by the original Pentium. The problem is that IMUL was made really fast on P6, so compilers started preferring it over add/lea sequences, and now it sucks again. If you are using VC7/7.1, /G7 tells the compiler to prefer add/lea over imul.

For MPEG, allocate your buffers together and skew them appropriately. For instance, allocate Y/Cb/Cr planes for a single buffer in a single allocation and interleave or skew them as needed so you don't alias within a single plane. These alignment optimizations won't hurt P6 and may even help if you end up making better use of L1 cache in general. Meia interleaves all three planes together; you could go farther and interleave all three buffers together as well but that may be a loss. VirtualAlloc() is the only function you really have to watch out for as it allocates memory at perfect alignment for aliasing -- 64K.

I generally don't pay that much attention except if I'm already doing alignment tricks or if it starts showing up on VTune traces. It's very difficult to control 64K aliasing unless you are dealing with the innermost loop, because not only do your buffers cause aliasing, but so does the stack. The way I deal with this is to avoid the stack if possible in the lowest level conversion loops. This usually requires either reusing ESP as a general purpose register, by stashing it in the SEH chain, and/or spilling scalar registers to MMX/SSE registers in the second-level loop.

 
    Top
fccHandler
Posted: Nov 24 2003, 09:17 PM


Administrator n00b


Group: Moderators
Posts: 3961
Member No.: 280
Joined: 13-September 02



Thanks again. I will be looking at Meia in depth.

--------------------
May the FOURCC be with you...
 
     Top
Shining Arcanine
Posted: Nov 27 2003, 03:33 PM


Unregistered









Contact the Creator of Kirbi, Eric Bron, for help:

http://www.aceshardware.com/read.jsp?id=60000261 (article that praises how well it is SSE2, SMT/SMP optimized)
http://www.adeptdevelopment.com/

If he can get the 2.4GHz P4C to crank out 16.7 billion polygons at 1.1fps. Imagine what he could do for Virtual Dub. Or at least your modification to it. biggrin.gif
 
  Top
phaeron
Posted: Nov 28 2003, 06:49 AM


Virtualdub Developer


Group: Administrator
Posts: 7773
Member No.: 61
Joined: 30-July 02



Not all of those 16.7 billion polygons are visible -- most of them will be culled by scene traversal or backface culling. Consider that if they were all visible, the CPU would be rendering 6.3 triangles per clock. That would be 1000x faster than achieved by RAD's Pixomatic software 3D engine (http://www.radgametools.com/pixofeat.htm), which is co-written by none other than optimization expert Michael Abrash and is a very optimized engine in its own right.

Not to mention that video compression and image convolution are very different from vertex transformation, polygon rasterization, and texture mapping.

Keep in mind as well that fccHandler only has a Pentium III, so it would be very unfortunate to have his version P4 optimized. smile.gif
 
    Top
Shining Arcanine
Posted: Nov 29 2003, 08:52 PM


Unregistered









QUOTE (phaeron @ Nov 28 2003, 02:49 AM)
Not all of those 16.7 billion polygons are visible -- most of them will be culled by scene traversal or backface culling. Consider that if they were all visible, the CPU would be rendering 6.3 triangles per clock. That would be 1000x faster than achieved by RAD's Pixomatic software 3D engine (http://www.radgametools.com/pixofeat.htm), which is co-written by none other than optimization expert Michael Abrash and is a very optimized engine in its own right.

Not to mention that video compression and image convolution are very different from vertex transformation, polygon rasterization, and texture mapping.

Keep in mind as well that fccHandler only has a Pentium III, so it would be very unfortunate to have his version P4 optimized. smile.gif

Considering how much praise he received from Ace's Hardware, I figured he would be able to optimize Virtualdub or at least fccHandler's modification for the P4.
 
  Top
enrico75ps
  Posted: Mar 6 2004, 08:23 PM


Unregistered









Thank'you for your wonderful work finally I can encode my mpeg2 video!!! biggrin.gif biggrin.gif biggrin.gif
 
  Top
NuPogodi
Posted: Mar 8 2004, 01:55 PM


Advanced Member


Group: Members
Posts: 536
Member No.: 6558
Joined: 1-October 03



QUOTE (fccHandler @ Nov 1 2003, 02:53 PM)
You haven't been listening.  I say again, "VirtualDub can only create AVI files." rolleyes.gif

TMPGEnc can convert AVI to MPEG.

With all my respect, fccHandler... I have to correct you... there exists a freeware plugin (YMPEG) which allows VDub to output MPEG-1/2 files. It is still alpha version, but results look quite promising...
http://www.motiwala.com/ympeg.htm

There is, probably, another similar plugin (Yane), but rfmmars is the only who knows the details. I've asked him where I can download the Yane plugin to compare the results, but I'm still waiting...
http://virtualdub.everwicked.com/index.php...&t=6210&hl=yane

Anyway, YMPEG exists and it works. Amen!

--------------------
Optimists believe that they live in the best of existing worlds. Pessimists are afraid of that's right...
 
     Top
rfmmars
Posted: Mar 8 2004, 03:11 PM


Advanced Member


Group: Members
Posts: 324
Member No.: 5438
Joined: 29-July 03



I am sorry if I fail to respond to your request about the Yane plug-in, I think we are all talking about the same one only with different spelling. There is a beta version now and I am not sure that I download it from Donald Graft's site. Send me a e-mail if you can't find.

richard
rfmmars@cox.net
 
     Top
TCmullet
Posted: Mar 10 2004, 03:46 PM


Advanced Member


Group: Members
Posts: 312
Member No.: 3970
Joined: 2-May 03



After reading about ympeg, I raced to his site. But when I went to install, it failed because I don't have SSE (only an Athlon 850, not an athlon XP). Drat!
 
    Top
NuPogodi
Posted: Mar 12 2004, 11:26 AM


Advanced Member


Group: Members
Posts: 536
Member No.: 6558
Joined: 1-October 03



QUOTE (TCmullet @ Mar 10 2004, 09:46 AM)
After reading about ympeg, I raced to his site.  But when I went to install, it failed because I don't have SSE (only an Athlon 850, not an athlon XP).  Drat!

What can I say aside from a banality like "I regret..."?
You can try to use ffvfw by Milan Cutka, ffvfw also has the built-in mpeg2encoder... but i must confess that its encoding quality did not satisfy me.

--------------------
Optimists believe that they live in the best of existing worlds. Pessimists are afraid of that's right...
 
     Top
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:
40 replies since Oct 2 2003, 06:38 AM Track this topic | Email this topic | Print this topic
Pages: (3) 1 2 [3] 
<< Back to News / Announcements