|
|
| Sarreq Teryx |
| Posted: Feb 12 2003, 03:26 PM |
 |
|
VirtualdubMod Alpha tester
  
Group: Vdubmod Alpha Testing Team
Posts: 175
Member No.: 41
Joined: 16-July 02

|
I originally posted this on the divx forums here, bottomish of the thread
| QUOTE | | QUOTE | | Oh, and another thing, allow it to use the full luminance range, I hate seeing a frame that's supposed to be pitch black (r0,g0,b0) and it gets encoded as very dark grey (10,10,10). possibly making the color range scene adaptive to maximize the possible quality by only encoding the color range that is in use in each scene, like if it's a very dark scene that only uses colors from 0,0,0 to 90,90,90 (out of a possible 24bit RGB range), you only need to encode 20bit RGB, which, when translated to YUV, much less would be necessary. Or using the full yv12 range to represent less RGB color depth could work too. Unless this is already done of course. |
to expand on and lessen the confusion of that:
(keeping in mind that I have no idea if this is done now or not. and that I'm writing this at 2:35am)
say you have a frame who's color range is between r60,g40,b90 and r110,g210,b180, that makes a range of r50,g170,b90 or 765,000 colors, which would only need 20bits of data in RGB. now, rounding that 20bits of RGB color to 12bits of YUV color would be quite a bit less lossy than using that same 12bits to represent the full 24bits of RGB color it does now. and with less of an RGB color range in a frame it just gets better, (probably) at least until you reach a 12bit RGB ranged frame (which would come out as a perfect picture [before DCT compression]), then, using yv12, you can't go any lower without wasting bits.
I'd guess, to specify the RGB color range to be decoded, it would require 6bytes more data per frame, which ain't much compared to the size of frames already.
another idea using this color range encoding, to improve it's accuracy (at the expense of more data per frame), might be to, instead of just applying it to a full frame, take portions of a frame and apply it to that, maybe 4x4 (or more) grids of macroblocks, or possibly freeform areas (circles, elipses, rectagles, bezier areas, etc, whatever is within a user adustable minimum of a 12bit range).
I realize all this will probable totally break MPEG-4 compatability, and really don't mind it if it would improve quality, just make them ADVANCED (bigger emphasis on advanced is not available in font form) features |
So am I completely nuts, or is this feasible??
-------------------- And as I walk through the Valley of the Shadow of Death, Lord, thy balls and shaft, they comfort me, you annoint my head with oil, some salt, a dash of pepper, a sprigg of parsley......Lord?............Lord??? What dost thou intend to do with that fork??? |
 |
| muf |
| Posted: Feb 12 2003, 06:00 PM |
 |
|
MCF team member
  
Group: Moderators
Posts: 179
Member No.: 46
Joined: 21-July 02

|
Therapy material.
--------------------
 |
 |
| phaeron |
| Posted: Feb 13 2003, 03:47 AM |
 |
|

Virtualdub Developer
  
Group: Administrator
Posts: 7773
Member No.: 61
Joined: 30-July 02

|
MPEG already does this -- it's called adaptive quantization. The DC component for each block -- the average level of all pixels -- is stored as differences between adjacent blocks. The differences are then divided by the current quantizer and then VLC encoded. When the codec has lots of bits to spare and the local area is relatively shallow, the codec kicks down the DC quantizer and encodes small differences. When bits run short or large variations are seen, the DC quantizer is raised and big differences are encoded. The quantizers can be scaled from [1,32] and can be changed between any macroblock.
YV12 has an 8-bit Y plane with a black-to-white excursion of [16,235], which translates to 7.8 bits of precision for grays. The reduced luma precision in YV12 is thus certainly not responsible for the raised black floor you are seeing. Keep in mind that MPEG-4 is third-generation video technology designed by video professionals around the world. They would not put up with a video standard that couldn't encode black properly!
Also, you are slightly confused in your macroblock motion proposal. pel is indeed the same as pixel, meaning picture element. Half-pel motion prediction means, very simply, that you can shift blocks by half-pixel amounts. So you can sample at a relative offset of (-1,4.5), or (-2.5, 3.5). Quarter-pel means you can do quarter pixel offsets like (-1.25, 0). These fractional offsets are resolved via bilinear filtering.
As for dynamic prediction, you are too late. Even MPEG-1 allows the encoder to select motion vector resolution on a per-frame basis. |
 |
| Sarreq Teryx |
| Posted: Feb 13 2003, 06:49 AM |
 |
|
VirtualdubMod Alpha tester
  
Group: Vdubmod Alpha Testing Team
Posts: 175
Member No.: 41
Joined: 16-July 02

|
| QUOTE | | MPEG already does this -- it's called adaptive quantization. The DC component for each block -- the average level of all pixels -- is stored as differences between adjacent blocks. The differences are then divided by the current quantizer and then VLC encoded. When the codec has lots of bits to spare and the local area is relatively shallow, the codec kicks down the DC quantizer and encodes small differences. When bits run short or large variations are seen, the DC quantizer is raised and big differences are encoded. The quantizers can be scaled from [1,32] and can be changed between any macroblock. | for some reason that just doesn't sound quite like what I'm suggesting, but I'll bow to your quite-a-bit-more-than-me wisdom anyway (and no that wasn't meant to be sarcastic )
| QUOTE | | YV12 has an 8-bit Y plane with a black-to-white excursion of [16,235], which translates to 7.8 bits of precision for grays. The reduced luma precision in YV12 is thus certainly not responsible for the raised black floor you are seeing. Keep in mind that MPEG-4 is third-generation video technology designed by video professionals around the world. They would not put up with a video standard that couldn't encode black properly! |
I'd think so too, if It wasn't something I've noticed happening on a regular basis. I use Paul Curie's coring filter quite a bit, which takes a threshold of pixels close to black and sets it to 0,0,0 black. every time I take the time to look at it after compressing, those areas that should be perfectly black are actually encoded as 10,10,10 really-dark-grey.
| QUOTE | | Also, you are slightly confused in your macroblock motion proposal. pel is indeed the same as pixel, meaning picture element. Half-pel motion prediction means, very simply, that you can shift blocks by half-pixel amounts. So you can sample at a relative offset of (-1,4.5), or (-2.5, 3.5). Quarter-pel means you can do quarter pixel offsets like (-1.25, 0). These fractional offsets are resolved via bilinear filtering. | That's what I thought originally, but I was going by the divx.com guides, which are written so confusingly, I thought hpel and qpel referred to the accuracy within the 8x8 blocks. from your description, that seems to be a bit useless, unless you're reducing the size of the video from the source, though. if not resizing down, does it actually give any benefit??
| QUOTE | | As for dynamic prediction, you are too late. Even MPEG-1 allows the encoder to select motion vector resolution on a per-frame basis. | well I did say I didn't know if it was done already
-------------------- And as I walk through the Valley of the Shadow of Death, Lord, thy balls and shaft, they comfort me, you annoint my head with oil, some salt, a dash of pepper, a sprigg of parsley......Lord?............Lord??? What dost thou intend to do with that fork??? |
 |
| phaeron |
| Posted: Feb 14 2003, 04:00 AM |
 |
|

Virtualdub Developer
  
Group: Administrator
Posts: 7773
Member No.: 61
Joined: 30-July 02

|
| QUOTE (Sarreq Teryx @ Feb 13 2003, 12:49 AM) | | QUOTE | | Also, you are slightly confused in your macroblock motion proposal. pel is indeed the same as pixel, meaning picture element. Half-pel motion prediction means, very simply, that you can shift blocks by half-pixel amounts. So you can sample at a relative offset of (-1,4.5), or (-2.5, 3.5). Quarter-pel means you can do quarter pixel offsets like (-1.25, 0). These fractional offsets are resolved via bilinear filtering. | That's what I thought originally, but I was going by the divx.com guides, which are written so confusingly, I thought hpel and qpel referred to the accuracy within the 8x8 blocks. from your description, that seems to be a bit useless, unless you're reducing the size of the video from the source, though.
|
Half-pel and quad-pel prediction doesn't give you more pixels in a block. Let's say you have a video where a background is slowly moving across at 1.5 pixels/frame to the left. Well, you can't directly pull the previous frame over by 1.5 pixels, but what you can do is pull it 1 pixel over, make a copy two pixels over, and then average the two. That gives you an approximation of a pan by 1.5 pixels. If you don't weight the frames equally, you can shift the image by a little more or less. That's how you get quadpel.
Better interpolation filters can be used to improve the result, and in fact, the exact same ones used as for resizing video. Rarely is motion in a video a perfect pan, though, and thus the higher quality result isn't worth it. If you dump the motion vectors from an MPEG file, you'll find they generally are a bit more sporadic than the smooth motion you perceive in the video. And if the motion has zoom or rotation components, you're going to have much bigger sources of error than the interpolation filter.
Keep in mind that most encoders don't even explore all of the motion vectors possible in MPEG-1 full pel or half pel prediction for every block, much less all quad pel offsets. Most of them use hierarchical or telescopic searching to zoom in on a spot, then jiggle the search block a bit to figure out the subpixel offset. The encoding speed adjustment primarily adjusts how thorough the motion search is.
| QUOTE | if not resizing down, does it actually give any benefit??
|
Yes, it does. MPEG also encodes the difference between the copied block and the desired block (motion compensation), so a closer prediction helps reduce the amount of compensation required. Of course, as you up the prediction precision, you pay with more bits for motion vectors, a more complex decoder, and slower encoding. And there are definitely diminishing returns as you use finer positioning. |
 |
| thegreenling |
| Posted: Mar 15 2003, 01:03 PM |
 |
|
Unregistered

|
| QUOTE (Sarreq Teryx @ Feb 12 2003, 09:26 AM) | | QUOTE | | allow it to use the full luminance range |
|
... sounds there is a colormaximiser like a volumenmaximiser(TMPGenc) needed, that calkulate the chanels of colourspace, (processing would need much time, you shouldn't use filters, some single spots get near the max = PROBLEM) and than stretch them to the MAD-MAX(part 1 tol 3;-), or a defined value, ... what you talking about? I do calkulatings first on TMPGenc, filtering on vDUB, encoding on TMPGenc to MPEG1(most!). THATS THE WAY...AHA, AHA...I LIKE IT...AHA, AHA
I find the thought of a Anime-only codec(Doom9) from KILE brilliant. If you can get the picture in vectors, you only need keyframes and a software that moves the vektorparts to the next keyframe. The framerate can get as high as your machin can do. And no more telecines. A AVI-codec that don't decompress, that draws Pictures, like a WAV-MIDI-codec, or so. (May be it's evil to post it here, but I have done) |
 |
| thegreenling |
Posted: Mar 15 2003, 01:20 PM |
 |
|
Unregistered

|
Wait, I have once more an idea!
What if you only want to stretch luminanceplane(not colours)? Do anybody know whatfore the K is on CMYK colorspace(gamma corection first!)? |
 |
| phaeron |
| Posted: Mar 15 2003, 10:34 PM |
 |
|

Virtualdub Developer
  
Group: Administrator
Posts: 7773
Member No.: 61
Joined: 30-July 02

|
Anime can't be represented with just vectors, as you also have text, gradients, fades, and CG to deal with. Believe me when I say this isn't the first time a person has gotten this idea. If you can write a program to pull vector objects from anime then a renderer already exists for it: Flash.
The K in CMYK stands for black and really only applies to color separation printing. Theoretically, cyan/magenta/yellow subtractive pigments can produce black when combined together, but in practice (A) they form a muddy brown instead of black, (B) the three color overlays aren't aligned in practice and have color fringes, and © it's expensive to use three times the ink for black. Black ink, on the other hand, is CHEAP. So black is added as a fourth pass to improve quality and lower costs. |
 |
| thegreenling |
| Posted: Apr 8 2003, 12:00 PM |
 |
|
Unregistered

|
| QUOTE (phaeron @ Mar 15 2003, 04:34 PM) | 1)Anime can't be represented with just vectors, as you also have text, gradients, fades, and CG to deal with.
2)The K in CMYK stands for black and really only applies to color separation printing. |
1)OK, forget vectors and let's talk about how anime works to compess it more effective! Like said, you have text, gradients, fades, and CG to deal with. How can a CODEC recognize those things to save them separated(4 example).
2)Yes, CMYK is a colorspace used on colorprinters and in TMPGenc/color correction. What if you make a higher contrast on K-plane? And how to convert the filteringprocess to RGB-mode, so it is without colorspaceconversation? |
 |
| phaeron |
| Posted: Apr 9 2003, 02:15 AM |
 |
|

Virtualdub Developer
  
Group: Administrator
Posts: 7773
Member No.: 61
Joined: 30-July 02

|
| QUOTE | 1)OK, forget vectors and let's talk about how anime works to compess it more effective! Like said, you have text, gradients, fades, and CG to deal with. How can a CODEC recognize those things to save them separated(4 example).
|
If I knew that, don't you think I would have written my own codec by now?
| QUOTE | 2)Yes, CMYK is a colorspace used on colorprinters and in TMPGenc/color correction. What if you make a higher contrast on K-plane? And how to convert the filteringprocess to RGB-mode, so it is without colorspaceconversation?
|
What are you talking about? |
 |
| thegreenling |
| Posted: Apr 9 2003, 01:53 PM |
 |
|
Unregistered

|
| QUOTE (phaeron @ Apr 8 2003, 08:15 PM) | | If I knew that, don't you think I would have written my own codec by now? |
You are a DEVELOPER? OK, may be you are realy able to do such nasty things, possibly. A weird question on this: How does MS-video1-codec work? It is partial useable on cartoons.
| QUOTE (phaeron @ Apr 8 2003, 08:15 PM) | | CMYK - What are you talking about? |
Yes, TMPGenc has a Filter that can "correct" this colorspace. If the BLACK-plane isn't binary, we can say it's the gray-plane, or I'm standing on the hose?
|
 |
| fccHandler |
| Posted: Apr 9 2003, 04:00 PM |
 |
|
Administrator n00b
  
Group: Moderators
Posts: 3961
Member No.: 280
Joined: 13-September 02

|
| QUOTE (thegreenling @ Apr 9 2003, 09:53 AM) | | You are a DEVELOPER? OK, may be you are realy able to do such nasty things, possibly. |
ROTFLMAO
-------------------- May the FOURCC be with you... |
 |
| phaeron |
| Posted: Apr 10 2003, 02:58 AM |
 |
|

Virtualdub Developer
  
Group: Administrator
Posts: 7773
Member No.: 61
Joined: 30-July 02

|
I sense Babelfish. If you are using an automated translation program, please say so. Machine-translated text often looks either weird or like an insult.
CMYK filtering doesn't buy you anything, because the CMYK planes are derived from RGB or YCbCr planes. If you had video stored as CMYK, it might be different -- but that would be a strange video format indeed.
Flipbooks from a cereal box, perhaps.
Microsoft Video 1, if I remember correctly, tiles the video frame into 4x4 blocks and then encodes each:
1) As the same as the same time in the previous frame (punch-through). 2) As a solid block of color. 3) As a two-color block, where each pixel may choose from one of two colors for the block. 4) As an eight-color block, where each 2x2 block within the tile has two colors and each of the four pixels within the 2x2 block may choose from the two colors for that block.
Notably missing is the 16-color option -- Microsoft Video 1 cannot encode more than 8 colors per tile, and has no provision for motion prediction. That means it will never encode smooth gradients, and can never be lossless. That having been said, MSVideo1 has extremely fast decompression, and can help you determine if decoding time is your bottleneck or not. It certainly is not anywhere near the state of the art. |
 |
| thegreenling |
| Posted: Apr 10 2003, 07:43 PM |
 |
|
Unregistered

|
| QUOTE (fccHandler @ Apr 9 2003, 10:00 AM) | | ROTFLMAO |
Doesn't it hurt somehow?
| QUOTE (phaeron @ Apr 9 2003, 08:58 PM) | | looks either weird or like an insult |
I'm sorry if it sounds like an insult. That was not intended. I just wanted to say(at a strange way): Yes, as a DEVELOPER you have the needed knowledge. I dont use a interpreter to translate my weird stuff. The words may disagree in time and space, and it's difficult to separate FLYCATCHER from information, but somehow I like it this way.
| QUOTE (phaeron @ Apr 9 2003, 08:58 PM) | | Video |
1) the most important for cartoonencoding are good keyframes, with capturefiles it's difficult - filtering against timeline could help, but how to do? 2) at the edge of borderline is flicker(very bad for compressing) - WARPSHARP is helping, but the line can even disappear, so you have to use TOONTOOL first or something like that 3) moved parts from picture have the same problems - a selective higer smoothing threshold on changed parts could eliminate spots 4) some codecs are written for MOVING PICTURES, on cartoons I don't need prediction, because the changes are too strong, smooth scrolling is very rare 5) CARTOONS have most pure colours or they merge together, textures are hard to separate from noise, especial if scrolled (you get smeareffects from temporal filters), the point is: how to blur/blend/merge colors without doing it to textures, textures are most at GRAYPLANE/lumavalue(like Y in YCbCr) 6) many parts of the picture are only changed for 1 or 2 or 3 pictures(friends;-), so they can be recycled or the change as negative 7) sometimes it is easier to filter to what you don't want, than what you expect to get
|
 |
| fccHandler |
| Posted: Apr 11 2003, 07:40 AM |
 |
|
Administrator n00b
  
Group: Moderators
Posts: 3961
Member No.: 280
Joined: 13-September 02

|
| QUOTE (thegreenling @ Apr 10 2003, 03:43 PM) | Doesn't it hurt somehow? |
Yes, and ouch!
Actually, you've made some valid points (if I interpret your posts correctly), so I don't think you are a fool. But will you answer phaeron's question: Are you speaking through an interpreter?
-------------------- May the FOURCC be with you... |
 |
|