User:Graywolf~zhwiki/sandbox

前言[编辑]

2011/11/24, 開啟我人生的里程碑, 目前從事視訊會議, 負責VideoPhone的部份

X264vfw(Video For Windows)[编辑]

版本 : x264vfw_34_2008bm_29021
參考網址 : http://sourceforge.net/projects/x264vfw/files/x264vfw/

Encode[编辑]

Command Line[编辑]

Command :--preset ultrafast --tune zerolatency --keyint 300 --scenecut 40 --ipratio 1.0 --qcomp 0.3 --cplxblur 30 --aq-mode 1 --qpstep 8 --deblock 1:-1

ultrafast : 設定之後會蓋掉前面的參數, 因此需要優先設定

             --no-8x8dct --aq-mode 0 --b-adapt 0
             --bframes 0 --no-cabac --no-deblock
             --no-mbtree --me dia --no-mixed-refs
             --partitions none --rc-lookahead 0 --ref 1
             --scenecut 0 --subme 0 --trellis 0
             --no-weightb --weightp 0

zerolatency :

             --bframes 0 --force-cfr --no-mbtree
             --sync-lookahead 0 --sliced-threads
             --rc-lookahead 0

參數 :

       --keyint 300(每300個Frame會有一個I-Frame), 預設250
       --scenecut 40(設定決策使用I幀、IDR幀的閾值(場景變換檢測), 預設40
       --ipratio 1.0(修改I幀量化值相比P幀量化值的目標平均增量。越大的值會提高I幀的品質), 預設1.4
       --qcomp 0.3(量化值曲線壓縮係數。0.0是固定位元率，1.0則是固定量化值), 預設0.6
       --cplxblur 30(以給定的半徑範圍套用高斯模糊（gaussian blur）於量化值曲線。這意味著分配給每個幀的量化值會被它的鄰近幀模糊掉，
                     以此來限制量化值波動), 預設20
       --aq-mode 1(彈性量化模式), 預設1
       --qpstep 8(設定兩幀之間量化值的最大變更幅度), 預設4
       --deblock 1:1(參考網路上的設定), 預設0:0

       --subme 5(圖層變化細緻度, 在Pic1的A點(10, 10.25)移動到Pic2的B點(10, 12.27); 若subme設為0, 則紀錄該點的偏移量為(0,2), 
                 若設為1, 則紀錄該點的偏移量為(0, 2.0), 2則紀錄該點的偏移量為(0, 2.02); 也就是subme越大影像越細膩), 預設7

心得 : 由於發現當I-Frame在傳送的時候, 所設定的Bitrate會飆高(1.5~2倍, 視場景而定), 因此希望藉由X264的參數來穩定頻寬. 嘗試用過vbv(Video Buffering Verifier), 雖可達到準確的頻寬控制, 但是卻抑制了I-Frame該有的資料量, 導致I相當於P來的醜(模糊)許多而無法接受

       --vbv-maxrate Bitrate(填滿Buffer的位元率, 越大則頻寬衝的越高), 預設0
       --vbv-bufsize Bitrate/2(Buffer的大小, 也可以說是Delay Time, 以Bitrate 512kbps來講, 若bufize 512, 則Delay 1秒), 
                             預設0

Decode[编辑]

Video[编辑]

Capture[编辑]

Display[编辑]

DirectX(9.0)[编辑]

格式轉換

由於Decode出來的影像格式為VY12, 但不是所有的顯卡都能支援此格式; 因此我們將YV12轉換成大多數顯卡所能繪致的RGB32格式

傳統轉換公式：

YV12 to RGB24:

bool YV12_to_RGB24_table(unsigned char* pYV12, unsigned char* pRGB24, int Width, int Height)

{

if(!pYV12 || !pRGB24)

return -1;

const long nYLen = long(Width * Height);

const int nHfWidth = (Width >> 1);

if((nYLen < 1) || (nHfWidth < 1))

return -1;

// Y data

unsigned char* yData = pYV12;

// v data

unsigned char* vData = &yData[nYLen];

// u data

unsigned char* uData = &vData[nYLen >> 2];

if((!uData) || (!vData))

return -1;

int rgb[3];

int i, j, m, n, x, y, pu, pv, py, rdif, invgdif, bdif;

m = -Width;

n = -nHfWidth;

bool addhalf = true;

for(y = 0; y < Height; y++)

{

m += Width;

if(addhalf)

{

n += nHfWidth;

addhalf = false;

}

else

{

addhalf = true;

}

for(x = 0; x < Width; x++)

{

i = m + x;

j = n + (x >> 1);

py = yData[i];

// search tables to get rdif invgdif and bidif

rdif = Table_fv1[vData[j]]; // fv1

invgdif = Table_fu1[uData[j]] + Table_fv2[vData[j]]; // fu1+fv2

bdif = Table_fu2[uData[j]]; // fu2

rgb[2] = py+rdif; // R

rgb[1] = py-invgdif; // G

rgb[0] = py+bdif; // B

j = nYLen - Width - m + x;

i = (j << 1) + j;

// copy this pixel to rgb data

for(j = 0; j < 3; j++)

{

if((rgb[j] >= 0) && (rgb[j] <= 255))

{

pRGB24[i + j] = rgb[j];

}

else

{

pRGB24[i + j] = (rgb[j] < 0)? 0 : 255;

}

return 0;

}

RGB24 to RGB32

bool RGB24_to_RGB32(unsigned char* pRGB24, unsigned char* pRGB32, int Width, int Height)

{

for(int i = 0; i < Height ; i++) // RGB24轉RGB32

{

for(int j = 0; j < Width ; j++)

{

pRGB32[(i * Width * 4) + (j * 4)] = pRGB24[(i * Width * 3) + (j * 3)];

pRGB32[(i * Width * 4) + (j * 4) + 1] = pRGB24[(i * Width * 3) + (j * 3) + 1];

pRGB32[(i * Width * 4) + (j * 4) + 2] = pRGB24[(i * Width * 3) + (j * 3) + 2];

pRGB32[(i * Width * 4) + (j * 4) + 3] = 0;

}

return 0;

}

2. 使用Surface轉換:

設定被轉換的參數(格式), CreateOffscreenPlainSurface

繪圖

在繪圖的部份採用Texture與Surface二種方式; 主要取決於來源的顏色格式

1. 建立裝置CreateDevice(UINT Adapter, D3DDEVTYPE DeviceType, HWND hFocusWindow, DWORD BehaviorFlags, D3DPRESENT_PARAMETERS *pPresentationParameters, IDirect3DDevice9** ppReturnedDeviceInterface);

/*

Adapter, D3DADAPTER_DEFAULT

DeviceType, D3DDEVTYPE_HAL 硬體繪製(GPU) or D3DDEVTYPE_SW 軟體模擬繪製(CPU)

hFocusWindow, 繪製視窗

BehaviorFlags, 處理頂點方式硬體(D3DCREATE_HARDWARE_VERTEXPROCESSING) 或軟體(D3DCREATE_SOFTWARE_VERTEXPROCESSING) 或其他相關設定

pPresentationParameters, 設定相關環境參數例如Windowed(是否視窗模式) BackBufferFormat(資料格式) BackBufferWidth(資料寬度)...

ppReturnedDeviceInterface, 儲存DirectX Device的介面

*/

2. 設定繪製相關參數SetRenderState(D3DRENDERSTATETYPE State, DWORD Value);

/*

D3DRS_LIGHTING, 燈光控制

D3DRS_ALPHABLENDENABLE, 透明貼圖

D3DRS_SRCBLEND, SRC混色模式

D3DRS_DESTBLEND, DST混色模式

D3DRS_BLENDOP, 混色模式

D3DRS_CULLMODE, 指定畫面描繪的模式

*/

3. 建立Texture CreateTexture(UINT Width, UINT Height, UINT Levels, DWORD Usage, D3DFORMAT Format, D3DPOOL Pool, IDirect3DTexture9** ppTexture, HANDLE* pHandle

);

/*

Width, 資料寬度

Height, 資料高度

Levels, 1 可藉由GetLevelCount()來取得

Usage, 0

Format, 資料格式

Pool, D3DPOOL_MANAGED 在系統記憶體建立一份備份可以無視裝置遺失

ppTexture, 儲存Texture元件

pHandle, 0

*/

4. 建立頂點資訊 CreateVertexBuffer(UINT Length, DWORD Usage,DWORD FVF, D3DPOOL Pool, IDirect3DVertexBuffer9** ppVertexBuffer, HANDLE* pHandle);

/*

Length, 設定4個點座標長度即為4(需要再乘點座標資訊的結構大小)

Usage, 0

FVF, 設定頂點座標的環境例如D3DFVF_XYZ 3維空間表示座標

Pool, D3DPOOL_MANAGED 在系統記憶體建立一份備份可以無視裝置遺失

ppVertexBuffer, 儲存ppVertexBuffer元件

pHandle, 0

*/

CreateIndexBuffer(UINT Length, DWORD Usage, D3DFORMAT Format, D3DPOOL Pool, IDirect3DIndexBuffer9** ppIndexBuffer, HANDLE* pHandle);

/*

Length, 設定4個點座標長度即為4(需要再乘點座標資訊的結構大小)

Usage, 0

Format, D3DFMT_INDEX16(Indices are 16 bits each)

Pool, D3DPOOL_MANAGED 在系統記憶體建立一份備份可以無視裝置遺失

ppIndexBuffer, 儲存ppIndexBuffer元件

pHandle, 0

*/

5. 開始繪圖

Clear(DWORD Count, const D3DRECT *pRects, DWORD Flags, D3DCOLOR Color, float Z, DWORD Stencil);

/*

Count, 0

pRects, NULL

Color, D3DCLEAR_TARGET Clear the color

Z, 1.0f

Stencil, 0

*/

BeginScene()

SetTexture(DWORD Stage,IDirect3DBaseTexture9 *pTexture);

/*

Stage, 0 pTexture對應第0個

pTexture, 儲存pTexture元件

*/

LockRect(UINT Level, D3DLOCKED_RECT *pLockedRect, CONST RECT *pRect, DWORD Flags);

memcpy(pLockedRect->pBits, pData, size);

UnlockRect(UINT Level)

SetStreamSource(UINT StreamNumber, IDirect3DVertexBuffer9 *pStreamData, UINT OffsetInBytes, UINT Stride);

/*

StreamNumber, 0

pStreamData, ppVertexBuffer

OffsetInBytes, 0

Stride, 座標資訊的結構大小

*/

SetFVF(DWORD FVF);

SetIndices(IDirect3DIndexBuffer9 *pIndexData);

DrawIndexedPrimitive(D3DPRIMITIVETYPE Type, INT BaseVertexIndex, UINT MinIndex, UINT NumVertices, UINT StartIndex, UINT PrimitiveCount);

/*

Type, DrawIndexedPrimitive 照Index Buffer順序畫出三角形

BaseVertexIndex, 0 從第0個頂點開始

MinIndex, 0 最小的Index值為何

NumVertices, 4 包含了幾個點

StartIndex, 0 從第BaseVertexIndex個點的第0的Index開始

PrimitiveCount, 2 有2個Type

*/

EndScene();

Present(NULL, NULL, NULL, NULL);

貼字

OpenGL[编辑]

Color Space[编辑]

1. RGB24

每一個Pixel均是由相同位置的三個相同長寬的R, G, B平面所組成; 例如圖片上的點P(a, b)是由三個平面R(a, b), G(a, b), B(a, b)所組成

2. YV12

YV12影像格式的圖片(Width, Height)是由Y(Width, Height), V(Width/2, Height/2), U(Width/2, Height/2)三個平面所組成, 大小及相對位置如圖1.1~1.3

 例如YV12圖片上的點P(a, b)組成關係如下:
 P(0, 0)由Y(0, 0), V(0, 0), U(0, 0)所組成

 P(0, 1)由Y(0, 1), V(0, 0), U(0, 0)所組成

 P(1, 0)由Y(1, 0), V(0, 0), U(0, 0)所組成

 P(1, 1)由Y(1, 1), V(0, 0), U(0, 0)所組成

圖1.1 圖片中Y的相對位置
圖1.2 圖片中V的相對位置
圖1.3 圖片中U的相對位置

記憶體排序的方式如圖1.4, Y平面大小為Width * Height, V平面大小為Width/2 * Height/2, U平面大小為Width/2 * Height/2

3. I420

I420影像格式的圖片與YV12雷同(平面大小); 唯獨差在V平面與U平面的記憶體位置, 也就是VU平面互換, 記憶體排序的方式如圖2.1,

 例如YV12圖片上的點P(a, b)組成關係如下:
 P(0, 0)由Y(0, 0), U(0, 0), V(0, 0)所組成

 P(0, 1)由Y(0, 1), U(0, 0), V(0, 0)所組成

 P(1, 0)由Y(1, 0), U(0, 0), V(0, 0)所組成

 P(1, 1)由Y(1, 1), U(0, 0), V(0, 0)所組成

圖2.1 I420, 記憶體排序方式

4. 轉換

在Color Format轉換上採用FFMpeg方式

(1)YV12轉YUY2

SwsContext *img_convert_ctx;

unsigned char *YUY2Buf;

// Initialize

img_convert_ctx = sws_getContext(Width, Height, PIX_FMT_YUV420P, Width, Height, PIX_FMT_YUYV422, SWS_POINT, NULL, NULL, NULL);

// Transfer

unsigned char *inbuf[3];

unsigned char *outbuf[3];

int inlinesize[3] = {Width, Width/2, Width/2};

int outlinesize[3] = {Width* 2, 0, 0};

inbuf[0] = (unsigned char* )malloc(Width * Height);

inbuf[1] = (unsigned char* )malloc(Width * Height >> 2 );

inbuf[2] = (unsigned char* )malloc(Width * Height >> 2);

outbuf[0] = (unsigned char* )malloc(Width * mHeight * 2);

outbuf[1] = NULL;

outbuf[2] = NULL;

memcpy(inbuf[0], YV12Buf, Width * Height);

memcpy(inbuf[1], YV12Buf+ (Width * m_lVideoHeight), Width * Height >> 2);

memcpy(inbuf[2], YV12Buf+ (Width * m_lVideoHeight*5>>2), Width * Height >> 2);

sws_scale(img_convert_ctx, inbuf, inlinesize, 0, Height, outbuf, outlinesize);

memcpy(YUY2Buf, outbuf[0], Width * Height * 2);

free(*inbuf);

free(*outbuf);

// Release

sws_freeContext(img_convert_ctx);

Audio[编辑]

RTP Info[编辑]

1. Marker bit & PayLoad type

我們可以藉由RTP Header的第二BYTE來得到Marker bit及PayLoad type, Marker bit表示此封包為Frame的最後一個包也就是結尾; Video的PayLoad type為96 ~ 127

unsigned char marker:1;

unsigned char payloadtype:7;

2. 指數Golomb碼

在影像上, 我們較關切的如何取得長寬資訊, 在H264 bitstream中長寬資訊是以指數Golomb的編碼方式存取; 而長寬資訊是編碼在SPS(Sequence Parameter Set)中

指數Golomb碼簡易來說就是移動n個Bits後遇到"1", 即往後取n個Bits; 參數藉游編碼所佔的Bits數舉例如下

佔1Bit-> 1

佔3Bit-> 01, 此參數的值佔3Bits; 例如011, 移動1個Bit後遇到"1"因此往後多取1Bit, 值為"3", 010, 值為"2"

佔5Bit-> 001, 此參數的值佔5Bits; 例如00101, 移動2個Bit後遇到"1"因此往後多取2Bit, 值為"3"

佔7Bit-> 0001, 此參數的值佔7Bits; 例如0001011, 移動3個Bit後遇到"1"因此往後多取3Bit, 值為"11"

長寬資訊從Wireshark分析的話, 名稱為"pic_width_in_mbs_minus1"與"pic_height_in_mbs_minus1"; 意思為取出的值+1在乘16即為該Frame的長寬