Thursday, March 27, 2008

Important notes

Negotiating Pin Connections:
When the Filter Graph Manager tries to connect two filters, the pins must agree on various things. If they cannot, the connection attempt fails. Generally, pins negotiate the following:
2.Media type
The transport is the mechanism that the filters will use to move media samples from the output pin to the input pin. For example, they can use the IMemInputPin interface ("push model") or the IAsyncReader interface ("pull model").

Media type:
Almost all pins use media types to describe the format of the data they will deliver.
The allocator is the object that creates the buffers that hold the data. The pins must agree which pin will provide the allocator. They must also agree on the size of the buffers, the number of buffers to create, and other buffer properties.
The base classes implement a framework for these negotiations. You must complete the details by overriding various methods in the base class. The set of methods that you must override depends on the class and on the functionality of your filter.

Processing and Delivering Data:
The primary function of most filters is to process and deliver media data. How that occurs depends on the type of filter:
A push source has a worker thread that continuously fills samples with data and delivers them downstream.

A pull source waits for its downstream neighbor to request a sample. It responds by writing data into a sample and delivering the sample to the downstream filter. The downstream filter creates the thread that drives the data flow.
A transform filter has samples delivered to it by its upstream neighbor. When it receives a sample, it processes the data and delivers it downstream.
A renderer filter receives samples from upstream, and schedules them for rendering based on the time stamps.

Tuesday, March 25, 2008

RTP audio filter problem in latency...

1.Yesterday we completed the Packet loss adjusting mechanism.
2.Today we improved the audio quality. By Putting IAmPushSource and set the latency as
GetLatency(){ *plLatency = 3500000; return S_OK;}
I have implemented the IAMPushSource on My live source Filter's Output Pin.
I faced 2 to 3 seconds latency problem;
So I modified the things as follows:
STDMETHODIMP CRTPAudioStream::GetPushSourceFlags(ULONG *pFlags)
//The filter time stamps the samples using a private clock.
//The clock is not available to the rest of the graph through IReferenceClock.
return S_OK;
Now everything works fine within the Filter, we implemented as follows:

class CRTPFilter: public CSourceStream, public IAMFilterMiscFlags
public: // IAMFilterMiscFlags override

class CRTPAudioStream : public CSourceStream,public IAMPushSource
{return E_NOTIMPL;}
STDMETHODIMP GetPushSourceFlags(ULONG *pFlags)
{return E_NOTIMPL;}
{return E_NOTIMPL;}
STDMETHODIMP SetPushSourceFlags(ULONG Flags)
{return E_NOTIMPL;}
{return E_NOTIMPL;}
{ * prtLatency = 45000000; return S_OK;} //Set 450ms Latency for audio

Saturday, March 15, 2008

CSource Filter architecture with memory mapped file

For Network Source Filter, we are not able to receive and render the data efficiently,if we want to implement any Queue like mechanism to network received packets and render the data if we did all these things in a source filter., that will takes much CPU time. So we can do it different way as follows:

within the source filter, we have to read data from the memory mapped file.From another one application, we have to read data from network and write it to the memory mapped file.But Both the Source Filter and Network receiving application must be resides in the same system. if we did like this, then we can render the received audio and video data without any problem. we can also queue the recceived network packets.

I have seen the application for the following: Source Filter renders the data in a memory mapped file.
1. Camera Source Filter to render the data from the memory mapped file.
2. From another application , it creates the Filter Graph with source Filter and implemented the Thread to receive packets from the network and Queueing it in a datastructure.

Add Worker Thread in Source Filter derived from CSource class

How to create threads within CSource derived class.
within PushDesktop Source Filter,I created the thread. within this thread, I incremented the integer.within FillBuffer() fn, I displayed the integer value.
ThreadProc() {
i++; SetEvent( m_hEvent); }
FillBuffer() { WaitForSingleObject( m_hEvent);
the following code is not working ;
ThreadProc() {
while(true) {
dwWaitObject = WaitForSingleObject(m_hStopEvent, 5);
if(dwWaitObject == WAIT_OBJECT_0) { return 0; }
SetEvent( m_hEvent);
WaitForSingleObject( m_hEvent);
ThreadProc() {
i++; SetEvent( m_hEvent); }
FillBuffer() { WaitForSingleObject( m_hEvent);

At every one second, the video renderer waits for data from the source Filter. So if u modify thedwWaitObject = WaitForSingleObject(m_hStopEvent, 5); as
dwWaitObject = WaitForSingleObject(m_hStopEvent, 2); then it is working.
we must not take that much of time ...
Try with one more thing:
ThreadProc() { ThreadCB(); }
ThreadCB() { WaitForSingleObject(); printf("\n sundar");

Wednesday, March 12, 2008

RTP Audio Source Filter Problem without Glitches

Solution for Removing Audio Glitches in a RTP Audio Source Filter
1.Test the Timestamp of the large files.
- it is also having some strange values like Start Time : 40000 and End Time :0
2.Set only start Time and be the stop time always 0.3.Dump the RTP Packet within the file.( No Problem with audio data)4.Implement the IAMPushSource interface.
I tested the large MP2 file in GraphEdit :
If I inserted the dummy filter before the MPEG1 Audio Decoder filter, I got the following :
Start Time : 28114700 Stop Time : 23998200 Diff : 4116500Another time I got it as follows:
Start Time :31588108 Stop Time :28454648
Differnece :3133460

If I inserted the Dummy Filter after the MPEg1 Audio Decoder, I got the timestamp values as follows:
Grabber Sample size is : 18432 Start Time :0 Stop Time :0
Grabber Sample size is : 18432 Start Time :960000 Stop Time :0
Grabber Sample size is : 18432 Start Time : 1920000 Stop Time : 0
For the RTP Source Filter I got the following:
----------------------------------------------- 1.If I inserted the dummy filter before the MPEG1 Audio Decoder filter, I got the following :
Start Time Stop Time 2520000 2520000 2520000 2835000 2835000 3150000 2.If I inserted the Dummy Filter after the MPEG1 Audio Decoder, I got the timestamp values as follows:
Grabber Sample Size is : 18432 Start Time : 2520000 Stop Time : 0
Start Time : 2835000 Stop Time : 0 Start Time : 3150000 Stop Time : 0
Start Time : 3465000 Stop Time : 0 Start Time : 3780000 Stop Time : 0
Nothing above is works.

I am sending mp2 audio data with 48Hz frequency. At the RTP Audio Source Filter, I modified the frequency as 22.05 ( 22050 samples per second)..

Samples per second is the speed controller of a filter graph. it determines the speed of the filter.

48000(48 KHz) filter will run fast.
44100 (44.1KHz) filter will be some extent slow compare to the 48000.
22050( 22.05 KHz ) is very slow.
we are rendering the audio half of the transmitted audio's speed.

Tuesday, March 11, 2008

RTP source Filter

RTP Source Filter :
From RTP Source Filter, we will just call the APIs in RTP Stack DLL.
we called the RTP Stack DLL APIs as follows:

RTPReceiveVideo() ; if( HasMPEG4Frames() == TRUE)
{ GetMPEG4Frame(pbBufferdata,pbBufferSize,iTimestamp); }
within this RTP stack DLL, we implemented the RTP parser.
Normally only KeyFrames alone will be wrapped in more than one RTP packet.
So what we have done was if the frame is wrapped in more than one RTP packet, then will put it in some Queue.
if we received all the consecutive RTP packets until the end of that frame,then only we will return HasMPEG4Frames() as TRUE.otherwise we will discard that Packet.
ffmpeg will sends every MPEG4 Encoded frame beginning in a new Packet.
if the Frame contained in more than one packets, assume that if the frame is in 3 RTP Packets. within the 3rd RTP packet it will notsend the Next Frame's will be wrapped in 4th RTP packet.

MPEG4 Decoder Filter Problem

MPEG4 Decoder Filter:
within MPEG4 Decoder Filter,we will allow only keyframe for the first time by Checking the following :
if( m_bFirstFrame)
if ( (pSourceBuffer[0] == (unsigned char) 0x0) && (pSourceBuffer[1] == (unsigned char) 0x0) && (pSourceBuffer[2] == (unsigned char) 0x1) &&
((pSourceBuffer[3] == (unsigned char) 0xB0) // VOS Start Code
((pSourceBuffer[3] ==(unsigned char) 0xB6) && // VOL with KeyFrame
( (pSourceBuffer[4] & 0xC0)== 0x00) )
(pSourceBuffer[3] == (unsigned char) 0x20)
) )
m_bFirstFrame = false;
else {
//Wait For KeyFrame
if (NOERROR == pSource->GetTime(&TimeStart, &TimeEnd))
{ pDest->SetTime(&TimeStart, &TimeEnd);
return NOERROR;

if we got the First frame, then we will send the data to the MPEG4 Decoder DLL APIs whatever we got from the RTP source Filter.

Note :
MPEG4 Decoder DLL APIs support only frame by frame decompression.
That means at atime we can pass only one frame to the MPEG4 decoder DLL Apis. otherwise MPEG4 Decoder will fails.

Audio Source Problem

Description about the Audio Problem :
1.within the Audio source Filter we set the config as follows: i)Sampling Frequency is 48KHz ii) we set the timestamp as follows from the RTP packet as follows:
m_rtStart = m_rtStop; m_rtStop = m_iRTPPacketTimestamp ;
the difference between m_rtStart and m_rtStop is 15200.
AudioFrameNo StartTime Stop Time
1 0 15120
2 15120 30240
3 30240 45360 and so on
within the Source Filter, we dumped the received audio data to an Mp2 file without timestamp information. if we played the Mp2 file, it was working well.
2.ffmpeg command for capturing data from audio source as follows:
ffmpeg -f audio_device -i /dev/dsp -acodec -mp2 -ar 48000 -ab 64kb -f rtp rtp://

Application Architecture and filter development ideas

Some Application architecture :
Client - Wrap the VLC player in a Client application and control it.
Server - wrap the ffmpeg and control it using server application
we can broadcast the things with Server and receive the things with client
RTSP Application Architecture
RTSP Client : wrap the live555 RTSP receiving.
RTSP server: Wrap the live555 server

For Hobby, Develop the Filters around these open source code then we will get more ideas.

Wednesday, March 05, 2008

MPEG4 Decoder Problem

Somewhere in the middle the MPEG4 Decoder is failed.
if we used the MPEG4 Decoder with the MPEG4 Demux , then there will be no problem.
But if we used it with the RTP Video Source Filter, then we got the problem.
Reason:----------- This is the problem with MPEG4 Decoder.if we used any DivX free decoder filter, then it is working well.if we are starting from the middle, then the RTP source Filter will not send the Keyframe first.This is the problem
I modified the Decoder Filter code to accept it as follows:
For the First time, if we are receiving VOS header then only allow it to display . otherwise just drop the frame.By checking the VOS Start code (00 00 01 B0) But it is not working. It seems like the ffmpeg is not sending VOS header.
So I added the code to check for the VOL start code and frame type.
if (bFirstFrame)
if( VOS start code or ( VOL startcode and frametype == KeyFrame))
bFirstFrame = false;
else { return S_FALSE; //Drop the Frame. }

Video Rendering downstream Filter Problem

I am decoding an MPEG2 video stream with resolution of 704x480 to the YUY2 format type (using ffmpeg).
If I use the equation:
lSampleSize = pVIH->bmiHeader.biWidth * pVIH->bmiHeader.biBitCount * pVIH->bmiHeader.biHeight / 8
I get a sample size of 675840 bytes.
This is the same size of data I am getting from my decode routine for the resulting YUY2 image. By intercepting other samples coming from other decoders, it is apparent that the correct size of the media sample, when calling SetActualDataLength should be 737280. This number can be calculating from the same equation above if the width is incremented to a number divisible by 128. I've discovered enough to know that this is due to the stride requirement of the VMR.
My question is this... How do I know the stride requirement as it surely cannot always be 128?
And more importantly, how would I go about populating a DirectShow media sample buffer from a pointer to the raw bytes of YUY2 data which is 675840 bytes in length?
Result :
When your output pin connects to the video renderer's input pin, after CheckMediaType() and before DecideBufferSize(), the video renderer will call your QueryAccept() method with the adjusted biWidth (that is, biWidth will be the stride and rcSource.right-rcSource.left will be the width). You need to override QueryAccept() and save the adjusted media type, so you will know the correct stride and sample size (which you will need right after in DecideBufferSize()).
> And more importantly, how would I go about populating a > DirectShow media sample buffer from a pointer to the raw > bytes of YUY2 data which is 675840 bytes in length?
You copy 1 scanline at a time:
for(y = 0; y < height; y++)
source = buffer + y * width; target = sample + y * stride; memcpy(target,source,width*sizeof(pixel));

Thread Programming

Thread Programming :
  I faced Threading problems.

I developed the thread for reading from Network :

 int ThreadProc() 
  return 0;

 if I put the above code, I am getting CPU time as almost 100%.
CPU doesnt execute other processes. I put a Sleep(100), So that the OS executes the other processes during the sleep of that  program.
Sleep(0) also gives timeslice to execute other process.


 while(1){ } or While(bRun) {}  is called as Busy loop technically.

 Technically they call it as Busy waiting or Spinning.

busy waiting or spinning is a technique in which a process repeatedly checks to see if a condition is true,
such as waiting for keyboard input or waiting for a lock to become available. It can also be used to delay execution for some
amount of time; this was necessary on old computers that had no method of waiting a specific length of time other than
by repeating a useless loop a specific number of times, but on modern computers with clocks and different processor speeds,
this form of time delay is often inaccurate and a sign of a naïve attempt at programming.

Alternatives to busy waiting:
 Most operating systems and threading libraries provide a wide set of system calls which will block the process
on an event, such as lock acquisitions, timers, I/O availability, or signals. This is often the simplest, most efficient, fair, and
race-free way. A single call checks, informs the scheduler of the event it is waiting for, inserts a memory barrier where applicable,
and may perform a requested I/O operation before returning. Other processes can use the CPU while the caller is blocked.
The scheduler is given the information needed to implement priority inheritance or other mechanisms to avoid starvation.

 Busy waiting itself can be made much less wasteful by using a "delay" function found on most operating systems.
This puts a thread to sleep for a specified time, during which the thread will waste no CPU time.
 If the loop is checking something simple then it will spend most of its time asleep and will not waste a large proportion
of the available CPU time. It will still consume some CPU time though.

When busy waits are appropriate :
 In low-level hardware driver programming, sometimes busy waits are actually desirable.
It is not practical to implement hardware interrupt-based signalling for every hardware device, particularly for devices
 that are seldom accessed. Sometimes it is necessary to write some sort of control data to a hardware device and
then read back some sort of status data, which is not valid until several, perhaps even tens of clock cycles later.
The programmer could call an operating system delay function, but more time would be spent simply performing the
 function call (let alone switching to an interim thread) than is required by the hardware. In such cases, it is common to
implement a busy wait that keeps reading the status data until it is valid. Calling a delay function in this case would actually
waste CPU time due to the comparatively large overhead involved in the function call and thread switching.

 For Some important Operations, we can set the ThreadPriority to Highest.This will improves the execution speed.

For Windows programming, we can do the following:

  DWORD dwWait = WaitForSingleObject(m_hThreadStopEvent,5);
  if (dwWait == WAIT_OBJECT_0)  //ThreadStopEvent occurred

  return 0;

The Description for the above code is as follows:
 it will waits for 5 milliseconds for ThreadStop Event. if this event occurs, the control comes out of the while loop.
other wise control will be in loop and will reads data from the network.
 whenever we want to comes out of the ThreadProc() fn just rise the ThreadStopEvent.

convert video to mobile phone formats

    if ur mobile phone is supporting video formats.then u can convert the video to the mobile phone video formats.For this Conversion, if we use any freeware, they will watermark the video with their company name, logo or with URL.
 So How can we convert the video to mobile phone video format without any watermark;

 just download "ffmpeg for windows".

This Zip file will have ffmpeg.exe.

Using this ffmpeg we can convert any type of video to the Mobile phone video format;
just run the ffmpeg with the following arguments :

 ffmpeg -i "D:\Source.avi"  -f 3gp -vcodec mpeg4  -s 176x144 -acodec libfaac -ar 8000 -ac 1 -ab 12.2k "D:\Output.3gp"

"D:\source.avi" may be any type of video file avi, wmv or any type.