Spawning Threads
Overview
Last time we added audio support by taking advantage of SDL's audio functions. SDL started a thread that made callbacks to a function we defined every time it needed audio. Now we're going to do the same sort of thing with the video display. This makes the code more modular and easier to work with - especially when we want to add syncing. So where do we start?
First we notice that our main function is handling an awful lot: it's running through the event loop, reading in packets, and decoding the video. So what we're going to do is split all those apart: we're going to have a thread that will be responsible for decoding the packets; these packets will then be added to the queue and read by the corresponding audio and video threads. The audio thread we have already set up the way we want it; the video thread will be a little more complicated since we have to display the video ourselves. We will add the actual display code to the main loop. But instead of just displaying video every time we loop, we will integrate the video display into the event loop. The idea is to decode the video, save the resulting frame in another queue, then create a custom event (FF_REFRESH_EVENT) that we add to the event system, then when our event loop sees this event, it will display the next frame in the queue. Here's a handy ASCII art illustration of what is going on:
_______ ________ audio _______ _____
| | pkts | | | | to spkr
| DECODE |----->| AUDIO |--->| SDL |-->
|________| |_______| |_____|
| video _______
| pkts | |
+---------->| VIDEO |
________ |_______| _______
| | | | |
| EVENT | +------>| VIDEO | to mon.
| LOOP |----------------->| DISP. |-->
|_______|<---FF_REFRESH----|_______|
The main purpose of moving controlling the video display via the event loop is that using an SDL_Delay thread, we can control exactly when the next video frame shows up on the screen. When we finally sync the video in the next tutorial, it will be a simple matter to add the code that will schedule the next video refresh so the right picture is being shown on the screen at the right time.
Simplifying Code
We're also going to clean up the code a bit. We have all this audio and video codec information, and we're going to be adding queues and buffers and who knows what else. All this stuff is for one logical unit, viz. the movie. So we're going to make a large struct that will hold all that information called the VideoState.
typedef struct VideoState {
AVFormatContext *pFormatCtx;
int videoStream, audioStream;
AVStream *audio_st;
PacketQueue audioq;
uint8_t audio_buf[(AVCODEC_MAX_AUDIO_FRAME_SIZE * 3) / 2];
unsigned int audio_buf_size;
unsigned int audio_buf_index;
AVPacket audio_pkt;
uint8_t *audio_pkt_data;
int audio_pkt_size;
AVStream *video_st;
PacketQueue videoq;
VideoPicture pictq[VIDEO_PICTURE_QUEUE_SIZE];
int pictq_size, pictq_rindex, pictq_windex;
SDL_mutex *pictq_mutex;
SDL_cond *pictq_cond;
SDL_Thread *parse_tid;
SDL_Thread *video_tid;
char filename[1024];
int quit;
} VideoState;
Here we see a glimpse of what we're going to get to. First we see the basic information - the format context and the indices of the audio and video stream, and the corresponding AVStream objects. Then we can see that we've moved some of those audio buffers into this structure. These (audio_buf, audio_buf_size, etc.) were all for information about audio that was still lying around (or the lack thereof). We've added another queue for the video, and a buffer (which will be used as a queue; we don't need any fancy queueing stuff for this) for the decoded frames (saved as an overlay). The VideoPicture struct is of our own creations (we'll see what's in it when we come to it). We also notice that we've allocated pointers for the two extra threads we will create, and the quit flag and the filename of the movie.
So now we take it all the way back to the main function to see how this changes our program. Let's set up our VideoState struct:
int main(int argc, char *argv[]) {
SDL_Event event;
VideoState *is;
is = av_mallocz(sizeof(VideoState));
av_mallocz() is a nice function that will allocate memory for us and zero it out.
Then we'll initialize our locks for the display buffer (pictq), because since the event loop calls our display function - the display function, remember, will be pulling pre-decoded frames from pictq. At the same time, our video decoder will be putting information into it - we don't know who will get there first. Hopefully you recognize that this is a classic race condition. So we allocate it now before we start any threads. Let's also copy the filename of our movie into our VideoState.
pstrcpy(is->filename, sizeof(is->filename), argv[1]);
is->pictq_mutex = SDL_CreateMutex();
is->pictq_cond = SDL_CreateCond();
pstrcpy is a function from ffmpeg that does some extra bounds checking beyond strncpy.
Our First Thread
Now let's finally launch our threads and get the real work done:
schedule_refresh(is, 40);
is->parse_tid = SDL_CreateThread(decode_thread, is);
if(!is->parse_tid) {
av_free(is);
return -1;
}
schedule_refresh is a function we will define later. What it basically does is tell the system to push a FF_REFRESH_EVENT after the specified number of milliseconds. This will in turn call the video refresh function when we see it in the event queue. But for now, let's look at SDL_CreateThread().
SDL_CreateThread() does just that - it spawns a new thread that has complete access to all the memory of the original process, and starts the thread running on the function we give it. It will also pass that function user-defined data. In this case, we're calling decode_thread() and with our VideoState struct attached. The first half of the function has nothing new; it simply does the work of opening the file and finding the index of the audio and video streams. The only thing we do different is save the format context in our big struct. After we've found our stream indices, we call another function that we will define, stream_component_open(). This is a pretty natural way to split things up, and since we do a lot of similar things to set up the video and audio codec, we reuse some code by making this a function.
The stream_component_open() function is where we will find our codec decoder, set up our audio options, save important information to our big struct, and launch our audio and video threads. This is where we would also insert other options, such as forcing the codec instead of autodetecting it and so forth. Here it is:
int stream_component_open(VideoState *is, int stream_index) {
AVFormatContext *pFormatCtx = is->pFormatCtx;
AVCodecContext *codecCtx;
AVCodec *codec;
SDL_AudioSpec wanted_spec, spec;
if(stream_index < 0 || stream_index >= pFormatCtx->nb_streams) {
return -1;
}
// Get a pointer to the codec context for the video stream
codecCtx = pFormatCtx->streams[stream_index]->codec;
if(codecCtx->codec_type == CODEC_TYPE_AUDIO) {
// Set audio settings from codec info
wanted_spec.freq = codecCtx->sample_rate;
/* .... */
wanted_spec.callback = audio_callback;
wanted_spec.userdata = is;
if(SDL_OpenAudio(&wanted_spec, &spec) < 0) {
fprintf(stderr, "SDL_OpenAudio: %s/n", SDL_GetError());
return -1;
}
}
codec = avcodec_find_decoder(codecCtx->codec_id);
if(!codec || (avcodec_open(codecCtx, codec) < 0)) {
fprintf(stderr, "Unsupported codec!/n");
return -1;
}
switch(codecCtx->codec_type) {
case CODEC_TYPE_AUDIO:
is->audioStream = stream_index;
is->audio_st = pFormatCtx->streams[stream_index];
is->audio_buf_size = 0;
is->audio_buf_index = 0;
memset(&is->audio_pkt, 0, sizeof(is->audio_pkt));
packet_queue_init(&is->audioq);
SDL_PauseAudio(0);
break;
case CODEC_TYPE_VIDEO:
is->videoStream = stream_index;
is->video_st = pFormatCtx->streams[stream_index];
packet_queue_init(&is->videoq);
is->video_tid = SDL_CreateThread(video_thread, is);
break;
default:
break;
}
}This is pretty much the same as the code we had before, except now it's generalized for audio and video. Notice that instead of aCodecCtx, we've set up our big struct as the userdata for our audio callback. We've also saved the streams themselves as audio_st and video_st. We also have added our video queue and set it up in the same way we set up our audio queue. Most of the point is to launch the video and audio threads. These bits do it:
SDL_PauseAudio(0);
break;
/* ...... */
is->video_tid = SDL_CreateThread(video_thread, is);
We remember SDL_PauseAudio() from last time, and SDL_CreateThread() is used as in the exact same way as before. We'll get back to our video_thread() function.
Before that, let's go back to the second half of our decode_thread() function. It's basically just a for loop that will read in a packet and put it on the right queue:
for(;;) {
if(is->quit) {
break;
}
// seek stuff goes here
if(is->audioq.size > MAX_AUDIOQ_SIZE ||
is->videoq.size > MAX_VIDEOQ_SIZE) {
SDL_Delay(10);
continue;
}
if(av_read_frame(is->pFormatCtx, packet) < 0) {
if(url_ferror(&pFormatCtx->pb) == 0) {
SDL_Delay(100); /* no error; wait for user input */
continue;
} else {
break;
}
}
// Is this a packet from the video stream?
if(packet->stream_index == is->videoStream) {
packet_queue_put(&is->videoq, packet);
} else if(packet->stream_index == is->audioStream) {
packet_queue_put(&is->audioq, packet);
} else {
av_free_packet(packet);
}
}
这里没有什么新东西,除了我们给音频和视频队列限定了一个最大值并且我们添加一个检测读错误的函数。格式上下文里面有一个叫做pb的 ByteIOContext类型结构体。这个结构体是用来保存一些低级的文件信息。函数url_ferror用来检测结构体并发现是否有些读取文件错误。
在循环以后,我们的代码是用等待其余的程序结束和提示我们已经结束的。这些代码是有益的,因为它指示出了如何驱动事件--后面我们将显示影像。
while(!is->quit) {
SDL_Delay(100);
}
fail:
if(1){
SDL_Event event;
event.type = FF_QUIT_EVENT;
event.user.data1 = is;
SDL_PushEvent(&event);
}
return 0;
我们使用SDL常量SDL_USEREVENT来从用户事件中得到值。第一个用户事件的值应当是SDL_USEREVENT,下一个是 SDL_USEREVENT+1并且依此类推。在我们的程序中FF_QUIT_EVENT被定义成SDL_USEREVENT+2。如果喜欢,我们也可以传递用户数据,在这里我们传递的是大结构体的指针。最后我们调用SDL_PushEvent()函数。在我们的事件分支中,我们只是像以前放入 SDL_QUIT_EVENT部分一样。我们将在自己的事件队列中详细讨论,现在只是确保我们正确放入了FF_QUIT_EVENT事件,我们将在后面捕捉到它并且设置我们的退出标志quit。
得到帧:video_thread
当我们准备好解码器后,我们开始视频线程。这个线程从视频队列中读取包,把它解码成视频帧,然后调用queue_picture函数把处理好的帧放入到图片队列中:
int video_thread(void *arg) {
VideoState *is = (VideoState *)arg;
AVPacket pkt1, *packet = &pkt1;
int len1, frameFinished;
AVFrame *pFrame;
pFrame = avcodec_alloc_frame();
for(;;) {
if(packet_queue_get(&is->videoq, packet, 1) < 0) {
// means we quit getting packets
break;
}
// Decode video frame
len1 = avcodec_decode_video(is->video_st->codec, pFrame, &frameFinished,
packet->data, packet->size);
// Did we get a video frame?
if(frameFinished) {
if(queue_picture(is, pFrame) < 0) {
break;
}
}
av_free_packet(packet);
}
av_free(pFrame);
return 0;
}
在这里的很多函数应该很熟悉吧。我们把avcodec_decode_video函数移到了这里,替换了一些参数,例如:我们把AVStream保存在我们自己的大结构体中,所以我们可以从那里得到编解码器的信息。我们仅仅是不断的从视频队列中取包一直到有人告诉我们要停止或者出错为止。
把帧队列化
让我们看一下保存解码后的帧pFrame到图像队列中去的函数。因为我们的图像队列是SDL的覆盖的集合(基本上不用让视频显示函数再做计算了),我们需要把帧转换成相应的格式。我们保存到图像队列中的数据是我们自己做的一个结构体。
typedef struct VideoPicture {
SDL_Overlay *bmp;
int width, height;
int allocated;
} VideoPicture;
我们的大结构体有一个可以保存这些缓冲区。然而,我们需要自己来申请SDL_Overlay(注意:allocated标志会指明我们是否已经做了这个申请的动作与否)。
为了使用这个队列,我们有两个指针--写入指针和读取指针。我们也要保证一定数量的实际数据在缓冲中。要写入到队列中,我们先要等待缓冲清空以便于有位置来保存我们的VideoPicture。然后我们检查看我们是否已经申请到了一个可以写入覆盖的索引号。如果没有,我们要申请一段空间。我们也要重新申请缓冲如果窗口的大小已经改变。然而,为了避免被锁定,尽量避免在这里申请(我现在还不太清楚原因;我相信是为了避免在其它线程中调用SDL覆盖函数的原因)。
int queue_picture(VideoState *is, AVFrame *pFrame) {
VideoPicture *vp;
int dst_pix_fmt;
AVPicture pict;
SDL_LockMutex(is->pictq_mutex);
while(is->pictq_size >= VIDEO_PICTURE_QUEUE_SIZE &&
!is->quit) {
SDL_CondWait(is->pictq_cond, is->pictq_mutex);
}
SDL_UnlockMutex(is->pictq_mutex);
if(is->quit)
return -1;
// windex is set to 0 initially
vp = &is->pictq[is->pictq_windex];
if(!vp->bmp ||
vp->width != is->video_st->codec->width ||
vp->height != is->video_st->codec->height) {
SDL_Event event;
vp->allocated = 0;
event.type = FF_ALLOC_EVENT;
event.user.data1 = is;
SDL_PushEvent(&event);
SDL_LockMutex(is->pictq_mutex);
while(!vp->allocated && !is->quit) {
SDL_CondWait(is->pictq_cond, is->pictq_mutex);
}
SDL_UnlockMutex(is->pictq_mutex);
if(is->quit) {
return -1;
}
}
这里的事件机制与前面我们想要退出的时候看到的一样。我们已经定义了事件FF_ALLOC_EVENT作为SDL_USEREVENT。我们把事件发到事件队列中然后等待申请内存的函数设置好条件变量。
让我们来看一看如何来修改事件循环:
for(;;) {
SDL_WaitEvent(&event);
switch(event.type) {
case FF_ALLOC_EVENT:
alloc_picture(event.user.data1);
break;
记住event.user.data1是我们的大结构体。就这么简单。让我们看一下alloc_picture()函数:
void alloc_picture(void *userdata) {
VideoState *is = (VideoState *)userdata;
VideoPicture *vp;
vp = &is->pictq[is->pictq_windex];
if(vp->bmp) {
// we already have one make another, bigger/smaller
SDL_FreeYUVOverlay(vp->bmp);
}
// Allocate a place to put our YUV image on that screen
vp->bmp = SDL_CreateYUVOverlay(is->video_st->codec->width,
is->video_st->codec->height,
SDL_YV12_OVERLAY,
screen);
vp->width = is->video_st->codec->width;
vp->height = is->video_st->codec->height;
SDL_LockMutex(is->pictq_mutex);
vp->allocated = 1;
SDL_CondSignal(is->pictq_cond);
SDL_UnlockMutex(is->pictq_mutex);
}
你可以看到我们把SDL_CreateYUVOverlay函数从主循环中移到了这里。这段代码应该完全可以自我注释。记住我们把高度和宽度保存到VideoPicture结构体中因为我们需要保存我们的视频的大小没有因为某些原因而改变。
好,我们几乎已经全部解决并且可以申请到YUV覆盖和准备好接收图像。让我们回顾一下queue_picture并看一个拷贝帧到覆盖的代码。你应该能认出其中的一部分:
int queue_picture(VideoState *is, AVFrame *pFrame) {
if(vp->bmp) {
SDL_LockYUVOverlay(vp->bmp);
dst_pix_fmt = PIX_FMT_YUV420P;
pict.data[0] = vp->bmp->pixels[0];
pict.data[1] = vp->bmp->pixels[2];
pict.data[2] = vp->bmp->pixels[1];
pict.linesize[0] = vp->bmp->pitches[0];
pict.linesize[1] = vp->bmp->pitches[2];
pict.linesize[2] = vp->bmp->pitches[1];
// Convert the image into YUV format that SDL uses
img_convert(&pict, dst_pix_fmt,
(AVPicture *)pFrame, is->video_st->codec->pix_fmt,
is->video_st->codec->width, is->video_st->codec->height);
SDL_UnlockYUVOverlay(vp->bmp);
if(++is->pictq_windex == VIDEO_PICTURE_QUEUE_SIZE) {
is->pictq_windex = 0;
}
SDL_LockMutex(is->pictq_mutex);
is->pictq_size++;
SDL_UnlockMutex(is->pictq_mutex);
}
return 0;
}
这部分代码和前面用到的一样,主要是简单的用我们的帧来填充YUV覆盖。最后一点只是简单的给队列加1。这个队列在写的时候会一直写入到满为止,在读的时候会一直读空为止。因此所有的都依赖于is->pictq_size值,这要求我们必需要锁定它。这里我们做的是增加写指针(在必要的时候采用轮转的方式),然后锁定队列并且增加尺寸。现在我们的读者函数将会知道队列中有了更多的信息,当队列满的时候,我们的写入函数也会知道。
显示视频
这就是我们的视频线程。现在我们看过了几乎所有的线程除了一个--记得我们调用schedule_refresh()函数吗?让我们看一下实际中是如何做的:
static void schedule_refresh(VideoState *is, int delay) {
SDL_AddTimer(delay, sdl_refresh_timer_cb, is);
}
函数SDL_AddTimer()是SDL中的一个定时(特定的毫秒)执行用户定义的回调函数(可以带一些参数user data)的简单函数。我们将用这个函数来定时刷新视频--每次我们调用这个函数的时候,它将设置一个定时器来触发定时事件来把一帧从图像队列中显示到屏幕上。
但是,让我们先触发那个事件。
static Uint32 sdl_refresh_timer_cb(Uint32 interval, void *opaque) {
SDL_Event event;
event.type = FF_REFRESH_EVENT;
event.user.data1 = opaque;
SDL_PushEvent(&event);
return 0;
}
这里向队列中写入了一个现在很熟悉的事件。FF_REFRESH_EVENT被定义成SDL_USEREVENT+1。要注意的一件事是当返回0的时候,SDL停止定时器,于是回调就不会再发生。
现在我们产生了一个FF_REFRESH_EVENT事件,我们需要在事件循环中处理它:
for(;;) {
SDL_WaitEvent(&event);
switch(event.type) {
case FF_REFRESH_EVENT:
video_refresh_timer(event.user.data1);
break;
于是我们就运行到了这个函数,在这个函数中会把数据从图像队列中取出:
void video_refresh_timer(void *userdata) {
VideoState *is = (VideoState *)userdata;
VideoPicture *vp;
if(is->video_st) {
if(is->pictq_size == 0) {
schedule_refresh(is, 1);
} else {
vp = &is->pictq[is->pictq_rindex];
schedule_refresh(is, 80);
video_display(is);
if(++is->pictq_rindex == VIDEO_PICTURE_QUEUE_SIZE) {
is->pictq_rindex = 0;
}
SDL_LockMutex(is->pictq_mutex);
is->pictq_size--;
SDL_CondSignal(is->pictq_cond);
SDL_UnlockMutex(is->pictq_mutex);
}
} else {
schedule_refresh(is, 100);
}
}
现在,这只是一个极其简单的函数:当队列中有数据的时候,他从其中获得数据,为下一帧设置定时器,调用video_display函数来真正显示图像到屏幕上,然后把队列读索引值加1,并且把队列的尺寸size减1。你可能会注意到在这个函数中我们并没有真正对vp做一些实际的动作,原因是这样的:我们将在后面处理。我们将在后面同步音频和视频的时候用它来访问时间信息。你会在这里看到这个注释信息“timing密码here”。那里我们将讨论什么时候显示下一帧视频,然后把相应的值写入到schedule_refresh()函数中。现在我们只是随便写入一个值80。从技术上来讲,你可以猜测并验证这个值,并且为每个电影重新编译程序,但是:1)过一段时间它会漂移;2)这种方式是很笨的。我们将在后面来讨论它。
我们几乎做完了;我们仅仅剩了最后一件事:显示视频!下面就是video_display函数:
void video_display(VideoState *is) {
SDL_Rect rect;
VideoPicture *vp;
AVPicture pict;
float aspect_ratio;
int w, h, x, y;
int i;
vp = &is->pictq[is->pictq_rindex];
if(vp->bmp) {
if(is->video_st->codec->sample_aspect_ratio.num == 0) {
aspect_ratio = 0;
} else {
aspect_ratio = av_q2d(is->video_st->codec->sample_aspect_ratio) *
is->video_st->codec->width / is->video_st->codec->height;
}
if(aspect_ratio <= 0.0) {
aspect_ratio = (float)is->video_st->codec->width /
(float)is->video_st->codec->height;
}
h = screen->h;
w = ((int)rint(h * aspect_ratio)) & -3;
if(w > screen->w) {
w = screen->w;
h = ((int)rint(w / aspect_ratio)) & -3;
}
x = (screen->w - w) / 2;
y = (screen->h - h) / 2;
rect.x = x;
rect.y = y;
rect.w = w;
rect.h = h;
SDL_DisplayYUVOverlay(vp->bmp, &rect);
}
}
因为我们的屏幕可以是任意尺寸(我们设置为640x480并且用户可以自己来改变尺寸),我们需要动态计算出我们显示的图像的矩形大小。所以一开始我们需要计算出电影的纵横比aspect ratio,表示方式为宽度除以高度。某些编解码器会有奇数采样纵横比,只是简单表示了一个像素或者一个采样的宽度除以高度的比例。因为宽度和高度在我们的编解码器中是用像素为单位的,所以实际的纵横比与纵横比乘以样本纵横比相同。某些编解码器会显示纵横比为0,这表示每个像素的纵横比为1x1。然后我们把电影缩放到适合屏幕的尽可能大的尺寸。这里的& -3表示与-3做与运算,实际上是让它们4字节对齐。然后我们把电影移到中心
位置,接着调用SDL_DisplayYUVOverlay()函数。
结果是什么?我们做完了吗?嗯,我们仍然要重新改写声音部分的代码来使用新的VideoStruct结构体,但是那些只是尝试着改变,你可以看一下那些参考示例代码。最后我们要做的是改变ffmpeg提供的默认退出回调函数为我们的退出回调函数。
VideoState *global_video_state;
int decode_interrupt_cb(void) {
return (global_video_state && global_video_state->quit);
}
我们在主函数中为大结构体设置了global_video_state。
这就是了!让我们编译它:
gcc -o tutorial04 tutorial04.c -lavutil -lavformat -lavcodec -lz -lm /
`sdl-config --cflags --libs`
请享受一下没有经过同步的电影!下次我们将编译一个可以最终工作的电影播放器。
原文出自