Guides: How to moderate live videos with AI

0:00

Hey, Phil from Mux here.

0:01

We build video infrastructure

0:03

that any developer

0:04

can leverage to build

0:06

amazing video experiences

0:07

on their website.

0:08

I'm here with a quick demo of

0:10

how you can use one of our new

0:11

features, latest thumbnails,

0:13

to analyze and moderate the

0:15

content that's coming from

0:16

your live streams using ai.

0:18

Let's take a look.

0:21

We're gonna need three

0:21

things to build this out.

0:23

First of all, we're

0:24

gonna need a live stream.

0:25

We're gonna need an account

0:26

and some API keys for

0:28

OpenAI's, API, and then we're

0:31

just gonna need a little node

0:32

strip to glue it all together.

0:33

We've got our live

0:34

stream up and running

0:35

in OBS at the moment.

0:37

This is just showing Big Buck

0:38

Bunny, which is of course,

0:40

the best movie ever made.

0:42

But what if somebody started

0:43

streaming something that

0:45

wasn't Big Buck Bunny?

0:46

What if they streamed

0:47

a violent movie or

0:48

something even worse?

0:50

Let's hook up some content

0:51

moderation so that if or when

0:54

that happens, we can catch it.

0:57

Mux has always

0:58

had an image, API.

0:59

This lets you pull

1:00

thumbnails or storyboards

1:02

or animated GIFs from

1:04

videos or live streams.

1:06

We added a new feature a

1:07

couple of weeks ago, which

1:09

allows you to request

1:10

the latest thumbnail

1:11

from a live stream.

1:12

All you need to do is build

1:13

your image URL, the same way

1:15

you always would, but pass

1:17

a new query parameter called

1:18

Latest, and set it to true.

1:20

This works just like it

1:21

always has done, but by

1:23

adding the latest parameter,

1:24

this means that every 10

1:25

seconds this thumbnail will

1:27

get updated with the latest

1:28

thumbnail from the livestream.

1:30

This is great 'cause it means

1:31

you can pretty much just pass

1:33

this straight into any odd

1:34

moderation framework you have.

1:35

It doesn't matter whether

1:36

that's OpenAI's, it

1:38

doesn't matter if that's

1:39

hives or somebody else's.

1:40

Just having this constant

1:41

URL you can use to always

1:43

get that live image just

1:44

means that updating your

1:45

moderation flow is just as

1:46

simple as adding a new job

1:48

to your moderation queue.

1:50

So for this example,

1:52

we're gonna use OpenAI's

1:53

moderation model.

1:54

What's really cool about this

1:56

API is it's actually free.

1:58

There are rate limits.

1:59

So if you're building

2:00

something significant

2:01

in scale, you might want

2:03

to keep that in mind.

2:04

We're going to use OpenAI's

2:06

latest omni moderation model.

2:08

This can take and moderate

2:10

either text or images,

2:12

and obviously we're using

2:14

it in the image mode

2:15

'cause we're gonna pass

2:16

it this latest thumbnail

2:17

from our live stream.

2:19

Okay, so let's take

2:20

a look at the code.

2:21

This whole script is less

2:23

than 50 lines of Node.js.

2:25

First thing you need to

2:26

do is just import the

2:27

OpenAI SDK, and then set

2:29

up our API keys so we can

2:31

access the OpenAI client.

2:33

We've got a function, an

2:33

async function, which just.

2:35

You pass in a playback ID and

2:37

it moderates the live stream.

2:39

The first thing this function

2:40

does is construct the URL.

2:42

It passes in a playback ID,

2:43

obviously, and sets the latest

2:45

to true, and then it sets

2:47

the width to 640 pixels wide.

2:50

Obviously, we'll then

2:50

automatically pick a height

2:52

for it based on the aspect

2:53

ratio of the content.

2:54

640 is a pretty

2:55

good balance here.

2:56

It's probably more than you

2:57

actually need for moderation,

2:58

but if you are then using

2:59

this image for other

3:01

functions like summarization

3:03

or categorization, tagging,

3:04

that sort of thing.

3:05

You might want a little

3:06

bit higher image quality,

3:08

especially if you're using a

3:09

multimodal, uh, LVM for this.

3:11

Then we pretty much just

3:13

call the OpenAI client.

3:14

We tell it to use omni

3:16

moderation latest.

3:17

That's the model we

3:18

just talked about.

3:19

We pass it the image UL,

3:20

and that's kind of it.

3:22

This is asynchronous,

3:23

so we wait for this

3:24

response and then we

3:26

take a look at response.

3:27

There is a top level boolean.

3:29

That's whether this

3:30

content was flagged or not.

3:32

This is, set to true.

3:33

If any of the categories come

3:34

back over 80% confidence.

3:36

We're looking at

3:37

two categories here.

3:38

We're looking at sexual and

3:40

we're looking at violence.

3:41

So let's take a look and

3:43

see if anyone has put

3:44

anything sinister into our

3:46

copy of Big Buck Bunny.

3:50

Okay, so we've got our

3:52

moderation script running.

3:54

It's pretty quiet right now.

3:56

Every 15 seconds.

3:57

This is requesting the

3:58

latest frame of a live

3:59

stream that's updated every

4:00

10 seconds, so you always

4:02

know that you're gonna get

4:03

a new thumbnail this way.

4:05

We can actually also

4:05

turn up the frequency.

4:07

We refresh those images.

4:08

So if you need to be

4:09

slightly market, we

4:10

can help you do that.

4:11

There's not a whole lot

4:12

going on here, so we're

4:13

gonna just use a horror

4:14

movie for this and see

4:16

what happens in our script.

4:18

Let's take a look.

4:19

So now we've flipped over to

4:20

watching a horror movie on

4:22

our live stream, and you'll

4:23

see that we're starting to

4:24

get flags back for violence.

4:26

So this is quite a gory

4:28

horror movie, uh, and you'll

4:30

see that the violence value

4:31

is coming back at 0.8, 0.9.

4:34

What happens here is you get

4:35

a confidence value from zero

4:37

to one, where one is very high

4:39

confidence that the model is

4:40

seeing violence, and zero is

4:41

very low confidence value.

4:43

So we can see, for example,

4:44

the sexual confidence is very

4:46

low, which means the content

4:47

probably isn't sexual, but

4:49

yeah, it's quite violent.

4:50

It's a horror movie.

4:51

There are monsters

4:52

doing unspeakable

4:52

things to people, so.

4:55

We're correctly detecting that

4:56

this movie is a bit violent.

4:59

You can obviously see how

5:00

if you're building out

5:01

a user generated content

5:02

platform with a live

5:04

streaming functionality,

5:05

how this is really useful.

5:07

How can you use this to

5:08

catch streams that shouldn't

5:09

be on your platform.

5:11

What's really cool is this

5:12

same approach works for a

5:14

bunch of other functionality.

5:15

I recently wrote a blog post

5:16

about how you can also use

5:18

these same latest thumbnails.

5:19

With multimodal models

5:21

and prompts to do content

5:23

summarization or content

5:25

tagging all through

5:26

using the same latest

5:28

thumbnail function.

5:29

It's not just for moderation.

5:30

You can kind of string

5:31

all sorts of AI workflows

5:33

off the back of this.

5:34

As always, I'd love to

5:35

see what you build on

5:37

top of latest thumbnails.

5:39

You can use it to

5:39

build all sorts.

5:41

This is just one way of

5:41

leveraging that feature.

5:43

And if you want more

5:44

tips and tricks and cool

5:46

things built with video.

5:47

Don't forget to like and

5:49

subscribe to our YouTube

5:51

channel, but also go and

5:53

check out Mux.com where

5:55

you can play around with

5:56

all this infrastructure.

5:57

We've also got an amazing

5:59

new free plan for getting

6:00

started building with Mux,

6:02

so check that out too.