Thứ Tư, 21 tháng 3, 2018

Waching daily Mar 22 2018

>> Okay. So, next speaker is

Hai Le from Washington University, St. Louis?

>> Yeah. >> Yeah.

>> Thank you.

>> And he's going to talk about Precision-Recall.

>> Hi, everyone. My name is

Hai from Washington University in St. Louis.

This is Jaguar with my advisor Professor Brendan Juba.

So the problem that we worked on is Class Imbalance.

So, we all know that class imbalance is a situation

when one class severely out-represent other classes.

So some practitioners claim that

class imbalance make it harder for them

for classifying imbalanced data,

but existing learning theory do

not refer to class imbalance.

So although there are

many methods that have been proposed

to correct the imbalance like

various sampling method or the cost sensitive method,

but we have a question,

do those methods really help?

So in our work,

first we derived the relationship

between precision-recall and accuracy,

and then we claim that the large data set

is the only cure for class imbalance.

So we proved that both theoretically and empirically.

Further related work in this problem, Reader et al.

empirically saw that every metric except

for accuracy depends on the class imbalance.

Alvarez also discovered the relationship

between precision, recall and accuracy.

However, he didn't like interpret

the consequence of that relationship.

So first, let's look at our derivation for

the relationship between precision-recall and accuracy.

So our theorem is,

suppose D is a distribution over example with

Boolean label and base positive rate mu,

and suppose we have

a classifier h with precision greater than 0.5.

Then our theorem is define epsilon maximum,

precision and recall error,

then epsilon max has to satisfy,

mu epsilon max is less or equal than epsilon accuracy

and less or equal than three times mu epsilon max.

So, in general, precision and recall is invalid where

the accuracy is scaled by the factor

of three times mu the positive rate.

So the consequence of this theorem is,

if we integrate with the VC of the accuracy,

we can conclude that with

probability at least one minus delta,

at least one over

mu epsilon max times d plus log one over

delta example are necessary and sufficient to

achieve precision and recall

greater than one minus epsilon max,

where d is the VC dimension of the classifier.

So the precision of the term one over mu is very

important because it lead to the consequence that

when the objective is high precision and high recall,

class imbalance does impose a cost on learning.

That means that in order

to achieve high precision and recall,

it demand that the size

of the training data scale by the imbalance.

So we conclude that using

large training data size is the cure for imbalanced data,

because it is what controls the generalization error.

So, let's look at some case study when we really need

high precision because remember

when I talked about our theorem,

we made the assumption that

the precision is greater than 0.5.

So, we look at

some case study when we really need high precision.

So the first example is

the ICU alarm system for predicting cardiac arrest.

So because cardiac arrest is

a very rare event and is very expensive,

so we really want that when the ICU alarm system

is triggered the patient

actually need to transfer to the ICU,

so indeed we want high precision.

Another example that we

want to talk to is the machine translation example.

So, in this one we also need

really high precision and recall because you

can have high accuracy but

low recall if you predict the rare word and never use,

but you can also have higher accuracy but low precision

but in this case the choice that you use of

the words in the translation might be meaningless.

So, Brants et al.

has a system that's so continual improvement as

the training data increases

up to 100 billion of examples.

So, it's not too

clear why he needs so much amount of data,

so we did lay some investigation and

we observed that most of

the sentences contains rare words.

And the frequency of

the rarest word in the target sentence would determine

the amount of data that are required to

reliably translate the sentence.

So we did an experiment on

the New York Times corpus and we observed that at

least 1.5 million example are necessary

to correctly learn 55% of the sentence.

So, for the experiment part,

the aim of this experiment part

is to verify our conclusion

that using large training data size is

necessary to cope with class imbalance.

So in order to do that,

we are going to compare

the performance of different techniques for fixing

class imbalance versus no modification version

but trained on larger training size.

So the data set that we use in

this experiment is drug discovery data set with

about 1 million negative label

and have chose like 62,000 positive label.

So in the first part of the experiment,

first we stratify the data into

20 folds with about 50,000 samples each.

And then we select one fold

and we create different model by

re-sampling this using different imbalance techniques,

like oversampling, undersampling, SMOTE,

and then we change the model using KNN.

For the rest of the data,

we just change using KNN and we

repeat the experiment 200

times and compare the performance.

So you can see, for the no modification version,

when trained on a large data size

we have significant higher precision and

recall than the model

that use different imbalance techniques.

In the second part of the experiment,

we compare the performance by different data

set size from 100,000 to 500,000 samples.

You can see that the precision and recall

scale when the size of the data

set increase in the no modification version.

So, to conclude, we

conclude that it is not possible to achieve

high precision and recall under

severe class imbalance unless

you have a certain amount of data.

And we claim that the method for correcting

class imbalance don't help and you

shouldn't trust classifiers on

typical size data set with high imbalance because

it is very similar to the situation that you

have low accuracy on a small data set size.

So, our advice, when you

have to deal with class imbalance

but you don't have enough data,

our advice is to rather than

trying different techniques for correcting the imbalance,

you try to explore

some prior knowledge about the domain like,

for example, for the machine translation example

that I talked before,

you can try to look for the definition of

the word in the dictionary rather than trying

different techniques for correcting imbalance.

That's all on my talk. Thank you, everyone.

For more infomation >> Precision-Recall versus Accuracy and the Role of Large Data Sets - Duration: 9:24.

-------------------------------------------

【星のカービィ 小ネタ】早速バグ発見!?『星の○○○○』をカービィでプレイする方法(語録字幕)【星のカービィ スターアライズ】 - Duration: 10:04.

For more infomation >> 【星のカービィ 小ネタ】早速バグ発見!?『星の○○○○』をカービィでプレイする方法(語録字幕)【星のカービィ スターアライズ】 - Duration: 10:04.

-------------------------------------------

ปัญ MV ชู้กะชู้ มีฉากไหนบ้าง - Duration: 2:54.

For more infomation >> ปัญ MV ชู้กะชู้ มีฉากไหนบ้าง - Duration: 2:54.

-------------------------------------------

OCTAVOS DE FINAL l | Champions y Europa League - Duration: 9:01.

For more infomation >> OCTAVOS DE FINAL l | Champions y Europa League - Duration: 9:01.

-------------------------------------------

Khmer Comedy_ - រឿង ឪល្ងើកូនល្ងង់​ #01 - Duration: 21:25.

For more infomation >> Khmer Comedy_ - រឿង ឪល្ងើកូនល្ងង់​ #01 - Duration: 21:25.

-------------------------------------------

How to record your Iphone screen in IOS 11 - Duration: 3:22.

welcome guys in this video I'm gonna be showing you how to record your iPhone

screen on iOS 11 so if you ever wanted to record your phone screen because you

wanted to maybe share something that you were doing on your phone screen with

someone else or if you're like a lot of people you play games and you just want

to record your your game playing activity or whatever you do on your

phone I don't know whatever you do on your whatever apps you use on your phone

and you just wish that you could do a quick recording and share with anyone if

you want or just keep for your own own usage. alright guys let me show you what

to do this in order to record your iPhone screen you need to be on iOS 11

you can record your iPhone screen if you are on a lower version but you cannot

use this method because Apple have haven't put this method in the older

version of iOS you need IOS 11 in order to do this one I'm gonna show you okay

so let's go over to our iPhone and we're gonna go to settings when setting comes

up you're going to scroll down and you're going to look for control center

click on that when control center comes up you want to say customize controls

and as you can see at this moment you're gonna see some apps that are already in

your control centers if you swipe up from the bottom of your phone come up

this is your control center and you can see that currently these are the apps

that are there. so I am going to swipe down and if you scroll down

you go to

more controls you're going to see screen recording all you need to do is just

click the plus sign and it will go to include which means that it will be

included in your control center so what I can do now I can go back to my home and swipe up

screen and all I need to do is hold the recording button

right here and my screen will begin to record also

the first time that you're doing this guys if you don't hold it (record button) for like about

three seconds so this menu comes up where you can turn your microphone on

because if you just click it normally it will begin to record but it will not

record your microphone so the first time while you are recording please hold it

this little record button hold it down for like three seconds and this menu

will come up and you just turn on your your microphone and you begin to record

both your screen on your voice so if you don't want your advice to be recorded

you can always turn the microphone on and off vice versa all right so

basically that's it for this one I hope you really found this video useful if

you did please remember to Like share subscribe and thank you until next time

Không có nhận xét nào:

Đăng nhận xét