Free £25 Bet!
Free £50 Bet at VCBet!
Free £25 Bet!

In association with Sports-Punter Free Bets Odds Comparison BetHelp Limso

We are the Official Forum of FreeBetting.net & FCBet.com


Sports News Sports Stats Live Scores OddsChecker Place Bets Suggest a Site


Go Back   The Punters Lounge - The World's Best Betting Forum > Systems, Strategy and General Betting Help > Systems & Strategy Forum

Systems & Strategy Forum Discuss all your strategies, systems, selection methods and staking plans here. Try and keep your match selections to the other forums.

Free £25 Bet at Jaxx!
UK & Irish Football Forum | Western European Football Forum | UEFA Cup & Champions League Football Forum | International Football Forum | Eastern & Southern European Football Forum | Nordic & Scandinavian Football Forum | Non European Football Forum | At The Races Forum | At The Races Systems Forum | Other Sports Forum | USA Sports Forum | Fantasy & Fun Comps Forum | Free Bets Forum | Systems & Strategy Forum | Glory Hunter's Forum | Tipster's Challenge Forum | Daily Racing Comp Forum | Euro & Worldwide Comp Forum | Poker Tourneys Forum | Poker Strategy Forum | Poker Chat Forum | Poker Live Forum | Poker Challenges Forum | Poker Staking Forum | Poker Leagues Forum | Bookmakers & Exchanges Forum | Punter's Tools/Betting Help Forum | General Chat Forum | Tech & Gaming Forum | Sports Banter Forum | Live Sports Feeds Forum

Reply
 
Thread Tools Display Modes
Old 19-01-2008, 19:33   #1 (permalink)
Newbie Punter
 
Join Date: 13 Nov 2007
Posts: 23
Default Random Sampling

ive got a dataset of of 8120 matches and i want to predict for an away win. im building a logistic model and there are 2339 away wins in my dataset. therefore my other 2339 should consist of home winds and draws. but how do i choose the split? should it be 2339 away wins and 1170 draws and 1170 home wins?

please help

thanks
Ace123 is offline   Reply With Quote
Old 19-01-2008, 20:37   #2 (permalink)
Mens Doubles Punter
 
Join Date: 21 Dec 2003
Location: Newcastle upon Tyne
Age: 24
Posts: 10,003
Default Re: Random Sampling

I don't understand. Why do you need the same amount of homes/draws as aways?
__________________
I use statistics much as a drunken man uses lamp-posts - as support rather than illumination. (Andrew Lang)

Everyone thought Einstein was crazy until he started kicking ass.
Mr Intensity is offline   Reply With Quote
Old 19-01-2008, 21:29   #3 (permalink)
Newbie Punter
 
Join Date: 13 Nov 2007
Posts: 23
Default Re: Random Sampling

what should my population be then if im trying to model away wins??
Ace123 is offline   Reply With Quote
Old 21-01-2008, 23:28   #4 (permalink)
Mens Doubles Punter
 
Join Date: 21 Dec 2003
Location: Newcastle upon Tyne
Age: 24
Posts: 10,003
Default Re: Random Sampling

As you've got a pretty decent number of games I'd start by using half of the data for training and half for validation.
__________________
I use statistics much as a drunken man uses lamp-posts - as support rather than illumination. (Andrew Lang)

Everyone thought Einstein was crazy until he started kicking ass.

Last edited by Mr Intensity; 21-01-2008 at 23:29.
Mr Intensity is offline   Reply With Quote
Old 22-01-2008, 00:26   #5 (permalink)
Newbie Punter
 
Join Date: 13 Nov 2007
Posts: 23
Default Re: Random Sampling

the whole point of doing logistic regression is that ur "goods" are the same volume as your "bads" so that there is no bias. I'm just wondering whether in football modelling u should consider this standard statistical practice or include everybody in ur sample??

in a season, on average, there are 50% home wins and the other 50% is made up of draws and away wins.

my question is whether ur sample should model all observations even thogh there may be a bias. or u should evenly split out the population so that ur dealing with equal volumes.
Ace123 is offline   Reply With Quote
Old 22-01-2008, 11:10   #6 (permalink)
Mens Doubles Punter
 
Join Date: 21 Dec 2003
Location: Newcastle upon Tyne
Age: 24
Posts: 10,003
Default Re: Random Sampling

Quote:
Originally Posted by Ace123 View Post
the whole point of doing logistic regression is that ur "goods" are the same volume as your "bads" so that there is no bias. I'm just wondering whether in football modelling u should consider this standard statistical practice or include everybody in ur sample??

in a season, on average, there are 50% home wins and the other 50% is made up of draws and away wins.

my question is whether ur sample should model all observations even thogh there may be a bias. or u should evenly split out the population so that ur dealing with equal volumes.
It's been over a year but as far as I remember the whole point of logistic regression is to work out the probability of a "good" from binomially distributed data using an appropriate link function. You know n so you're wanting to work out p. If you sample the data so you know p=0.5 then what's the point?
__________________
I use statistics much as a drunken man uses lamp-posts - as support rather than illumination. (Andrew Lang)

Everyone thought Einstein was crazy until he started kicking ass.

Last edited by Mr Intensity; 22-01-2008 at 11:11.
Mr Intensity is offline   Reply With Quote
Old 22-01-2008, 11:28   #7 (permalink)
Junior Punter
 
slapdash's Avatar
 
Join Date: 30 Oct 2004
Posts: 12,352
Awards Showcase
Daily Horse Racing Competition Daily Horse Racing Competition Daily Horse Racing Competition 
Total Awards: 3
Default Re: Random Sampling

If you artificially force your samples of "bads" to have 50% home wins and 50%
draws, then you'll be introducing much more of a bias. Since home wins are
actually more frequent than draws, you'll probably be heavily biasing the
"bads" in favour of factors that correlate with the home team doing badly.

I know what logistic regression is about, more or less, though I don't know
much about the nuts and bolts. But I don't understand why you need the
samples of goods and bads to have the same size?
slapdash is offline   Reply With Quote
Old 22-01-2008, 11:36   #8 (permalink)
Junior Punter
 
slapdash's Avatar
 
Join Date: 30 Oct 2004
Posts: 12,352
Awards Showcase
Daily Horse Racing Competition Daily Horse Racing Competition Daily Horse Racing Competition 
Total Awards: 3
Default Re: Random Sampling

Quote:
Originally Posted by Mr Intensity View Post
It's been over a year but as far as I remember the whole point of logistic regression is to work out the probability of a "good" from binomially distributed data using an appropriate link function. You know n so you're wanting to work out p. If you sample the data so you know p=0.5 then what's the point?
Isn't the point of logistic regression to work out how the probability of a
"good" varies when you have knowledge of other factors? So fixing the total
sample so that the overall probability is 0.5 doesn't necessarily prejudge the
answer.
slapdash is offline   Reply With Quote
Old 22-01-2008, 11:45   #9 (permalink)
Mens Doubles Punter
 
Join Date: 21 Dec 2003
Location: Newcastle upon Tyne
Age: 24
Posts: 10,003
Default Re: Random Sampling

Quote:
Originally Posted by slapdash View Post
Isn't the point of logistic regression to work out how the probability of a
"good" varies when you have knowledge of other factors? So fixing the total
sample so that the overall probability is 0.5 doesn't necessarily prejudge the
answer.
Just re-read what I wrote and it sounds really dumb. Setting the sample with 50% home wins is wrong but for the reasons you've stated.
__________________
I use statistics much as a drunken man uses lamp-posts - as support rather than illumination. (Andrew Lang)

Everyone thought Einstein was crazy until he started kicking ass.
Mr Intensity is offline   Reply With Quote
Old 22-01-2008, 21:48   #10 (permalink)
Newbie Punter
 
Join Date: 13 Nov 2007
Posts: 23
Default Re: Random Sampling

so basically for each sample (build set and validation set) you should have the true proportions of home wins, draws and away wins??
Ace123 is offline   Reply With Quote
Old 22-01-2008, 22:44   #11 (permalink)
Mens Doubles Punter
 
Join Date: 21 Dec 2003
Location: Newcastle upon Tyne
Age: 24
Posts: 10,003
Default Re: Random Sampling

Quote:
Originally Posted by Ace123 View Post
so basically for each sample (build set and validation set) you should have the true proportions of home wins, draws and away wins??
No, then you're forcing the data, which is bad. Do as the thread title says - choose randomly.
__________________
I use statistics much as a drunken man uses lamp-posts - as support rather than illumination. (Andrew Lang)

Everyone thought Einstein was crazy until he started kicking ass.
Mr Intensity is offline   Reply With Quote
Old 22-01-2008, 23:06   #12 (permalink)
Newbie Punter
 
Join Date: 13 Nov 2007
Posts: 23
Default Re: Random Sampling

ok so what should be my "good" and "bad" outcomes??

i still dont get what the splits should be.

lets take an example. say i want to model the probability of a home win and my data set size is 8000. 4000 are home wins, 2000 are draws and 2000 are away wins. could you possibly explain to me how i would build a logistic model based on the above info??

thanks
Ace123 is offline   Reply With Quote
Old 22-01-2008, 23:23   #13 (permalink)
Mens Doubles Punter
 
Join Date: 21 Dec 2003
Location: Newcastle upon Tyne
Age: 24
Posts: 10,003
Default Re: Random Sampling

What software are you using?

Easiest way to do it is to take your data, order it by date, take the first 4000 results and use that as your training data.

You need to decide which factors you want to include. This is easy. I'd start by including everything you might want to include. You then want to create your model using the software and do an Analysis of Deviance. Your software should add terms sequentially, so you have a forward stepwise approach and can do chi-squared tests to get a P-value and use hypothesis tests to determine which factors to keep in. Then when you have decided which factors to keep in you can run the model again using different link functions to determine which is the best.

That should give you a model to start with. Then you can start messing around and use the testing data


Sorry if that's a bit patronizing, from you're posts not sure how much you know
__________________
I use statistics much as a drunken man uses lamp-posts - as support rather than illumination. (Andrew Lang)

Everyone thought Einstein was crazy until he started kicking ass.
Mr Intensity is offline   Reply With Quote
Old 22-01-2008, 23:31   #14 (permalink)
Newbie Punter
 
Join Date: 13 Nov 2007
Posts: 23
Default Re: Random Sampling

thanks very much for that advice mr intensity.

im actually using SAS.

so what would be my target variable? and how would you define the target variable? as in what would the "1" represent and what would "0" represent??
Ace123 is offline   Reply With Quote
Old 22-01-2008, 23:34   #15 (permalink)
Mens Doubles Punter
 
Join Date: 21 Dec 2003
Location: Newcastle upon Tyne
Age: 24
Posts: 10,003
Default Re: Random Sampling

Replyed on msn
__________________
I use statistics much as a drunken man uses lamp-posts - as support rather than illumination. (Andrew Lang)

Everyone thought Einstein was crazy until he started kicking ass.
Mr Intensity is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts



All times are GMT. The time now is 05:34.


Powered by vBulletin® Version 3.7.0
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.

Free £100 Bet!
Online Bookmakers
Free £100 Bet!

Recommended Bookies: Bet365 | BetDirect | | Blue Square | Canbet | Centrebet | Coral | Eurobet | Ladbrokes | Paddy Power | Party Bets | Pinnacle Sports | Skybet | SportingBet | Stan James | ToteSport | VCBet

Recommended Betting Exchanges: | Betfair | WBX

Recommended for Spread Betting: Sporting Index |
Partner Sites
Football Betting Tips Australian Free Bets HOT Free Bets HOT Odds Comparison Soccer Punter
Bookmakers Livescore SoccerVista Asian Handicap Betting Guide Euroleague Betting Picks
Soccer Picks

Contact Us | Disclaimer


© 2008 PuntersLounge.Com Ltd | Gambling Problems?

Powered by vBulletin® Version 3.7.0
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.