I appreciate everyones input. Especially the latter statistical analysis. Thankfully I will not be publishing any of this data in Epidemiology journals. As for validity....well, the degree to which this data supports the intended conclusion ...all I can say, is that these numbers represent the searches of players who at least had some inclination to look for other MMORPGs.
Limitations would include
Single players doing multiple searches
"playing around"..people just pluggin in different criteria just to see what comes up yet having no desire to actually play a "P2P 2D graphic Pirate MMORPG" (Yes thats a real search)
catagories are not all inclusive.
To this point, the majority of visitors have found the finder code helpful and easy to use. An although that may not be statistically significant, its the result I care most about.
Thanks for everyone's input. I will keep working on it.
Torrential
P.S. "you need millions of searches for anyone to care or listen" ..... Boy? What are you smoking?!?!
Originally posted by Mahni Originally posted by apwrsmage In statistical analysis, depending on what you're doing, the ideal sample size is 150 to 3,000. A sample size in the millions starts to pollute the numbers.
"Ideal" sample size depends on estimated effect size in the population - there could be cases were a very small effect size posed such a great risk (say a small % increase in aircraft failures) that it would warrant a large sample size to detect it. But for this case, your point is valid - a sample size of 150 would be enough to detect an effect size of interest (assuming no analysis of higher order interactions). Larger sample sizes (for example, "in the millions") would not "pollute the numbers" or introduce any type of systematic bias. If you have a large sample (say hundreds of thousands of transaction records, as an example) and you were to fit the *entire* sample with certain techniques (for example, a decision tree algorithm such as CART or CHAID) you could end up with a result that was not *generalizable* due to overfitting, but this would be the fault of the statistician for not using an appropriate validation technique, *not* for having "too much data". As I said in my previous post, my greatest concerns here are how representative the data is, and with statistical validity. I don't have concerns about power or whether differences are statistically significant.
Using a sample set in the millions, unless you're dealing with a very straight-forward, black and white situation can introduce a certain amount of random variance that, depending on the situation, can play with power. Or, again depending on the situation, the differences in conclusions from the larger sample size are negligible enough to not warrant the extra effort... say a 20.05% trend versus a 20.03% trend.
Nevertheless, in this situation a control certainly can't be established given the nature of the situation and its uses, but the tool isn't being used for clinical trials, and there are so many possible ways the tool can be used it can't declare a definitive result, it can only show a trend.
Which brings us all back to the original point: A trend. The tool shows the trend of curiosity among those that have used it as to what games and systems they're currently looking for. It seems to me that was the whole point of the tool in the first place. No more, no less.
In statistical analysis, depending on what you're doing, the ideal sample size is 150 to 3,000. A sample size in the millions starts to pollute the numbers.
"Ideal" sample size depends on estimated effect size in the population - there could be cases were a very small effect size posed such a great risk (say a small % increase in aircraft failures) that it would warrant a large sample size to detect it.
But for this case, your point is valid - a sample size of 150 would be enough to detect an effect size of interest (assuming no analysis of higher order interactions).
Larger sample sizes (for example, "in the millions") would not "pollute the numbers" or introduce any type of systematic bias. If you have a large sample (say hundreds of thousands of transaction records, as an example) and you were to fit the *entire* sample with certain techniques (for example, a decision tree algorithm such as CART or CHAID) you could end up with a result that was not *generalizable* due to overfitting, but this would be the fault of the statistician for not using an appropriate validation technique, *not* for having "too much data".
As I said in my previous post, my greatest concerns here are how representative the data is, and with statistical validity. I don't have concerns about power or whether differences are statistically significant.
Using a sample set in the millions, unless you're dealing with a very straight-forward, black and white situation can introduce a certain amount of random variance that, depending on the situation, can play with power. Or, again depending on the situation, the differences in conclusions from the larger sample size are negligible enough to not warrant the extra effort... say a 20.05% trend versus a 20.03% trend.
Nevertheless, in this situation a control certainly can't be established given the nature of the situation and its uses, but the tool isn't being used for clinical trials, and there are so many possible ways the tool can be used it can't declare a definitive result, it can only show a trend.
Which brings us all back to the original point: A trend. The tool shows the trend of curiosity among those that have used it as to what games and systems they're currently looking for. It seems to me that was the whole point of the tool in the first place. No more, no less.
Sorry for the derail here...
Large sample sets do not "introduce" "random variance". If you are saying large sample sets somehow increase unexplained variance OR variance from individual differences (think error terms in structural equation modeling), they do not. If you are saying they introduce some form of systematic bias, they do not. If you are saying that large sample sizes can create a problem due to sampling error, they can but thats a sampling error problem. As sample sizes go up, confidence intervals go down, whether its a sample size of one hundred or ten million.
Not having a control (group) is irrelevant here. Descriptives and estimation of population parameters (population mean, variance) can always be done with a sample of the population without a control group. If you are implying that only "clinical trials" give definitive results, that is incorrect. Observational or psuedo-experimental designs are just as valid (and in some cases *more* valid as a design) than experimental designs (such as a factorial design including a control group).
When you say "there are so many possible ways the tool can be used...", if you are saying this is not a controlled experiment, I completely agree. But thats a validity / generalizability issue (that I pointed out in earlier posts). It doesn't lead me to conclude that this is directional (in your words, a trend), but it does lead me to be *very* hesitant in drawing any conclusions from the data (how respondents use the tool doesn't necessarily translate with what people are looking for from "new" mmos).
Just to reiterate, I think the tool is great and has a lot of utility.
Originally posted by Mahni Originally posted by apwrsmage
Originally posted by Mahni
Originally posted by apwrsmage
In statistical analysis, depending on what you're doing, the ideal sample size is 150 to 3,000. A sample size in the millions starts to pollute the numbers.
"Ideal" sample size depends on estimated effect size in the population - there could be cases were a very small effect size posed such a great risk (say a small % increase in aircraft failures) that it would warrant a large sample size to detect it. But for this case, your point is valid - a sample size of 150 would be enough to detect an effect size of interest (assuming no analysis of higher order interactions). Larger sample sizes (for example, "in the millions") would not "pollute the numbers" or introduce any type of systematic bias. If you have a large sample (say hundreds of thousands of transaction records, as an example) and you were to fit the *entire* sample with certain techniques (for example, a decision tree algorithm such as CART or CHAID) you could end up with a result that was not *generalizable* due to overfitting, but this would be the fault of the statistician for not using an appropriate validation technique, *not* for having "too much data". As I said in my previous post, my greatest concerns here are how representative the data is, and with statistical validity. I don't have concerns about power or whether differences are statistically significant.
Using a sample set in the millions, unless you're dealing with a very straight-forward, black and white situation can introduce a certain amount of random variance that, depending on the situation, can play with power. Or, again depending on the situation, the differences in conclusions from the larger sample size are negligible enough to not warrant the extra effort... say a 20.05% trend versus a 20.03% trend. Nevertheless, in this situation a control certainly can't be established given the nature of the situation and its uses, but the tool isn't being used for clinical trials, and there are so many possible ways the tool can be used it can't declare a definitive result, it can only show a trend. Which brings us all back to the original point: A trend. The tool shows the trend of curiosity among those that have used it as to what games and systems they're currently looking for. It seems to me that was the whole point of the tool in the first place. No more, no less. Sorry for the derail here... Large sample sets do not "introduce" "random variance". If you are saying large sample sets somehow increase unexplained variance OR variance from individual differences (think error terms in structural equation modeling), they do not. If you are saying they introduce some form of systematic bias, they do not. If you are saying that large sample sizes can create a problem due to sampling error, they can but thats a sampling error problem. As sample sizes go up, confidence intervals go down, whether its a sample size of one hundred or ten million. Not having a control (group) is irrelevant here. Descriptives and estimation of population parameters (population mean, variance) can always be done with a sample of the population without a control group. If you are implying that only "clinical trials" give definitive results, that is incorrect. Observational or psuedo-experimental designs are just as valid (and in some cases *more* valid as a design) than experimental designs (such as a factorial design including a control group). When you say "there are so many possible ways the tool can be used...", if you are saying this is not a controlled experiment, I completely agree. But thats a validity / generalizability issue (that I pointed out in earlier posts). It doesn't lead me to conclude that this is directional (in your words, a trend), but it does lead me to be *very* hesitant in drawing any conclusions from the data (how respondents use the tool doesn't necessarily translate with what people are looking for from "new" mmos). Just to reiterate, I think the tool is great and has a lot of utility.
And this is the part where you overthink it and miss my point, and most of my meanings, entirely.
And this is the part where you overthink it and miss my point, and most of my meanings, entirely. I'll try again. Originally posted by apwrsmage A sample size in the millions starts to pollute the numbers.
Not true.
In your opinion. My opinion is otherwise. It's obvious you're trying to "win" an intellectual "debate". But, this is the last I'm posting on this thread. The point that the tool may not be scientific but provides good information and is interesting has been ground into a fine paste. So feel free to say, "Nuh uh! You're all kinds of wrong! Cuz I know!" though I continue to disagree. If you want to keep bashing away at the subject in an attempt to puff yourself up, be my guest.
And this is the part where you overthink it and miss my point, and most of my meanings, entirely.
I'll try again.
Originally posted by apwrsmage
A sample size in the millions starts to pollute the numbers.
Not true.
In your opinion. My opinion is otherwise. It's obvious you're trying to "win" an intellectual "debate". But, this is the last I'm posting on this thread. The point that the tool may not be scientific but provides good information and is interesting has been ground into a fine paste. So feel free to say, "Nuh uh! You're all kinds of wrong! Cuz I know!" though I continue to disagree. If you want to keep bashing away at the subject in an attempt to puff yourself up, be my guest.
I've taken my time to try to explain why large sample sizes are not bad (though they are also frequently unnecessary for statistical analysis). There isn't any need to make ad hominem attacks.
__________________________________________________ In memory of Laura "Taera" Genender. Passed away on Aug/13/08 - Rest In Peace; you will not be forgotten
1. Fantastic/ new thinking classes. And there skill/ability system. - Sub skills, secret skills, unike ablitys, laerning abilitys. ect.
2. A fantastic adventure system, ala WH new system when camping. and a personal system/ both player and class. And i for one would like to se a system " What u see is what u get " I mean how many baers have mony, sword and armor on em ?
3. Grafic that knock the S... out of one/ but only to the limit were a disent computer can run it. Not ala AOC.
4. A faersome, and unholy race, and the same goes for a holy race. Ala Heroes of might and magic system.
6. A crafting system were u allso can be unike, and the thing you make + a system were players for ones have to work for it, big time.
5. and so on.
Beta: GW, WOW, AOC, DAOC, DOD, L2, VG. Playd: GW necro msx, WOW never maxd, DAOC reaver-paladin-cabalist-vamp max, L2 never maxd, VG never maxd.
F2P is a scam. Game companies are out to make a profit. Free to play is a way to scam people out of money in a shadey way. FTP should be referd to as "incrimental".
Let me know the cost upfront.
I don't want some corporate scam to dangle cash bought goodies all over the game. Made to draw you into spending more cash in the long run.
Remember. The goal for the companies making games is PROFIT.
I do like free expansions, not pay expansions. If people pay monthly this is what you constantly pay for. You should not be double hit like Everquest, or Galaxies, where they force you to pay for a expansion on top of a monthly.
Guild wars is the only non-monthly method I like. Again, I like to know the cost up front. Not some company trying to inch it's hand into your pocket.
In statistical analysis, depending on what you're doing, the ideal sample size is 150 to 3,000. A sample size in the millions starts to pollute the numbers.
"Ideal" sample size depends on estimated effect size in the population - there could be cases were a very small effect size posed such a great risk (say a small % increase in aircraft failures) that it would warrant a large sample size to detect it.
But for this case, your point is valid - a sample size of 150 would be enough to detect an effect size of interest (assuming no analysis of higher order interactions).
Larger sample sizes (for example, "in the millions") would not "pollute the numbers" or introduce any type of systematic bias. If you have a large sample (say hundreds of thousands of transaction records, as an example) and you were to fit the *entire* sample with certain techniques (for example, a decision tree algorithm such as CART or CHAID) you could end up with a result that was not *generalizable* due to overfitting, but this would be the fault of the statistician for not using an appropriate validation technique, *not* for having "too much data".
As I said in my previous post, my greatest concerns here are how representative the data is, and with statistical validity. I don't have concerns about power or whether differences are statistically significant.
Using a sample set in the millions, unless you're dealing with a very straight-forward, black and white situation can introduce a certain amount of random variance that, depending on the situation, can play with power. Or, again depending on the situation, the differences in conclusions from the larger sample size are negligible enough to not warrant the extra effort... say a 20.05% trend versus a 20.03% trend.
Nevertheless, in this situation a control certainly can't be established given the nature of the situation and its uses, but the tool isn't being used for clinical trials, and there are so many possible ways the tool can be used it can't declare a definitive result, it can only show a trend.
Which brings us all back to the original point: A trend. The tool shows the trend of curiosity among those that have used it as to what games and systems they're currently looking for. It seems to me that was the whole point of the tool in the first place. No more, no less.
Sorry for the derail here...
Large sample sets do not "introduce" "random variance". If you are saying large sample sets somehow increase unexplained variance OR variance from individual differences (think error terms in structural equation modeling), they do not. If you are saying they introduce some form of systematic bias, they do not. If you are saying that large sample sizes can create a problem due to sampling error, they can but thats a sampling error problem. As sample sizes go up, confidence intervals go down, whether its a sample size of one hundred or ten million.
Not having a control (group) is irrelevant here. Descriptives and estimation of population parameters (population mean, variance) can always be done with a sample of the population without a control group. If you are implying that only "clinical trials" give definitive results, that is incorrect. Observational or psuedo-experimental designs are just as valid (and in some cases *more* valid as a design) than experimental designs (such as a factorial design including a control group).
When you say "there are so many possible ways the tool can be used...", if you are saying this is not a controlled experiment, I completely agree. But thats a validity / generalizability issue (that I pointed out in earlier posts). It doesn't lead me to conclude that this is directional (in your words, a trend), but it does lead me to be *very* hesitant in drawing any conclusions from the data (how respondents use the tool doesn't necessarily translate with what people are looking for from "new" mmos).
Just to reiterate, I think the tool is great and has a lot of utility.
People need to stop using the term "free to play" F2P is a scam. Game companies are out to make a profit. Free to play is a way to scam people out of money in a shadey way. FTP should be referd to as "incrimental". Let me know the cost upfront. I don't want some corporate scam to dangle cash bought goodies all over the game. Made to draw you into spending more cash in the long run. Remember. The goal for the companies making games is PROFIT. I do like free expansions, not pay expansions. If people pay monthly this is what you constantly pay for. You should not be double hit like Everquest, or Galaxies, where they force you to pay for a expansion on top of a monthly. Guild wars is the only non-monthly method I like. Again, I like to know the cost up front. Not some company trying to inch it's hand into your pocket.
Free to play is not always a scam. It depends on how the cash shop is put into the game. If it is a shop to buy bonus, cooler looking items which have no advantage than other items, then in no way is this a scam. It is just a way that the companies make a profit, because that's what they have to do. If it is like Mabinogi, where you have to buy a card to reduce your age or to do certain parts of the game, in no way is this a scam. The companies have to make a profit, and most of those things are optional.. Even if the ones in Mabinogi are not, they are probably cheaper or just as much as a P2P game.
It is wrong just because a game has a cash shop to say it is a scam.
__________________________________________________ In memory of Laura "Taera" Genender. Passed away on Aug/13/08 - Rest In Peace; you will not be forgotten
And this is the part where you overthink it and miss my point, and most of my meanings, entirely.
I'll try again.
Originally posted by apwrsmage
A sample size in the millions starts to pollute the numbers.
Not true.
In your opinion. My opinion is otherwise. It's obvious you're trying to "win" an intellectual "debate". But, this is the last I'm posting on this thread. The point that the tool may not be scientific but provides good information and is interesting has been ground into a fine paste. So feel free to say, "Nuh uh! You're all kinds of wrong! Cuz I know!" though I continue to disagree. If you want to keep bashing away at the subject in an attempt to puff yourself up, be my guest.
I've taken my time to try to explain why large sample sizes are not bad (though they are also frequently unnecessary for statistical analysis). There isn't any need to make ad hominem attacks.
I had a good chuckle at this. Just to nail the coffin shut, a sample size of a million would really only be bad (albeit often uneccissary) if the population you were sampling was less than a million, in which case the person taking the sample would more than likely be laughed at.
As for what you can draw form this data. You can safely that of people who are activly searching for a new mmorpg, 93% of them either dont want Sci-fi or are impartial to it. :P
What would be really cool is if you could release the raw data, then we could do things like % of people looking for free mmorpgs are looking for the following, or if you were to release a sci-fi mmorpg, more of the people looking for this type of game are wanting a 3d p2p game.. things like that.
after 6 or so years, I had to change it a little...
I don't see any use for this information at all I'm afraid. It has absolutely no value in representing the views of anyone.
You don't know why people searched for certain terms, how many times the same person or people searched for the same terms, or how many other terms they also searched for not listed in your data.
Comments
I appreciate everyones input. Especially the latter statistical analysis. Thankfully I will not be publishing any of this data in Epidemiology journals. As for validity....well, the degree to which this data supports the intended conclusion ...all I can say, is that these numbers represent the searches of players who at least had some inclination to look for other MMORPGs.
Limitations would include
Single players doing multiple searches
"playing around"..people just pluggin in different criteria just to see what comes up yet having no desire to actually play a "P2P 2D graphic Pirate MMORPG" (Yes thats a real search)
catagories are not all inclusive.
To this point, the majority of visitors have found the finder code helpful and easy to use. An although that may not be statistically significant, its the result I care most about.
Thanks for everyone's input. I will keep working on it.
Torrential
P.S. "you need millions of searches for anyone to care or listen" ..... Boy? What are you smoking?!?!
Torrential: DAOC (Pendragon)
Awned: World of Warcraft (Lothar)
Torren: Warhammer Online (Praag)
But for this case, your point is valid - a sample size of 150 would be enough to detect an effect size of interest (assuming no analysis of higher order interactions).
Larger sample sizes (for example, "in the millions") would not "pollute the numbers" or introduce any type of systematic bias. If you have a large sample (say hundreds of thousands of transaction records, as an example) and you were to fit the *entire* sample with certain techniques (for example, a decision tree algorithm such as CART or CHAID) you could end up with a result that was not *generalizable* due to overfitting, but this would be the fault of the statistician for not using an appropriate validation technique, *not* for having "too much data".
As I said in my previous post, my greatest concerns here are how representative the data is, and with statistical validity. I don't have concerns about power or whether differences are statistically significant.
Using a sample set in the millions, unless you're dealing with a very straight-forward, black and white situation can introduce a certain amount of random variance that, depending on the situation, can play with power. Or, again depending on the situation, the differences in conclusions from the larger sample size are negligible enough to not warrant the extra effort... say a 20.05% trend versus a 20.03% trend.
Nevertheless, in this situation a control certainly can't be established given the nature of the situation and its uses, but the tool isn't being used for clinical trials, and there are so many possible ways the tool can be used it can't declare a definitive result, it can only show a trend.
Which brings us all back to the original point: A trend. The tool shows the trend of curiosity among those that have used it as to what games and systems they're currently looking for. It seems to me that was the whole point of the tool in the first place. No more, no less.
But for this case, your point is valid - a sample size of 150 would be enough to detect an effect size of interest (assuming no analysis of higher order interactions).
Larger sample sizes (for example, "in the millions") would not "pollute the numbers" or introduce any type of systematic bias. If you have a large sample (say hundreds of thousands of transaction records, as an example) and you were to fit the *entire* sample with certain techniques (for example, a decision tree algorithm such as CART or CHAID) you could end up with a result that was not *generalizable* due to overfitting, but this would be the fault of the statistician for not using an appropriate validation technique, *not* for having "too much data".
As I said in my previous post, my greatest concerns here are how representative the data is, and with statistical validity. I don't have concerns about power or whether differences are statistically significant.
Using a sample set in the millions, unless you're dealing with a very straight-forward, black and white situation can introduce a certain amount of random variance that, depending on the situation, can play with power. Or, again depending on the situation, the differences in conclusions from the larger sample size are negligible enough to not warrant the extra effort... say a 20.05% trend versus a 20.03% trend.
Nevertheless, in this situation a control certainly can't be established given the nature of the situation and its uses, but the tool isn't being used for clinical trials, and there are so many possible ways the tool can be used it can't declare a definitive result, it can only show a trend.
Which brings us all back to the original point: A trend. The tool shows the trend of curiosity among those that have used it as to what games and systems they're currently looking for. It seems to me that was the whole point of the tool in the first place. No more, no less.
Sorry for the derail here...Large sample sets do not "introduce" "random variance". If you are saying large sample sets somehow increase unexplained variance OR variance from individual differences (think error terms in structural equation modeling), they do not. If you are saying they introduce some form of systematic bias, they do not. If you are saying that large sample sizes can create a problem due to sampling error, they can but thats a sampling error problem. As sample sizes go up, confidence intervals go down, whether its a sample size of one hundred or ten million.
Not having a control (group) is irrelevant here. Descriptives and estimation of population parameters (population mean, variance) can always be done with a sample of the population without a control group. If you are implying that only "clinical trials" give definitive results, that is incorrect. Observational or psuedo-experimental designs are just as valid (and in some cases *more* valid as a design) than experimental designs (such as a factorial design including a control group).
When you say "there are so many possible ways the tool can be used...", if you are saying this is not a controlled experiment, I completely agree. But thats a validity / generalizability issue (that I pointed out in earlier posts). It doesn't lead me to conclude that this is directional (in your words, a trend), but it does lead me to be *very* hesitant in drawing any conclusions from the data (how respondents use the tool doesn't necessarily translate with what people are looking for from "new" mmos).
Just to reiterate, I think the tool is great and has a lot of utility.
But for this case, your point is valid - a sample size of 150 would be enough to detect an effect size of interest (assuming no analysis of higher order interactions).
Larger sample sizes (for example, "in the millions") would not "pollute the numbers" or introduce any type of systematic bias. If you have a large sample (say hundreds of thousands of transaction records, as an example) and you were to fit the *entire* sample with certain techniques (for example, a decision tree algorithm such as CART or CHAID) you could end up with a result that was not *generalizable* due to overfitting, but this would be the fault of the statistician for not using an appropriate validation technique, *not* for having "too much data".
As I said in my previous post, my greatest concerns here are how representative the data is, and with statistical validity. I don't have concerns about power or whether differences are statistically significant.
Using a sample set in the millions, unless you're dealing with a very straight-forward, black and white situation can introduce a certain amount of random variance that, depending on the situation, can play with power. Or, again depending on the situation, the differences in conclusions from the larger sample size are negligible enough to not warrant the extra effort... say a 20.05% trend versus a 20.03% trend.
Nevertheless, in this situation a control certainly can't be established given the nature of the situation and its uses, but the tool isn't being used for clinical trials, and there are so many possible ways the tool can be used it can't declare a definitive result, it can only show a trend.
Which brings us all back to the original point: A trend. The tool shows the trend of curiosity among those that have used it as to what games and systems they're currently looking for. It seems to me that was the whole point of the tool in the first place. No more, no less.
Sorry for the derail here...
Large sample sets do not "introduce" "random variance". If you are saying large sample sets somehow increase unexplained variance OR variance from individual differences (think error terms in structural equation modeling), they do not. If you are saying they introduce some form of systematic bias, they do not. If you are saying that large sample sizes can create a problem due to sampling error, they can but thats a sampling error problem. As sample sizes go up, confidence intervals go down, whether its a sample size of one hundred or ten million.
Not having a control (group) is irrelevant here. Descriptives and estimation of population parameters (population mean, variance) can always be done with a sample of the population without a control group. If you are implying that only "clinical trials" give definitive results, that is incorrect. Observational or psuedo-experimental designs are just as valid (and in some cases *more* valid as a design) than experimental designs (such as a factorial design including a control group).
When you say "there are so many possible ways the tool can be used...", if you are saying this is not a controlled experiment, I completely agree. But thats a validity / generalizability issue (that I pointed out in earlier posts). It doesn't lead me to conclude that this is directional (in your words, a trend), but it does lead me to be *very* hesitant in drawing any conclusions from the data (how respondents use the tool doesn't necessarily translate with what people are looking for from "new" mmos).
Just to reiterate, I think the tool is great and has a lot of utility.
And this is the part where you overthink it and miss my point, and most of my meanings, entirely.
Not true.
In your opinion. My opinion is otherwise. It's obvious you're trying to "win" an intellectual "debate". But, this is the last I'm posting on this thread. The point that the tool may not be scientific but provides good information and is interesting has been ground into a fine paste. So feel free to say, "Nuh uh! You're all kinds of wrong! Cuz I know!" though I continue to disagree. If you want to keep bashing away at the subject in an attempt to puff yourself up, be my guest.
In your opinion. My opinion is otherwise. It's obvious you're trying to "win" an intellectual "debate". But, this is the last I'm posting on this thread. The point that the tool may not be scientific but provides good information and is interesting has been ground into a fine paste. So feel free to say, "Nuh uh! You're all kinds of wrong! Cuz I know!" though I continue to disagree. If you want to keep bashing away at the subject in an attempt to puff yourself up, be my guest.
I've taken my time to try to explain why large sample sizes are not bad (though they are also frequently unnecessary for statistical analysis). There isn't any need to make ad hominem attacks.Good job Czzarre
__________________________________________________
In memory of Laura "Taera" Genender. Passed away on Aug/13/08 - Rest In Peace; you will not be forgotten
1. Fantastic/ new thinking classes. And there skill/ability system. - Sub skills, secret skills, unike ablitys, laerning abilitys. ect.
2. A fantastic adventure system, ala WH new system when camping. and a personal system/ both player and class. And i for one would like to se a system " What u see is what u get " I mean how many baers have mony, sword and armor on em ?
3. Grafic that knock the S... out of one/ but only to the limit were a disent computer can run it. Not ala AOC.
4. A faersome, and unholy race, and the same goes for a holy race. Ala Heroes of might and magic system.
6. A crafting system were u allso can be unike, and the thing you make + a system were players for ones have to work for it, big time.
5. and so on.
Beta:
GW, WOW, AOC, DAOC, DOD, L2, VG.
Playd:
GW necro msx, WOW never maxd, DAOC reaver-paladin-cabalist-vamp max, L2 never maxd, VG never maxd.
Play:
Daoc - Paladin. Classic.
I appreciate the statistical debate. As such I have now edited the title of this thread to read...
What are players looking for in a new MMORPG?...Here are some numbers which may or may not have significance depending on the power analysis construct
All in fun guys
Torrential
Torrential: DAOC (Pendragon)
Awned: World of Warcraft (Lothar)
Torren: Warhammer Online (Praag)
People need to stop using the term "free to play"
F2P is a scam. Game companies are out to make a profit. Free to play is a way to scam people out of money in a shadey way. FTP should be referd to as "incrimental".
Let me know the cost upfront.
I don't want some corporate scam to dangle cash bought goodies all over the game. Made to draw you into spending more cash in the long run.
Remember. The goal for the companies making games is PROFIT.
I do like free expansions, not pay expansions. If people pay monthly this is what you constantly pay for. You should not be double hit like Everquest, or Galaxies, where they force you to pay for a expansion on top of a monthly.
Guild wars is the only non-monthly method I like. Again, I like to know the cost up front. Not some company trying to inch it's hand into your pocket.
SHOHADAKU
The site is good and the tool works decent enough, I answered honestly and it told me all the game I play already
As for the debate, despite agreeing with you mahni, people who use the term "ad hominem" smell.
-----
The person who is certain, and who claims divine warrant for his certainty, belongs now to the infancy of our species.
But for this case, your point is valid - a sample size of 150 would be enough to detect an effect size of interest (assuming no analysis of higher order interactions).
Larger sample sizes (for example, "in the millions") would not "pollute the numbers" or introduce any type of systematic bias. If you have a large sample (say hundreds of thousands of transaction records, as an example) and you were to fit the *entire* sample with certain techniques (for example, a decision tree algorithm such as CART or CHAID) you could end up with a result that was not *generalizable* due to overfitting, but this would be the fault of the statistician for not using an appropriate validation technique, *not* for having "too much data".
As I said in my previous post, my greatest concerns here are how representative the data is, and with statistical validity. I don't have concerns about power or whether differences are statistically significant.
Using a sample set in the millions, unless you're dealing with a very straight-forward, black and white situation can introduce a certain amount of random variance that, depending on the situation, can play with power. Or, again depending on the situation, the differences in conclusions from the larger sample size are negligible enough to not warrant the extra effort... say a 20.05% trend versus a 20.03% trend.
Nevertheless, in this situation a control certainly can't be established given the nature of the situation and its uses, but the tool isn't being used for clinical trials, and there are so many possible ways the tool can be used it can't declare a definitive result, it can only show a trend.
Which brings us all back to the original point: A trend. The tool shows the trend of curiosity among those that have used it as to what games and systems they're currently looking for. It seems to me that was the whole point of the tool in the first place. No more, no less.
Sorry for the derail here...Large sample sets do not "introduce" "random variance". If you are saying large sample sets somehow increase unexplained variance OR variance from individual differences (think error terms in structural equation modeling), they do not. If you are saying they introduce some form of systematic bias, they do not. If you are saying that large sample sizes can create a problem due to sampling error, they can but thats a sampling error problem. As sample sizes go up, confidence intervals go down, whether its a sample size of one hundred or ten million.
Not having a control (group) is irrelevant here. Descriptives and estimation of population parameters (population mean, variance) can always be done with a sample of the population without a control group. If you are implying that only "clinical trials" give definitive results, that is incorrect. Observational or psuedo-experimental designs are just as valid (and in some cases *more* valid as a design) than experimental designs (such as a factorial design including a control group).
When you say "there are so many possible ways the tool can be used...", if you are saying this is not a controlled experiment, I completely agree. But thats a validity / generalizability issue (that I pointed out in earlier posts). It doesn't lead me to conclude that this is directional (in your words, a trend), but it does lead me to be *very* hesitant in drawing any conclusions from the data (how respondents use the tool doesn't necessarily translate with what people are looking for from "new" mmos).
Just to reiterate, I think the tool is great and has a lot of utility.
Free to play is not always a scam. It depends on how the cash shop is put into the game. If it is a shop to buy bonus, cooler looking items which have no advantage than other items, then in no way is this a scam. It is just a way that the companies make a profit, because that's what they have to do. If it is like Mabinogi, where you have to buy a card to reduce your age or to do certain parts of the game, in no way is this a scam. The companies have to make a profit, and most of those things are optional.. Even if the ones in Mabinogi are not, they are probably cheaper or just as much as a P2P game.
It is wrong just because a game has a cash shop to say it is a scam.
__________________________________________________
In memory of Laura "Taera" Genender. Passed away on Aug/13/08 - Rest In Peace; you will not be forgotten
there is not such a thing as "free to play"
In your opinion. My opinion is otherwise. It's obvious you're trying to "win" an intellectual "debate". But, this is the last I'm posting on this thread. The point that the tool may not be scientific but provides good information and is interesting has been ground into a fine paste. So feel free to say, "Nuh uh! You're all kinds of wrong! Cuz I know!" though I continue to disagree. If you want to keep bashing away at the subject in an attempt to puff yourself up, be my guest.
I've taken my time to try to explain why large sample sizes are not bad (though they are also frequently unnecessary for statistical analysis). There isn't any need to make ad hominem attacks.I had a good chuckle at this. Just to nail the coffin shut, a sample size of a million would really only be bad (albeit often uneccissary) if the population you were sampling was less than a million, in which case the person taking the sample would more than likely be laughed at.
As for what you can draw form this data. You can safely that of people who are activly searching for a new mmorpg, 93% of them either dont want Sci-fi or are impartial to it. :P
What would be really cool is if you could release the raw data, then we could do things like % of people looking for free mmorpgs are looking for the following, or if you were to release a sci-fi mmorpg, more of the people looking for this type of game are wanting a 3d p2p game.. things like that.
after 6 or so years, I had to change it a little...
I don't see any use for this information at all I'm afraid. It has absolutely no value in representing the views of anyone.
You don't know why people searched for certain terms, how many times the same person or people searched for the same terms, or how many other terms they also searched for not listed in your data.
Completely pointless and of no value.
Take the Magic: The Gathering 'What Color Are You?' Quiz.