To request a specific blog topic or if you have any questions email James@StatisticsSolutions.com

Wednesday, April 10, 2013

Selecting a Survey Instrument



When selecting a survey instrument for dissertation research, there are some important factors that should influence the decision.  First, and foremost, the instrument should accurately measure the variable of interest.  If the goal of research is to assess job satisfaction of top executives at fortune 500 companies, you will need to select an instrument that measures job satisfaction.  In this instance, the Job Satisfaction Survey would be a good choice.  The instrument is composed of 36 Likert scale items.  In the case of this particular instrument, you can calculate a total score, or you calculate nine sub-scale scores.  If you want to know about overall job satisfaction, the total score would be a sufficient measure.  

It is important to select an instrument that has been found to be reliable and valid.  Reliability refers to the extent that the instrument yields the same results over multiple trials.  Validity refers to the extent that the instrument measures what it was designed to measure.  There are several ways to assess the reliability and validity of the instrument once data has been collected however, these factors are important to know prior to data collection.  To determine if the instrument has been proven reliable and valid, it is important to research the instrument and find out what previous studies ascertained.  A quick assessment of previous research that used the instrument should allow you to do this.

When selecting a survey instrument, it is important to know how total scores or averages are calculated and what higher or lower scores indicate.  Oftentimes, in survey instruments, the tool can be comprised of negatively worded items as well as positively worded items.  When scoring these instruments, it is important to know which items need to be reverse scored prior to calculation.  It is important to understand how the instrument has been scored in previous studies and to duplicate that scoring method for your study.  

One other factor to consider when selecting an instrument is the type of data you will be obtain.  If you are planning to use a simply descriptive study, the design of the response options can vary from question to question.  If you plan to use inferential statistics, it is beneficial to be able to create total scores.  In order to create total scores or average scores, you typically want all response options that make up a particular scale or sub-scale to have the same range, perhaps 1 - 5 where 1 = strongly disagree and 5 = strongly agree.

Remember, the two most important factors in selecting an instrument are that the instrument measures your variable of interest and that it is reliable and valid.  That information coupled with the other suggestions will assist you in the selection of an excellent instrument.

Tuesday, March 12, 2013

Coefficient of Determination



  • To determine how much variance two variables share, or how much variance is explained, or accounted for, by a set of variables (predictors) in an outcome variable.
  • Values can range from 0.00 to 1.00, or 0 to 100%.

  • In terms of regression analysis, the coefficient of determination is an overall measure of the accuracy of the regression model.

  • In simple linear regression analysis, the calculation of this coefficient is to square the r value between the two values, where r is the correlation coefficient.


  • It helps to describe how well a regression line fits (a.k.a., goodness of fit). An R2 value of 0 indicates that the regression line does not fit the set of data points and a value of 1 indicates that the regression line perfectly fits the set of data points.

  • By definition, R2 is calculated by one minus the Sum of Squares of Residuals (SSerror) divided by the Total Sum of Squares (SStotal):  R2 = 1 – (SSerror / SStotal).

  • In the case of a multiple linear regression, if the predictor variables are too correlated with one another (referred to as multicollinearity), this can cause the coefficient of determination to be higher in value. 

  • If, for whatever reason, there is multicollinearity in the regression model, the Adjusted R Squared (Adjusted Coefficient of Determination) should be interpreted.  The Adjusted R2 can take on negative values, but should always be less than or equal to the Coefficient of Determination.  Note: The Adjusted R2 will only increase if more predictors variables are added to the regression model.

  • Inversely, the Coefficient of Non-Determination explains the amount of unexplained, or unaccounted for, variance between two variables, or between a set of variables (predictors) in an outcome variable. Where the Coefficient of Non-Determination is simply 1 – R2.

Monday, March 4, 2013

When to use descriptive Statistics to answer RQs



·         Descriptive statistics are the appropriate analyses when the goal of the research is to present the participants’ responses (as frequencies and percentages and/or as means and standard deviations) to survey items in order to address the research questions.  There are no hypotheses in descriptive statistics.

·         Descriptive statistics include: frequencies and percentages for categorical (ordinal and nominal) data; and averages (means, medians, and/or ranges) and standard deviations for continuous data.  Frequency is the number of participants that fit into a certain category or group; it is beneficial to know the percent of the sample that coincides with that category/group.  Percentages can be calculated to assess the percent of the sample that corresponds with the given frequency; typically presented without decimal places (according to APA 6th ed. standards).  Typically, the average that is calculated/presented is the mean.  Means describe the average unit for a continuous item; and standard deviations describe the spread of those units in reference to the mean.  

·         You cannot (statistically) infer results with descriptive statistics. Inferential (parametric and non-parametric) statistics are conducted when the goal of the research is to draw conclusions about the statistical significance of the relationships and/or differences among variables of interest.

·         Power analyses (sample size and effect size) can be conducted when the analyses used to address the research questions are inferential; not for descriptive statistics and there is not a minimum sample size that is required to conduct descriptive statistics.

·         Descriptive statistics are appropriate when the research questions ask questions similar to the following:

      •  What is the percentage of X, Y, and Z participants?
      • How long have X, Y, and Z participants been in a certain group/category?
      • What are, or describe, the factors of X?
      • What is the average of variable Y?
      • How much do X participants agree about a certain topic?
      • What are, or describe, the similarities and/or differences on a certain topic by group/category?
·         Example: a study was conducted on a group of college students about specific courses offered, where the questions had “check all that apply” responses.  The study’s research question asked “What courses offered to college students are most prevalent?”  Descriptive statistics would be the appropriate analysis to address the research question.  Frequencies and percentages could be conducted on the survey’s listed courses that students took/registered for.  See the table below for details.

Frequencies and Percentages on the Survey’s Listed Courses

Course
n
%



English composition 101
35
25
Chemistry 101
53
66
Algebra 101
16
4
Pottery
2
1
Intro to Psychology
70
85
Art 101
72
86
Note.  Percentages may not total 100 due to rounding error and participant allowance to select multiple responses.


Tuesday, February 19, 2013

Writing a quantitative research question



Formulating a quantitative research question can often be a difficult task.  When composing a research question, a researcher needs to determine if they want to describe data, compare differences among groups, assess a relationship, or determine if a set of variables predict another variable.  The type of question the researcher asks will help to determine the type of statistical analysis that needs to be conducted.  It is also important to consider what specific variables need to be assessed when writing a research question.  The researcher must be certain all variables are quantifiable, or measurable. Measuring variables can be as simple as having participants report their age or as involved as having participants answer survey questions that make up a reliable instrument.  Some examples of different types of research questions are presented below:

Descriptive:
Describe the teachers’ perceptions of the newly implemented reading assessment program.
The goal of a descriptive research question is to describe the data.  The researcher cannot infer any conclusions from this type of analysis; it simply presents data.  Descriptive questions do not have corresponding null and alternative hypotheses because the researcher is not making inferences.  Descriptive studies can be conducted on categorical or continuous data.

Comparative:
Are there differences in students’ grades by gender (male vs. female)?
Are there differences in job level (entry vs. mid vs. executive) by gender (male vs. female)?
Comparative questions can be assessed using a continuous variable and a categorical grouping variable, as well as with two categorical grouping variables.  They type of analysis will vary depending on the types of data.

Relationship:
Is there a relationship between age and fitness level?
Is there a relationship between ice cream sales and temperature at noon?
Questions that assess relationships do not require a definitive independent and dependent variable, but two variables are required; they can be considered variables of interest as opposed to independent and dependent variables.  Data used for this type of analysis can be dichotomous, ordinal, or continuous.  They type of analysis will vary depending on the types of data. 

Predictive:
Do age, gender, and education predict income?
Does a pitcher’s ERA predict the number of wins the team has?
Predictive questions have a definitive independent and dependent variable.  Typically, the independent variable should be continuous or dichotomous, but nominal and ordinal variables can be used.  When nominal and ordinal variables are used as predictors, they must be dummy coded.  Like the independent variable, the dependent variable is typically continuous or dichotomous, but can also be ordinal or nominal.  The type of analysis that is appropriate will vary based upon the type of data.