To prepare for this discussion, read the instructor guidance, and Sections 2.2, 2.3, 2.4, 3.3, “Steps†in Section 3.4, 3.5, 4.3, and Chapter 5 of the Newman (2016) textbook._x000D_
_x000D_
In contrast to qualitative designs, which are all basically non-experimental and descriptive, quantitative research designs may be either experimental or non-experimental. Within the non-experimental category, descriptive and correlational research are sub-categories._x000D_
_x000D_
Using the University of Arizona Global Campus Library databases, look for a scholarly/peer-reviewed quantitative research study on the topic you selected in Week 1. In your initial post,_x000D_
_x000D_
Appraise the differences between experimental and non-experimental research._x000D_
Differentiate between a correlational study and an experimental study._x000D_
State the hypothesis being tested in the selected quantitative research study._x000D_
Identify the major variables and categorize them as independent or dependent._x000D_
Describe the methods and results of the study._x000D_
Determine whether the study is descriptive, correlational, or experimental, and explain why it fits this classification._x000D_
Instructor guidance_x000D_
Welcome to Week 4 of Research Methods! This week, you will learn about a variety of popular quantitative research designs, both non-experimental and experimental. Required resources are sections 2.2, 2.3, 2.4, 3.3, the part of section 3.4 about “Steps in Observational Research,†sections 3.5, 4.3, and all of Chapter 5 in the Newman (2016) textbook. There are also some additional resources cited in this guidance that you may find helpful in the assignments._x000D_
_x000D_
Assignments for the week include a discussion, an interactive learning activity and quiz, and a written assignment. To see how your assignments will be graded, look at the rubrics accessible through a link on the screen for each discussion or assignment._x000D_
_x000D_
The Week 4 discussion is Experimental Versus Non-experimental Quantitative Research. Your initial post is due by Day 3, and all replies are due by Day 7. To prepare for the discussion, read this instructor guidance and the sections of the Newman (2016) textbook listed above. Use the University of Arizona Global Campus Library databases to find a peer-reviewed quantitative research study on the topic you chose in Week 1. In your initial post, discuss the differences between experimental and non-experimental research, and between correlational and experimental studies. Summarize the main points of your selected study, including the hypothesis, the major variables, the research methods used, and the results. Determine whether the study is descriptive, correlational, or experimental, and explain why it fits into that category. Document your sources in APA style. At least two replies to peers’ initial posts are required, in addition to replying to others who post on your thread._x000D_
_x000D_
After you have learned about quantitative research from the assigned readings and participating in the discussion, you will be ready to do the interactive activity and take the quiz called Quantitative Research Fundamentals, due by Day 6. In the first part of the learning activity, match variables to their measurement scales. In the second part of the activity, you will answer multiple choice questions about quantitative research, hypothesis testing, validity, and reliability. When you have mastered the interactive learning activity, take the graded quiz. As with all quizzes in this course, you may retake it as many times as you wish until the end of the course to improve your score. Your highest score will be retained._x000D_
_x000D_
The written assignment is a Quantitative Research Critique paper, which is due on Day 7. Review the assigned readings and discussion forum posts. The assignment prompt also provides links to Writing Center and Library resources on how to read a scholarly article and write a critique, which will be helpful to review before starting the assignment. Your instructor will post an announcement with the reference for the assigned article to be critiqued. Retrieve the article from the University of Arizona Global Campus Library, and also download the Quantitative Research Critique Template provided in the Course Materials and the assignment prompt. The template is set up in APA format with a series of questions to answer about the assigned study. Submit your completed template form to Waypoint._x000D_
_x000D_
After completing this instructional unit, you will be able to:_x000D_
_x000D_
Compare and contrast experimental and non-experimental research approaches._x000D_
Identify the key features, pros, and cons of selected quantitative research designs._x000D_
Differentiate terms used in quantitative research and hypothesis testing._x000D_
Critique a quantitative research study._x000D_
Keep these objectives in mind as you go through this week’s learning activities._x000D_
_x000D_
Although you read this in an earlier week, you may want to review section 1.2 of the Newman (2016) textbook, which outlines the steps in the scientific method for quantitative studies. These steps are Hypothesize, Operationalize, Measure, and Explain (HOME). In Week 2, you formulated a research question and a hypothesis to fulfill step 1 of this process. The second step is to operationally define the variables you will use. The concepts you are researching must be represented by measurable variables, which you choose in this step of the research. The third step is to collect properly measured data, ensuring validity and reliability. Psychometrics, which is the field of psychological measurement, includes setting measurement scales for variables and composing measurement instruments such as tests, questionnaires, and surveys._x000D_
_x000D_
ThinkstockPhotos_506060042.jpg_x000D_
_x000D_
Essentially, variables in research mean the same thing that variables meant in your high school algebra class. They are numerical or logical values that may be unknown and can change. A variable must also have more than one possible value; otherwise, it would not be able to vary. There are four scales of measurement for variables: nominal, ordinal, interval, and ratio. It is important to define the variables and how they will be measured before designing any data collection instruments, such as surveys. To end up with the most precise and accurate results possible, you should try to operationalize your variables at the highest feasible measurement level. The measurement scale of the variables also determines which statistical tests can be used to analyze the data. Higher level scales allow more flexibility in the choice of statistical procedures._x000D_
_x000D_
Nominal scale variables are the lowest measurement level and denote differences in kind instead of differences in amount or intensity. Even though they are sometimes referred to as qualitative or categorical variables, the fact that they are variables at all signifies that they are being used in quantitative research methods. Be careful not to confuse qualitative variables with qualitative research methods. A nominal variable can identify a group or a characteristic. For instance, hair color could be a variable, and the possible values could include auburn, brown, blond, etc. Another example of a nominal variable is race. In a data set, races may be represented by numbers (for instance, 1=white, 2=black, 3=Asian, 4=American Indian, 5=other), but one race is not more racial than another. They are just different categories.Ordinal scale variables are also categorical, but the order of the categories is meaningful. Ordinal variables indicate ranking of values, without a clear measurement of distance between values or the amount of differences between variables. An example is the subjective assessment of pain. In a doctor’s office, you may be asked to rate your current level of pain on a scale of 1 to 10, with 1 being the least amount of pain and 10 being the worst pain you can imagine. This is not a precise, objective measurement. The number you choose is based on your own experience and perception. A rating of 6 means that you are having more pain than you did on a day when you picked 3, but it does not mean that your pain level is exactly twice as much as it was before. This scale is useful for determining whether something is ranked higher or lower than something else, but it does not measure exactly how much the difference is. The difference between 4 and 5 may not be the same as the difference between 7 and 8._x000D_
_x000D_
The next step up is the interval scale. Interval scale variables have an equal amount or distance between consecutive values. This allows some mathematical operations to be used, specifically addition, subtraction, and averages (Newman, 2016). However, the interval scale does not have a true zero point. An example of an interval scale is Fahrenheit temperature. Each degree is the same distance away from the next degree. However, a Fahrenheit temperature of zero degrees does not mean a complete absence of temperature. Also, fractions and multiples of values of a variable like Fahrenheit temperature would not be accurate representations of reality. You could not say that 40 degrees is exactly half as warm as 80 degrees, but the difference between 50 degrees and 54 degrees is the same amount as the difference between 80 degrees and 84 degrees._x000D_
_x000D_
Ratio scale variables are at the highest level of measurement. The ratio scale has all the properties of the interval scale plus a true zero point. Values on a ratio scale can be added, subtracted, multiplied, and divided, with meaningful results (Newman, 2016). An example of a ratio scale variable would be age (in months or years). An infant who is 12 months old is twice as old as one who is six months old._x000D_
_x000D_
As mentioned before, the most accurate statistical results can be found by using the highest feasible measurement scale. Variables generated from questions on a survey, such as age, annual income, or years of experience are sometimes given as multiple choice ranges (ordinal scale) instead of having the participants fill in the actual number (ratio scale). This may help respondents feel that their privacy is more protected, but it limits the options for statistical procedures and the precision of the analysis._x000D_
_x000D_
A construct is something that psychologists and psychometricians believe exists, but it cannot be measured directly because it is something that occurs inside of the mind. To try to measure a construct, it is necessary to measure a collection of things (variables) that are related to the construct, and then put those things together. An example is happiness. How can you tell if a person is happy? How happy is the person? You might note that the person is smiling or saying positive things. When you see or hear this evidence, you might conclude that the person is happy. On the other hand, if the person is frowning or crying, or complaining about something, you might infer that the person is not happy. But the verdict on whether the person is happy or not is a conclusion drawn based on the available evidence, not a direct measurement of happiness or the lack of it. The quality of psychometric measures is determined by calculations of validity and reliability._x000D_
_x000D_
The concept of validity refers to how appropriately and meaningfully a construct is measured by an instrument. Does the instrument really measure what it claims to measure? For instance, some of the first intelligence tests really measured how well people could read English or how familiar they were with middle class American culture instead of how intelligent they were (Gould, 1982). Imagine deciding that a recent immigrant to the United States who was just starting to learn English and was too poor to own a television set was of low intelligence. It took a while for people to realize that this was unfair and did not make sense. These days, psychometricians are much more careful about watching out for things like this when they construct tests and questionnaires. The important thing with validity is how the measurement is interpreted. In the example noted above, it would have been appropriate to interpret a low score on knowledge of popular television shows as an indication that the person probably did not watch much television, instead of concluding that the person lacked intelligence._x000D_
_x000D_
There are multiple types of evidence for the validity of using an instrument for a particular interpretation. Some of these are content validity, construct validity, and criterion validity. When you are reading a quantitative research report, it should include information about the validity of the instruments used for data collection in the study._x000D_
_x000D_
Content validity has to do with whether or not the questions (called items) are well-written and represent all of the important aspects of the field being tested. For example, a licensing exam for a nurse would not have content validity if it left out some of the most important things that nurses have to do in their work, such as taking vital signs. Content validity is usually checked by having a group of experts in the field go over the test to see if they think anything important is missing or if anything in it is not important and needs to be removed._x000D_
Construct validity has to do with whether the combination of items on the instrument is consistent with the theory that explains the construct. How accurately does the instrument measure what it is supposed to measure? Sub-categories of construct validity are face validity, convergent validity, and discriminant validity. Face validity is simply that experts in the field believe the items are asking about the construct in question and not some other construct or situation. Convergent and discriminant validity actually involve calculating a correlation coefficient. With convergent validity, you would compare your instrument with instruments that are known to measure related constructs, and you would expect to find a positive correlation. With discriminant validity, you would compare your instrument to one which is known to accurately measure a different construct; in this case, you would expect the two instruments not to be correlated. If they are, you have a problem._x000D_
_x000D_
Finally, criterion validity relates the measurement of the construct to the measurement of the behavior that is expected when the construct is present. Two sub-categories within criterion validity are concurrent validity and predictive validity (Newman, 2016). Concurrent validity means you are determining how similar your instrument is to another instrument given about the same time that measures the same thing and has already been shown to be valid. The two instruments should have a high positive correlation if they really are measuring the same thing. One measure might be a series of knowledge questions, while the other might be a test of hands-on skills. Predictive validity is a little different, because usually the two measurements being compared are taken at different times. An example is the correlation between an aptitude test and actual job performance. If it can be shown that people who score high on that particular aptitude test tend to have good performance evaluations on a job using those aptitudes, then the aptitude test has predictive validity. The point is to use the right kind of test to predict the behavioral outcome (i.e., job performance) in which you are interested._x000D_
_x000D_
There is a saying in psychometrics, “reliability is a necessary but not sufficient condition for validity.†This means that unless the measures are reliable, they cannot be valid for a particular use or interpretation, and that the fact that measures are reliable does not necessarily mean that they are valid for the given interpretation or decision. If I teach a unit on t tests, but then I give a test on ANOVA instead, my test might be a reliable measurement of my students’ knowledge of ANOVA, but it is not a valid measurement of their knowledge of t tests. In other words, it might be a good test for one thing, but it is not the right test for what I am trying to measure._x000D_
_x000D_
The concept of reliability is defined as the extent to which measurement is free from error, or how consistent a measure is. The difference between the true measurement of a construct and the test score for an individual is called error. This type of error does not mean a wrong answer on a test. It means that the measurement includes some things that were not intended and that the measurements may not be consistent because of this. Some items may be interpreted differently by different participants due to unfamiliar vocabulary or the use of words that might have more than one meaning. The environment in which the instrument is given may not be perfectly consistent between testing sites, such as different temperatures, lighting, or sound levels. Even within the same room, there are things that could make the testing experience different for different participants. For example, people whose seats were near a window might have been distracted by something happening outdoors. Characteristics of the participants may also influence test scores, such as illness, problems at home, or hearing something about the topic of the test on the radio on the way to the test site. These things may influence test scores in ways that have nothing to do with the characteristic being measured._x000D_
_x000D_
Measurement of reliability almost always takes the form of a correlation coefficient. The values of reliability coefficients range from zero, indicating no consistency between two sets of measures, to 1.00, indicating perfect consistency between the two sets of measures. As a rule of thumb, reliability coefficients should be greater than .50, although it is commonly accepted that reliability of .70 or higher is necessary for most purposes._x000D_
_x000D_
Although there are several types of reliability coefficients for tests or surveys, the most popular type is Cronbach’s alpha, a measure of internal consistency (also called interitem reliability). Before Cronbach developed the procedure for the alpha coefficient, researchers had to administer two different but equivalent tests to the same group of people (alternate forms reliability), give the same test to the same people on two different occasions (test-retest reliability), or split a large test into two parts and compare the half scores (split-half reliability). Coefficient alpha quickly caught on because it only requires one test, and even though the calculation is complex, it can be done easily and quickly by modern computers. (When first developed, it is said that it took a group of Cronbach’s graduate assistants about two weeks to perform all of the calculations by hand.) Cronbach’s alpha involves calculating the correlation of each item on the instrument with every other item on the instrument._x000D_
_x000D_
Here is an excellent video that helps explain Reliability and Validity beyond your text in different terms (Centennial, 2007). It is not required, but worth watching._x000D_
_x000D_
Reliability and validity (Links to an external site.)_x000D_
_x000D_
Quantitative research methods encompass descriptive, correlational, and experimental designs. What quantitative designs all have in common is that measurement concepts and statistical techniques are used to analyze the data pertaining to the research question._x000D_
_x000D_
Descriptive research designs can use either a qualitative or quantitative approach. We covered the qualitative approach last week. Quantitative descriptive methods measure and summarize things with numbers. Descriptive statistics are used to analyze data in descriptive studies. These include measures of central tendency (mean, median, and mode), measures of variation (range, variance, and standard deviation), and frequency (counts and percentages)._x000D_
_x000D_
In archival research, the researcher works with existing written or computerized records (Newman, 2016). For quantitative archival research, raw data sets with personal identifying information removed can be obtained from various government agencies, as well as other sources. For example, the National Center for Education Statistics (NCES) recruits large, nationally representative samples of students and collects data from them on a regular basis. Those interested in doing this type of research must try to find a data set from a reliable source that contains the variables needed to answer the research question or test the hypothesis. It is important to determine whether the operational definitions of the variables are compatible with the new theory or hypothesis and to ensure that the sample is appropriate to the target population of interest (Kluwin & Morris, 2006)._x000D_
_x000D_
In quantitative observational studies, the observer assigns numerical values to what is being observed, such as the number of times a person does a certain action, the number of different actions that take place in a specified time period, or the intensity of an action. A checklist or a rating scale may be used to keep track of counts or scores._x000D_
_x000D_
Correlational research involves collecting data about at least two different things, then measuring the relationship between them. This is a non-experimental quantitative type of research. It goes beyond descriptive research because it relates different variables to each other instead of describing each variable separately. Yet, it is not in the experimental category because the researcher does not manipulate an independent variable or provide an intervention. Much survey research is in the correlational category._x000D_
_x000D_
Experimental research has historically been considered the “gold standard†of research methods, if the purpose of the study is to establish causation. This is because the elements of control imposed in a well-conducted experiment make it possible to determine that the independent variable (the hypothesized cause) is the only factor that is influencing the dependent variable (the observed effect). Correlational designs can establish a relationship between variables, but they cannot prove that the relationship is one of cause and effect._x000D_
_x000D_
Experiments use all of the qualities of the scientific method: objectivity, precise measurements, control of other possible influencing factors, careful logical reasoning, and replication, as described in your textbook (Newman, 2016). To be considered a true experiment, a research study must have all three of these characteristics: (1) manipulation of an independent variable; (2) random assignment of participants to groups or conditions; and (3) control of extraneous variables._x000D_
_x000D_
A quasi-experiment is a study that has some, but not all, of these characteristics. Variables which are presumed causes but cannot be manipulated by the researcher are sometimes called “quasi-independent†variables. If quasi-independent variables such as gender or race are of interest in a study, they must be used in conjunction with an independent variable that can be manipulated, such as an aspect of the environment, in order for the study to count as a true experiment. Without a manipulated independent variable, the study would be a quasi-experiment. Think about situations when it is either impossible or unethical to randomly assign people to conditions or to manipulate a variable. In these situations, a quasi-experimental design may be called for instead of a true experiment._x000D_
_x000D_
Random assignment means that after you have recruited a sample of participants, you use a random process to divide them into two (or more) groups, usually based on the values of the independent variable. Randomly assigning participants to groups guards against possible bias and helps assure that the groups will be equal on unknown or extraneous factors. The two groups are called treatment (or experimental) and control._x000D_
_x000D_
The treatment group receives the treatment being tested, and the control group receives either no treatment or a standard treatment – this is also called manipulation of the independent variable. Everything about the two groups should be the same except for their condition on the independent variable. After the treatment period is over, the researcher measures the dependent variable for all participants and compares the scores for the two groups. If the treatment group has a significantly different average score than the control group, you can be fairly certain that the treatment was what made the difference._x000D_
For example, suppose you want to find out if a new method of teaching science is really better than the way it is currently being done. You would get a random sample of appropriate students and randomly assign them to groups. (If you can randomly assign students to groups, this would be an experiment, but if you have to use existing classes as the groups, it would be considered a quasi-experiment.) You have to watch out for possible problems, called threats to validity, such as students from different groups talking to each other about how they are being taught (this is called diffusion of treatment). Random assignment prevents bias (i.e., assigning your favorite students to a particular group) and it also helps you make sure that you don’t have all of the students who were already better at science in the same group. The groups should be as equal as possible at the beginning of the study._x000D_
_x000D_
In this example, the independent variable is the teaching method, which is assigned by the researcher. The dependent variable would be the score on a science test that all of the participants would take at the end of the study. If all goes well and the groups are really random and have an equal average on unknown factors that might affect knowledge or ability in the subject, you will be able to rely on the results of the comparison of test scores. So, if the treatment group’s average score is much higher than the control group’s average score, you can be reasonably confident that the new teaching method works better than the old method. If the average scores for the two groups are about the same, then you can conclude that both methods work just as well. Of course, if the control group has a higher average score than the treatment group, you won’t want to switch to the new teaching method!_x000D_
_x000D_
There are two aspects of experimental validity – internal and external. Internal validity has to do with being able to be sure that the independent variable in your experiment was really the cause of any observed change in the dependent variable. Without internal validity, you may as well not bother doing the experiment, because the results cannot be trusted._x000D_
_x000D_
External validity, on the other hand, is desirable, but not essential in the same way that internal validity is. External validity concerns whether the results of the research can be generalized to people (or animals) other than the participants in the research sample._x000D_
_x000D_
Both aspects of validity need to be protected from threats, a few of which are described in Chapter 5 of the textbook. An article by Onwuegbuzie (2000) contains a much more complete and detailed description of threats to internal and external validity. Controlling extraneous variables (things that might influence the dependent variable but are not part of the experiment) will help resolve many internal validity threats, but the more controls that are instituted to increase internal validity the less likely it is that external validity will be strong. Researchers must find a balance between internal and external validity._x000D_
_x000D_
Both correlational and experimental research studies require the use of inferential statistics. These are statistical tests that include a hypothesis test. The researcher must formulate a null hypothesis and a research (or alternative) hypothesis. These two must be exact opposites, so that if the evidence supports one, it would not support the other. The hypothesis test estimates the probability of the null hypothesis being true based on the data collected. This estimate is the p-value (probability). The researcher must compare the p-value to a pre-set alpha level. The alpha level is the maximum amount of risk the researcher is willing to take of wrongly rejecting the null hypothesis. A typical alpha level is .05, or a 5% chance of the results being wrong. If the calculated p-value is higher than the researcher’s alpha level, the researcher cannot reject the null hypothesis. If the p-value is less than alpha, the null hypothesis can be rejected because a low probability of the null hypothesis being true is a high probability of the alternative hypothesis being true._x000D_
_x000D_
If you have any questions about this week’s readings or assignments, email your instructor or post your question on the “Ask Your Instructor†forum. Remember, use the forum only for questions that may concern the whole class. For personal issues, use email._x000D_
_x000D_
References_x000D_
_x000D_
Centennial, L. (2007). Reliability and validity (Links to an external site.) . Retrieved from http://www.youtube.com/watch?v=H7fiJLUNQxI_x000D_
_x000D_
Gould, S. J. (1982, May 6). A nation of morons (Links to an external site.). New Scientist, 349-352. Retrieved from http://systematicbiology.co.nf/Gould(1982)_Nation_of_Morons.pdf_x000D_
_x000D_
Kluwin, T. N., & Morris, C. S. (2006). Lost in a giant database: The potentials and pitfalls of secondary analysis for deaf education. American Annals of the Deaf, 151(2), 121-128._x000D_
_x000D_
Newman, M. (2016). Research methods in psychology (2nd ed.). Bridgepoint Education._x000D_
_x000D_
Onwuegbuzie, A. J. (2000). Expanding the framework of internal and external validity in quantitative research. Paper presented at the annual meeting of the Association for the Advancement of Educational Research (AAER), Ponte Vedra, Florida. Retrieved from the ERIC database._x000D_