Our benefits

24/7 customer support

Professional writers

No plagiarism

Privacy guarantee

Affordable prices

94% of return customers

Free extras

Free title page

Free bibliography

Free formatting

Free of plagiarism

Free delivery

Home
KEY FEATURES OF CLASSIFICATION SCHEMES

Characteristics of classification systems that hold significant implications for planning theses and dissertations include (a) systems' structures, (b) the time and method of creating categories, and (c) precision and clarity.

Systems' Structures

The term structure refers to a classification system's (a) types of categories and (b) the relationship that exists among the categories. From among many possible kinds of structure, those we describe in the following paragraphs concern the quantity of dimensions and the patterning of dimensions.

Quantity of dimensions

As suggested in our earlier examples, a dimension--or variableor taxon (whose plural is taxa)--is the aspect of life that serves as the focus of classification. Thus, dimensions can be age, gender, intelligence, attitude toward dieting, friendship patterns, use of illicit drugs, place of residence, level of formal education, favorite movie, type of physical exercise, frequency of sexual intercourse, and thousands more. The number of dimensions adopted in a thesis or dissertation can vary from one dimension to a great many. For instance, the variable time, as represented by successive historical eras, can be the single classification dimension used in an account of a Crow Indian tribe's migration pattern over a period of three centuries. In our earlier example of attitudes toward abortion, there were 17 variables--five representing personal-identification dimensions and 12 representing answers to questions about attitudes.

Patterning of dimensions

The relationships among variables in a classification system can assume diverse patterns, including linear versus hierarchical, discrete versus continuous, and exclusive versus nonexclusive classes.

In a linear typology, each variable can be seen as consisting of a single line that extends from an extreme position on the left to an opposite extreme on the right. The categories or choices along the line can vary from two (a dichotomous pattern), to three (a trichotomous pattern), or to any larger number of intervals. The greater the number of intervals along the line, the more the pattern deserves the label continuous variable.

An instance of a dichotomous variable is gender, defined as consisting of two categories--male and female. However, sometimes a variable that initially appears dichotomous actually involves far more categories. A case in point is Carl Jung's ( Jung, 1971) conception of personality types in terms of introversion and extroversion. Introverted persons are concerned primarily with their own thoughts and feelings. Extroverts direct their interests outward, to things outside themselves. But rarely does anyone match either the extreme introvert or extrovert description. Most people are some mixture of both traits and therefore belong at some point along the continuum that connects the two so that an introvert/extrovert pattern involves many intervals along a scale. Consequently, a researcher who uses Jung's notion of introversion/extroversion is obliged to classify people's personalities in terms of, not just two intervals, but of numerous intervals representing different degrees of the two opposing traits.

In contrast to linear classification schemes are hierarchical systems. Two of the best known hierarchies are in the field of biology, Carolus Linnaeus' Animalia and Plantae taxonomies for classifying types of animal and plant life. Each of these systems is organized in seven tiers that form the image of an inverted tree that branches downward. The most inclusive tiers at the top--the trunk--is the kingdom, encompassing all the phenomena within the field of interest--all fauna in the Animalia taxonomy and all flora in the Plantae. Six additional tiers that progressively branch out below the kingdom are, in descending order, phylum, class, order, family, genus, and species. The farther one descends in each hierarchy, the greater the number of defining characteristics that determine the category in which an organism belongs.

Let's now envision a hierarchical scheme that we might create for a study of the extra costs students incur when they enroll in various courses in a university. What we are interested in discovering is how much students can expect to be charged in their courses in addition to their tuition fees--extra charges for books, supplies, equipment, laboratories, field trips, and more. We not only want to know the average extra course expenses across the entire campus, but also how much those costs vary from one department to another, between specializations within a department, and between courses within a specialization. To accomplish this, we create a six-tier hierarchical classification system--an inverted tree--as displayed in Figure 10-2.

The quantities of colleges, departments, specializations, and courses in a typical university are too large to be accommodated in a single diagram, so we have simplified Figure 10-2 by portraying only a sampling of those variables. Tier 1 represents the entire university. On Tier 2 we limit the pictured colleges to a pair--humanities and sciences. Tier 3 limits the number of depicted departments to six in each college. On Tier 4, we illustrate specializations for only two departments--the foreign language department in the humanities college and the biology department in the sciences college. On Tier 5 we distinguish three levels of courses for three foreign-language specialties (French, German, Hindi) and three biology specialties (botany, genetics, bio-chemistry). The three course levels are defined as follows: (1) lower-division (freshmen and sophomores) introductory survey courses, (2) upper-division (juniors and seniors) advanced courses, and (3) advanced graduate courses. The individual courses within each of these three groups are then found on Tier 6. In the humanities colleges, we have drawn a line under the German-language specialty to identify Level-2 (upper-division advanced) courses on Tier 6. In the sciences colleges, we identify Level-3 (advanced graduate) courses in the genetics specialization. Although our diagram encompasses only this restricted sampling of colleges, departments, specializations, and courses, in our study we intend to initially consider all courses in the university's curriculum.

Figure 10-2 A University Course-Analysis Hierarchy

Consider, now, two ways that our course-analysis classification scheme can help us answer our research questions. First, it seems obvious that trying to discover the extra costs for every course in the university curriculum would be an overwhelming task, far more costly in funds, time, and effort than the desired information is worth. So we need to identify a sample of courses to study, a sample that fairly represents the entire population of courses. Our course-analysis hierarchy can help us with this task. To begin, it is apparent that extra course costs will vary between colleges (humanities versus sciences), among departments, and among specializations within departments. Therefore, we probably should include all departments from both colleges in our sample. And under the specializations (Tier 4),we will likely want to distinguish between lecture and laboratory (practicum) courses so as to be sure that we include some of each type. Some departments--history, literature, philosophy, sociology, and the like--may have no classes that require special equipment or supplies so that we need not make the lecture/laboratory distinction in those departments. Hence, descending from Tier 1 through Tier 3, we have included the entire population of colleges and departments. Now, at Tier 4, we begin sampling. Within each department, we draw a selected number of specializations by means of either random or systematic sampling techniques (as described in Chapter 7 under surveys.) How many specializations we choose will depend on the number of Tier 6 courses we ultimately plan to analyze. Then, for those selected specializations we move to Tier 5. By random or systematic sampling, we choose a specified number of course levels. Finally, among those course levels in our selected specializations, we choose a sample of courses that we will study.By the foregoing process, we end up with a sample of courses whose extra costs we can discover by polling a sample of students who have been enrolled in the courses. We can draw a sample of, let us say, four or five students from each course by randomly selecting them from last semester's class rosters in the university registrar's computer data bank. The polling can be done by sending postcard questionnaires (containing prepaid return postage) to the selected students to learn what extra costs they paid--and for what kinds of items--in the course designated on their particular postcard.Let's imagine that our sampling method worked out well and that a large proportion of the students we polled did, indeed, return the postcards. Our classification hierarchy now equips us to tell a detailed story about extra costs of courses. When the results of our postcard survey are combined in various patterns, we can report the average extra course costs for the typical:

class in the university as a whole
lecture versus laboratory course in the university as a whole
course in the humanities versus typical course in the sciences
lecture versus laboratory course in the humanities versus typical lecture versus laboratory course in the sciences
course in each department
lecture versus laboratory course in each department that has both types of courses
course level (lower division, advanced upper division, graduate) in the university as a whole
course level (lower division, advanced upper division, graduate) in humanities versus course level in the sciences
course level (lower division, advanced upper division, graduate) in each department
course level (lower division, advanced upper division, graduate) in selected specializations

We are also prepared to report the variability of extra costs in each of the above categories; and with examples of specific classes, we can illustrate the course features that lead to such variability. Without the aid of our hierarchical classification system, we would not have been prepared to report so much information in such a systematic fashion.

We turn now to another issue regarding codification schemes, that of exclusive versus nonexclusive classes, as reflected in the question: Within the field of interest, when an item--a bit of information--is to be classified, is there a single category in which the item must be placed, or might it be placed in more than one category? If there is only one site in which an item is properly situated, the system's classes can be considered exclusive or distinct. But if an item might reasonably be put in more than one location, the system's classes are not exclusive or, at best, are only partially so.

Both Linnaeus' Plantae taxonomy and a comparative chart of the world's languages are exclusive-class structures. A flower with the characteristics of the buttercup can be properly situated only under the genus Ranuculus. A Welshman's native language falls exclusively in the Brythonic branch of the Celtic division of the Indo-European family of languages.

In contrast to such discrete structures, other schemes prove less than perfect in terms of class exclusiveness. Consider, for example, the widely used taxonomy of educational objectives created by Benjamin Bloom ( 1956), David Krathwohl ( 1964), and their colleagues. The method of classifying the goals of education was created to help "teachers, administrators, professional specialists, and research workers who deal with curricula and evaluation problems . . . especially to help them discuss these problems with greater precision" ( Bloom, 1956, p. 1). The system was divided into three domains--cognitive (intellectual skills), affective (emotional outcomes), and psychomotor (manipulative or motor skills). A separate subtaxonomy was developed for each domain. The problem of properly classifying educational goals within the cognitive domain can be illustrated with the following objective from a high school literature class:

Students will be able to compare and contrast three short stories-- Poe The Purloined Letter, Steinbeck The Red Pony, and Maugham A Friend in Need.

It can be argued that this goal might reasonably be located under any one or more of the following taxa:

1.24 Knowledge of the criteria by which facts, principles opinions, and conduct are tested or judged.

4.30 Analysis of organizational principles . . . which hold [a] communication together. . . . It includes the bases, necessary arrangement, and the mechanics which make the communication a unit.

6.20 Judgments in terms of external criteria. Evaluation of material with reference to selected or remembered criteria. ( Bloom, 1956, pp. 203, 206-207)

Hence, when an observed phenomenon is described in terms of this classification structure, the phenomenon might qualify for placement in more than one category. In effect, the scheme's taxa are not exclusive. Instead, they can represent diverse ways of viewing a phenomenon. This means that people who use the system are required to decide whether an item should be assigned to only one class or to more than one.

The Time and Method of Creating Categories

As shown by our earlier examples of classification systems, the dimensions and the categories within them can be established at various stages of your research, as influenced by (a) how accurately you can predict ahead of time what sorts of dimensions and their categories will best aid you in answering your research questions and (b) whether new questions arise during the process of conducting your project.To illustrate the matter of predictive accuracy, let's consider three guide questions that we might include in a study of the incidence and modes of transmittal of infectious diseases in Africa, Europe, and South Americaover the decade 1990-2000. Our special concern is with the spread of HIV (human immunodeficiency virus) infection compared to the spread of other types of infectious diseases. Among the questions that guide our study, here are three whose answers will require classification schemes. We are interested in deciding how and when we will specify the dimensions and the categories within those dimensions for the three schemes.

1. What was the annual incidence of HIV in Africa, Europe, and South America over the decade 1990-2000?
2. What was the annual incidence of different varieties of infectious disease in Africa, Europe, and South America, 1990 to 2000?
3. What was the incidence of the circumstances in which infected persons incurred HIV in Africa, Europe, and South America during the first half compared to the last half of the decade 1990-2000?

To answer the first two questions, we plan to gather data by examining statistical reports from world health organizations. To answer the third, we plan to interview a representative sample of HIV victims from each of the continents who contracted the infection at different times over the 1990-2000 period.

When, then, do we set up our three classification schemes? The one for recording answers to the first question can be formulated before we collect any data, because we can specify all three dimensions and categories ahead of time. In effect, the focal concern (dimension) of the first question is incidence of HIV infection. The second dimension is time (10 annual intervals) and the third is continents (3 continents). We can also foresee the exact form in which to display our summarized results--a 3-by-10 table in which we enter the focal variable (incidence numbers) in the cells that represent the intersection of the time and continent variables, as suggested by the dummy table in Figure 10-3.

Our second question has the incidence of different varieties of infectious diseases as the central concern. Before collecting our information, we can specify the time dimension's categories (10 annual intervals) but not how many different infectious diseases, because we do not know ahead of time what variety of infectious diseases have been identified in world health reports for the three continents. Therefore, we do not know how many diseases or their titles to include in our classification system until we have compiled the data. Consequently, we cannot specify ahead of time the exact size of the table in which we will summarize our results.

Figure 10-3 Incidence of HIV Infection by Continent 1990-2000

(number of cases per 100,000 persons in the population)
 Africa Europe South America
1990    
1991    
1992    
1993    
1994    
1995    
1996    
1997    
1998    
1999    

The third question poses the same problem as the second. The center of attention (focal variable) is incidence of circumstances in which HIV was contracted. Whereas we know at the outset the number (2) and titles ( 1990-1994 and 1995-1999) of categories along the time variable, we cannot predict with any confidence the variety of circumstances in which HIV was incurred. We must analyze the results of our interviews before making that determination. The question to guide our task of analysis can be "What groups of circumstances can we abstract from the interviews and how should each group be defined and labeled?"

It is useful to recognize advantages and disadvantages of defining categories before data have been collected compared to defining categories after data collection. Consider the variable circumstances under which HIV was contracted. If we specify categories of circumstances ahead of time, then when collecting data we can be prepared with a convenient list of the categories, and our respondents need only check the one or more categories that apply in their case. Later, when we compile the results, our task is a simple matter of counting how many respondents checked the different options. However, the disadvantage of this approach is that it can easily miss or distort important features of circumstances that are not accommodated by the preconceived choices that we offer to respondents. Consequently, the final report of the research will fail to acknowledge the effect of those neglected features.

On the other hand, if no categories are defined ahead of time but are extracted only later from the collected data, the task of collecting information via interviews or questionnaires will likely involve the complex process of gathering complete, verbatim accounts of what respondents say or write in narrative form. Then it can be a laborious, demanding challenge to analyze the narratives so as to derive categories that are true to the spirit of the respondents' experiences. However, adopting such an approach does enable us to accommodate features of respondents' experiences that might have been missed if their answers had been limited to those options we might have included in a set of multiple choices.

Two ways that researchers can seek to combine the advantages of both the "check-the-option" and "describe-your-experience" approaches are by (a) preceding the final data collection with a pilot study and/or (b) inviting respondents to attach explanatory comments to each set of multiple-choice categories they have marked.

In the pilot-study approach, the researcher asks a sample of respondents (via interviews or questionnaires) open-ended questions, such as, "As far as you can tell, what was the occasion of your contracting HIV?" Or "Could you describe the situation in which you think you caught HIV--where it occurred, who was involved, how it came about?" The answers to these questions are then analyzed to abstract categories, with each category defined in a way that clearly distinguishes it from the others. Then the defined categories are listed on a questionnaire that can be administered either in printed form or as an interview schedule to the participants in the study. Those participants are asked to check the category--or categories--that apply to them.

The second approach to capturing information that could be missed if respondents were offered only multiple-choice options is that of inviting them to offer comments about additional factors that apply in their case. The directions on a such a multiple-choice questionnaire could then read Place an X in the box at the left of each of the following statements that tells about the circumstance in which you think you caught HIV. Then, in the space under each statement, write any comments you think will help explain how you contracted HIV.

Precision and Clarity

A classification problem that can seriously diminish the trustworthiness of a study's results is one caused by the inadequate definition of a variable's categories. There are two principal conditions under which this problem occurs: (a)

 

when respondents fill out questionnaires or answer interview questions and (b) when researchers attempt to classify the contents of people's narratives that are in the form of testimony, letters, written answers to essay questions, interview responses, and the like.

In the first instance, participants are given multiple-choice answers to a question. Their task is to select the choice that is true of their own situation. So, if the outcomes of the research project are to be trusted, then the classification categories--the multiple choices--must be defined precisely enough to enable any reasonably informed person to select the proper choice. Some types of categories are easy to define and recognize. Such is the case with divisions based on gender, age, time interval (year, decade, century), place (school, city, nation), and measures that result in definite quantities (heights, test scores, days absent from school). Any errors made in placing data in their proper categories are the result of respondents' carelessness in selecting a category, not the result of imprecisely defined classes. However, the meanings of the choices are often not so obvious, so respondents may not all interpret the choices in the same way. The greater their confusion, and the more they must depend on subjective judgments, the less accurate the results of the study. One way to reduce such ambiguity is to conduct a tryout study of your questionnaire or your interview schedule before you administer it to the participants on whom the results of your study will be founded. The tryout is best conducted with people who are much like your intended respondents. When you administer your questionnaire or interview to the tryout subjects, you can provide some such instructions as these:

I'm asking you to help me discover if the questions on this sheet--and the choices among the answers--are easy to understand. So, as you read each question and the answers below the question, decide whether you can easily decide which answer is best. If you have no trouble deciding on an answer, then write the word clear next to that item. But if you have trouble deciding among the answers, write a question mark beside the item. Then, after you finish we can talk about the items with the question marks. That will help me know how I might change those items so they are easier to understand.

Now let's turn to our second circumstance, the situation in which the information the researcher receives from respondents is in narrative form. Such will be the case when an interviewer asks, "What is your opinion of Ms. Kelley as a senatorial candidate?" Or "What was your reaction to the Carswell murder trial?" Or "What problems do you expect if a voucher system is used for determining which schools children will attend?" Or "What social class do you think you belong to, and why do you think you are in that class?" After researchers have collected answers to such open-ended queries, they are obliged to resolve two issues: (a) From the viewpoint of the research study's purpose, into what categories should respondents' answers be placed? (b) How should those categories be defined so that respondents' answers can be located in the proper categories?

To illustrate one way to cope with the first of these issues, consider a study conducted by Johnston ( 1988) to discover the moral reasoning of adolescents who were asked to offer their opinions about the wrongdoing displayed in two Aesop fables. Johnston's aim was to learn whether 60 adolescents' judgments were determined more by their sense of compassionate caring than by their sense of even-handed justice, so her classification system initially contained two categories--caring and justice. But after collecting the 60 students' opinions, she discovered that not all answers fit neatly into either the caring class or the justice class. Some answers included an equal measure of caring and justice, so she created a third class to accommodate those mixed responses. Furthermore, a few answers failed to fit any of the three categories, so she relegated them to a fourth class entitled uncodable, "which meant that the answer did not clearly represent any identified logic" ( Johnston, 1988, p. 54).

Now let's look at the second issue, the matter of how accurately members of the research team assign each respondent's answer to a correct category. We should recognize that, in Johnston's study, when students were asked about the moral problem that was embedded in each fable, they could reply in any way they chose. It then became the researcher's task to extract from students' replies (some of which were quite complex) the portions that seemed to reflect either a justice attitude or a caring viewpoint--or both, or neither. But such a task could involve a substantial measure of subjectivity on the part of the person making that decision--subjectivity which might result in one person coding data into different categories than did another. The greater the amount of subjective opinion involved in the coding process, the less trustworthy are the reported outcomes of the research. Three of the ways investigators seek to increase the accuracy of assigning data to categories are those of (a) defining each class in a manner that clearly distinguishes it from any other class, (b) accompanying each definition with examples of the sorts of data (such as the sorts of student replies in the Johnston study) that belong in that category, and (c) conducting training sessions for the people who will do the coding.

Therefore, if the research for your thesis or dissertation involves classifying narrative responses, you may profit from adopting one or both of the following methods for assessing how precisely you have defined your categories. And if this assessment suggests that your classification system is less than satisfactory, you may wish to (a) redefine your classes, (b) add illustrative examples to your definitions, and (c) retrain your coders.

The first assessment method concerns the successive-judgments consistency of coders' decisions. Specifically, to what extent does a coder place a particular kind of observation or response consistently in the same category? Imagine that a teacher is judging how logically students' argue their position in essays they have written about human rights. The teacher has established five categories of logic in which to locate students' arguments--superior, above average, moderate, below average, and very poor. Her appraisal technique is deemed perfect in terms of successive-judgments consistency if, when she reevaluates the essays this week, she gives each one exactly the same rating that she gave it last week. In effect, her method is 100% consistent over time. However, such an outcome would be quite unusual. Studies of such matters have shown that raters' successive judgments of narrative data (essays, letters, descriptive oral responses, and the like) are often inconsistent. So if the results of your coding the same interview or questionnaire responses on more than one occasion show that your successive judgments are consistent less than 85% or 90% of the time, you may wish to adopt one or more of the three remedial techniques mentioned above.The second method concerns interrater reliability: To what extent do two or more judges agree on where to locate a particular observation or response? If multiple judges are in substantial agreement about the placement of data, then the definitions of categories and the training of judges are considered trustworthy. A typical way to assess interrater reliability is to have two or more raters independently classify the same data (narrative answers to questions) into the study's categories. The interrater reliability is then reported as the percent of times the coders' judgments matched. In the Aesop fables project, Johnston reported intercoder reliability for two raters as 100% agreement on coding students' solutions to The Dog in the Manger story and of 90% agreement on coding solutions to The Porcupine and the Moles fable ( Johnston, 1988, p. 54).

PLANNING CHECKLIST

In planning the way you intend to classify the information that you gather for your project, you may find it helpful to complete the following steps.

1. For each question that you hope to answer by the use of your collected information
2. State the question.
3. Name and define each variable (dimension, characteristic) that you will include in your data analysis. Tell when you intend to name and define each of those variables. (Before you gather data? After you gather data?)
4. Name and define the classes (categories) you will employ within each variable. Tell when you intend to name and define each of those categories. (Before you gather data? After you gather data?)
5. State whether the variables are linear or form a hierarchy. If a hierarchy, then draw a diagram showing the hierarchy's structure.
6. Create dummy tables illustrating the form in which you intend to display and analyze your collected information after it has been classified.
 
< Prev   Next >

Service features

24/7 customer support

Written from scratch papers only

Any citation style

Fully referenced

Never resold papers

275 words per page Courier New font