0.0(0)

Take a practice test

Defining the Psychological Construct

Psychological Constructs

Concepts; unseen processes postulated to explain behavior
not straightforward or simple to measure
cannot be observed directly

Conceptual Definition

describes the behaviors and internal processes that make up that construct, along with how it relates to other variables

Operational Definition

a definition of a variable in terms of precisely how it is to be measured
the process of developing indicators or items for measuring these constructs

Theory

to develop a psychological test, use a theory of the target construct
key to this is that there must be a rational link between the items’ content and the definition and understanding of the construct

Unidimensional and Multidimensional

Unidimensional Construct: expected to have a single underlying dimension; measured using a single measure or test
Multidimensional Construct: consist of 2 or more underlying dimensions

Make a Concept Paper

Conceptualization

the mental process by which fuzzy and imprecise constructs (concepts) and their constituent components are defined in concrete and precise terms
The process of understanding what is included and what is excluded in the concept

Essential Test Item Considerations

Test Items

Test Items: units that make up a test and the means through which samples of test taker’s behavior are gathered
It follow that the overall quality of the test depends primarily on the quality of the items that make it up, although the number of items in a test, and their sequencing or position within the test, are also matters of fundamental importance
Item Analysis: a general term that refers to all the techniques used to asses the characteristics of test items and evaluate their quality during the process of test development and test construction
- involves both qualitative and quantitative procedures
- Qualitative Item Analysis Procedures
  - rely on the judgments of reviewers concerning the substantive and stylistic characteristics of items as well as their accuracy and fairness
  - appropriateness of item content and format to the purpose of the test and the populations for whom the test is designed
  - clarity of expression
  - grammatical correctness
  - adherence to some basic rules for writing items that have evolved over time
- Quantitative Item Analysis
  - involves a variety of statistical procedures designed to ascertain the psychometric characteristics of items based on the responses obtained from the samples used in the process of test development

Test Development

Test Planning

the constructs or knowledge domains that the test will assess
the type of population with which the test will be used
the objectives of the items to be developed, within the framework of the test’s purpose
the concrete means through which the behavior samples will be gathered and scored

the last point includes decisions about the method of administration, the format of the test item stimuli and responses, and the scoring procedures to be used
after these issues are decided and a preliminary plan for the test is made, the process of test development usually involves the following steps:
1. Generating the item pool by writing or otherwise creating the test items, as well as the administration and scoring procedures to be used
2. Submitting the item pool to reviewers for qualitative item analysis, and revising or replacing items as needed
3. Trying out the items that have been generated and reviewed on samples that are representative of the population for whom the test is intended
4. Evaluating the results of trial administrations of the item pool through quantitative item analysis and additional quantitative analysis
5. Adding, deleting, and/or modifying items as needed, on the basis of both qualitative and quantitative item analysis
6. Conducting additional trial administrations for the purpose of checking whether item statistics remain stable across different groups -- cross-validation -- until a satisfactory set of items is obtained
7. Standardizing or fixing the length of the test and the sequencing of items, as well as the administration and scoring procedures to be used in the final form of the test, on the basis of the foregoing analyses
8. Administering the test to an new sample of individuals -- carefully selected to represent the population of test takers for whom the test is intended -- in order to develop normative data or performance criteria, indexes of test score reliability and validity, as well as item-level statistics for the final version of the test
9. Publishing the test in its final form, along with an administration and scoring manual, accompanying documentation of standardization data, reliability and validity studies, and the materials needed for test administration and scoring

Test Item Types

Selected Response Items or Objective or Fixed-Response Items

close-ended in nature
they present a limited number of alternatives from which the test taker must choose
in ability tests, items of this type include multiple-choice true-false, ranking and matching, as well as items that call for a rearrangement of the options provided
in personality tests, objective items may be either dichotomous or polytomous
- Dichotomous Items: require a choice between 2 alternatives
- Polytomous Items: present the test taker with 3 or more alternative responses to a statement
- These alternatives are typically scaled in terms of degree of acceptance (e.g., ike, indifferent, or dislike), intensity of agreement (e.g., from strongly agree to strongly disagree), frequency (e.g., from never to very often), and so forth -- with the midpoint usually signifying a neutral, uncertain, or middle-of-the-road position

Forced Choice Items

Objective items that require test takers to choose which one of 2 or more alternatives is most or least characteristic of them
This kind of item is used mainly in multidimensional personality inventories to control for the tendency of test takers to respond in the direction they perceive as more socially desirable

Selected-Response Items

Advantages
- objective items are by far the most popular and frequently used type of test item
- Their advantages derive from the ease and objectivity with which they can be scored, which result in significant time savings and enhance test score reliability
- make efficient use of testing time because more of them can be administered within any given time period than is the case with constructed-response items
- Although they can also be administered individually, most tests that use selected-response items are intended for group testing
- All the responses to objective items can easily and reliably be transformed into a numerical scale for scoring purposes, a fact that greatly simplifies the quantitative analysis of these items
- in ability tests, correct and incorrect answers are usually assigned values of 1 or 0, respectively; occasionally, variations, such as 2 or 1 or 0 are available for partial credit
- in personality tests, dichotomous items are also scored 1 or 0, depending on whether the test taker’s response is or not in the direction of the construct that the test is designed to assess
Disadvantages
- more susceptible than constructed-response items to certain problems
- the possibility of correct guessing
- incorrect answers to objective items can easily occur as a result of haste, inattention, carelessness, malingering, or other chance factors unrelated to the test taker’s level of knowledge or ability in the area covered by the item
- test-taking response sets can be intentionally or unintentionally misleading
- carelessly written multiple-choice items, in particular, often include alternatives that are
  - grammatically incompatible with the stem of the item
  - susceptible to various interpretations due to imprecise wording
  - selected-response items are clearly less flexible than constructed response items with regard to the possible range of responses
  - they offer no opportunity for assessing characteristics that may be special or unique to an individual test taker or that lie outside the range of alternatives provided

Constructed-Response Items or Free-Response Items

open ended
may involve writing samples, free oral responses, performances of any kind, and products of all sorts
the most common type of constructed-response items are essay questions and fill-in-the-blanks
directions for administering constructed-response tests should include stipulation on matters such as
- time limits
- medium, manner, or length of the required response
- whether access to materials or instruments pertinent to the test (e.g., textbooks, calculators, computers, etc.) is permitted
interviews, biographical data questionnaires, and behavioral observations are tools for the assessment of personality that often rely on open-minded responses
in personality testing proper, the use of constructed responses is limited mainly to projective techniques
Advantages
- provide richer samples of the behavior of examinees and allow for their unique characteristics to emerge
- open-minded items offer a wider range of possibilities and more creative approaches to testing and assessment than selected-response items
Disadvantages
- related to score reliability, and as a consequence, to validity as well
- scoring constructed responses, both in ability and personality tests, is always a more time consuming and complex matter than scoring selected responses because some degree of subjectivity is invariably necessary
- there is always the possibility that a response will be evaluated differently by different scorers due to its uniqueness or to some other factor
- test length, response length

Scaling: Response Formats and Item Writing

Scaling

Measurement: the assignment of numerals to objects or events according to rules
good research in psychology and social psychology depends on good measurement
understanding the concepts behind scaling would help up understand the ways on how we could assign numerical values in psychological measurement

Fundamentals about Numbers

in a psychological measurement, we use numerals to represent on individual level of a psychological battle
it is therefore true that numerals can represent psychological attributes in many different ways
these different ways can be described in the 3 properties of numerals: identity, order, and quantity

The Property of Identity

the most fundamental form of measurement is the ability to reflect “sameness” vs “differentness”
all people within a category must satisfy the property of identity
- all people within a category must be “identical” with respect to the feature reflected by the category
- in this case, numerals have no actual mathematical value
when making categorical differentiations between people, the distinction between people of different categories represent differences in quality rather than quantity

The Property of Order

indicates the rank of people relative to each other along some dimension
when numerals are used to indicate order, they essentially serve as labels

Property of Quantity

when numerals have the property of quantity, they also provide information about the magnitude or differences between people

Number “0”

“0” is a strange number as it has various meanings
- absolute zero
- arbitrary zero
knowing what 0 means is essential in psychological measurement

Response Formats

a scale’s response format refers to the way in which items are presented and responses are obtained
includes the Likert scale and semantic differential scales
must consider the number of response options available. A minimum of 2 is required but having more has pros and cons as well
the use of midpoints is a common consideration in scale construction
- Midpoints are presented with terms such as “neutral” or “neither agree nor disagree”, often achieved through an odd number of response options
- may have pros and cons also
Some researchers might want to accommodate respondents who have no opinion about the item or who don’t know what their true perspectives are
- avoid using neutral options as “I don’t know” responses
- These responses might mean more than just lack of knowledge or opinion
- You better focus on simplicity, clarity, and breath of the psychological dimension
In constructing a psychological scale, one should consider at least 2 issues regarding the consistency of response options across items
- a scale’s items should have equal number of response options
- the logical order of the response options should be consistent across items

Assembling and Writing Items

Relevant Content

Item content must reflect the intended psychological variables
the breadth of the variable must be reflected in the scale’s content

Number of Items

number of items should be considered for each construct to be measured, with each having its own set of items and receiving its own score; depends on several issues
- longer scales have higher reliability
- broadly-defined constructs may require more items reflecting more items reflecting narrowly-defined constructs
- consider the context of administration or time-sensitive contexts

Clarity of Language

items and instructions should be relatively clear and easily understood
should entail little cognitive effort
- avoid psychological (technical) jargons, double negatives, double-barreled items (reflecting 2 separate ideas)

Balanced Scales

as a general rule, scales should be “balanced” by including positively-keyed and negatively-keyed items
negatively-keyed items must be reversely scored

Home

Social Studies

Psychology Analytical Psychology

Psychological Assessment (Laboratory)

Defining the Psychological Construct

Psychological Constructs

Concepts; unseen processes postulated to explain behavior
not straightforward or simple to measure
cannot be observed directly

Conceptual Definition

describes the behaviors and internal processes that make up that construct, along with how it relates to other variables

Operational Definition

a definition of a variable in terms of precisely how it is to be measured
the process of developing indicators or items for measuring these constructs

Theory

to develop a psychological test, use a theory of the target construct
key to this is that there must be a rational link between the items’ content and the definition and understanding of the construct

Unidimensional and Multidimensional

Unidimensional Construct: expected to have a single underlying dimension; measured using a single measure or test
Multidimensional Construct: consist of 2 or more underlying dimensions

Make a Concept Paper

Conceptualization

the mental process by which fuzzy and imprecise constructs (concepts) and their constituent components are defined in concrete and precise terms
The process of understanding what is included and what is excluded in the concept

Essential Test Item Considerations

Test Items

Test Items: units that make up a test and the means through which samples of test taker’s behavior are gathered
It follow that the overall quality of the test depends primarily on the quality of the items that make it up, although the number of items in a test, and their sequencing or position within the test, are also matters of fundamental importance
Item Analysis: a general term that refers to all the techniques used to asses the characteristics of test items and evaluate their quality during the process of test development and test construction
- involves both qualitative and quantitative procedures
- Qualitative Item Analysis Procedures
  - rely on the judgments of reviewers concerning the substantive and stylistic characteristics of items as well as their accuracy and fairness
  - appropriateness of item content and format to the purpose of the test and the populations for whom the test is designed
  - clarity of expression
  - grammatical correctness
  - adherence to some basic rules for writing items that have evolved over time
- Quantitative Item Analysis
  - involves a variety of statistical procedures designed to ascertain the psychometric characteristics of items based on the responses obtained from the samples used in the process of test development

Test Development

Test Planning

the constructs or knowledge domains that the test will assess
the type of population with which the test will be used
the objectives of the items to be developed, within the framework of the test’s purpose
the concrete means through which the behavior samples will be gathered and scored

the last point includes decisions about the method of administration, the format of the test item stimuli and responses, and the scoring procedures to be used
after these issues are decided and a preliminary plan for the test is made, the process of test development usually involves the following steps:
1. Generating the item pool by writing or otherwise creating the test items, as well as the administration and scoring procedures to be used
2. Submitting the item pool to reviewers for qualitative item analysis, and revising or replacing items as needed
3. Trying out the items that have been generated and reviewed on samples that are representative of the population for whom the test is intended
4. Evaluating the results of trial administrations of the item pool through quantitative item analysis and additional quantitative analysis
5. Adding, deleting, and/or modifying items as needed, on the basis of both qualitative and quantitative item analysis
6. Conducting additional trial administrations for the purpose of checking whether item statistics remain stable across different groups -- cross-validation -- until a satisfactory set of items is obtained
7. Standardizing or fixing the length of the test and the sequencing of items, as well as the administration and scoring procedures to be used in the final form of the test, on the basis of the foregoing analyses
8. Administering the test to an new sample of individuals -- carefully selected to represent the population of test takers for whom the test is intended -- in order to develop normative data or performance criteria, indexes of test score reliability and validity, as well as item-level statistics for the final version of the test
9. Publishing the test in its final form, along with an administration and scoring manual, accompanying documentation of standardization data, reliability and validity studies, and the materials needed for test administration and scoring

Test Item Types

Selected Response Items or Objective or Fixed-Response Items

close-ended in nature
they present a limited number of alternatives from which the test taker must choose
in ability tests, items of this type include multiple-choice true-false, ranking and matching, as well as items that call for a rearrangement of the options provided
in personality tests, objective items may be either dichotomous or polytomous
- Dichotomous Items: require a choice between 2 alternatives
- Polytomous Items: present the test taker with 3 or more alternative responses to a statement
- These alternatives are typically scaled in terms of degree of acceptance (e.g., ike, indifferent, or dislike), intensity of agreement (e.g., from strongly agree to strongly disagree), frequency (e.g., from never to very often), and so forth -- with the midpoint usually signifying a neutral, uncertain, or middle-of-the-road position

Forced Choice Items

Objective items that require test takers to choose which one of 2 or more alternatives is most or least characteristic of them
This kind of item is used mainly in multidimensional personality inventories to control for the tendency of test takers to respond in the direction they perceive as more socially desirable

Selected-Response Items

Advantages
- objective items are by far the most popular and frequently used type of test item
- Their advantages derive from the ease and objectivity with which they can be scored, which result in significant time savings and enhance test score reliability
- make efficient use of testing time because more of them can be administered within any given time period than is the case with constructed-response items
- Although they can also be administered individually, most tests that use selected-response items are intended for group testing
- All the responses to objective items can easily and reliably be transformed into a numerical scale for scoring purposes, a fact that greatly simplifies the quantitative analysis of these items
- in ability tests, correct and incorrect answers are usually assigned values of 1 or 0, respectively; occasionally, variations, such as 2 or 1 or 0 are available for partial credit
- in personality tests, dichotomous items are also scored 1 or 0, depending on whether the test taker’s response is or not in the direction of the construct that the test is designed to assess
Disadvantages
- more susceptible than constructed-response items to certain problems
- the possibility of correct guessing
- incorrect answers to objective items can easily occur as a result of haste, inattention, carelessness, malingering, or other chance factors unrelated to the test taker’s level of knowledge or ability in the area covered by the item
- test-taking response sets can be intentionally or unintentionally misleading
- carelessly written multiple-choice items, in particular, often include alternatives that are
  - grammatically incompatible with the stem of the item
  - susceptible to various interpretations due to imprecise wording
  - selected-response items are clearly less flexible than constructed response items with regard to the possible range of responses
  - they offer no opportunity for assessing characteristics that may be special or unique to an individual test taker or that lie outside the range of alternatives provided

Constructed-Response Items or Free-Response Items

open ended
may involve writing samples, free oral responses, performances of any kind, and products of all sorts
the most common type of constructed-response items are essay questions and fill-in-the-blanks
directions for administering constructed-response tests should include stipulation on matters such as
- time limits
- medium, manner, or length of the required response
- whether access to materials or instruments pertinent to the test (e.g., textbooks, calculators, computers, etc.) is permitted
interviews, biographical data questionnaires, and behavioral observations are tools for the assessment of personality that often rely on open-minded responses
in personality testing proper, the use of constructed responses is limited mainly to projective techniques
Advantages
- provide richer samples of the behavior of examinees and allow for their unique characteristics to emerge
- open-minded items offer a wider range of possibilities and more creative approaches to testing and assessment than selected-response items
Disadvantages
- related to score reliability, and as a consequence, to validity as well
- scoring constructed responses, both in ability and personality tests, is always a more time consuming and complex matter than scoring selected responses because some degree of subjectivity is invariably necessary
- there is always the possibility that a response will be evaluated differently by different scorers due to its uniqueness or to some other factor
- test length, response length

Scaling: Response Formats and Item Writing

Scaling

Measurement: the assignment of numerals to objects or events according to rules
good research in psychology and social psychology depends on good measurement
understanding the concepts behind scaling would help up understand the ways on how we could assign numerical values in psychological measurement

Fundamentals about Numbers

in a psychological measurement, we use numerals to represent on individual level of a psychological battle
it is therefore true that numerals can represent psychological attributes in many different ways
these different ways can be described in the 3 properties of numerals: identity, order, and quantity

The Property of Identity

the most fundamental form of measurement is the ability to reflect “sameness” vs “differentness”
all people within a category must satisfy the property of identity
- all people within a category must be “identical” with respect to the feature reflected by the category
- in this case, numerals have no actual mathematical value
when making categorical differentiations between people, the distinction between people of different categories represent differences in quality rather than quantity

The Property of Order

indicates the rank of people relative to each other along some dimension
when numerals are used to indicate order, they essentially serve as labels

Property of Quantity

when numerals have the property of quantity, they also provide information about the magnitude or differences between people

Number “0”

“0” is a strange number as it has various meanings
- absolute zero
- arbitrary zero
knowing what 0 means is essential in psychological measurement

Response Formats

a scale’s response format refers to the way in which items are presented and responses are obtained
includes the Likert scale and semantic differential scales
must consider the number of response options available. A minimum of 2 is required but having more has pros and cons as well
the use of midpoints is a common consideration in scale construction
- Midpoints are presented with terms such as “neutral” or “neither agree nor disagree”, often achieved through an odd number of response options
- may have pros and cons also
Some researchers might want to accommodate respondents who have no opinion about the item or who don’t know what their true perspectives are
- avoid using neutral options as “I don’t know” responses
- These responses might mean more than just lack of knowledge or opinion
- You better focus on simplicity, clarity, and breath of the psychological dimension
In constructing a psychological scale, one should consider at least 2 issues regarding the consistency of response options across items
- a scale’s items should have equal number of response options
- the logical order of the response options should be consistent across items

Assembling and Writing Items

Relevant Content

Item content must reflect the intended psychological variables
the breadth of the variable must be reflected in the scale’s content

Number of Items

number of items should be considered for each construct to be measured, with each having its own set of items and receiving its own score; depends on several issues
- longer scales have higher reliability
- broadly-defined constructs may require more items reflecting more items reflecting narrowly-defined constructs
- consider the context of administration or time-sensitive contexts

Clarity of Language

items and instructions should be relatively clear and easily understood
should entail little cognitive effort
- avoid psychological (technical) jargons, double negatives, double-barreled items (reflecting 2 separate ideas)

Balanced Scales

as a general rule, scales should be “balanced” by including positively-keyed and negatively-keyed items
negatively-keyed items must be reversely scored