Glossary of terms in survey research

A 

  • Achieved Response
  • The group of people who actually replied to a survey. The size and makeup of the group dictates the accuracy of any estimate we can make of the view of the population. See sampling.

  • Arithmetic mean 
  • A simple average calculated by adding up a group of values, and dividing the sum by the number of values. 

  • Artwork 
  • Typesetting quality original from which printers make plates for printing. Often now provided "on disk", meaning in a computer file of one kind or another.

B 

  • Benchmark
  • A fixed point with which to compare. 

    Comparing your survey results with those from other organisations

    In an effort to put their survey results in context, people sometimes try to compare them with those obtained by other organisations using similar questions. See Services: Benchmarking

    Choosing a subset from your own survey to use as a benchmark

    You might choose a benchmark subset and then compare all the other subsets with it.  

    When making comparisons between a current survey and previous ones, you might take 2004 as the benchmark and show results for 2005 and 2006 with the improvement / decline since the benchmark year. 

    The term derives from the distinctive marks (known as benchmarks) made by the U.K. ordnance survey on public buildings, bridges etc, whose height above sea level is shown on Ordnance Survey maps. These provide a fixed point from which surveyors can derive levels for other places they are surveying.

C 

  • Census 
  • A survey in which every member of a population is invited to respond. 

  • Class
  • A category within a classification system, see below.

  • Classification system
  • If we want to create subsets from our response based on people's characteristics such as gender, age range, location etc. we must ask informants to classify themselves, usually by ticking a box within a classification system. 

  • Closing date
  • The official, or published, closing date is the date by which we tell informants we want responses back. 

    Usually, responses keep trickling in after that date, though. The actual closing date is the date on which we decide that no more responses will be accepted, and we go ahead to analyse the data already received. We plan this date in advance, but usually agree it finally at the time and in consultation with you, based on 

    • the number of responses so far, and
    • where they have come from, as well as 
    • the rate at which they continue to arrive. (See progress report
    • how urgently you need the results and 
    • how important it is to you to include every last possible response.

  • Cluster
  • See topic

  • Commitment

Label for a group of employee satisfaction measures concerning the level of allegiance employees have with their employer. The individual items and their aggregated value correlate with certain individual and corporate performance measures so commitment measures have been regarded as important drivers of improved performance. See also Engagement.

  • Confidence interval

The range around a sample result, within which the actual value for the population is likely to lie, at a given level of probability.

If we obtain a sample result of 62%, this is only an estimate of the true value we would have obtained if we had been able to ask every member of the population. We may calculate that the confidence interval associated with this result is plus or minus 3% at the 95% confidence level. This means we can be 95% confident that the population result would be somewhere between 59% and 65%.

  • Confidence level
  • When we assess the significance of any differences the survey appears to find, we do so at a given confidence level, usually the 95% confidence level. This means that we can be 95% confident that a difference which exceeds the sampling error we calculate really is significant. The other 5% of the seemingly significant findings will be due to exceptionally large sampling errors.

    The confidence levels usually offered are as follows

    Confidence level Sampling error 
    based on 
    68% 1 standard error
    95% 2 standard errors
    99% 3 standard errors

  • Correlation
  • When two sets of data appear to vary in the same way, they are said to be correlated. If you visit a school and measure the height and weight of every pupil, those who are taller will tend to be heavier too. There will be exceptions to the rule, though, so although using the data collected you could make a good guess at the weight of an unknown child based only on knowing how tall they were, you would be caught out now and again by a very fat short one, or a very thin tall one. In this case height and weight would be positively correlated to a high degree.

    We measure correlation using a correlation coefficient. A correlation coefficient of 1 means that as one value increases so the other will increase in a completely predictable way. A correlation coefficient of -1 means that as one value increases so the other will decrease in a completely predictable way. A correlation coefficient of 0 means that there is no connection between the two sets of data. In the example above, height and weight might be correlated with a coefficient of 0.7 

D 

  • Demographics 
  • A shorthand for the classifications referred to above. 

  • Design factor
  • When we assess the significance of any differences a survey appears to find, we compare the apparent difference with the difference which might have arisen as a result of the sampling process (sampling error). Only when the difference is greater than might have arisen through sampling error do we say the difference is significant.

    The instrument itself introduces further error, however, because different people interpret language differently, so their understanding of an item we included in the instrument may not be the meaning we intended. To provide for this extra error, the sampling error may be increased by the design factor. For example, if we decide that we should allow for a further 20% margin of error then we multiply the sampling error by a design factor of 1.2 and only regard as significant any difference which exceeds the new, bigger range of error.

E

  • Engagement

Label for a group of measures concerning the extent to which employees are engaged with their employer organisation and its aims and objectives. The individual items and their aggregated value correlate with certain individual and corporate performance measures so engagement measures are regarded as important drivers of improved performance. See also Commitment.

F 

  • Focus Group 
  • A group of people brought together to provide their input to a particular issue or problem. When developing an instrument, we often use focus groups drawn from the target population to get their perspective on the issues to be measured. This ensures we cover the relevant issues, and avoid producing an instrument which asks everything except the one thing the target groups wish to tell you.

G

  • gsm
  • Grams per square metre - measure of the weight of a paper. Standard copier paper is 75 - 80 gsm, meaning that one square metre of the paper would weigh between 75 and 80 grams. More prestigious papers are usually about 100gsm. At about 150 gsm, the material would begin to feel more like a card than paper.

I 

  • Index
  • A single figure representing a range of measures and comparing one thing with another. 

    A stock market index is arrived at by calculating the value of a given "basket" of shares and comparing the current value with the value at an earlier benchmark date. The result is usually presented as a current value compared with a base (the benchmark) of 100. So if the market has been rising and the value of the basket of shares has increased from £43,023 at the benchmark date, say 1 April 1998 to £69,415 now the index would be 69,415 divided by 43,023 and multiplied by 100 = 161. The index should be quoted as 161 base 1 April 1998 and it tells us that the particular basket of shares has increased in value by 61 percent since the base date.

    We sometimes use indices  to summarise survey results and compare performance in one area with another, or to compare with an earlier measure, rather as a stock market index does.

  • Informant 
  • One of the people completing and returning a questionnaire, or otherwise providing information about their characteristics, attitudes and opinions in a survey. 

  • Instrument 
  • A survey questionnaire. The purpose of the survey is to measure attitudes and opinions. A measuring tool (like a rule or a micrometer) is known as an instrument, and so is the questionnaire which is the tool for this kind of measuring. 

  • Item 
  • Each separate question in an instrument is called an item. It is called an item because it might not actually be a question. Often it will be a statement such as "I like my job" and informants will be asked to tick one of a series of boxes to show how strongly they agree or disagree with the statement. There are lots of other kinds of item which might be used, many of which are also not questions. In practice, the terms "item" and "question" tend to be used interchangeably.

K 

  • Keying 
  • Survey items may be positively or negatively keyed. The distinction concerns the wording of the item and the system adopted for converting responses on questionnaires into numerical scores for analysis. A five point scale from Unacceptable to Excellent might be represented by the numbers 1 to 5, so a respondent's tick next to Unacceptable would be recorded as a score of 1, and an Excellent response as a 5.

    In this case, higher scores are a good thing, and we refer to the item as positively keyed. If instead we had chosen to use 5 to represent Unacceptable, and 1 to represent Excellent, lower scores would mean more favourable responses, and the question would be a negatively keyed one.

    The distinction is equally relevant in a case where all responses are expressed as a level of agreement say from Totally Disagree, scored as 1, to Totally Agree, scored as 7.

    If we then offer a positive statement such as "I like working here" for the respondent to agree or disagree with, this is a positively keyed item. But a statement such as "ABC Company staff are offhand on the phone" would be a negatively keyed item, because a higher level of agreement with it, represented by a higher score, would be a bad result. 

  • Keystrokes 
  • The number of keyboard key depressions needed to input the data represented on one completed questionnaire. Most items can be input with a single numerical keystroke. Multiple choice items count as several keystrokes - as many as there are options to choose from. 

    Any demographic or classification items are coded and count as many keystrokes as there are characters in their codes. e.g. A classification system with codes a, b, c .. z., or 0, 1, 2 .. 9 would take 1 keystroke and one with codes aa, ab, ac ... zz or 00, 01, 02 .. 99 would need 2 keystrokes. 

L

  • Limits of accuracy

Another way of referring to confidence interval.

M 

  • Management Services 
  • Generic name for the application of a range of techniques for the study of work and organisations with a view to bringing about improvement. Defined in BS 3138: 1992 Glossary of terms used in management services as 

    The provision of advisory and information services to assist management in improving effective use of resources. This may embrace the use of work study, O & M, operational research, data processing, ergonomics, economic forecasting, and industrial engineering.

    Usually practised by independent or internal consultants without executive authority. Their conclusions are usually presented as recommendations for line management to consider. 

N

  • Norms
  • A collection of data comprising responses to survey items, allowing us to put your results in context with responses from other surveys.

P 

  • Panel
  • A permanent representative sample maintained by a market research agency from which information is obtained on more than one occasion either for continuous research or for ad hoc projects. (MRS Research Buyer's Guide)

  • Percentage
  • An easy way to compare proportions by saying how many each represents out of one hundred. 

    If we asked people in an office if they wanted a coffee machine which made real coffee instead of instant and we found that 38 of the 54 people in department A agreed, and 46 of the 63 in department B, it is hard to know which department is more enthusiastic. But if we say that those agreeing were 70.4% in department A and 73.0% in department B we can easily see that department B is more in favour than department A.

    Working them out: 

    38 divided by 54, multiplied by 100 = 70.37037, which we round off to 70.4 or 70 

    46 divided by 63, multiplied by 100 = 73.01587, which we round off to 73.0 or 73

  • Percentile 
  • A percentile (abbreviated to %ile) expresses the average response to a scale item as if the scale had been from 0 to 100. So it provides a satisfaction score or an agreement score, always out of 100.

    It provides a way of converting results measured using different scales (response frames) to a common scale of 100 points. Even when different scales have not been used, it can often be easier to understand a result expressed as a percentile than a raw score.

    Imagine, for example that we want to compare results from one survey, or two or more different surveys and some results are on a scale from 1-5 and others on a scale from 1-7. The same answer can mean different things according to which scale applies. Say two questions had the answer 3. On the first scale (1-5), this is exactly the midpoint, but on the second (1-7) it is closer to the lowest possible score (1) than to the highest (7). 

    To work out a percentile, the scale is divided into 100 so-called percentile points. By working out how far along its possible scale each average result lies, and expressing it as a percentage of the way along, we can say at which percentile point the average lies, and make the results comparable one with the other. 

    Some examples: 

Scale Average Raw Score Percentile
1 - 5 3 50
1 - 7 3 33.33
1 - 5 2.3 32.5
1 - 7 2.3 21.67
1 - 7 4 50
0 - 1 0.45 45
1 - 5 1 0

    To calculate a percentile from a raw score, calculate 

    (Raw Score - Min) / Range * 100 

    where Raw Score is the average raw score; Min is the minimum of the scale; Range is the maximum of the scale minus the minimum of the scale. 

    Taking as an example the fourth line in the table above, a Raw score of 2.3 on a scale from 1 to 7; 

    Raw Score = 2.3; Min = 1; Range = 7 - 1 = 6. 

    So %ile    = (Raw Score - Min) / Range * 100 

                    = (2.3 - 1 ) / 6 * 100 

                    = 1.3 / 6 *100 

                    = 21.67

  • Population 
  • Statistical term for the whole group about whose characteristics or views we are trying to learn, when we study only a sample chosen from within it. 

R 

  • Random sample

A sample selected using a technique which ensures that every member of the population has an equal chance of being selected. Choosing the first 1,000 names from a telephone directory (sorted alphabetically) would produce a sample but Mrs Aardvark and Mr Brown would have a better chance of being included than Mr Zziwa, so it would not be a random sample. Problematic non-random samples are most likely to arise from a sampling frame which has been sorted on a relevant characteristic (say postcode, which might correlate with household income) or has a pattern inherent in it which might coincide with a sampling interval you might use to select a systematic sample.

  • Raw score
  • When we capture the data from an instrument, we have to convert ticks in boxes to codes or numbers which the computer can handle. If an item is a statement with an agreement scale, there might be five boxes for the informant to tick, labeled as shown below. We key the score shown, according to the box ticked. This is known as a raw score, because it hasn't yet been subjected to any processing.

    Box label 

Strongly disagree  Disagree  Neither agree nor disagree  Agree  Strongly agree

    Score

1 2 3 4 5

    Having filtered out a subset, then for each item in the survey we can add all the scores we have recorded and divide by the number of them to arrive at an average raw score for this item, within this subset.

  • Representative sample
  • A sample chosen so that it fairly represents the make-up of the population. This means that the mix of relevant characteristics (Age, Gender, Product used, Region etc.) is the same as in the population. If the sample is a small one, it is very hard to choose a sample which comprises matching percentages of informants taking account of many different characteristics, say matching the percentages mix of the population on Gender, Ethnic origin, Age groups, Income and Disability. Even if it were possible to find a sample whose mix did mimic the population on all these characteristics, the achieved response might not. For this reason, we usually are working with samples which, while reasonably representative, is not wholly so. If an estimate of the population average view is required, this can be found by reweighting the results.

  • Respondent 
  • See informant

  • Response frame
  • The mechanism through which informants answer the item. It might be a range of tick-boxes labeled to represent a scale; an agreement scale, say, or a scale from Very dissatisfied to Very satisfied. In these cases, the informant would be asked to tick one box. For a multiple choice item it would be a series of options and the informant might be asked to tick only one, or as many as apply. For a free text comment, it is just an area in which the informant can write (or on the web, type) their response.

  • Response rate 
  • The number of responses received, usually expressed as a percentage of the total number of people invited to respond. see Response rate enhancement

  • Responses 
  • The questionnaires actually returned. See Achieved response.

  • RSL
  • Registered Social Landlord. An organisation registered with the Housing Corporation and therefore qualified to provide housing and receive subsidy from the Housing Corporation. Most RSLs are Housing Associations.

S 

  • Sampling 
  • A technique by which we learn about the characteristics or views of a whole group (population) by gathering data about only some representative members of it. The result is an estimate of the characteristics or views of the whole group. The accuracy of the estimate depends on the size of the sample and the popularity of the characteristic or view we are trying to estimate. See Sampling error

    See also Random sample; Representative sample; Sampling interval; Sequential sample; Stratified sample; Systematic sample.

  • Sampling error
  • If the sample has produced the result 42% and we estimate the sampling error (or confidence interval) at plus or minus 3% we might express the result as 42% ±3%. This means that the population result would have been in the range 39% to 45%. Even this isn't quite specific enough, though, because in an extreme case the population result might be outside even this range. So we have to say how sure (how confident) we are that the population result would have been in the range stated. There are three commonly used confidence levels; roughly 68%, 95% and 99% confident, corresponding to sampling errors of plus or minus one, two and three standard errors respectively.

    The most popular confidence level is 95% and this is the one our reports use unless you ask us to do something different. This means that when we say that a difference shown on a report is significant there is only a one in twenty chance that it actually isn't (95% = 19 out of 20).

    Unfortunately, there are several ways these results can be expressed. Taking the example already used, and assuming that we are 95% confident of the result given, it might be expressed in any of the following ways. They all represent exactly the same result:

    • 42% ±3% at the 95% confidence level
    • 42% ±1.5% at the 68% confidence level
    • 42% ±4.5% at the 99% confidence level
    • Sample mean: 42% Confidence level 95% Confidence interval ±3%
    • 42% Limits of accuracy ±3% at the 95% confidence level
     
  •  Sampling Frame

A list comprising one record for each member of the population from which a sample can be chosen.

  • Sampling interval


The number (or average number) of steps through the sampling frame between records to be included in the sample. In a systematic sample, to take a 10% sample, you would select every 10th record, so the sampling interval is 10. For a truly random sample, the number of steps between selected records would be random numbers which average 10.

  • Self-administered
  • A survey instrument designed for the informant to complete unaided. The distinction is between this and an instrument which is intended for completion by a professional interviewer based on an interview with the informant.

  • Sequential sample

  • A sampling technique used when you can’t predict the response rate. If you know you need an achieved response of 200 and you have a mailing list of 20,000 to use as the sampling frame, how many will you mail? You might only get a 1% response rate, in which case, you would need to mail all 20,000 but if the response rate was 2% you would have spent twice as much as you needed to on the mailing. The trick is to send a small mailing first, to test the response rate, so you mail 1,000 and count the responses you get. Now that you know what response rate to expect, you can select a further sample big enough to provide the achieved response you need.


  • Significance 
  • If we compare the results for the same question from two different groups of informants, they might appear to show a difference between the views of the two groups. Before drawing attention to it, and proposing action based on it, we need to be sure that the difference could not reasonably be explained simply as the result of sampling error. If the difference is greater than the sampling error we could reasonably expect, then we say the difference is significant. Our standard reports highlight significant differences at a given confidence level between subsets or occasions of running the survey.

    Generally, the smaller the sample size, the greater the sampling error. 

  • Stakeholder
  • A convenient jargon term which embraces an organisation's customers, employees, shareholders, suppliers, neighbours etc; in fact anyone who has any interest in what the organisation does. The term is popular lately in government circles and in local government, where "stakeholders" include Council tax payers; other residents; businesses and their employees; users of services like leisure facilities and libraries who may not be resident within the local authority area; shoppers and mere passers through.

  • Standard deviation 
  • A statistical measure of the variation in a set of data. We often use an average to summarise a number of data items, but an average tells you nothing about the extent of the spread or "scatter" of the individual values around it. That is the purpose of working out the standard deviation. 

    These two lists of values both average 100 but their standard deviations are very different.

      110 150  
      98 90  
      102 110  
      90 50  
      95 75  
      105 125  
      102 110  
      99 95  
      98 90  
      100 100  
      100 100  
      101 105  
    Average 100 100  
    Standard deviation 4.7 23.6  

  • Standard error 
  • A statistical measure of the extent to which the average of a sample might differ from the population average. 

  • Stratified sample
  • If you plan to break the survey results down into subsets, you need to ensure that the resulting subsets will provide sample sizes big enough to draw useful conclusions from. So you may need stratify the sampling frame by splitting it into the categories you will subsequently use to create the subsets. Then you select a sample in each stratum big enough to produce a useful sample in the achieved response. This means probably choosing a different percentage from each stratum as below.

     

    Clients

    Achieved sample required

    Predicted response rate

    Sample size

    % sample

    Product 1

    2,545

    40

    50

    80

    3.1

    Product 2

    1,866

    40

    50

    80

    4.3

    Product 3

    752

    40

    50

    80

    10.6

    Product 4

    120

    40

    50

    80

    66.7


  • Subset 
  • Any group of informants defined in terms of their responses to questions in the instrument. The responses to a survey may be summarised and reported as a whole, but it is usually helpful to see separately the results obtained from groups of  informants who have some features in common.

    A subset may include all female informants, say, or all clients in the South of England. We can set rules to control whether respondents are included in a subset via a class or a range of classes in any of the classification systems by which respondents have been classified, and / or by specifying responses to any question(s) in the survey.

    A subset definition may admit all members of a single class, (e.g. a group which includes all females); or a range of classes (e.g. those in departments c to e). Classification systems may be combined so if your survey includes codes for department, job type, and length of service we could create a subset which includes anyone who works in departments coded a to c, in jobs coded d or f and who has length of service coded c or higher.

    We can also define a subset in terms of responses to the questions in the body of the survey, so if there was a question about the frequency of meetings with a five point scale for responses from "never" through to "very frequent", we could create a subset comprising people who said they had meetings never or only occasionally. This would allow us to see how this group of people answer the other questions in the survey. We can do the same sort of thing by comparing one question with another, so if a survey asked people to rate various sources of information we could create a subset of those who say they get more information from the grapevine than from organised meetings.  

    We can also construct weighted subsets from a number of simple subsets, to produce results which estimate the results we might have obtained from an overall response in which the representation of classes was different from that which was actually received. This is valuable when the distribution of responses does not reflect the true mix of classes in the population whose views the survey is intended to estimate.

  • Survey fatigue 
  • The phenomenon whereby people get fed up with filling in survey questionnaires. It becomes more acute when surveys are repeated too often, or when they appear to be irrelevant, or pointless. Surveys which ask informants what they want changed, but after which no change occurs, will often lead through survey fatigue to a lower response rate next time the survey is run. 

  • Systematic sample

  • A sample selected by taking every Nth record in the sampling frame. This is fine, provided that there is no pattern in the sampling frame which would mean that you would be picking the same sort of informant every time.

    Say every block of flats on an estate has ten flats. There are three floors, each with three one-bedroom flats, then a top floor with a three bedroom flat. You want a 10% sample, so you choose every tenth property. You will get either all one bedroom, or all three bedroom properties and the sample will not be representative. In this case, you need to use a truly random sampling technique instead.

T 

  • Time used 
  • A method of fixing consultancy fees. We perform whatever parts of the project you have instructed us to do, we keep records of the time we devote to your work, and bill you for the time spent. Our daily rate for consulting work is currently £800 per day plus expenses and VAT. For part days, we bill at £100 per hour plus expenses and VAT. Our minimum billing period is 5 minutes, so we don't charge for an hour if the job takes only ten minutes.

  • Topic 
  • Items (questions) may be grouped into topics, either for reporting purposes or to allow topic averages to be calculated. Topics are sometimes called clusters.

  • Topic average 
  • Topic averages may be just the arithmetic mean of the results for the items which make up the topic. If we are calculating topic averages for you, the items in the topic must all use the same scale for responses but positively and negatively keyed questions can be combined to produce a measure of how favourably informants have responded to the topic.

    They can be weighted if you wish, so that some items are given more weight than others. Each item can be included in as many topics as you wish, and may have a different weighting assigned for use in each topic in which it features.

  • Transfer of learning
  • Transfer of learning has occurred when knowledge and skills learned show themselves in the behaviour of the learner. It is the difference between knowing how a situation should be handled and actually doing it that way when it arises. 

    Many drivers would be able to tell you that the right way to deal with a rear wheel slide is to steer into it. Not so many would actually do the right thing when the skid happened. They are the ones for whom transfer of learning has occurred, usually as a result of having had the motivation and the opportunity to practise.

V 

  • Validity
  • A measure of the extent to which an instrument truly measures what it claims to measure. For example, if we are trying to construct a measure of customer loyalty, we might include an item which says Next time you need a widget, will you choose an ABC widget? The item is said to have face validity if, as in this case, it appears on the face of it that it would measure customer loyalty. (It is actually a measure of repurchase intention, which is one aspect of customer loyalty.) 

    If we offer a scale of responses from certainly not to certainly and administer the instrument to several different groups of people, we will get a good measure of the relative loyalty of the different groups. All we have measured so far, though, is what people say they will do. If, as part of the instrument development process, we can administer the instrument to a group of people whose widget purchasing we can then monitor, so we know who bought a widget, and whether the one they bought was indeed an ABC widget or some other manufacturer's, we call this data the criterion. It is the standard by which we are testing our instrument in the way that an instrument for measuring distance might be checked against a known official measure. We can then calculate the correlation between the responses to the question in the instrument and the criterion - people's real loyalty as demonstrated by their buying behaviour. We may be able to show a link between the results from the item in the instrument and people's future behaviour. The strength of this link is a measure of the predictive validity of the item in our instrument. 

    This is a costly and difficult process to go through and it is often impossible or impractical to obtain criterion data. For this reason, many employee and customer satisfaction measures depend on face validity alone.

  • VAT 
  • Value added Tax. The European Union sales tax. In the U.K. the VAT rate is currently 17.5% of the cost of most goods or services, including ours. 

  • Verify 
  • Key data a second time, comparing the first and second versions to see that they agree. Provides greater confidence in the accuracy of the data to be analysed. 

W 

  • Weightings 
  • When we average several items to arrive at a measure of an overall concept, i.e. a topic or cluster, sometimes it isn't appropriate to give every item in the list equal importance. Or if we are trying to estimate the views of a population but the demographic mix in the responses we have received does't reflect the equivalent mix in the population, we need an average which gives the different classes weights which reflect their representation among the population rather than how many responded to our survey.

    In either case, the solution is to use weighted averages. In this example, we have decided that item 2 is twice as important as items 1 and 3, so its weighting is twice the weighting assigned to the others, and its value influences the result more than they do. A simple average of the three values is 35. The weighted average, the total of the Weight x Value column divided by the sum of the weights, is 190/4 = 47.5.

    Item
    Value
    Weight
    Weight x Value
    1
    20
    1
    20
    2
    75
    2
    150
    3
    10
    1
    10
    Totals
    105
    4
    190
    Averages
    35
    47.5

     

    We can calculate weighted topic average results for a topic, and we can calculate weighted subset results