The "2017 Global Cybersecurity Assurance Report Card" reports on a web-based survey of 700 IT security people from mid to large organizations. The survey was sponsored by Tenable and conducted by CyberEdge. It makes a reasonable job of measuring concerns and opinions within the narrow constraints of the specific questions posed. For example here's one of the survey questions:
The 5 categories of response (1-5) and the 7 listed items (a-g) constitutes a Likert scale, a commonplace opinion survey approach with both strengths (primarily low cost) and weaknesses. In respect of just that question, methodological concerns include:
- The question stem is a little ambiguous, referring to 5 being "highest" when it presumably means "of greatest concern" or whatever.
- The 1-5 values are ordinals and do not indicate absolute value or relative proportions. A value of 4 does not literally mean "twice as concerned" as a value of 2. A value of 1 is not "one fifth of the concern" indicated by a 5.
- The use of just a few discrete categories prevents respondents indicating intermediate or marginal responses (e.g. "Right at the top edge" of any category is indistinguishable from "Right at the lower edge" of the same category).
- There is a natural bias against selecting the 'extremes' (or rather, the 1 and 5 categories), on top of which the center category is often chosen when the respondent has no opinion or knowledge, especially if the survey is configured to demand a response to every item. This bias leads to a statistically-significant preponderance of the middle category.
- The question concerns "challenges facing IT security professionals" i.e. not the respondents' own challenges, nor those facing their organizations, but their opinions about an ill-defined/unspecified generic group. Who are these "IT security professionals" anyway? I suspect each survey respondent has their own interpretation.
- The specific a-g sequence of Likert items can influence the result of this kind of question. The sequence could simply have been randomized for each respondent to eliminate the bias, but the methodology statement does not say so.
- Calling the items "challenges" immediately frames the problem space in a particular way: these are problems, things of concern, difficulties, troublesome areas. With slight re-wording, the question could have referred to "interests" or "investments", or even "opportunities [for improvement]" which would probably have influenced the result.
- The Likert items - the particular "challenges" listed - narrowly constrains the responses, and they too are ambiguously worded. There appears to be no way for survey respondents to identify other "challenges", or or to question the meaning of the items listed, nor to provide explanatory comments amplifying or qualifyng their responses. Item e, for instance, begs questions about the precise meaning of every word: what is 'low' (e.g. is that below some other thing, low on some sort of measurement scale, or lower now than at a previous or later time?), 'security' (IT security, information security, physical security, safety, protection, national security, or something else?), 'awareness' (a general appreciation, alertness, the motivation to behave differently, completion of a training course, self-study, or what?), and yes even 'employees' (meaning everyone on the payroll or just staff or IT users maybe?).
- The methodology was (partially) explained, and the survey questions were included in the report along with (some) other basic parameters ... but there is virtually no detail on how the respondents engaged with the study, aside from it being a web survey. Was it on the Tenable website, maybe?Did Tenable send out emails with links to their mailing list, their customers, or some other pre-selected audience? Why did respondents participate? Were they expecting to get something in return? If they self-selected, was it because they have an unusual level of concern about the subject matter, or because they are outspoken critics, or for some other reason?
The report states the results of that question thus:
Although it is not actually stated in the report, I presume the numbers are means of the values selected by respondents, which immediately raises statistical concerns since (as stated above) the values are ordinal numbers. Are means even valid in this case? (I'm no statistician, but I suspect not).
Notice that all the reported values are above 3.3, indicating an overall preponderance of 4's and 5's - in other words, respondents are generally concerned about the 'challenges' listed. That is not unexpected since respondents are described as "IT security professionals" who are generally risk-averse by nature, but nevertheless it is a systematic bias to the method.
Notice also that the scale on figure 10 ranges from 3.00 to 3.90, not 1 to 5. Without further information, we have no way of determining whether the differences between the items are significant. There are no confidence limits or other statistics concerning the range of responses to each item. We aren't even told how many responses there were to each item in the question. Giving two decimal places implies a degree of precision that I suspect may be misleading, but it does rather conveniently allow them to rank the "challenges" in numerical order with no ties.
Oh and notice the legend to figure 10: it and the following text refers glibly to "top challenges" whereas in fact the challenges were chosen by the surveyors, not by respondents. Speaking personally, my "top challenges" are not even listed as options. I am barely even interested in the items listed, except for security awareness of course, and in that I willingly admit extreme bias since I make my living from addressing that "challenge" (except I consider it an opportunity!!).
PS Despite several mentions of "2017" in the report, I believe we are still in 2016. Or have I been in a coma all year?