Ranking FBI Most Wanted cyber threat actors puts APT10 in the top slot. Or does it, really? Well, it all depends on how you look at it.
51 graduate students from 12 countries took part in ranking FBI Most Wanted cyber threat actors. They used a range of structured analytical techniques that included pair ranking and a set of 11 Long Matrices
My name is Alexei Kuvshinnikov and I teach the use of Structured Analytical Techniques for strategic threat assessment and forecasting. I am member of the International Association for Intelligence Education (IAFIE).
My core audience are unvetted graduate students with a major in foreign affairs. 51 graduate students from 12 countries – Brazil, China, France, Germany, Italy, Libya, Slovakia, Russia, South Korea, Switzerland, UK and the USA – attended my Fall 2020 course.
The course ended with an exam. And I took extra care to stun my class with a surprise tasking. Students got 130 minutes to rank seven organisations from the FBI Most Wanted cyber threat actors´ list and determine, which one posed the highest threat.
Candidates included APT41, APT10, GozNym, Boyusec, SamSam, Mabna Institute and JabberZeus.
None of the students had any particular previous knowledge of the cyber threats scene. So, I threw them into the water at the deep end. But they had all the skills and tools needed to do the job that I had taught them on the course.
As a first step in ranking FBI Most Wanted cyber threat actors, students had to take a web delve to conduct basic OSINT research.
They needed to collect evidence that allowed forming an informed judgement about candidates´ strengths and weaknesses. Being graduate students, I expected them to be skilled in unstructured OSINT research. Including that on an unfamiliar subject. And they did show that they knew both what to look for and how to look for it.
Then students switched to pair ranking to reduce the field from seven to three actors.
Each student did pair ranking independently. At a group level, it translated into some ten combinations of three out of seven. These occurred with a different frequency, too. As an example, APT10 finished in the top three a total of 39 times. To compare, APT41´s result was – 35 times, GozNym -5 times, Boyusec – 5 times, SamSam – 14 times, Mabna Institute – 26 times and JabberZeus – 5 times.
And finally, students used a set of four different types of Long Matrices (11 variations in total) to do the actual threat ranking.
Thus, each student assigned values to 364 data points. Accordingly, the complete data set of my study included 18564 unique values. I hope you are halfway impressed.
When summing the 51 individual Mean Scores, APT10 firmly took the overall top-ranking slot with a commanding 8.37% lead.
- ATP41 came out as the runner-up.
- And Mabna Institute ended in the third slot trailing APT10 by some whopping 40.68%.
You see, here is why I find this result important.
- There were 51 research-savvy people with very different backgrounds and correspondingly different biases.
- They individually reviewed a significant number of different sources.
- And extracted very different evidence.
- Then each of them using own unique reasoning assigned unique values to 364 data points.
And all that produced a clear group preference for APT10.
That´s the power of structured analytical techniques.
But here are some other interesting findings that I extracted from students´ analysis.
As a start, here are the rankings of actors with highest total scores by type of the Long Matrix.
APT10 consistently took the top-ranking slot in all. And with a comfortable lead.
APT41 trailed it by a mean of 9.60%. Between its highest and lowest margin of lag depending on the LM type there is what I call a “fork” of 4,10% (7,58-11,68).
Looking at the third-ranking Mabna Institute, it trails far, far behind APT10 in terms of the total score. The mean gap is 40.12%.
But hey, look again at the “fork” between its highest and lowest margin of lag – it´s only 3,94% (38,03-41,97). And compared to the runner-up, it shrunk.
SamSam in the fourth-ranking slot as measured by the total score falls behind the horizon (66,65 % mean gap). But the fork in its margin spread shrinks even more to some meagre 1.99% (65,62-66,71).
Fifth-ranking GozNym has a margin fork of 3.95% (84,08-88,03), similar to that of the Mabna Institute.
Sixth-ranking Boyusec basically stays in the same range with a margin fork of 1,67% (86,10-87,77).
And the margin fork of the bottom-ranking JabberZeus covers only 0.72% (88,14-88,86).
Now, what could THAT mean? There is clearly some kind of pattern present here.
A key benefit of structured analytical techniques is that they facilitate collaboration. And team work is the best-known method of bias mitigation. Working in a team cancels out individual biases.
From this angle, 51 individual exam reports analysed together show some similarity with a group analytic product. That could be a reason for a notable consistency of ranking results across different types of Long Matrices.
Variance in lag margin forks in all matrix types fits into a rather tight range of 3.38%.
Furthermore, the Long Matrix in particular tends to impose greater rigor and discipline on individual reasoning. As a consequence, it can significantly reduce occurrence of random choices in the valuation of attributes.
And that can explain why lag margin forks are not only distributed within a fairly tight range BUT ALSO within a fairly tight AND fairly LOW one – 0.72%-4.10%.
Somehow, all those hidden correlations feel kind of weird, don’t they?
Anyway, I delved into the data a bit deeper still – and the going got even more interesting.
Now I looked at the results of ranking FBI Nost Wanted cyber threat actors from a different angle. And it quickly became obvious that APT10´s winning in the ranking by total score was winning by volume.
After all, this threat actor was kind of a champion of pair ranking. APT10 finished in the top three a total of 39 times. It is some 10% more often than APT41 (35), which is not a big deal. Mabna Institute lagging by some 33,3% (26) already faces a clear disadvantage. SamSam with a lag of 64,1% (14) is relegated to the lower league. And GozNym, Boyusec, and JabberZeus (5 each) score a whopping 87,2% less often than APT10.
Since more people included APT10 in their “shortlists” for threat ranking, its overall score turned out the highest. That´s elementary. But does this result correctly reflect students´ ACTUAL threat perception?
Now, the mere fact that APT10 took part in more rankings than any other threat actor may suggest a certain consensus on the severity of its threat. But it makes good sense to try and disentangle threat perception from the impact of volume.
So, I´ve calculated the second-level mean scores from the 51 first-level (individual) mean scores of all threat actors in each of the LM types plus overall.
And guess what? Ranking FBI Most Wanted cyber threat actors by mean scores in the place of total scores led to a serious reshuffling of the ranking hierarchy.
Now it was APT41 that clearly came out as the top-ranking threat actor. Admittedly, by a rather small margin – 2.3% – but still.
APT41 achieved that by taking the top-ranking slot in LM19, LM12 and LM2.0. APT10 fell back to the second position.
The third position in the overall ranking by mean score went to JabberZeus. And SamSam finished in the fourth place. True, students picked JabberZeus and SamSam in pair ranking less frequently. But whenever they did, they gave them pretty high threat scores.
Even more interestingly, when applying this method the bottom-ranking actor lagged the top-ranking one by only 12.22%. To compare, ranking by total score resulted in an 88.85% lag.
Let´s get real. I see no way how an organisation that figures on the FBI Most Wanted list can be 90% less of a threat than another organisation on the same list.
Here is a closer look at matrix-specific results.
- LM19-0,02: APT41-APT10-Boyusec-SamSam.
- LM12-0,02: APT41-APT10-JabberZeus-Boyusec.
- LM2.0: APT41-APT10-JabberZeus-SamSam.
But now, here comes the Real Big Surprise.
GozNym confidently takes the top-ranking slot with a 5.83% lead. I find that to be nothing short of sensational.
GozNym finished among the top three actors in only five (out of 51) pair rankings. Accordingly, ranking by volume pushed it down well below the radar. As a matter of fact, steamrollered it flat with the ground. In ranking by total score, GozNym trailed APT10 by 86.82%. But those few students who did pick it obviously thought GozNym posed a critical level of threat.
Why then could GozNym come out at the top when ranking FBI Most Wanted cyber threat actors only in this one type of matrix – LM2.0-CRI?
Apparently, students dug out some highly diagnostic evidence. And LM2.0-CRI served best to incorporate this evidence in their appraisal.
CRI stands for Compound Risk Index. This technique can most accurately translate evidence into valuation of harm and probability. Which means, risk.
LM2.0-CRI is the only long matrix type that allows to treat risk at the level of each individual attribute. One result is a vastly enhanced granularity of risk assessment. Each threat attribute undergoes assessment for Harm and Probability separately and independently of every other. CRI calibrates threat with risk.
Use of the Compound Risk Index catapulted GozNym to the top-ranking position. I take this discovery as another proof of this technique´s considerable utility.
Another actionable benefit of CRI is that is triggers the “risk feedback loop”.
When using Long Matrices, analysts conduct the initial risk assessment of threat actors at the start of the project. They reflect these early risk indices in LM19, LM12 and LM2.0. Based on preliminary evidence, these risk valuations are not necessarily very accurate.
By the time analysts get down to LM2.0-CRI, they will have accumulated more evidence. Correspondingly, their risk assessment would at this stage iwould often gain in accuracy. So, benefits involved in the very concept of calculating risk at attribute level are further enhanced by better evidence.
As a result, CRIs rather often deviate quite a bit from the initially calculated risk indices. CRIs can then replace the latter in LM19, LM12 and LM2.0 to produce more accurate results. Use of this “risk feedback loop” sometimes makes a significant difference in the final threat ranking.
Exam tasking did not include use of the “risk feedback loop”. After all, there is only so much that you can squeeze into 130 minutes. But look here at an example of impact of the “risk feedback loop” from another case study.
One last question that begs attention is, why did APT10 and APT 41 finish among the top three in pair ranking contests seven-eight times more often than GozNym to start with?
Well, I guess you should never underestimate the power of biases. They are lurking at every turn of your reasoning.
Students did the pair ranking individually. The mitigating effect of team work did not apply. And it looks like primacy bias and confirmation bias reaped their usual harvest.
Serial position biases in combination with confirmation bias often distort analytic judgements.
Serial position bias is when the impact of evidence depends on whether you process it first or last.
When you succumb to Primacy bias, information that you process early on acquires a disproportionate influence on your judgement as compared to evidence you process later on. It has been empirically observed in 80-90% of 1200 tested persons who were professional UK intelligence analysts.
If you are vulnerable to Recency bias (which happens more rarely), it is information that you process towards the end of your project that has a disproportionate influence on your judgement.
Confirmation bias steps in when you form a leading hypothesis in accordance with the primacy bias. Since it connects best to your expectations you´ll discard all later information that contradicts the inference so preferred by your mind-set. It´s basic human nature. And as it happens in your subconsciousness, you won´t even realise you are doing it.
In the tasking, the threat actors were presented as follows.
- Mabna Institute.
You see my point? There is an implicit suggestion of a hierarchy in this presentation.
Accordingly, it is rather probable that students conducted their research in this same sequence. And a usual majority fell prey to primacy bias enhanced by confirmation bias. They subconsciously focused on APT41 and APT10.
And now, let me share with you an example of an unedited exam report. Naturally, I picked one that in my view is really good. But it´s not the only one that I found excellent.
The Class average for 51 students was 92.92 and I´ve got a reputation of being “Uncle Scrooge´y” with points in the absence of good reasons.
Course of study: SciencesPo-MGIMO Dual Masters’ degree
Field of study: International Public Management (European concentration)
A French-Danish National, I joined this masters’ programme in September 2019 after four years at University College London, where I read Russian and History graduating with a first-class degree.
Core competencies obtained as part of my academic path range from Russo-Danish relations under Ivan the Terrible to modern Political Economy. That said, I have spent the most time researching European history and philosophy (especially 19th century) and now the European Union.
My professional experience includes working in an Oil and Gas supermajor, a financial advisory, a news agency as well as serving in the military. Upon graduation in 2021, I plan on joining a national administration.
My email is: 3cduhamel -AT- gmail.com
Addressed to: Matt Gorham, Executive Assistant Director, Cyber Group, Criminal, Cyber, Response, and Services Branch, Federal Bureau of Investigations (FBI)
Author: Constantin DUHAMEL
Cyber threats to American sovereignty have been rampant ever since the very development of this technological vector. Political and technological changes of the past 20 years – globalisation and digitalisation chief among them – have significantly dropped the transaction costs associated with operations in the cyber domain and greatly increased the incidence of malign activity conducted in cyberspace to the advantage of America’s adversaries. Russian entities’ interference in the 2016 US election is but the most public example.
Therefore, America’s place in the world and homeland safety is now more than ever challenged, requiring the FBI to be ever-more careful in allocation of its resources and thus maximise efficiency of its mechanisms and effectiveness in securing outcomes favourable to American society.
As such, this report has taken the initiative to measure which of the major cyber entities present on the FBI Wanted List ought to command the FBI Cyber Group’s attention and resources (capital and manpower) in terms of threat and risk. This report will define Threat as Intent x Capability and Risk as Harm x Probability.
While the author of this report does not represent American views as a European citizen and does not have expertise in the cyber world, he has operational experience in the intelligence sector of a NATO ally.
Six cyber threats were examined: APT41 (China), APT10 (China), GozNym (CIS), Boyusec (China), SamSam (CIS), Mabna Institute (Iran), JabberZeus (CIS).
This research has demonstrated that the greatest threat is posed (in descending order) by APT41, APT10 and GozNym with a weighted average of 70.08, 54.95 and 46.80 where 100 = most threatening. Utmost priority must be given to APT41 on account of its long-standing achievements and broad scope of operations.
These three groups are associated with Chinese external intelligence, military intelligence, and Russian government-influenced organised crime, respectively.
This result confirmed my a priori assumptions that the closer the unit to government entities and resources, the more dangerous its threat is likely to be to domestic American interests. That said, the chosen methodology allowed me to break from my a priori bias as to the importance of Russia’s capabilities, exposing the expedient if more risky and opportunistic nature of the GozNym threat.
Methodology: final average is the sum of four structured analytical techniques (weighted averages) computing a different threat analysis depending on the inputs – ways to look at threats – used in the chosen model.
Firstly, I eliminated all but 3 of the most important threats by measuring them against each other informed by basic OSINT research.
Secondly, I used the Long Matrix 19 (LM19) technique evaluating the threats in accordance with 19 criteria with a mark of 0-4, forming Intent and Capability. Key among them were Links to Government, Size of group, Sophistication, Expertise and Monopoly. If the first two are obviously important, it became necessary to distinguish Expertise from Sophistication. Our research found GozNym was more adept but drawn to smaller-scale financial fraud – unlike APT10 or APT41’s ambitions, which are of systemic nature.
Less important attributes such as Links to Organised Crime and Corruption actually had a crucial impact on the results. Perceived as advantages (freedom to be creative and untraceable) initially, they hindered the effectiveness and scope of GozNym in favour of its Chinese peers.
Thirdly, the Long Matrix 12 (LM12) was used in similar fashion, with weights introduced from LM19 to compensate for the smaller number of variables.
They increased the gap this time between APT41 and the rest (APT10/GozNym) – demonstrating the importance of the Resources attribute for the latter to perform at a level similar to those of the former. This is compelling evidence for devoting most resources to tackling the APT41 challenge.
Finally, the Long Matrix 2.0 (LM 2.0) enabled me to deal with compound risk – risk measured by attribute rather than overall (i.e. as a multiplication rather than addition) for greater accuracy. It confirmed LM12 results, creating an almost perfectly equidistant spectrum of threats with APT41, APT10 and GozNym separated by 30 points in descending order. As with LM12, it leaves no ambiguity as to the order in which these threats ought to be engaged.
Despite the political points likely to be gained from policymakers by dealing with GozNym and tackling ‘Russian threats’, its clearly APT10 that deserves the attention after APT41 from the point of view of utility.”
Thank you Al for your interesting and insightful course and for the tools that you gave us! Really glad that our exam reports did not go into oblivion and made a basis for such a fruitful and thoughtful research!