Count-Min Sketch (CMS) and HeavyKeeper (HK) are two realizations of a compact frequency estimator (CFE). These are a class of probabilistic data structures that maintain a compact summary of (typically) high-volume streaming data, and provides approximately correct estimates of the number of times any particular element has appeared. CFEs are often the base structure in systems looking for the highest-frequency elements (i.e., top-K elements, heavy hitters, elephant flows). Traditionally, probabilistic guarantees on the accuracy of frequency estimates are proved under the implicit assumption that stream elements do not depend upon the internal randomness of the structure. Said another way, they are proved in the presence of data streams that are created by non-adaptive adversaries. Yet in many practical use-cases, this assumption is not well-matched with reality; especially, in applications where malicious actors are incentivized to manipulate the data stream. We show that the CMS and HK structures can be forced to make significant estimation errors, by concrete attacks that exploit adaptivity. We analyze these attacks analytically and experimentally, with tight agreement between the two. Sadly, these negative results seem unavoidable for (at least) sketch-based CFEs with parameters that are reasonable in practice. On the positive side, we give a new CFE (Count-Keeper) that can be seen as a composition of the CMS and HK structures. Count-Keeper estimates are typically more accurate (by at least a factor of two) than CMS for “honest” streams; our attacks against CMS and HK are less effective (and more resource intensive) when used against Count-Keeper; and Count-Keeper has a native ability to flag estimates that are suspicious, which neither CMS or HK (or any other CFE, to our knowledge) admits.
2022
Adversarial Correctness and Privacy for Probabilistic Data Structures
Mia Filić, Kenneth Paterson, Anupama Unnikrishnan, and 1 more author
In CCS ’22: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, Nov 2022
We study the security of Probabilistic Data Structures (PDS) for handling Approximate Membership Queries (AMQ); prominent examples of AMQ-PDS are Bloom and Cuckoo filters. AMQ-PDS are increasingly being deployed in environments where adversaries can gain benefit from carefully selecting inputs, for example to increase the false positive rate of an AMQ-PDS. They are also being used in settings where the inputs are sensitive and should remain private in the face of adversaries who can access an AMQ-PDS through an API or who can learn its internal state by compromising the system running the AMQ-PDS. We develop simulation-based security definitions that speak to correctness and privacy of AMQ-PDS. Our definitions are general and apply to a broad range of adversarial settings. We use our definitions to analyse the behaviour of both Bloom filters and insertion-only Cuckoo filters. We show that these AMQ-PDS can be provably protected through replacement or composition of hash functions with keyed pseudorandom functions in their construction. We also examine the practical impact on storage size and computation of providing secure instances of Bloom and insertion-only Cuckoo filters.
2018
Modelling the Connection between GNSS Positioning Performance Degradation, and Space Weather and Ionospheric Conditions using RReliefF Features Selection
The relationship between space weather and ionospheric conditions and GNSS position degradation has been recognized in numerous scientific studies. However, the relationship quantification remains a valuable scientific goal. In this manuscript, recent refinements in modelling of the level of GNSS positioning performance degradation caused by space weather and ionospheric dynamics are presented. The selected supervised machine learning (ML) method based on Linear Models (LM) and RReliefF variable selection process are used on experimentally collected data set in a quiet space-weather period.
On Development of the Forecasting Model of GNSS Positioning Performance Degradation due to Space Weather and Ionospheric Conditions
Mia Filić
In 2018 2nd URSI Atlantic Radio Science Meeting (AT-RASC), May 2018
Space weather and ionospheric conditions effects on the Global Satellite Navigation System (GNSS) positioning performance and operation have already been identified. However, the qualification of this relationship is still a subject of scientific activities. A model forecasting the level of GNSS positioning performance degradation caused by space weather and ionospheric dynamics represents a valuable scientific goal. This manuscript addresses the refinement in forecasting model development procedure achieved through utilisation of selected supervised machine learning method based on Linear Models (LM) and Component Analysis (PCA) on experimentally collected data set of the quiet space-weather period.