Static News

Geometric Search Trees (g-trees.github.io)

fanf2 | 6 comments

msy|next|[-][1 more]

If you're interested in how this is put together the repo is here: https://github.com/g-trees/g_trees/ and it uses https://github.com/worm-blossom/demo_macromania.

EdSchouten|prev|next|[-][3 more]

From "2 Related Work":

> None of these data structures can provide a non-probabilistic upper bound on the number of items per vertex. This hampers efficient implementation; and adversarial data suppliers can trivially produce n items in O(n) expected time that must all be stored in the same vertex.

And "4.3 Novel G-Trees":

> Our analysis confirmes that — with high probability — the resulting trees are of logarithmic height and their G-nodes store O(k) items. > > The second key insight toward an efficient k-ary data structure is, paradoxically, that there is no need to be clever about it. Sorted linked lists are naïve, inefficient data structures, yet zip trees are efficient. We can be similarly naïve for our k-ary construction. > > We use a sorted linked list in which every node stores up to k items. We require all items to be stored as early in the list as possible; this is the simplemost way of achieving history-independence. In other words, the only node to store fewer than k items is the final node. We call such a list a k-list (see Figure 9).

To me it's not obvious why an adversarial data supplier can't produce items that all need to go into the same G-node. Sure, it can be transformed into a k-ary data structure by partitioning the items across n/k nodes, but I don't see how that's better than a linked list.

So this makes me wonder: how is this approach any better than a Prolly tree? I see it being mentioned twice in this article, but the comparison doesn't go into detail.

bminor13|parent|next|[-][2 more]

In order to be in the same G-node, they'd need to have the same rank and be close in value (such that they were not "broken up" by a value in the next highest rank), right?

Seems like brute-force search for adjacent values with the same rank is possible, but guaranteeing that intermediate higher-rank values dont also exist may not be (for an attacker). Maybe one mitigation on this sort of attack is to search for higher-rank extra values to insert to break up large G-nodes?

This also assumes they can know the hash function (if rank is chosen by cryptographically-secure hash); maybe also salting values before hashing could thwart these sorts of attacks?

EdSchouten|root|parent|next|[-][1 more]

Exactly. But my concern is that this is not any stronger/better than what Prolly trees already offer, which is why I'm disappointed that they are mentioned under "related work", but not discussed/compared in more detail.

zermelo44|prev|[-][1 more]

The presentation of the webpage is really really nice. Especially the highlighting/linking of mathematical definitions and bound variables.

How did you achieve this?