The application of graph analysis has proven to be a useful abstraction for solving many important problems accross a variety of disciplines. In graph analysis, physical objects, abstract concepts, and other entities are represented as a set of vertices and the relationships between them are represented as edges that connect two vertices together. For example, vertices might include people, Twitter usernames, computers, cars, words, emotions, countries, events like Facebook wall posts - nearly anything. Edges might include relationships like "talks to", "lives in", "friends with", "contains", "retweets", or "posts on Facebook wall" - again, nearly any relationship. You may have noticed that the wall post example was given as both a vertex and an edge. This is because how you encode events and relationships is up to you. You could decide that these posts will be vertices and that relationships to each event might include "authored by", "commented on by", "liked by", etc. Similarly, you could decide that you would like to store a graph containing who posts on whose wall. As an trivial and somewhat frivolous example, consider the graph below consisting of the vertex types "people" and "things that are eaten" and the edge types "friends with" and "eats".
Many problems today can be formulated as dynamic spatio-temporal graph problems. For example, one may wish to track communities within social networks on Facebook as edges (friendship pairs) are added or removed. Or more interestingly, one may look for people who bridge between different social communities, or switch allegiances over time.
As the computer science community increases its development of algorithms and codes for large-scale graph problems, no canonical graph representation has yet to emerge. Without a standard graph representation, algorithms that are implemented for one framework may require substantial programming efforts to port to a different framework. Even worse, algorithms within a single framework may use different data structures for each graph kernel, requiring costly data transformations between each graph kernel subroutine. These inefficiencies in time, space, and productivity, could be reduced or eliminated through a canonical graph representation.
STINGER is a data structure for storing sparse dynamic graphs with temporal and semantic information encoded in the graph. This means that vertices in the graph have types (people, usernames, events, words, etc.) and weights, and that edges in the graph have types (friends with, sends email to, near to), weights, and timestamps (which we treat as created and modified). What the types, weights, and timestamps really mean is up to the user. As a data structure, STINGER can be extended to store more or less information with each vertex and edge as the user sees fit. A basic diagram from the original technical report defining STINGER can be seen below; however, the API for STINGER is designed to abstract away the details of the structure and simply present an interface to vertices with a set of adjacent vertices stored in edges, all having different properties that can be retrieved and set.
For the detail-oriented users, the data structure is similar to an adjacency list except that the adjacencies of a vertex are stored in a linked list of semi-dense blocks of edges rather than a linked list of individual edges. Also, the vertex array contains more metadata per each vertex than just a pointer to the linked of edge blocks. As you can see, vertices have physical identifiers (an arbitrary byte string that maps to that vertex), types (an arbitraty 64-bit integer value), weights (also an arbitrary 64-bit integer value), in and out degrees (set by the structure when edges are inserted and removed), and a list of adjacencies. Edge blocks contain meta data about all edges in the block including the type (so all edges in a block are of the same type).
The fact is that STINGER is also more than just this data structure. STINGER is also software and a collection of tools and algorithms that can process on these dynamic graphs to help us understand more about the data that we have.
Check out the Getting Started guide for information on how to download STINGER, build it, try it out, and get started using it to power your software.