Graph and GNN

Graph structure is everywhere. It can be used for medicine/pharmacy, recommendation systems, social networks, and 3D games or meshes. The graph (G, G = (V, E)) contains nodes/vertices (V) and edges/connections (E), which are presented as features to show properties. Adjacency matrix (V x V) are usually used to represent connection information between nodes.

3 kinds of ML problems can be solved with Graph:

Node-level predictions: predict unlabeled node based on the information of other nodes
Edge-level (Link) predictions: predict whether the two nodes in a graph will be connected or not
Graph-level predictions: classify or predict the attribute of the input graph

GNN

Deep learning and Neural networks are widely used in speech, image and text. While graph data is more complex than them, and Graph Neural Network (GNN) is a network which is used to handle graph data.

A node in the graph is represented as features and can send and receive message along its connections with its neighbors, to update itself and understand the environment.

The core component of GCN is the message passing layer (MP layer). Because the image fusion process is between direct neighbors, the upper bound of how long a message can travel is defined by the number of iterations/MP layers, called hops. Too many MP layers can lead to over-smoothing, so that it will learn nothing new but make nodes indistinguishable from others.

For each node, it first aggregates the features of its direct neighbors and then update its own feature. Repeat this step, every node will iteratively contain the feature-based and structural information about the other nodes. This local and iterative aggregation is similar to learnable CNN kernels.

As shown in the illustration, at each MP layer, a node first aggregate/gather the features of neighbor nodes, combine it in a certain way, and then use it to update the node feature. The sharing mechanism is also called message passing.

Small Tricks

For node-level predictions, the binary mask [0, 1] are usually added to input to distinguish unknown nodes.
The Graph pooling (e.g. Mean-Max pooling) are usually used for global-level over all the nodes.
Batch input: concatenate all node features in a large matrix, and make a large adjacency matrix, where each graph is disconnected with others (set 0).

Graph-based Layout Generation

Graph2Plan

Graphic design is everywhere in our daily life: website page design, magazine layout, floor plan design for architecture, and so on. Therefore, automatic graph-based layout generation is helpful and important for designers to create new designs by the insight of previous ones. Graph2Plan [3] is an efficient deep learning-based approach to implicitly learn the constraints for floorplan generation, which automatically generate floorplan with retrieve-and-adjust paradigm and user in-the-loop designs. This strategy includes human design principles and user preferences and makes a UI system where user can modify the input to each step by editing the images.

The Graph2Plan [3] baseline model automatically generate floorplan with retrieve-and-adjust paradigm and user in-the-loop designs, but it has mainly two drawbacks: (1). User constrains are very simple, important information like room size, functional adjacency is not considered; (2). It relies on retrieval, if the input boundary is not similar to the database, the retrieval and generation will fail because it relies on retrieval and no learning used in graph generation.

Please refer to this slides for a more detailed introduction of Graph2Plan.

Neural Design Network

Neural Design Network (NDN) [4] can generate a graphic design layout given a set of components with user-specified attributes and constraints. It first takes as input a graph with partial edges from user input, completes this graph by filling in unknown edges, and infers these unknown edges to build a graph with complete relationships among components. Next, NDN iteratively predicts bounding boxes for components in the complete graph and fine-tunes the bounding boxes to improve the alignment and visual quality. Please refer to this slides for a more detailed introduction of NDN.