Multivariate Network Layout Using Force-Directed Method with Attribute Constraints

: Graph visualization with proper layout is widely applied to understand the relationship between entities in a complex system and the topological structure information is mainly used. Real-world graphs often have the community structures property which is ignored in many existing graph layout methods. Thus, we propose a multivariate network layout method using the force-directed method with attribute constraints. This method can effectively take into account the hierarchical structure, connection strength, and quantitative comparison between communities. First, the layout of community centers is generated by a force-directed algorithm in which node count of the community is taken as constraints to enable area balance of community; Second, community force based on node attribute is added in the force-directed algorithm to maintain the community clarity. A visualization system is also developed to allow users to interactively generate community structure-aware layout results. qualitative and quantitative evaluation of the results veriﬁes the usability and effectiveness of the proposed method by comparing it with other methods.


Introduction
As an effective way to show the relationship between entities, network layout is widely used in a large number of fields, such as social networks, logistics networks, and paper citation networks [1].In practical scenarios, the networks often contain a lot of additional attribute information, such as the age and hobbies of user nodes in the social network [2], which are called multivariate networks.The layout of multivariate networks requires full consideration of network attribute information while expressing network topology information [3,4].
In the past few decades, many network layout methods have been proposed, and a wealth of multivariate network visualization methods have been realized on this basis, which can be categorized into three categories: node-link, matrix, and attribute.As the most important layout structure, node-link structures are used to display network information in many network visual analysis systems, e.g., Vister [5], Polaris [6], and Scalable Framework [7].The latest research on node-link structure pays more attention to the time complexity, and many acceleration methods have been proposed [8,9].Current commonly used node-link structure layout algorithms mainly include two categories: the force-directed method [10][11][12][13][14] that simulates the repulsive force and attraction between nodes in the physical system; and the dimensionality reduction method [14][15][16] that uses the graph distance matrix to project the graph to a high-dimensional space and then reduce the graph to a low-dimensional space.The force-directed method is simple, easy to implement, and extensible, but lacks the use of attribute information.Similarly, the dimensionality reduction method can maintain the structural consistency, but it has poor scalability and also lacks the use of attribute information.
The matrix structure [17] displays the connection relationship of the network with an adjacency relationship matrix.The hybrid structure combines the node-link structure with the matrix structure [18].For multivariate networks, although they clearly display the connection relationship of nodes and the attribute information, the network display lacks visual perception.
The attribute structure ignores the structural information of the multivariate network and directly uses the attribute information for layout.It is possible to convert the attributes into coordinates, and then draw the position of the node in the layout space [19].It is also possible to lay out the network with time attributes according to the time axis [20].The advantage is that it can display the attribute information of the network without restriction, but the disadvantage is that the visualization lacks structural information, which is not conducive to exploration.
To tackle the above problems, in this paper, we propose a multivariate network layout method using the force-directed method with attribute constraints (FDAC).We utilize the network topology and node attribute information, based on the user's choice, to realize the visual exploration of multivariate networks.First, through the interactive selection from users, the attributes of the nodes are obtained and the communities are allocated, to realize the data preprocessing.Then the attribute community layout is performed to obtain the position information of the attribute community of the node.Finally, the node layout is carried out to realize the visualization of multivariate networks.In the layout process, the attribute information of the nodes is fully utilized, and the layout is realized adaptively according to the user selection.We have implemented an interactive visual system to complete the layout task of the entire multivariate network.The source code of FDAC is available at https://github.com/tddsa/FDAC(accessed on 27 April 2022).Overall, our contributions are as follows: • We propose a multivariate network layout method FDAC based on attribute constraints.It mainly includes attribute community layout and node layout.Based on the force-directed method, multiple constraints are constructed using attribute information.

•
We construct an interactive visualization system.The system allows users to select the focus and attributes of interest, generates the corresponding layout adaptively, and supports the visualization of numerical and non-numerical attributes.

•
We propose two evaluation metrics to evaluate our layout results.Through qualitative and quantitative evaluation, we verify the usability and effectiveness of the proposed method compared with other state-of-the-art methods.
The structure of this paper is as follows: Section 2 introduces related work, Section 3 explains the proposed FDAC methods, Section 4 illustrates the visualization system, Section 5 displays the experiment results, and Sections 6 and 7 introduce the discussion and conclusion.

Force-Directed Method
The node-link structure draws nodes and edges in the layout space according to certain rules, allowing users to select nodes and edges to achieve the ideal layout [21].The most basic requirement is to ensure that the nodes are displayed clearly and without overlap, and the edges overlap and cross as little as possible [22].The force-directed method is the most popular node-link structure.The pioneer of the force-directed layout algorithm is the algorithm of Tutte [23] in 1963.This method is only suitable for three connected graphs and plane graphs.In 1984, Eades [24] proposed a spring embedding algorithm.The core idea is to represent the nodes in the network as rings and the edges as springs.Fruchterman and Reingold [10] proposed the FR algorithm based on the Eades algorithm, using the iterative process of the displacement scale control algorithm to generate a uniformly distributed layout of nodes.Kamada and Kawai proposed the KK algorithm [25], which uses the method of minimizing the energy function of the spring model of the network topology to generate the results.The LinLog [26] algorithm also uses the energy function to realize network visualization, but it can gather the topological structure of the network together to realize the clustering of nodes.Jacomy et al. proposed the ForceAtlas2 [27] algorithm, which achieves speed and accuracy improvements.The stress optimization model [28][29][30][31][32] obtains the optimal solution of the energy function through optimization.First, construct the energy function of the network layout, then derive it, and obtain the optimal solution through iteration.This process is strictly convergent [33].Davidson and Harel proposed the DH algorithm [34], which uses a simulated annealing algorithm to reduce the moving speed of nodes and prevent nodes from approaching unconnected edges, thereby reducing edge crossing.The DRgraph [9] method approximates the graph distance through a sparse distance matrix, uses a negative sampling technique to estimate the gradient, and uses a multi-level layout scheme to accelerate the optimization process.PIM [8] speeds up the force-guided layout by using an in-memory processing architecture.

Network Visualization
There are many methods for visual analysis of networks by using topology and attributes.Some methods analyze the network from a global view and show the overall structure to users.For example, Li et al. [35] use a topology-based approach to detect closely connected subgraphs in a network and represent them as abstract nodes to construct multi-level structures.Gemici et al. [36] analyzed the structure of social networks using information from underlying networks and hierarchies.Ge Huang et al. [37] used a radial layout algorithm based on tree layout to explore the hierarchy of the network.Stef et al. [38] used an approach based on attribute information to aggregate nodes with the same attribute value to provide a coarse network representation.Shen et al. [39] proposed a method that simultaneously utilizes network topology information and attributes information.OnionGraph [40] creates a five-level layout through attributes and topologies and allows users to access each level through context.Edge binding [41,42] helps users tease out connections in a graph layout by binding edges that are close together.Other methods analyze the network from a local view, focusing on nodes that users are interested in.FACETS [43] compute fractions according to topology and attributes to find neighbors of nodes that users are interested in.Crnovrsanin et al. [44] recommended nodes associated with focus through filtering.After users select nodes, Refinery [45] obtains correlation information through a random walk algorithm.Laumond et al. [46] allowed users to select multiple nodes continuously to help users obtain the regional structure they are interested in.

Methods
In this section, we first introduce the two keys of the FDAC method, i.e., attribute community layout and node layout, and then introduce the two evaluation metrics.

Attribute Community Layout
We utilize the force-directed method with constraints to calculate the location of the attribute community.The forces received by each community include spring force, repulsive force, collision force and central force.The simulated annealing method is used to obtain the community location coordinates.In the following, we introduce the problem illustration.
We define the entire multivariate network as G = (V, E), where V = {v 1 , v 2 , ..., v n } represents n nodes in the network, and E = {e 1 , e 2 , ..., e m } represents m edges in the network, where each edge e k = (v i , v j ) connects two different nodes in the network.On this basis, we obtain the attribute community structure of the multivariate network G C = (V C , E C ), where V C = {v c 1 , v c 2 , ..., v c n } represents the locations of c n attribute community centers, and E C = {e c 1 , e c 2 , ..., e c m } represents the connection relationships between communities.Each edge e c k = {v c i , v c j } connects two different attribute communities.Note that V C and E C do not exist in the original multivariate network.For each attribute community center v c i in V C , its coordinates are randomly initialized v c i = (x c i , y c i ).If there is an edge e k connection between the nodes of two attribute communities, then there is an edge e c k between the two communities.

Spring Force
Spring force acts on two attribute communities connected by edges, and its calculation formula is as follows: where e c k is the edge of calculating the spring force, and d pq is the Euclidean distance between the centers v c p and v c q of the current two attribute communities p and q, which are connected by e c k .L pq is the ideal distance between the two attribute communities, i.e., the ideal length of edge e c k , and t represents time.In our method, it indicates the current iteration number of the algorithm.We set α as the global temperature at time t, which is used to implement the simulated annealing algorithm, whose function is to reduce the force on the node as the iteration proceeds, control the severity of the changes in the position of the node and finally achieve the convergence of the algorithm.strength a is the elastic strength of this edge, just like the elastic coefficient.The formulas of α and strength a are as follows: where T is the target value of simulated annealing, deg p and deg q are the degrees of the centers v c p and v c q of the two attribute communities.The strength of edge e c k depends on the two connected nodes.α decay is the decay rate of the simulated annealing algorithm.When α decreases to less than the threshold, the iteration is terminated.There are differences in the number of nodes in communities with different attributes.Therefore the ideal distance between communities L pq is not static.The community radius should be determined according to the number of nodes in the community.The formula of L pq is as follows: where L 0 is the basic distance between communities, n is the number of all nodes, n c p and n c q are the number of nodes contained in the two attribute communities p and q.L max is the maximum distance of the edge, depending on the size of the layout area.The calculation formula is as follows: where W and H are the width and height of the layout area, respectively.According to the Verlet [1] integration algorithm of particle motion simulation, the velocity changes ∆v ax and ∆v ay of the attribute communities p and q in the horizontal and vertical directions under the action of the attractive force F a can be calculated.The formulas are as follows: b pq = deg p deg p + deg q (10) where x pq and y pq are the distances between the attribute communities p and q in the horizontal and vertical directions, respectively.b pq is determined by the degree of the center of the two attribute communities v c p and v c q .This makes the node with the smaller degree at both ends of the same edge move more.

Repulsive Force
The repulsive force is the force that an attribute community is repelled by others, and its function is to spread the entire network layout on the plane.For community p, the formula for the repulsive force exerted by the community q is as follows: where strength r (p, q) is the repulsive strength of the community p by the community q, δ is a multi-segment function, the formula is as follows: where S max is the maximum setting of the repulsive strength and the default value of S max is 600, and k is a coefficient between 0 and 1.When the distance between the communities is less than the target distance, the node is subject to greater repulsion.Finally, the horizontal and vertical velocity changes of the attribute community p under the action of repulsive force are as follows:

Collision Force
Collision force is the force that two attribute communities receive when they overlap.It acts to repel the two communities in opposite directions along the centerline, which is similar to the repulsive force.When the actual distance is greater than the sum of the community radius, the collision force will disappear.The formula is as follows: where strength c is the default collision strength.The closer the two communities are, the greater the collision force they receive.Under the effect of the collision force, the velocity changes of the two overlapping attribute communities p and q in the horizontal and vertical directions are as follows: The purpose is to make the attribute community with many nodes move slowly, and make the community with few nodes move fast.This guarantees the stability of the network visualization.

Central Force
The central force refers to the force received by each attribute community from the center of the layout area.It makes the center point of the entire network layout close to or even coincides with the center point of the layout area, ensuring that the network is drawn in the center of the layout area.Define the coordinates of the center point of the layout area as (x o , y o ), and the coordinates of the center point of the network as (x o , y o ), the formula is as follows: where (x i , y i ) is the coordinates of the center of the i-th attribute community, n c i is the number of nodes in the i-th attribute community.This makes it possible to consider the influence of communities with more nodes on the central location when calculating the central location of the network.The formula of the central force is as follows: where strength o is the strength of the central force.Therefore, the velocity change of the attribute community p in the horizontal and vertical directions under the action of the central force is as follows: Since the expression of the result of the force in each attribute community is the speed in the horizontal and vertical directions, in the process of each iteration, these speeds can be accumulated, and at the end of each iteration, the location of the attribute community is moved.Therefore, combining the effects of the above four forces on the community center, the total speed change is as follows: where ∑ a ∆v ay (p) is the total amount of velocity change in the horizontal direction of the attribute community p under the attractive action, and the formulas of repulsive and collision forces are similar.Finally, the update formula of the velocity change to the location of the center (x c p , y c p ) of the attribute community p is as follows: where (x c p , Y c p ) is the position of the center of the attribute community p in the previous iteration, v x (p) and v y (p) are the speed of the previous iteration, and β is the decay rate of speed, and the default value of β is 0.5, which is used to control the severity of changes in node position.Therefore, as the algorithm runs and the number of iterations increases, the global temperature α of the simulated annealing algorithm continues to decrease, thereby gradually achieving the convergence of the algorithm, and finally obtaining the ideal location of the center of each attribute community.

Node Layout
Similar to the Section 3.1 attribute community layout method, in the node layout process, each node will be also subject to attraction, repulsion, collision, and central force.In addition, each node is also subject to the community force exerted by its corresponding attribute community.If each attribute community is regarded as a circular area, then the role of community force is to draw the nodes of the same attribute to the range of the circular area.However, only relying on the action of these five forces cannot achieve satisfactory visualization results, so additional constraints must be introduced to these forces to achieve multivariate network visualization based on attribute constraints.

Spring Force Constraint
Spring force acts on nodes connected by one edge.The difference from the spring force of attribute community layout is that the ideal distance L ij between the two nodes i and j.
When the attributes of two nodes are the same, the two nodes belong to the same attribute community.Therefore, the formula for calculating L ij is as follows: where r is the default radius of the node, deg i is the degree of the node i, deg max is the maximum degree of all nodes.
When the attributes of two nodes are different, the two nodes belong to different attribute communities p and q.Therefore, the formula for calculating L ij is as follows:

Repulsive Force Constraint
For each node, the repulsive force by the node with the same attribute and by the node with different attributes should be different, so the repulsive force strength of node i received by node j is modified; the formula is as follows: where a i is the attribute of the node i, S is the default repulsive force strength, the default value S is −30 , and µ is an adjustable parameter.

Community Force
Community force refers to the attraction that the center of the attribute community exerts on the nodes of the same attribute.Its function is to draw the nodes of the same attribute into the same attribute community.The formula for node i in the attribute community to be attracted by the community center v c p = (x c p , y c p ) is as follows: where d ip is the Euclidean distance between the node i and the community center v c p , σ is a parameter to control the ideal distance between the node and the community center, strength k (i, p) is the strength of the community force.Its formula is as follows: where S is the default value of community force strength, n i−d is the number of nodes with different attributes connected to node i, n i−s is the number of nodes with the same attributes connected to node i.In this way, nodes with more different types of nodes than nodes of the same type in the neighborhood can be closer to their own attribute communities to maintain discrimination.Further, under the influence of the community force, the velocity change of the node i in the horizontal and vertical directions is: where x ip and y ip are the distance between node i and the center of the attribute community v c p in the horizontal and vertical directions, respectively, and deg p−max is the maximum value of the node degree in the attribute community p.
Finally, under the action of attractive force, repulsive force, collision force, central force, and attribute community force, the position of each node is calculated; the formula is as follows: Algorithm 1 is a flow chart of the entire algorithm process, which introduces the overall layout algorithm of the multivariate network with attribute constraints.

Evaluation Metrics
This paper proposes two evaluation metrics to measure the clarity and discrimination of the attribute community in the layout results from two aspects.
The first metric is the average distance between nodes in attribute communities (ADIAC).This metric can measure the degree of cohesion of the attribute community.On the one hand, the index reflects the clustering degree of nodes within the attribute community.The smaller its value is, the higher the distinguishable degree of the community is.On the other hand, when the ADIAC tends to 0, it means that nodes are stacked towards the community center, which will cause serious node overlap and make it difficult to show the connection relationship between nodes.The formula of ADIAC is as follows: where m is the number of all communities under the constraint of the attribute, C k is the k-th attribute community, n k is the number of nodes in C k , i and j are two different nodes in C k , x * i is the abscissa of node i after maximum and minimum normalization, and y * i is the ordinate.The formula for maximum and minimum normalization is as follows: where x is the abscissa of the node in the layout result, x min is the minimum value of the abscissa of all points, x max is the maximum value of the abscissa of all points, x * is the result of maximum and minimum normalization.Through the maximum and minimum normalization, the error interference of the size of the layout area on the evaluation result can be eliminated.The second metric is the average distance between attribute communities (ADBAC), which can measure the degree of distinguishability of attribute communities.The larger the distance between attribute communities, the greater the degree of distinction between attribute communities, and the layout results are presented more clearly.When the value of the metric tends to 0, it means that the different attribute communities are very close to each other.At this time, different attribute communities overlap with each other, and it is difficult to distinguish the existence of attribute communities from the layout results.The formula of ADBAC is as follows: where i and j are the i-th and j-th attribute communities respectively, xi is the abscissa of the centroid of the i-th attribute community, xj is the ordinate.The formula of ( x, ȳ) is as follows: where (x * i , y * i ) is the coordinate of the i-th node after the maximum and minimum normalization in a certain attribute community, and n is the number of nodes in this attribute community.

System
As shown in Figure 1, we introduce the general framework of our method.The input is a multivariate network G in = (V, E, A), where V is a node set, E is an edge set and A = {attr 1 , ..., attr m } is the attributes for every node.The output is the layout of the whole of the multivariate network.The general framework consists of two major modules: data preprocessing and network layout module.After data preprocessing, every node has a label and it belongs to an attribute community, then we can use FDAC to start the network layout.In our method, the data preprocessing module mainly consists of three steps: user select node of interest, user select attribute of interest and node attribute community distribution.On this basis, we developed a visualization system for applying the FDAC method.As shown in Figure 2, our system consists of three parts: (1) the focus selection view is used for the user to select the node of interest.(2) the attribute selection view is used for the user to select the attribute of interest and distribute the node attribute community (3) the layout result view is used to display multivariate network layouts.

Focus Selection View
The main task is to select the nodes of interest based on the specified dataset.As shown in Figure 2a, we provide two options.
Automatic screening by the system.Automatic screening by the system As shown in a1 in Figure 2a, shows the top 100 results after sorting based on a certain metric.The ranking metrics include PageRank [47], the degree of the node, and some attribute values of the node.PageRank is automatically calculated by the system after selecting the dataset.This option is mainly provided to users who are not familiar with the dataset.
Search view.As shown in a2 in Figure 2a, shows the search results based on the attributes and attribute values, including the total number of nodes and the corresponding information of all nodes.After selecting the dataset, the system will count the attributes of the node and the corresponding attribute values for the user to select in the search box.
Through the above two options, the user can select the nodes of interest for subsequent network layout tasks.The result of the selection will be displayed in Current Focus, and the user also needs to select hop, which is used to determine the size of the subgraph used for the layout.

Attribute Selection View
As shown in Figure 2b, the user needs to select the attribute used for the layout and the attribute value under that attribute.For numerical attributes and non-numerical attributes, we have designed different attribute community distribution methods.
Numerical attributes.Taking the age attribute of each node in a social network as an example, users are not very interested in a specific age most of the time.Therefore, we provide an interval division method for numerical attributes.The user can select the expected number m of attribute communities to be divided.Knowing that n nodes are waiting to be laid out in the network, the formula for the expected number of nodes n c in each community is as follows: Sort the numerical attributes of all nodes, and divide the sequence with n c as the length.When the sum of the number of nodes corresponding to consecutive numerical attributes reaches n c , it is classified as an attribute community.The label of the attribute community is the mark from the minimum to the maximum value of the numerical attribute.Each node in the attribute community can be assigned this mark.
Non-numerical attributes.Take the keyword attribute of each node in the paper citation network as an example.A paper often contains multiple keywords.When users explore the keyword attributes, they often choose multiple keywords as the focus of attention.Using a method based on perfect matching, the dataset is traversed, and the node attributes that exactly match the user's selection are labeled.For nodes that match multiple attributes at the same time, they contain multiple tags and are assigned to a new attribute community.
Node attribute community distribution.Based on the choices, we need to assign attribute labels to nodes.For some nodes, its attribute value matches only one of the choices, then the label of this node is the attribute value.For other nodes, if its attribute value matches multiple attribute values, then the label of this node is the combination of the attribute values and this node will be assigned to a new attribute community where nodes have the same label also be in.For example, keywords attribute may contain multiple values, like graph layout, visualization, data mining, etc. the combination of the attribute values is graph layout+visualization.

Layout Result View
Based on the user's selection result, the data are obtained from the database and preprocessed, and then the force-directed layout algorithm based on attribute constraints is used for layout.As shown in Figure 2c, the layout result is drawn on the view.
When users are faced with a crowded network layout, the complex relationship formed by nodes and edges will interfere with users' exploration of the network structure.Therefore, we have realized the function of assisting users to explore in the visual system.As shown in Figure 3a, users first acquire the structure of the entire network when exploring the nodes they are interested in.As shown in Figure 3b, users can focus on a local region by zooming and exploring the connection relationship between nodes in the corresponding region.As shown in Figure 3c, we also support users to select and drag nodes, allowing users to make temporary changes to the layout of the network to better identify the connections between nodes.

Evaluation
We compared the three layout methods in terms of evaluation indicators (ADIAC and ADBAC), layout visualization, and edge length.The complexity of our method is analyzed.A user experiment is designed to verify the effectiveness of the method.We conduct all experiments on a laptop computer with Intel(R) Core(TM) i7-9750H CPU, 16 GB memory, and Windows 10 installed.

Dataset
This paper uses three graph datasets.The first one is the paper citation network in the field of visualization (VisPCNet), and the second one is the Researcher Collaboration Network (RCNet) composed of the paper collaborations of researchers in the four fields of visualization (Vis), data mining (DM), human-computer interaction (HCI), and machine learning (ML).Both are based on the open source dataset of Aminer.The last one is an office message network (OMNet) dataset based on VAST 2012 mini-challenge II network [48] However, the comparison method in this paper, that is, the traditional force-guided layout method, performs very poorly on large-scale datasets.Therefore, this paper extracts four moderate-scale subgraphs from a relatively large-scale researcher cooperation network, which are named RCNet1, RCNet2, RCNet3, RCNet4, and OMNet1, respectively.The evaluation part of this article mainly uses these six datasets, and their specific information is shown in Table 1.VisPCNet.The network consists of 513 visualization papers (nodes) and the citation relationships (edges) between them.Among them, each paper has five attributes: title, keywords, journal, year, number of citations.
RCNet1, RCNet2, RCNet3 and RCNet4.These four networks are composed of researchers in the field of visualization, data mining, human-computer interaction, and machine learning, as well as their cooperation in the field.The node represents the researcher.If two researchers have published a paper together, then there is an edge connection between them.In each network, researchers have five attributes: name, interests, number of papers, number of citations, H index.
OMNet1.The network consists of 52 devices(nodes) and 140 communications links(edges) between nodes.Each node has six attributes: IP, user name, working years of user, message type, message number, and device type.

Compared Methods
We compare our method with three layout methods, i.e., FR layout [10], LinLog layout [26], and ForceAtlas2 layout [27].FR is the most classic force-directed model, and the subsequent various force-directed methods are improved versions of it.The LinLog model is a force-directed layout method based on an energy function, and its layout can present the structural community in the graph data.The nodes in the structural community may have the same attribute information.At this time, the LinLog model can present the attribute community, so we also use it as a comparison method.ForceAtlas2 is a force-directed method integrated in the well-known graph visualization tool Gephi.It uses various constraints to optimize the graph layout and can well reveal the structural information of the graph data.

Quantitative Results
Tables 2 and 3 are the comparison of experimental results under ADIAC and ADBAC evaluation metrics, respectively.We compare the four methods on the four attributes of the above six datasets.Attribute 1 of VisPCNet is the keywords, attribute 2 is the journal, attribute 3 is the year, and attribute 4 is the number of citations; Attribute 1 of RCNet1, RCNet2, RCNet3, and RCNet4 is interests, attribute 2 is the number of papers, attribute 3 is the number of citations, and attribute 4 is the H index. Attribute 1 of OMNet1 is the working years, attribute 2 is the message type, attribute 3 is the message number, and attribute 4 is the device type.
It can be seen from Table 2 that under each attribute of each dataset, the ADIAC value of the method in this paper is much smaller than the ADIAC value of the other three methods.Compared with other methods, ADIAC in this paper is the lowest, about 0.1.This shows that our method has good layout stability, can obviously gather nodes with the same attributes together, and ensure the differentiation between nodes.It can be seen from Table 3 that under each attribute of each dataset, the ADBAC value of this method is much larger than that of other methods.The results of the comparison of evaluation metrics show that compared with other methods, the method in this paper can obtain clearer and more distinguishable results of attribute community layout.

Qualitative Results
After visualizing the layout results, qualitatively comparing the layout results with human eyes can visually compare the differences between the methods.We selected 4 sets of layout results corresponding to VisPCNet and 1 set of layout results corresponding to OM-Net1 to visualize.Among them, different colors represent different a ttribute communities.As shown in Figures 4-8, the comparison results of the layout under the four attributes are drawn separately.The results indicate that our method can draw the attribute community and display the attribute information well according to the attributes of the nodes.

Edge Length
We use two metrics: uniform edge length and maximum edge length to measure edges in the layout result.For the metric of uniform edge length, we calculate the standard deviation of the length of all edges.The experimental results are shown in Table 4.The smaller the standard deviation and maximum value, the more balanced the network layout.From the results, although our method does not reach the best level, the cost is acceptable compared to the attribute information representation it brings.The time complexity of our algorithm includes attribute community layout O(T|V C | 2 ), node layout (T|V| 2 ), where |V| is the number of nodes, |V C | is the number of communities, and T is the number of iterations.Because the number of communities in the network is much smaller than the number of nodes, the total computational complexity of FDAC is derived as (T|V| 2 ).Table 5 shows the running times of our method for different data sets.

User Study
To verify the effectiveness of our method in the representation of attribute information, we designed a user study.We defined three tasks:

•
Community task: can users intuitively distinguish the number of attribute communities?• Node task: Based on the nodes that the user is interested in, can users intuitively judge the proportion of nodes with the same attribute and nodes with different attributes in its neighbor nodes?• Edge task: Based on the node that the user is interested in, can users intuitively explore other nodes with the same attribute connected to the structure along the edge?
We compared it with FR, linLog, and FA2.We invited five graduate students interested in the field of network layout as users.Users scored the experimental results of the four methods on multiple datasets according to the above tasks.The highest score is 5, which means it is easy to complete the task, and the lowest score is 1 which means it is difficult to complete the task.We averaged all the user scores.The experimental results are shown in Table 6.It can be seen from the experimental results that our method has obvious advantages in helping users to distinguish the number of attribute communities and judge node neighborhood relations.In terms of edge exploration, due to the influence of attribute community force, our method has more interference between edges in a larger graph, resulting in a lower overall score.In conclusion, our method can provide effective help for users to explore attribute information in the network.

Discussion
In this paper, we propose a method to visualize the layout of the graphs considering the attribute information of nodes.Compared with the traditional graph visualization layout method, the method in this paper can reflect the attribute information of the community graph data well.As in the above example, the clustering in the graph layout result can intuitively reflect the attribute information implicit in each attribute of the graph data.In the layout method that does not consider attributes, some nodes may be far away from most nodes with the same attributes due to the structural connection relationship.Although the layout uses colors and shapes for the logo, it still gives people a mixed feeling.On the contrary, the introduction of additional powerful constraints through attribute similarity can enable people to better confirm the attribute information of nodes.
In our method, attribute community force is applied to nodes to gather nodes with the same attribute into the same area.For some nodes, attribute constraints cause them to move away from neighboring nodes with different attributes, which creates additional edge crossings in the layout.This makes it difficult for users to explore the larger attribute community.On the one hand, we strengthen users' exploration ability through interaction design in the system.On the other hand, by defining community force strength, nodes closely connected with other attributes are more likely to be distributed on the periphery of the community to reduce crossover.
Our method automatically calculates the location of the attribute communities in space, allowing for a flexible layout.The idea of tree layout can be used in future work.Predetermine the location of each community layout based on the distribution of leaf nodes in the tree.For some nodes in the network, it may be connected to a large number of edges.We can also borrow PLANET's [37] idea of using polar coordinates to increase angles between different edges and enhance resolution.
Our method allows users to visualize the graph layout by selecting the node attributes they are interested in, to explore the attribute information implied by the graph data under this attribute.This kind of user interest-driven model makes the method in this paper flexible.If the node contains a non-numeric attribute, then each attribute value under the attribute is regarded as a kind of attribute community.If the node contains numerical attributes, users can discretize the attribute values according to their preferences, and each interval is an attribute community.
Adjusting the strength of attribute constraints has a significant impact on the results of the layout.When the attribute constraints are too tight, the nodes will closely surround the community center, resulting in a large number of overlapping nodes within the community.It is unable to distinguish the connection relationship between the same community node.When the attribute constraints are too loose, although the structural information between the same community nodes can be displayed well, the degree of discrimination between attribute communities will decrease.

Conclusions
In this paper, we propose a multivariate network layout method using force-directed with attribute constraints(FDAC), which can generate layout results with attribute information.We adopt a hierarchical layout strategy.By introducing attribute community force and multiple layout constraints in the force-directed layout, nodes with the same or similar attribute values are brought close to each other to form clusters.In this way, the attribute information contained in the network data can be intuitively conveyed to the user.
Aiming at the layout quality, this paper proposes two evaluation metrics, average distance between nodes in attribute communities and average distance between attribute communities.We compare three representative force-guided layout methods on six datasets to verify the effectiveness of this method.
The method in this paper generates large attribute communities when dealing with large networks, which is not conducive to users' exploration.We consider designing a multi-level attribute community layout in future work to reduce community size.There are a large number of edges between communities with different attributes.These edges interfere with the exploration of the network.We consider using edge bundling to simplify the structure.The time complexity of the algorithm can also be optimized.In the future, we consider completing both attribute community layout and node layout in one iteration.Our method automatically calculates the location of the attribute communities in space.We consider allowing users to decide where to place the attribute communities.We also consider using the tree layout structure to predetermine the location of each community and increase angles between different edges to enhance resolution.

Algorithm 1 :4 while α > T do 5 Update current temperature α; 6 for each node v c i in V C do 7 8 Get the total velocity change ∆v; 9 Update the position of v c i ; 10 while α > T do 11 Update current temperature α; 12 for
Attribute Constraint LayoutInput: G = (V, E): a multivariate networks; T: the target temperature; α: the current temperature Output: V = {v i , v 2 , ..., v n }: the position of nodes 1 Randomly initialize the attribute community centerV C = {v c 1 , v c 2 , ..., v c n };2 for every pair of nodes in G do 3 Generate edges E C = (e c 1 , e c 2 , . . ., e c m ) between communities according to the connection relationship between nodes; Get the velocity change ∆v a , ∆v r , ∆v c and∆v o ; each node v i in V do 13 Get the velocity change ∆v a , ∆v r , ∆v c , ∆v o and∆v k ; 14 Get the total velocity change ∆v; 15 Update the position of v i ; 16 return V;

Figure 1 .
Figure 1.The general framework of our approach.

Figure 2 .
Figure 2. Visual system for user interaction with FDAC.(a) The focus selection view with (a1) the automatic screening by the system and (a2) the search view.(b) The attribute selection view.(c) The layout result view.

Figure 3 .
Figure 3. Illustration of User detail exploration.(a) global view; (b) local view; (c) local view after interaction.

Figure 6 .
Figure 6.The result of attribute community layout under the attribute 'Year'.(a) FR layout; (b) LinLog layout; (c) ForceAtlas2 layout; (d) layout of our method.

Figure 7 .
Figure 7.The result of attribute community layout under the attribute 'Number of Citations'.(a) FR layout; (b) LinLog layout; (c) ForceAtlas2 layout; (d) layout of our method.

Figure 8 .
Figure 8.The result of attribute community layout under the attribute 'working years'.(a) FR layout; (b) LinLog layout; (c) ForceAtlas2 layout; (d) layout of our method.

Table 1 .
Illustration of the datasets.

Table 4 .
Metric comparison of edge lengths.

Table 5 .
Running times of our method.The data in the network consist of nodes and edges.Hence, the space complexity of our method isO(|V| + |E| + |V C | + |E C |),where |E| is the number of edges, and |E C | is the number of edges between communities.

Table 6 .
Average scores of users study.