{"componentChunkName":"component---src-templates-post-template-js","path":"/posts/dyadic-relational-graph-convolutional-networks-for-skeleton-based-human-interaction-recognition","result":{"data":{"markdownRemark":{"id":"53a6091c-39cf-564d-b679-d5cdad22ba6d","html":"<ul>\n<li><a href=\"#highlights\">Highlights</a></li>\n<li><a href=\"#abstract\">Abstract</a></li>\n<li><a href=\"#method\">Method</a></li>\n<li><a href=\"#results\">Results</a></li>\n<li><a href=\"#conclusion\">Conclusion</a></li>\n</ul>\n<p><strong>Here, I briefly introduce our work. Some contents are extracted from the accepted version of our paper. For more information please see <a href=\"https://www.sciencedirect.com/science/article/pii/S0031320321001072\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">our paper</a>.</strong> Code is available at <a href=\"https://github.com/GlenGGG/DR-GCN\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">github</a>.</p>\n<center>\n    <img style=\"border-radius: 0.3125em;\n    box-shadow: 0 2px 4px 0 rgba(34,36,38,.12),0 2px 10px 0 rgba(34,36,38,.08);\" \n    src=\"/media/paper-images/DR-GCN/structure.jpg\">\n    <br>\n    <div style=\"color:orange; border-bottom: 1px solid #d9d9d9;\n    display: inline-block;\n    color: #999;\n    padding: 2px;\">\n\tOverall architecture of DR-GCN.\n\t</div>\n</center>\n<h2 id=\"highlights\" style=\"position:relative;\"><a href=\"#highlights\" aria-label=\"highlights permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Highlights</h2>\n<ul>\n<li>We are the first to construct dynamic graphs on skeleton sequences that capture discriminative relations between skeletons.</li>\n<li>Relational Adjacency Matrix is proposed to present relational graphs using geometric features and relative attention.</li>\n<li>Proposed Dyadic Relational Graph Convolutional Network achieves state-of-the-art accuracy on three challenging datasets and improvements of 6.63% on NTU-RGB+D and 5.47% on NTU-RGB+D 120 over the baseline model.</li>\n<li>Our methods consistently help advanced models achieve higher accuracy of 1.26% on NTU-RGB+D and 2.86% on NTU-RGB+D 120.</li>\n</ul>\n<h2 id=\"abstract\" style=\"position:relative;\"><a href=\"#abstract\" aria-label=\"abstract permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Abstract</h2>\n<p><em>Skeleton-based human interaction recognition is a challenging task requiring all abilities to recognize spatial, temporal, and interactive features. These abilities rarely co-exist in existing methods. Graph convolutional network (GCN) based methods fail to extract interactive features. Traditional interaction recognition methods cannot effectively capture spatial features from skeletons. Toward this end, we propose a novel Dyadic Relational Graph Convolutional Network (DR-GCN) for interaction recognition. Specifically, we make four contributions: (i) we design a Relational Adjacency Matrix (RAM) that represents dynamic relational graphs. These graphs are constructed combining both geometric features and relative attention from the two skeleton sequences; (ii) we propose a Dyadic Relational Graph Convolution Block (DR-GCB) that extracts spatial-temporal interactive features; (iii) we stack the proposed DR-GCBs to build DR-GCN and integrate our methods with an advanced model. (iv) Our models achieve state-of-the-art results on SBU and significant improvements on the mutual action sub-datasets of NTU-RGB+D and NTU-RGB+D 120.</em></p>\n<h2 id=\"method\" style=\"position:relative;\"><a href=\"#method\" aria-label=\"method permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Method</h2>\n<center>\n    <img style=\"border-radius: 0.3125em;\n    box-shadow: 0 2px 4px 0 rgba(34,36,38,.12),0 2px 10px 0 rgba(34,36,38,.08);\" \n    src=\"/media/paper-images/DR-GCN/FIG3c.jpg\">\n    <br>\n    <div style=\"color:orange; border-bottom: 1px solid #d9d9d9;\n    display: inline-block;\n    color: #999;\n    padding: 2px;\">An illustration of the relational graph. Green dots in this image represents\nbody joints. The orange links represent relational links denoting the strong relation between joints of the two actors.</div>\n</center>\n<p>Above image shows one relational graph at a single frame, which is represented by proposed Relational Adjacency Matrix. It is generated separately for each frame of a sequence of frames in the skeleton sequence. </p>\n<center>\n    <img style=\"border-radius: 0.3125em;\n    box-shadow: 0 2px 4px 0 rgba(34,36,38,.12),0 2px 10px 0 rgba(34,36,38,.08);\" \n    src=\"/media/paper-images/DR-GCN/FIG5.jpg\">\n    <br>\n    <div style=\"color:orange; border-bottom: 1px solid #d9d9d9;\n    display: inline-block;\n    color: #999;\n    padding: 2px;\">\n\tAn illustration of the Relational Adjacency Matrix (RAM) generation procedure.\n\t</div>\n</center>\n<p>The generation and utilization are the key components of our paper. Briefly speaking, we generate the relational links, or the RAM, which represents the relational links by considering two components. They are the geometric component and the relative attention component. The geometric component is straitforward. If two joints each from one actor are close, then we consider them to be correlated. This simple assumption turns out to be very effective. For the relative attention component, we hope it can capture semantic information and connect joints that are semantically similar. We do this by first encode each joint with spatial-temporal graph convolutional layers and then calculate similarity between each joint pairs. Basing on above two component, we combine them using network-learned param and then we have the RAM.</p>\n<center>\n    <img style=\"border-radius: 0.3125em;\n    box-shadow: 0 2px 4px 0 rgba(34,36,38,.12),0 2px 10px 0 rgba(34,36,38,.08);\" \n    src=\"/media/paper-images/DR-GCN/FIG6.jpg\">\n    <br>\n    <div style=\"color:orange; border-bottom: 1px solid #d9d9d9;\n    display: inline-block;\n    color: #999;\n    padding: 2px;\">\n\tAn illustration of Dyadic Relational Graph Convolution Block (DR-GCB). DR-GC refers to dyadic relational graph convolution.\n\t</div>\n</center>\n<p>With the RAM, we propose Dyadic Relational Graph Convolution Block (DR-GCB) that apply dyadic relational graph convolution on the two skeletons to learn relational features. DR-GCB is highly extensible and can be plugged to other networks to improve their performance.</p>\n<h2 id=\"results\" style=\"position:relative;\"><a href=\"#results\" aria-label=\"results permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Results</h2>\n<p>We have done extensive experiments. Results show our network and methods achieve significantly better results comparing with other state-of-the-art methods. They also prove the extensibility of our methods. To review the data, please read <a href=\"https://www.sciencedirect.com/science/article/pii/S0031320321001072\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">our paper</a>.</p>\n<p>Below we show some generated relational graphs.\n<img src=\"/54ad5b0f655fc3f3ca559d7a90c70bdc/demo1.gif\" alt=\"demo-1\">\n<img src=\"/13d03f15fc5253a5b2a85f535ee964c6/demo2.gif\" alt=\"demo-2\">\n<img src=\"/b469fd1d61965213c7295b25085039e2/demo3.gif\" alt=\"demo-3\">\n<img src=\"/4da7a4b137cf02fcc8c7f4b7a8689ce0/demo4.gif\" alt=\"demo-4\"></p>\n<center>\n\t<div style=\"color:orange; border-bottom: 1px solid #d9d9d9;\n\tdisplay: inline-block;\n\tcolor: #999;\n\tpadding: 2px;\">\n\tSome demos of the generated relational graphs.\n\t</div>\n</center>\n<h2 id=\"conclusion\" style=\"position:relative;\"><a href=\"#conclusion\" aria-label=\"conclusion permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Conclusion</h2>\n<p>This article is only meant for a brief introduction, if interested please read <a href=\"https://www.sciencedirect.com/science/article/pii/S0031320321001072\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">our paper</a>.</p>\n<p>Our paper presents a novel Dyadic Relational Graph Convolutional\nNetwork (DR-GCN) for skeleton-based interaction recognition. We devise\nRelational Adjacency Matrix (RAM) denoting relational graph. It combines\nboth the geometric features and relative attention of the two skeletons in\ninteraction. Dyadic Relational Graph Convolution Block (DR-GCB) is fur-\nther proposed to extract spatial-temporal interactive features with RAM.\nWe stack multiple layers of DR-GCBs to build the backbone of our network.\nWe further propose Two-Stream Dyadic Relational AGCN (2S-DRAGCN)\nthat demonstrates our methods’ compatibility with ST-GCN based mod-\nels. Our proposed models show superior abilities in interaction recognition.\nThey achieve the highest accuracy on the mutual action sub-dataset of NTU-\nRGB+D, that of NTU-RGB+D 120, and the interaction dataset, SBU.</p>\n<p><em>© 2021. Contents from the accepted version is made available under the CC-BY-NC-ND 4.0 license <a href=\"http://creativecommons.org/licenses/by-nc-nd/4.0/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">http://creativecommons.org/licenses/by-nc-nd/4.0/</a>.</em></p>","fields":{"slug":"/posts/dyadic-relational-graph-convolutional-networks-for-skeleton-based-human-interaction-recognition","tagSlugs":["/tag/computer-vision/","/tag/action-recognition/","/tag/skeleton-based-action-recognition/","/tag/graph-convolutional-networks/","/tag/deep-learning/","/tag/research/"]},"frontmatter":{"date":"2021-02-19T14:13:40.121Z","description":"<p>We apply Graph Convolutional Networks on skeleton-based human-human interaction recognitions. We designed a Relational Adjacency Matrix (RAM) to represent dynamic relational graphs on the two actor's skeletons.</p> <p style=\"font-style: italic;\">Liping Zhu*, <span style=\"font-weight: bold\">Bohua Wan*</span>, Chengyang Li, Gangyi Tian, Yi Hou, Kun Yuan</p> <p style=\"font-style: italic;\">Pattern Recognition 115 (2021): 107920.</p>","tags":["Computer Vision","Action Recognition","Skeleton-based Action Recognition","Graph Convolutional Networks","Deep Learning","Research"],"title":"Dyadic Relational Graph Convolutional Networks for Skeleton-based Human Interaction Recognition","socialImage":{"publicURL":"/static/99d2a147d44057e4dd6664a84a9d8f20/structure.jpg"}}}},"pageContext":{"slug":"/posts/dyadic-relational-graph-convolutional-networks-for-skeleton-based-human-interaction-recognition"}},"staticQueryHashes":["251939775","401334301","41472230"]}