TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Vectorizing Graph Neural Networks (2020)

74 pointsby brileealmost 2 years ago

1 comment

VHRangeralmost 2 years ago
Yes, people working on graph based ML realize quickly that the underlying data structures most originally academic libraries (networkX, PyG, etc.) use are bad.<p>I wrote about this before [1] and based a node embedding library around the concept [2].<p>The NetworkX style graphs are laid out as a bunch of items in a heap with pointers to each other. That works at extreme scales, because everything is on a cluster&#x27;s RAM and you don&#x27;t mind paying the latency costs of fetch operations. But it makes little sense for graphs with &lt; 5B nodes to be honest.<p>Here&#x27;s the dream (remains to be implemented by someone):<p>Laying out the graph as a CSR sparse matrix makes way more sense because of data locality. You have an array for edges per node, an index pointer array, and then one matrix with a row per edge for edge data, and one matrix with a row per node for node data. Ideally you code the entire thing with apache arrow memory to ease access to other libraries&#x2F;languages&#x2F;<p>At larger scales, you could just leave the CSR array data on NVMe drives, and you&#x27;d still operate at 500mb&#x2F;s random query throughput with hand coded access, ~150mb&#x2F;s with mmap.<p>[1] <a href="https:&#x2F;&#x2F;www.singlelunch.com&#x2F;2019&#x2F;08&#x2F;01&#x2F;700x-faster-node2vec-models-fastest-random-walks-on-a-graph&#x2F;" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.singlelunch.com&#x2F;2019&#x2F;08&#x2F;01&#x2F;700x-faster-node2vec-...</a><p>[2] <a href="https:&#x2F;&#x2F;github.com&#x2F;VHRanger&#x2F;nodevectors">https:&#x2F;&#x2F;github.com&#x2F;VHRanger&#x2F;nodevectors</a>
评论 #36578309 未加载
评论 #36574733 未加载
评论 #36575047 未加载
评论 #36576997 未加载