BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210808T235336Z
LOCATION:Room A
DTSTART;TZID=America/Chicago:20210809T093000
DTEND;TZID=America/Chicago:20210809T094500
UID:icpp_ICPP 2021_sess134_duac_pap105@linklings.com
SUMMARY:TurboBC: A Memory Efficient and Scalable GPU Based Betweenness Cen
trality(BC) Algorithm in the Language of Linear Algebra
DESCRIPTION:DUAC Workshop\n\nTurboBC: A Memory Efficient and Scalable GPU
Based Betweenness Centrality(BC) Algorithm in the Language of Linear Algeb
ra\n\nArtiles, Saeed\n\nBetweenness centrality (BC) is a shortest path cen
trality metric used to measure the influence of individual vertices or edg
es on huge graphs that are used for modeling and analysis of human brain,
omics data, or social networks. The application of the BC algorithm to mod
ern graphs must deal with the size of the graphs, as well with highly irre
gular data-access patterns. These challenges are particularly important wh
en the BC algorithm is implemented on Graphics Processing Units (GPU), due
to the limited global memory of these processors, as well as the decrease
in performance due to the load unbalance resulting from processing irregu
lar data structures. In this paper, we present the first GPU based linear-
algebraic formulation and implementation of BC, called TurboBC, a set of m
emory efficient BC algorithms that exhibits good performance and high scal
ability on unweighted, undirected or directed sparse graphs of arbitrary s
tructure. Our experiments demonstrate that our TurboBC algorithms obtain m
ore than 18 GTEPs and an average speedup of 31.9x over the sequential vers
ion of the BC algorithm, and are on average 1.7x and 2.2x faster than the
state-of-the-art algorithms implemented on the high performance, GPU-based
, gunrock, and CPU-based, ligra libraries, respectively. These experiments
also show that by minimizing their memory footprint, the TurboBC algorith
ms are able to compute the BC of relatively big graphs, for which the gunr
ock algorithms ran out of memory.
END:VEVENT
END:VCALENDAR