TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Mapping Motor Vehicle Collisions in New York City

72 pointsby lil_teeover 6 years ago

12 comments

takk309over 6 years ago
Nice work, I am glad that there is a paragraph talking about exposure. Crash trends based strictly on total number of crashes are easy to predict just based on where there is more traffic. Using crashes per vehicle mile traveled for road segments or crashes per entering vehicle for intersections can help tease out trends. Controlling for severity is also important.<p>When I do a crash analysis for a city, one of the tasks I do regularly for my job, I generate a crash rate and severity index for each intersection. The severity index is basically a weighted average based on severity, non-injury=1, minor injury=3, and severe injury or fatality=8. The crash rate and severity index are divided to create a Severity Rate. While not perfect or statistically valid, it does help identify trends. Also, I am in a rural state so it is rare that there are enough crashes to make any statistically valid conclusions.
评论 #19145103 未加载
clhenrickover 6 years ago
I&#x27;ve worked extensively with this dataset on a similar project, <a href="http:&#x2F;&#x2F;crashmapper.org" rel="nofollow">http:&#x2F;&#x2F;crashmapper.org</a>, and through that process found that the data is extremely error prone. Perhaps 20% of the collisions recorded are not geocoded (e.g.lack lat, long coordinates) and don&#x27;t contain other location information such as street, cross street, and zip code that could be used to geocode them. It appears that some precincts of the NYPD do a better job at recording a crash location then others. Even more of the data lacks values for &quot;contributing factors&quot; so it seems difficult to use as a metric for analysis. Often there is a mismatch between the total number of persons injured or killed and the number of pedestrians, cyclists, or motorists injured or killed. Furthermore, whomever maintains this dataset will periodically go back in time and update it seemingly at random, editing existing data or adding new data, potentially months or years back in time. Often it appears to be that the data maintainer is changing values for fields such as the number of pedestrians, cyclists, motorists injured or killed. Presumably this is because more information surfaced about an incident at a later point in time and the city must go back and update it. However this can result in stats from the data not aligning with the NYPD&#x27;s or DOT&#x27;s official stats from a previous year. I would advise anyone to keep these facts in mind if trying to use the data for analysis and policy recommendations, such is open data.
xyzwaveover 6 years ago
Having done something similar for the Long Beach, CA area in college, one of the most interesting takeaways was the relative spatial distribution between fatal and non-fatal accidents.<p>Non-fatal accidents clearly clustered around high traffic areas, but fatal accidents didn’t reveal the same clustering. Instead they appeared to be uniformly distributed across the city.<p>I’m sure there is an explanation in this, and this was only 10 years data for a single city, but it always felt a little spooky that these accidents were equally likely to happen anywhere (though most likely later in the night).
评论 #19147722 未加载
评论 #19147342 未加载
jermaustin1over 6 years ago
I&#x27;m not sure what constitutes a &quot;collision&quot;, but in 2015, I lived on Lexington between 121 and 122 and saw the investigation of a Hit and Run of a homeless man. I talked to a couple of the witnesses who saw it happen.<p>This incident was at Lexington and 123rd. In the data, I do not see this incident.
karussellover 6 years ago
The question is if the highlighted area are really more dangerous or if there are just more visitors. Shouldn&#x27;t one take into account the traffic counts?<p>BTW: there is similar (open) data for Germany: <a href="https:&#x2F;&#x2F;unfallatlas.statistikportal.de&#x2F;" rel="nofollow">https:&#x2F;&#x2F;unfallatlas.statistikportal.de&#x2F;</a> (It clearly shows the problem I mentioned)<p>Update: sorry, it seems that this issue is already discussed in this thread
评论 #19145503 未加载
jdlygaover 6 years ago
Lots of crashes in Hell&#x27;s Kitchen. That area is full of people going out to bars and restaurants, tiny sidewalks, and lots of impatient drivers trying to get through Manhattan to New Jersey.
bonytover 6 years ago
The map of total deaths includes a significant blip on the west side near Pier 40 and the Holland Tunnel, which I think is from the 2017 truck attack.<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;2017_New_York_City_truck_attack" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;2017_New_York_City_truck_attac...</a><p>Map: <a href="https:&#x2F;&#x2F;imgur.com&#x2F;a&#x2F;jNbOv7W" rel="nofollow">https:&#x2F;&#x2F;imgur.com&#x2F;a&#x2F;jNbOv7W</a>
评论 #19148958 未加载
ryeguy_24over 6 years ago
I would bet that the shadow&#x2F;light patterns on Roosevelt Avenue &amp; 94th Street, Queens cause significant visual distractions to drivers and pedestrians.
dsfyu404edover 6 years ago
Drivers mostly hit other things when there&#x27;s too many things demanding their attention (poor visibility + difficult left turn + busy traffic + bikes + pedestrians = high risk of accidents) so this is probably just a heat map of intersections that are the busiest (in terms of things going on, not necessarily throughput).<p>I&#x27;d like to see a month by month heat map.
评论 #19147367 未加载
brianbreslinover 6 years ago
I would love to pay the author to do this for my city. I&#x27;m fairly certain I could get the local govt to pay up for this.
slowhand09over 6 years ago
Very impressive!
skizmover 6 years ago
<a href="https:&#x2F;&#x2F;xkcd.com&#x2F;1138&#x2F;" rel="nofollow">https:&#x2F;&#x2F;xkcd.com&#x2F;1138&#x2F;</a>
评论 #19146920 未加载