TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

YOLOv5: State-of-the-art object detection at 140 FPS

391 pointsby rocaucalmost 5 years ago

24 comments

bArrayalmost 5 years ago
I&#x27;m just going to call this out as bullshit. This isn&#x27;t YOLOv5. I doubt they even did a proper comparison between their model and YOLOv4.<p>Someone asked it to not be called YOLOv5 and their response was just awful [1]. They also blew off a request to publish a blog&#x2F;paper detailing the network [2].<p>I filed a ticket to get to the bottom of this with the creators of YOLOv4: <a href="https:&#x2F;&#x2F;github.com&#x2F;AlexeyAB&#x2F;darknet&#x2F;issues&#x2F;5920" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;AlexeyAB&#x2F;darknet&#x2F;issues&#x2F;5920</a><p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;ultralytics&#x2F;yolov5&#x2F;issues&#x2F;2" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ultralytics&#x2F;yolov5&#x2F;issues&#x2F;2</a><p>[2] <a href="https:&#x2F;&#x2F;github.com&#x2F;ultralytics&#x2F;yolov5&#x2F;issues&#x2F;4" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ultralytics&#x2F;yolov5&#x2F;issues&#x2F;4</a>
评论 #23482162 未加载
评论 #23482579 未加载
评论 #23482448 未加载
评论 #23481729 未加载
评论 #23480948 未加载
评论 #23481220 未加载
nharadaalmost 5 years ago
I welcome forward progress in the field, but something about this doesn&#x27;t sit right with me. The authors have an unpublished&#x2F;unreviewed set of results and they&#x27;re already co-opting the YOLO name (without the original author) for it and all of this to promote a company? I guess this was inevitable when there&#x27;s so much money in ML but it definitely feels against the spirit of the academic research community that they&#x27;re building upon.
评论 #23481149 未加载
评论 #23480881 未加载
评论 #23483688 未加载
sillysaurusxalmost 5 years ago
We made a site that lets you collaboratively tag a bunch of images, called tagpls.com. For example, users decided to re-tag imagenet for fun: <a href="https:&#x2F;&#x2F;twitter.com&#x2F;theshawwn&#x2F;status&#x2F;1262535747975868418" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;theshawwn&#x2F;status&#x2F;1262535747975868418</a><p>And the tags ended up being hilarious: <a href="https:&#x2F;&#x2F;pbs.twimg.com&#x2F;media&#x2F;EYXRzDAUwAMjXIG?format=jpg&amp;name=large" rel="nofollow">https:&#x2F;&#x2F;pbs.twimg.com&#x2F;media&#x2F;EYXRzDAUwAMjXIG?format=jpg&amp;name=...</a><p>(I&#x27;m particularly fond of <a href="https:&#x2F;&#x2F;i.imgur.com&#x2F;ZMz2yUc.png" rel="nofollow">https:&#x2F;&#x2F;i.imgur.com&#x2F;ZMz2yUc.png</a>)<p>The data is freely available via API: <a href="https:&#x2F;&#x2F;www.tagpls.com&#x2F;tags&#x2F;imagenet2012validation.json" rel="nofollow">https:&#x2F;&#x2F;www.tagpls.com&#x2F;tags&#x2F;imagenet2012validation.json</a><p>It exports the data in yolo format (e.g. it has coordinates in yolo&#x27;s [0..1] range), so it&#x27;s straightforward to spit it out to disk and start a yolo training run on it.<p>Gwern recently used tagpls to train an anime hand detector model: <a href="https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;AnimeResearch&#x2F;comments&#x2F;gmcdkw&#x2F;help_build_an_anime_hand_detector_by_tagging&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.reddit.com&#x2F;r&#x2F;AnimeResearch&#x2F;comments&#x2F;gmcdkw&#x2F;help_...</a><p>People seem willing to tag things for free, mostly for the novelty of it.<p>The NSFW tags ended up being shockingly high quality, especially in certain niches: <a href="https:&#x2F;&#x2F;twitter.com&#x2F;theshawwn&#x2F;status&#x2F;1270624312769130498" rel="nofollow">https:&#x2F;&#x2F;twitter.com&#x2F;theshawwn&#x2F;status&#x2F;1270624312769130498</a><p>I don&#x27;t think we could&#x27;ve paid human labelers to create tags that thorough or accurate.<p>All the tags for all experiments can be grabbed via <a href="https:&#x2F;&#x2F;www.tagpls.com&#x2F;tags.json" rel="nofollow">https:&#x2F;&#x2F;www.tagpls.com&#x2F;tags.json</a>, so over time we hope the site will become more and more valuable to the ML community.<p>tagpls went from 50 users to 2,096 in the past three weeks. The database size also went from 200KB a few weeks ago to 1MB a week ago and 2MB today. I don&#x27;t know why it&#x27;s becoming popular, but it seems to be.
评论 #23479791 未加载
评论 #23479494 未加载
评论 #23480175 未加载
评论 #23480913 未加载
评论 #23484888 未加载
jcimsalmost 5 years ago
Has anyone (beyond maybe self-driving software) tried using object tagging as a way to start introducing physics into a scene? E.g. human and bicycle have same motion vector, increases likelihood that human is riding bicycle. Bicycle and human have size and weight ranges that could be used to plot trajectory. Bicycles riding in a straight line and trees both provide some cues as to the gravity vector in the scene. Etc. etc.<p>Seems like the camera motion is probably already solved with optical flow&#x2F;photogrammetry stuff, but you might be able to use that to help scale the scene and start filtering your tagging based on geometric likelihood.<p>The idea of hierarchical reference frames (outlined a bit by Jeff Hawkins here <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=-EVqrDlAqYo&amp;t=3025" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=-EVqrDlAqYo&amp;t=3025</a> ) seems pretty compelling to me for contextualizing scenes to gain comprehension. Particularly if you build a graph from those reference frames and situate models tuned to the type of object at the root of each each frame (vertex). You could use that to help each model learn, too. So if a bike model projects a &#x27;riding&#x27; edge towards the &#x27;person&#x27; model, there wouldn&#x27;t likely be much learning. e.g. [Person]-(rides)-&gt;[Bike] would have likely been encountered already.<p>However if the [Bike] projects the (rides) edge towards the [Capuchin] sitting in the seat, the [Capuchin] model might learn that capuchins can (ride) and furthermore they can (ride) a [Bike].
评论 #23480978 未加载
评论 #23482674 未加载
ely-salmost 5 years ago
There seems to be an unfair comparison between the various network architectures. The reported speed and accuracy improvements should be taken with a bit of scepticism for two reasons.<p>* This is the first yolo implemented in Pytorch. Pytorch is the fastest ml framework around, so some of YOLOv5&#x27;s speed improvements may be attributed to the platform it was implemented on rather than actual scientific advances. Previous yolos were implemented using darknet, and EfficientDet is implemented in TensorFlow. It would be necessary to train them all on the same platform for a fair speed comparison.<p>* EfficientDet was trained on the 90-class COCO challenge (1), while YOLOv5 was trained on 80 classes (2).<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;ultralytics&#x2F;yolov5&#x2F;blob&#x2F;master&#x2F;data&#x2F;coco.yaml" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ultralytics&#x2F;yolov5&#x2F;blob&#x2F;master&#x2F;data&#x2F;coco....</a><p>[2] <a href="https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;automl&#x2F;blob&#x2F;master&#x2F;efficientdet&#x2F;inference.py#L42" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;google&#x2F;automl&#x2F;blob&#x2F;master&#x2F;efficientdet&#x2F;in...</a>
评论 #23480182 未加载
评论 #23485412 未加载
rocaucalmost 5 years ago
EfficientDet was open sourced March 18 [1], YOLOv4 came out April 23 [2], and now YOLOv5 is out only 48 days later.<p>In our initial look, YOLOv5 is 180% faster, 88% smaller, similarly accurate, and easier to use (native to PyTorch rather thank Darknet) than YOLOv4.<p>[1] <a href="https:&#x2F;&#x2F;venturebeat.com&#x2F;2020&#x2F;03&#x2F;18&#x2F;google-ai-open-sources-efficientdet-for-state-of-the-art-object-detection&#x2F;" rel="nofollow">https:&#x2F;&#x2F;venturebeat.com&#x2F;2020&#x2F;03&#x2F;18&#x2F;google-ai-open-sources-ef...</a> [2] <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2004.10934" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2004.10934</a>
评论 #23478966 未加载
评论 #23480487 未加载
ma2rtenalmost 5 years ago
<i>In February 2020, PJ Reddie noted he would discontinue research in computer vision.</i><p>He actually stopped working on it because of ethical concerns. I&#x27;m inspired that he made this principled choice despite being quite successful in this field.<p><a href="https:&#x2F;&#x2F;syncedreview.com&#x2F;2020&#x2F;02&#x2F;24&#x2F;yolo-creator-says-he-stopped-cv-research-due-to-ethical-concerns&#x2F;" rel="nofollow">https:&#x2F;&#x2F;syncedreview.com&#x2F;2020&#x2F;02&#x2F;24&#x2F;yolo-creator-says-he-sto...</a>
评论 #23481275 未加载
gokalmost 5 years ago
Er so this &quot;Ultralytics&quot; consulting firm just borrowed the name YOLO for this model and didn&#x27;t actually publish their results yet?
评论 #23478983 未加载
评论 #23479247 未加载
评论 #23479030 未加载
david_dracoalmost 5 years ago
&gt; In February 2020, PJ Reddie noted he would discontinue research in computer vision.<p>It would be fair to state also why he chose to discontinue developing YOLO, as it is relevant.
rememberlennyalmost 5 years ago
Two interesting links from the article:<p>1. How to train YOLOv5: <a href="https:&#x2F;&#x2F;blog.roboflow.ai&#x2F;how-to-train-yolov5-on-a-custom-dataset&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.roboflow.ai&#x2F;how-to-train-yolov5-on-a-custom-dat...</a><p>2. Comparing various YOLO versions <a href="https:&#x2F;&#x2F;yolov5.com&#x2F;" rel="nofollow">https:&#x2F;&#x2F;yolov5.com&#x2F;</a>
bosconalmost 5 years ago
Latency is measured for batch=32 and divided by 32? This means that 1 batch will be processed in 500 milliseconds. I have never seen a more fake comparison.
bcatanzaroalmost 5 years ago
Why benchmark using 32-bit FP on a V100? That means it’s not using tensor cores, which is a shame since they were built for this purpose. There’s no reason not to benchmark using FP16 here.
评论 #23480131 未加载
hnarayananalmost 5 years ago
What does it take to now use this name?
评论 #23480466 未加载
评论 #23479174 未加载
评论 #23479166 未加载
darknet-rideralmost 5 years ago
I really like the work done by AlexAB on darknet YOLOv4 and the original author Joseph Radmon with YOLOv3. These guys need a lot more respect than any other version of YOLO.
heisenburgzeroalmost 5 years ago
This is not the first time something is fishy. Back in the early stages of the repo. They were advertising on the front page that they are achieving similar MAP to the original C++ version. But only to be found out they haven&#x27;t train it on COCO dataset and test it.
0xcoffeealmost 5 years ago
Is it possible to run these models in the browser, something similar to tensorflow.js?
评论 #23479057 未加载
ebg13almost 5 years ago
It looks like this is YOLOv4 implemented on PyTorch, not actually a new YOLO?
评论 #23481462 未加载
franciscopalmost 5 years ago
I am very interested on loading YOLO into a Raspberry Pi + Coral.ai, anyone knows a good tutorial on how to get started? I tried before and with Darknet it was not easy at all, but now with pytorch there seem to be ways of loading that into Coral. I am familiar with Raspberry Pi dev, but not much with ML or TPUs, so I think it&#x27;d be mostly a tutorial on bridging the different technologies.<p>(might need to wait a couple of months since this was just released)
kuzeealmost 5 years ago
Just read this. Nice overview of the history of the &quot;YOLO&quot; family, and summary of what YOLOv5 is&#x2F;does.
DEDLINEalmost 5 years ago
Does anyone know of an open-source equivalent to YOLOv5 in the sound recognition &#x2F; classification domain? Paid?
评论 #23481869 未加载
qchrisalmost 5 years ago
If anyone&#x27;s interested in the direct GitHub link to the repository: <a href="https:&#x2F;&#x2F;github.com&#x2F;ultralytics&#x2F;yolov5" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ultralytics&#x2F;yolov5</a>
评论 #23480296 未加载
heavyset_goalmost 5 years ago
I like to think that the name is also a reference to the fact that this will inevitably be used in some autonomous driving systems.
tapatioalmost 5 years ago
Less weights, more accuracy. Magic :)
osipovalmost 5 years ago
Just recently IBM announced with a loud PR move that the company is getting out of the face recognition business. Guess what? Wall Street doesn&#x27;t want to keep subsidizing IBM&#x27;s subpar face recognition technology when open source and Google solutions are pushing the state of the art.
评论 #23479155 未加载