TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Topological methods for unsupervised learning problems [video]

57 pointsby lmcinnesabout 6 years ago

5 comments

nabla9about 6 years ago
Uniform Manifold Approximation and Projection<p><a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1802.03426" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1802.03426</a><p><a href="https:&#x2F;&#x2F;github.com&#x2F;lmcinnes&#x2F;umap" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;lmcinnes&#x2F;umap</a>
评论 #19274532 未加载
ham_sandwichabout 6 years ago
I’ve been interested in learning about topological data analysis, haven’t dug in too deep yet, but it definitely looks like an interesting direction to zig in while the field at large zags with ever larger deep learning architectures.<p>UMAP has already demonstrated its efficacy as a tool in any data scientist’s belt. Ayasdi and Gunnar Carlson’s work is certainly interesting, but unsure how much business value it can actually unlock. Seems like there is also opportunity to draw inspiration from the applied category theory crew (Spivak, Fong etc) to use some CT tools to approach data science from a fresh perspective.<p>Some of the research coming out is interesting, but as a practitioner I’m more interested in seeing how TDA can add differentiated value in a business context. Interested to hear where people see the field moving next.
评论 #19275713 未加载
评论 #19275163 未加载
评论 #19275211 未加载
评论 #19284574 未加载
notthingnillabout 6 years ago
Five minutes reading <a href="https:&#x2F;&#x2F;johnhw.github.io&#x2F;umap_primes&#x2F;index.md.html" rel="nofollow">https:&#x2F;&#x2F;johnhw.github.io&#x2F;umap_primes&#x2F;index.md.html</a><p>Without using any category, topology of sheaf theory, this is what I believe is in this paper:<p>(1) the prior hypothesis is that data points in R^n are a sample from a uniform distribution in a Riemann space.<p>(2) Try to define a Riemann metric such that the number of sample points in any ball B is propotional to the volume of B.<p>(3) Since (2) doesn&#x27;t define a global Riemann metric, they define a fuzzy membership relation. I suppose the role of the fuzzy tool is that local distance information is weighted according to the variance of the local distance estimations.<p>Disclaimer, I could be completely wrong.
starchild_3001about 6 years ago
I do a lot of clustering professionally, yet TDA feels very academic. Does finding locally connected components have practical value with &quot;noisy&quot; data sets&#x27;? If what you&#x27;re after is locally connected components, why can&#x27;t you use density clustering? Also my general feeling: If you have such weird shapes in R^n, maybe you should try to develop a better distance metric (vs finding connected components)? Just saying.
mikhailfrancoabout 6 years ago
Great introductory talk. I have a physics and 3D graphics background, with just enough topology and sketchy CT to understand all the words, so now I have some idea about TDA. Thanks.