TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

PyWhat: Identify Anything

290 pointsby truedukealmost 4 years ago

14 comments

acidbaseextractalmost 4 years ago
Some more great probabilistic python libraries:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;datamade&#x2F;usaddress" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;datamade&#x2F;usaddress</a> - &quot;usaddress is a Python library for parsing unstructured address strings into address components, using advanced NLP methods.&quot;<p><a href="https:&#x2F;&#x2F;github.com&#x2F;datamade&#x2F;probablepeople" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;datamade&#x2F;probablepeople</a> - &quot;probablepeople is a python library for parsing unstructured romanized name or company strings into components, using advanced NLP methods.&quot;
评论 #27533016 未加载
评论 #27534175 未加载
评论 #27533173 未加载
评论 #27532863 未加载
cosmic_quantaalmost 4 years ago
In the same vague theme of &quot;I don&#x27;t know what I&#x27;m dealing with&quot; : <a href="https:&#x2F;&#x2F;github.com&#x2F;ajalt&#x2F;fuckitpy" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;ajalt&#x2F;fuckitpy</a>
评论 #27534236 未加载
评论 #27531950 未加载
评论 #27531897 未加载
评论 #27535010 未加载
评论 #27534986 未加载
lettergramalmost 4 years ago
We built a similar tool, utilizing a CNN. It works on structured (and unstructured) data and provides additional info.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;capitalone&#x2F;DataProfiler" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;capitalone&#x2F;DataProfiler</a><p>Cool part, is you can “extend” the intern name-entity recognition model by refitting with the new data.<p>Out if the box, the DataProfiler does something like 18 entities including most of the PII dada.
评论 #27532582 未加载
cecilpl2almost 4 years ago
Cool, but it seems like 80% of the results in your example demos are Youtube video IDs.
评论 #27532319 未加载
评论 #27568392 未加载
评论 #27535002 未加载
lapp0almost 4 years ago
Why would I need this when I already have a full Tome of Identify with 50 charges?
评论 #27532994 未加载
评论 #27533069 未加载
mklalmost 4 years ago
Why are these screenshots animated? The command is still visible in the final frame, and the final frame shows the output we&#x27;re interested in, but not long enough to read and understand it.
评论 #27568399 未加载
mgraczykalmost 4 years ago
Somewhat odd result for s3.amazonaws.com:<p><pre><code> ~&gt; python3 -m pywhat s3.amazonaws.com Possible Identification ┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓ ┃ Matched Text ┃ Identified as ┃ Description ┃ ┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │ s3.amazonaws.com │ JSON Web Token (JWT) │ None │ └──────────────────┴──────────────────────┴─────────────┘</code></pre>
评论 #27535212 未加载
vitusalmost 4 years ago
I&#x27;m admittedly not impressed by the pcap processing.<p>It identifies a bunch of fragments of HTTP headers as &quot;YouTube Video ID&quot;.<p>Meanwhile, I can get the same info and more by running<p><pre><code> $ strings FollowTheLeader.pcap *]?&gt; GET &#x2F; HTTP&#x2F;1.1 Host: 10.0.2.5 User-Agent: Mozilla&#x2F;5.0 (X11; Linux x86_64; rv:60.0) Gecko&#x2F;20100101 Firefox&#x2F;60.0 Accept: text&#x2F;html,application&#x2F;xhtml+xml,application&#x2F;xml;q=0.9,*&#x2F;*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive Upgrade-Insecure-Requests: 1 Pragma: no-cache Cache-Control: no-cache HTTP&#x2F;1.0 200 OK Server: SimpleHTTP&#x2F;0.6 Python&#x2F;3.7.3rc1 Date: Sun, 14 Jul 2019 02:42:13 GMT Content-type: text&#x2F;html Content-Length: 105 Last-Modified: Sun, 14 Jul 2019 02:41:10 GMT &lt;h1&gt;My Flag Web Page&lt;&#x2F;h1&gt; &lt;p&gt;Hi there! Have a flag!&lt;&#x2F;p&gt; &lt;p&gt;Here is your flag: ctfa{terrific_traffic}&lt;&#x2F;p&gt;</code></pre>
dec0dedab0dealmost 4 years ago
At first I thought this was going to be like google lens. It&#x27;s instead a way to probabilistically Identify things in strings. I have wished for this to exist, and made my own dumbed down version of it before. This could be very useful for less fragile screen scraping.
Daneilalmost 4 years ago
Good program! I think? it&#x27;s can bi useful in OSINT, or many more things!
评论 #27568401 未加载
iabalmost 4 years ago
Has anyone tried using this on the GIMBAL&#x2F;GOFAST UAP videos?
MrYellowPalmost 4 years ago
I am confused and amazed at the same time.<p>What is this sorcery?
gigatexalalmost 4 years ago
There really is a Python module for everything.
rainonmoonalmost 4 years ago
Bee is a really tremendous and generous developer. I use a few of their other projects near-daily (Rustscan especially has changed my life.) Definitely one of those open source devs you follow just to see whatever they come up with next.
评论 #27568404 未加载