TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

GPT-4V(ision) system card [pdf]

46 pointsby juungeover 1 year ago

5 comments

simonwover 1 year ago
Genuine question: why is this only published as a PDF?<p>OpenAI have the resources to also publish this as HTML. They chose not to.<p>They&#x27;re not alone in this - most of the academic and research world, plus the concept of a &quot;whitepaper&quot; seems predicated on the idea of publishing PDFs.<p>Is this some stupid thing where human beings are expected to attach more prestige to information published in this way?<p>PDFs are a terrible way of publishing information in 2023:<p>- they render poorly on mobile devices, where many (most?) people do their reading<p>- they&#x27;re hard to copy and paste information out of<p>- you can&#x27;t link to headings within them (like HTML fragment links)<p>- you can&#x27;t easily run them through translation tools like the one built into Chrome<p>The benefits of PDF I can see are:<p>1. Easier to print and get the exact expected output<p>2. You can save one file offline<p>3. Easier to author<p>I&#x27;m not arguing to replace PDFs with HTML (though I wouldn&#x27;t miss them personally) - I&#x27;m saying publish documents as both!<p>Provide an HTML version and a PDF alternative for people who want it.<p>Am I missing something here? Why does the academic and research world stubbornly stick to such a hostile way of publishing their results?
评论 #37645849 未加载
评论 #37645571 未加载
评论 #37646729 未加载
评论 #37645617 未加载
tmalyover 1 year ago
Looking at this, it gave me this other idea.<p>I was looking over older State building codes from early 90s for a homeowners association issue.<p>Most of these older codes are scanned pictures of the text.<p>It would be interesting if they have some type of OCR extension for ChatGPT where you could upload the image of the pages and it could OCR and work with the text.<p>This same situation happens with the city council agendas current day. They make these 300 page pdf documents all of scanned images of the text. It is really hard to search them and figure out what is going on.
评论 #37649932 未加载
hsdropoutover 1 year ago
In this PDF there is an example of a controversial output in response to an image for job applicants. The &quot;solution&quot; was to decline to answer that category of question. This doesn&#x27;t feel like a reasonable approach, as it will become a game of whack-a-mole.<p>This also seems to acknowledge that the model has deep bias-related flaws and instead of treating the causes, they are going after symptoms.
stoicbatmanover 1 year ago
An interesting perspective on the use of PDFs in the academic and research world. What I find striking is how PDFs have remained so prevalent despite the rapid digital transformation in recent years. While the static nature of PDFs lends itself to easy citation, it&#x27;s time we reconsidered the emphasis on format over functionality.
swyxover 1 year ago
my notes:<p>- ramped up to 16k BeMyEyes + 1k developer alpha testers over 6 months<p>- reduced frequency and severity of hallucinations<p>- improved OCR and quality of descriptions<p>- great demand for describing people without affecting privacy&#x2F;bias - intentionally refusing person identification 98% of the time and lowering accuracy to 0%. also declining a whole lot of problematic queries, per fig 8<p>- converting known jailbreaks to images to defend against multimodal jailbreaks. ironic how jailbreak collection websites probably made it a lot easier to break the jailbreaks<p>- interesting descriptions of mitigation process in 2.4.2.<p>discussion linked <a href="https:&#x2F;&#x2F;twitter.com&#x2F;swyx&#x2F;status&#x2F;1706359912283152556" rel="nofollow noreferrer">https:&#x2F;&#x2F;twitter.com&#x2F;swyx&#x2F;status&#x2F;1706359912283152556</a>