TechEcho

5 comments

simonwover 1 year ago

Genuine question: why is this only published as a PDF?OpenAI have the resources to also publish this as HTML. They chose not to.They're not alone in this - most of the academic and research world, plus the concept of a "whitepaper" seems predicated on the idea of publishing PDFs.Is this some stupid thing where human beings are expected to attach more prestige to information published in this way?PDFs are a terrible way of publishing information in 2023:- they render poorly on mobile devices, where many (most?) people do their reading- they're hard to copy and paste information out of- you can't link to headings within them (like HTML fragment links)- you can't easily run them through translation tools like the one built into ChromeThe benefits of PDF I can see are:1. Easier to print and get the exact expected output2. You can save one file offline3. Easier to authorI'm not arguing to replace PDFs with HTML (though I wouldn't miss them personally) - I'm saying publish documents as both!Provide an HTML version and a PDF alternative for people who want it.Am I missing something here? Why does the academic and research world stubbornly stick to such a hostile way of publishing their results?

评论 #37645849 未加载

评论 #37645571 未加载

评论 #37646729 未加载

评论 #37645617 未加载

tmalyover 1 year ago

Looking at this, it gave me this other idea.I was looking over older State building codes from early 90s for a homeowners association issue.Most of these older codes are scanned pictures of the text.It would be interesting if they have some type of OCR extension for ChatGPT where you could upload the image of the pages and it could OCR and work with the text.This same situation happens with the city council agendas current day. They make these 300 page pdf documents all of scanned images of the text. It is really hard to search them and figure out what is going on.

评论 #37649932 未加载

hsdropoutover 1 year ago

In this PDF there is an example of a controversial output in response to an image for job applicants. The "solution" was to decline to answer that category of question. This doesn't feel like a reasonable approach, as it will become a game of whack-a-mole.This also seems to acknowledge that the model has deep bias-related flaws and instead of treating the causes, they are going after symptoms.

stoicbatmanover 1 year ago

An interesting perspective on the use of PDFs in the academic and research world. What I find striking is how PDFs have remained so prevalent despite the rapid digital transformation in recent years. While the static nature of PDFs lends itself to easy citation, it's time we reconsidered the emphasis on format over functionality.

swyxover 1 year ago

my notes:- ramped up to 16k BeMyEyes + 1k developer alpha testers over 6 months- reduced frequency and severity of hallucinations- improved OCR and quality of descriptions- great demand for describing people without affecting privacy/bias - intentionally refusing person identification 98% of the time and lowering accuracy to 0%. also declining a whole lot of problematic queries, per fig 8- converting known jailbreaks to images to defend against multimodal jailbreaks. ironic how jailbreak collection websites probably made it a lot easier to break the jailbreaks- interesting descriptions of mitigation process in 2.4.2.discussion linked <a href="https://twitter.com/swyx/status/1706359912283152556" rel="nofollow noreferrer">https://twitter.com/swyx/status/1706359912283152556</a>

GPT-4V(ision) system card [pdf]

5 comments

GPT-4V(ision) system card [pdf]

5 comments