Waiting lists and signup pages can't be Show HNs. Please see <a href="https://news.ycombinator.com/showhn.html" rel="nofollow">https://news.ycombinator.com/showhn.html</a>.<p>I've taken Show HN out of the title now.
I think the actual interface with OpenAI’s platform is the easy part. Everybody and their dog will have a version of this. Just look at the comments so far - many of us have already been playing with it.<p>If you want a real moat, figure out how to parse existing PDF documentation that is really badly formatted. Think diagrams and tables with text floating in various places, etc. Documentation of this style is very common in industries where physical things are being built in the real world. The standards documentation (IEEE, ANSI, NFPA, etc) doesn’t usually parse cleanly, much less the messier internal documentation within the businesses.<p>Grobid is the best example of such a documentation parser, but it is so laser focused on academic papers that it fails to properly process industry-style standards and SOP documentation. What the world needs right now is a Grobid that works for other kinds of messy documentation.
The problem with GPT3 is it creates BS answers. The function "Interval" does not seem to exist.<p>how would I do a SQL query with questdb to grab every other second of data?<p>This is a common use case for us. We have built SQL functions to easily do this task. The function below will return every other second of data:SELECT * FROM `time_series` WHERE (INTERVAL '2s')This query can be run directly from the web console or via Postgres wire protocol using Postgres client libraries such as pg4ce, postgres-java and others.The above query will return all records in the table with timestamp greater than 12
Was thinking of building something very similar - upload your technical docs - and create a chat bot help system.<p>One of the tricky things to overcome is the length limits of the prompt you can feed into things like GPT-3. There are some suggestions on the OpenAI website on how to overcome this. The main one seems to be to filter the content using embeddings and then only feed "relevant" sections into the prompt.<p>Would be interested to know what your approach is.
While I've been looking for this, the gap in most product documentation I have seen is that it's written by architects who want you to understand the whole framework and demonstrate its myriad parts instead of by users (e.g. product managers) who just need to solve a problem and they will learn from that.<p>On this GPT chat, I asked, "provide an example oauth2 config for QuestDB" and "how do I configure oauth2 for signaldot?" and it responded like a search engine with paragraphs from documentation that matched the keywords, perhaps within an extra degree of reasoning.<p>As someone who uses fantastically (farcically?) complex security technologies, I'd tell architects, PMs, documentation writers and developers: Your Cookbook Is Your Product.<p>LLM's have a GIGO problem, and until you can train it on example configs, it's going to be a search engine over someone's architectural grandstanding.
Some Highlights
- Twig takes static documentation sites and lets users ask questions about it<p>- Custom/FineTuned GPT3 models are trained on your data<p>- Can be embedded within your app or docs site.<p>Seeking Feedback<p>- What do you think about it.<p>- Would you add this to your docs site or within your app when it's available<p>- What can I add/change to make this better.
Looks interesting, but need more details about set up and pricing. This is another approach that doesn't require fine tuning <a href="https://github.com/jerryjliu/gpt_index">https://github.com/jerryjliu/gpt_index</a>
It seems to do open domain question answering without restricting to the topic.<p>> Is the word cat made of 4 or 5 letters?<p>>> The word cat is made of 4 letters, 3 of which are in the stem.
I had this idea a few years back just when ML was heating up. A service which would crawl all of your documentation/data to which you can then ask questions and get answers back. A domain specific search engine/Knowledge Base.<p>I started reading some of the papers, got to know that even to build a model required a huge number of resources and just dropped it.
Some great stuff here. Been thinking about doing this for enterprise grade software at work. What I want to do is feed it docs (pdf), text from support queries with answers and videos. What types of approaches should I be considered here? Have just started using Whisper to do ASR on videos.
It seems that technical writers will be even more important, as their job will be the imput for models.<p>Unless GPT can learn directly from code and explain it properly.<p>With chatGPT I've had some problems trying to deduce from code prompts, but this seems promising.
Just saw someone tried to hack/denial-of-serive the bot by just putting too many words in the prompt. (This hit triggers as usage is token based) .<p>I just set a prompt limit of 100 characters should be plenty. Bot is till down until OpenAI resets limits. <a href="https://twitter.com/chandanmaruthi/status/1612134496035569664?s=20&t=JvOWZPvxNGXwvLqKG_a3KQ" rel="nofollow">https://twitter.com/chandanmaruthi/status/161213449603556966...</a>
Hi All, We had over 6,000k interactions and have hit OpenAIs hard api usage limits :-). I have requested OpenAi to increase limits. Until then the live chat with bot may be restricted. Remember to get on the waitlist with "get access" if you want to give it a try when the live chat starts again.
<p><pre><code> QuestDb; Columnar time-series database with high performance ingestion and SQL analytics you know and love from QuestDB open source, now on the cloud.
Q: How high is the performance?
GPT3: We have tested QuestDB against other databases using benchmarking frameworks and benchmarking tools. We have also developed our own performance tests to simulate real-world use cases involving large data sets, multiple ingestion streams, and complex SQL queries. The results of these efforts are summarized in the table below:QuestDB is the fastest open source time series database available today. It powers real-life applications such as Yahoo! which stores 1 billion records per day in a single instance of QuestDB with
</code></pre>
GPT3 says QuestDb is web scale and I for one trust it implicitly.
Let's use a language model trained on hundreds of gigabytes of god-knows-what, has a tendency to hallucinate, has literally never used your product, is known as a consummate bullshitter, and unleash it unsupervised on your confused users.<p>Sorry for the snark, but what is your plan for quality assurance that this this meeting users needs instead of just BS'ing them out of your hair?
Trying something similar. Using a mix of embeddings and generative AI(davinci) to answer questions from scrapped data of website. Scrapped data for our website (Ozonetel.com) and created this site.<p>1. Scraping website. Used default node scraper. 5 mins.<p>2. Generated huggingface embeddings. 10 mins.<p>3. Use code AI to generate basic website. 5 mins.<p>4. Created prompt to limit to answers known. 1 min.<p>So in 30 mins we are able to create a site search powered by generative AI.<p>Disclaimer. This is still a work in progress.<p><a href="http://speech-kws.ozonetel.com/ozosearch" rel="nofollow">http://speech-kws.ozonetel.com/ozosearch</a>
Q: can I self-host questdb in my own EKS?<p>> QuestDB is a popular time series database used by many companies for their application monitoring and real-time analytics. We have built QuestDB as a system to track the progress of our own application, which has evolved from an experiment into production at scale. As we grow, so does our use of QuestDB; it's now the core data store for all things InfluxData.We are looking to hire talented engineers who share our mission of empowering data producers with tools to<p>Not super impressed