科技回声

I used to do a lot of work with spam filtering. I once worked at a company that had set up hundreds of marketing websites, all of which were the start of a sales funnel that fed into Marketo and then into LeadSpace and then in Salesforce. A good response from potential customers looked like this:"I am interested in the pricing for MaxaMegaAI. Do you have a free tier for a startup with less than 10 developers?"or:"Can your ETL tool handle different systems for geospatial calculations?"Bad responses looked like:"None"or:"Damn"Or:"sdefedflkjlkjsdfsdlkfjlskdfj"I wrote simply machine learning scripts to automate some of our spam filtering.I have the impression this has come a long way?I think this category of machine learning is sometimes called "essay ranking."I've been away from this kind of work for 7 years. I assume nowadays, with LLMs, there might be some advanced techniques that can be easily implemented?Can someone point me towards a good resource?

1 comment

PaulHoule大约 1 年前

I process text through<a href="https://www.sbert.net/" rel="nofollow">https://www.sbert.net/</a>and apply a classical machine learning algorithm such as the probability calibrated SVM. This usually beats bag-of-words classifiers as it is able to suss some of the meaning of words. The advantage of this approach is that it very fast (maybe 30 seconds to reliably train a model)It is also possible to “fine tune” a BERT family model using tools from Huggingface like so<a href="https://huggingface.co/docs/transformers/training" rel="nofollow">https://huggingface.co/docs/transformers/training</a>my experience is that this takes more like 30 minutes to train a model but the process is not so reliable. For some tasks this performs better than the first approach but I haven’t gotten it to reliably improve on my current models for my tasks.I am planning on fine-tuning a T5 model when I have a problem that I think it will do well on.

评论 #40139484 未加载

1 comment

PaulHoule大约 1 年前

评论 #40139484 未加载

Ask HN: Where can I find the latest info for essay ranking, spam filtering?

1 comment

Ask HN: Where can I find the latest info for essay ranking, spam filtering?

1 comment