This is a digression, but I really wish Amazon would be more normal in their product descriptions.<p>Amazon is rapidly developing its own jargon such that you need to understand how Amazon talks about things (and its existing product lineup) before you can understand half of what they're saying about a new thing. The way they describe their products seems almost designed to obfuscate what they <i>really</i> do.<p>Every time they introduce something new, you have to click through several pages of announcements and docs just to ascertain what something <i>actually is</i> (an API, a new type of compute platform, a managed SaaS product?)
No audio support: The models are currently trained to process and understand video content solely based on the visual information in the video. They do not possess the capability to analyze or comprehend any audio components that are present in the video.<p>This is blowing my mind. gemini-1.5-flash accidentally knows how to transcribe amazingly well but it is -very- hard to figure out how to use it well and now Amazon comes out with a gemini flash like model and it explicitly ignores audio. It is so clear that multi-modal audio would be easy for these models but it is like they are purposefully holding back releasing it/supporting it. This has to be a strategic decision to not attach audio. Probably because the margins on ASR are too high to strip with a cheap LLM. I can only hope Meta will drop a mult-modal audio model to force this soon.
Setting up AWS so you can try it via Amazon Bedrock API is a hassle, so I made a step-by-step guide: <a href="https://ndurner.github.io/amazon-nova" rel="nofollow">https://ndurner.github.io/amazon-nova</a>. It's 14+ steps!
Technical report is available here
<a href="https://www.amazon.science/publications/the-amazon-nova-family-of-models-technical-report-and-model-card" rel="nofollow">https://www.amazon.science/publications/the-amazon-nova-fami...</a>
They missed a big opportunity by not offering eu-hosted versions.<p>Thats a big thing for complience. All LLM-providers reserve the right to save (up to 30days) and inspect/check prompts for their own complience.<p>However, this means that company data is potentionally sotred out-of-cloud. This is already problematic, even more so when the storage location is outside the EU.
More options/competition is good. When will we see it on <a href="https://lmarena.ai/" rel="nofollow">https://lmarena.ai/</a> ?
I really wish they would left-justify instead of center-justify the pricing information so I'm not sitting here counting zeroes and trying to figure out how they all line up.
> The Nova family of models were trained on Amazon’s custom Trainium1 (TRN1) chips,10 NVidia A100 (P4d instances), and H100 (P5 instances) accelerators. Working with AWS SageMaker, we stood up NVidia GPU and TRN1 clusters and ran parallel trainings to ensure model performance parity<p>Does this mean they trained multiple copies of the models?
Some independent latency and quality evaluations already available at
<a href="https://artificialanalysis.ai/" rel="nofollow">https://artificialanalysis.ai/</a>
Looks to be cheap and fast.
It would be nice if this was a truly open source model like OLMo:
<a href="https://venturebeat.com/ai/truly-open-source-llm-from-ai2-to-drive-critical-shift-in-ai-development/" rel="nofollow">https://venturebeat.com/ai/truly-open-source-llm-from-ai2-to...</a>
> The model processes inputs up to 300K tokens in length [...] up to 30 minutes of video in a single request.<p>I wonder how fast it "glances" an entire 30 minute video and takes until the first returned token. Anyone wager a guess?
They really should've tried to generate better video examples, those two videos that they show don't seem that impressive when you consider the amount of resources available to AWS. Like what even is the point of this? It's just generating more filler content without any substance. Maybe we'll reach the point where video generation gets outrageously good and I'll be proven wrong, but right now it seems really disappointing.<p>Right now when I see obviously AI generated images for book covers I take that as a signal of low quality. If AI generated videos continue to look this bad I think that'll also be a clear signal of low quality products.
DOA<p>When marketing talks about price delta and not quality of the output, it is DOA. For LLMs, quality is a more important metric and Nova would always try to play catch with the leaderboard forever.
It's really amusing how bad Amazon is at writing and designing UI. For a company of their size and scope it's practically unforgivable. But they always get away with it.