TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

GDELT 2.0: new release of open Global Event Database updated every 15 min

1 点作者 bradleysmith大约 8 年前

1 comment

bradleysmith大约 8 年前
new features:<p>15 Minute Updates. Access the world’s breaking events and reaction in near-realtime as both the GDELT Event and Global Knowledge Graph now update every 15 minutes.<p>Realtime Translation of 65 Languages. GDELT 2.0 brings with it the public debut of GDELT Translingual, representing what we believe is the largest realtime streaming news machine translation deployment in the world: all global news that GDELT monitors in 65 languages, representing 98.4% of its daily non-English monitoring volume, is translated in realtime into English for processing through the entire GDELT Event and GKG&#x2F;GCAM pipelines. GDELT Translingual is designed to allow GDELT to monitor the entire planet at full volume, creating the very first glimpses of a world without language barriers. A special emphasis on locations and names makes GDELT 2.0 likely the largest multilingual geocoding system in the world.<p>Realtime Measurement of 2,300 Emotions and Themes. GDELT 2.0 also brings with it the debut of GDELT Global Content Analysis Measures (GCAM), representing what we believe is the largest deployment of sentiment analysis in the world: bringing together 24 emotional measurement packages that together assess more than 2,300 emotions and themes from every article in realtime, multilingual dimensions natively assessing the emotions of 15 languages (Arabic, Basque, Catalan, Chinese, French, Galician, German, Hindi, Indonesian, Korean, Pashto, Portuguese, Russian, Spanish, and Urdu). GCAM is designed to enable unparalleled assessment of the emotional undercurrents and reaction at a planetary scale by bringing together an incredible array of dimensions, from LIWC’s “Anxiety” to Lexicoder’s “Positivity” to WordNet Affect’s “Smugness” to RID’s “Passivity”.<p>High Resolution View of the Non-Western World. Over the last few months we’ve embarked upon an ambitious initiative to vastly expand GDELT’s knowledge of the media systems of the non-Western world. Working closely with governments, think tanks, academics, NGO’s, and citizens on the ground throughout the world we have been working country-by-country to try to build the highest resolution inventory possible of the media systems of the non-Western world. While we still have a long way to go and the fluidity of the world’s media ensures that this will be a perpetual task, we are incredibly excited by the ability of this high resolution inventory, coupled with GDELT Translingual’s ability to translate 98.4% of this material in realtime, to give voice to the most remote corners of the world in near-realtime.<p>Relevant Imagery, Videos, and Social Embeds. A large fraction of the world’s news outlets now specify a hand-selected image for each article to appear when it is shared via social media that represents the core focus of the article. GDELT identifies this imagery in a wide array of formats including Open Graph, Twitter Cards, Google+, IMAGE_SRC, and SailThru formats, among others. In addition, GDELT also uses a set of highly specialized algorithms to analyze the article content itself to identify inline imagery of high likely relevance to the story, along with videos and embedded social media posts (such as embedded Tweets or YouTube or Vine videos), a list of which is compiled. This makes it possible to gain a unique ground-level view into emerging situations anywhere in the world, even in those areas with very little social media penetration, and to act as a kind of curated list of social posts in those areas with strong social use. Quotes, Names, and Amounts. The world’s news contains a wealth of information on food prices, aid promises, numbers of troops, tanks, and protesters, and nearly any other countable item. GDELT 2.0 now attempts to compile a list of all “amounts” expressed in each article to offer numeric context to global events. In parallel, a new Names engine augments the existing Person and Organization names engines by identifying an array of other kinds of proper names, such as named events (Orange Revolution &#x2F; Umbrella Movement), occurrences like the World Cup, named dates like Holocaust Remembrance Day, on through named legislation like Iran Nuclear Weapon Free Act, Affordable Care Act and Rouge National Urban Park Initiative. Finally, GDELT also identifies attributable quotes from each article, making it possible to see the evolving language used by political leadership across the world.<p>Tracking Event Discussion Progression. Under the previous version of GDELT, only the first URL mentioning a given event was recorded, even if the event was mentioned in a hundred separate articles. GDELT 2.0 adds a new “Mentions” table that records every mention of an event over time, along with the timestamp the article was published. This allows the progression of an event through the global media to be tracked, identifying outlets that tend to break certain kinds of events the earliest or which may break stories later but are more accurate in their reporting on those events. Combined with the 15 minute update resolution and GCAM, this also allows the emotional reaction and resonance of an event to be assessed as it sweeps through the world’s media.<p>Over 100 New GKG Themes. There are more than 100 new themes in the GDELT Global Knowledge Graph, ranging from economic indicators like price gouging and the price of heating oil to infrastructure topics like the construction of new power generation capacity to social issues like marginalization and burning in effigy. The list of recognized infectious diseases, ethnic groups, and terrorism organizations has been considerably expanded, and more than 600 global humanitarian and development aid organizations have been added, along with global currencies and massive new taxonomies capturing global animals and plants to aid with tracking species migration and poaching.<p>Source Geographic Background Knowledge. GDELT now assesses the geography of every outlet it monitors over time and estimates its physical location on earth, incorporating that information back into the geocoding process to maximize its ability to recognize the geography of local media (a small rural radio station likely assumes its listeners know what country it is based in and thus does not clarify every mention of a local location with the corresponding country name).<p>Global Knowledge Graph Now in BigQuery. The GDELT Global Knowledge Graph is now available in Google BigQuery, allowing you to query and explore the GKG in realtime and to integrate it into queries of the Event dataset. In fact, the Event, Mentions, and GKG tables are now all in BigQuery and updated every 15 minutes, allowing you to leverage BigQuery’s enormous power to perform mass-scale analytics in near-realtime on our changing planet.