The New York Times and other news organizations have been hampered by the short cuts of HTML and hyperlinks, but are now reclassifying to provide more structured, fluid data in a major development with massive implications, notes New York Times VP of Research Operations Michael Zimbalist, who keynoted Day 2 at ILM East. The benefits are immediate in terms of SEO, but longer term, provider richer product for consumers, notes Zimbalist.
“Information has become increasingly granular or structured,” notes Zimbalist. Each unit of content has extensive machine readable metadata about itself.” Fluid information can move more easily among machines and people.
In the case of The New York Times can now process the 300 pieces of professional content that it produces every day — a brick of compiled information — into multiple formats, including things such as personal editions and slide shows. “You are reaching underneath the databases the power the Web to do new things,” says Zimbalist.
The key is to move the surplus of names to strong identifiers that are linking to data cloud driven bymeta data. The Times, for instance is embarking on moving all its data to DBpedia, which drives Wikipedia, Freebase, which is owned by Google, and GeoNames.
To date, 29,000 names have been recontextualized for a new semantic platform – a “super librarian “ –, which includes 39 percent of people (“Edgar Allen Poe”) , 31 percent of organizations, 76 percent of locations (“Park Slope”and 14 percent of descriptions. “The future is bright for librarians,” jokes Zimbalist.