Lynda Moulton is a Senior Analyst and Consultant in enterprise search at Outsell's Gilbane Group. She has over 30 years as a professional search architect. She also consults on metadata management and taxonomies for content behind the firewall. Lynda is a DZone MVB and is not an employee of DZone and has posted 15 posts at DZone. You can read more from them at their website. View Full User Profile

Why is it so Hard to "Get" Semantics Inside the Enterprise?

11.15.2011
| 6135 views |
  • submit to reddit

Semantic Software Technologies: Landscape of High Value Applications for the Enterprise was published just over a year ago. Since then the marketplace has been increasingly active; new products emerge and discussion about what semantics might mean for the enterprise is constant. One thing that continues to strike me is the difficulty of explaining the meaning of, applications for, and context of semantic technologies.

Browsing through the topics in this excellent blog site, http://semanticweb.com , it struck me as the proverbial case of the blind men describing an elephant. A blog, any blog, is linear. While there are tools to give a blog dimension by clustering topics or presenting related information, it is difficult to understand the full relationships of any one blog post to another. Without a photographic memory, an individual does not easily connect ideas across a multi-year domain of blog entries. Semantic technologies can facilitate that process.

Those who embrace some concept of semantics are believers that search will benefit from "semantic technologies." What is less clear is how evangelists, developers, searchers and the average technology user can coalesce around the applications that will semantically enable enterprise search.

On the Internet content that successfully drives interest, sales, opinion and individual promotion does so through a combination of expert crafting of metadata, search engine technology that "understands" the language of the inquirer and the content that can satisfy the inquiry. Good answers are reached when questions are understood first and then the right content is selected to meet expectations.

In the enterprise, the same care must be given to metadata, search engine "meaning" analysis tools and query interpretation for successful outcomes. Magic does not happen without people behind the scenes to meet these three criteria executing linguistic curation, content enhancement and computational linguistic programming.

Three recent meeting events illustrate various states of semantic development and adoption, even as the next conference, Semantic Tech & Business Conference - Washington, D.C. on November 29 - is upon us:

Event 1 - A relatively new group, the IKS-Community funded by the EU has been supporting open source software developers since 2009. In July they held a workshop in Paris just past the mid-point of their life cycle. Attendees were primarily entrepreneurs and independent open source developers seeking pathways for their semantically "tuned" content management solutions. I was asked to suggest where opportunities and needs exist in US markets. They were an enthusiastic audience and are poised to meet the tough market realities of packaging highly sophisticated software for audiences that will rarely understand how complex the stuff "under the hood" really is. My principal charge to them was to create tools that "make it really easy" to work with vocabulary management and content metadata capture, updates, and enhancements.

Event 2. - On this side of the pond, UK firm Linguamatics hosted its user group meeting in Boston in October. Having interviewed a number of their customers last year to better understand their I2E product line, I was happy to meet people I had spoken with and see the enthusiasm of a user community vested in such complex technology. Most impressive is the respectful tone and thoughtful sharing between Linguamatics principals and their customers. They share the knowledge of how hard it is to continually improve search technology that delivers answers to semantically complex questions using highly specialized language. Content contributors and inquirers are all highly educated specialists seeking answers to questions that have never been asked before. Think about it, search engines designed to deliver results for frequently asked questions or to find content on popular topics is hard enough, but finding the answer to a brand new question is a quantum leap of difficulty in comparison.

To make matters even more complicated, answers to semantic (natural language) questions may be found in internal content, in published licensed content or some combination of both. In the latter case, only the seeker may be able to put the two together to derive or infer an answer.

Publishers of content for licensing play a convoluted game of how they will license their content to enterprises for semantic indexing in combination with internal content. The Linguamatics user community is primarily in life sciences; this is one more hurdle for them to overcome to effectively leverage the vast published repositories of biological and medical literature. Rigorous pricing may be good business strategy, but research using semantic search could make more headway with more reasonable royalties that reflect the need for collaborative use across teams and partners.

Content wants to be found and knowledge requires outlets to enable innovation to flourish. In too many cases technology is impaired by lack of business resources by buyers or arcane pricing models of sellers that hold vital information captive for a well-funded few. Semantically excellent retrieval depends on an engine's indexing access to all contextually relevant content.

Event 3. - Leslie Owens of Forrester Research, at the Fall 2011 Enterprise Search Summit conducted a very interesting interactive session that further affirms the elephant and blind men metaphor. Leslie is a champion of metadata best practices and writes about the competencies and expertise needed to make valuable content accessible. She engaged the audience with a series of questions about its wants, needs, beliefs and plans for semantic technologies. As described in an earlier paragraph about how well semantics serves us on the Web, most of the audience puts its faith in that model but is doubtful of how or when similar benefits will accrue to enterprise search. Leslie and a couple of others made the point that a lot more work has to be done on the back-end on content in the enterprise to get these high-value outcomes.

We'll keep making the point until more adopters of semantic technologies get serious and pay attention to content, content enhancement, expert vocabulary management and metadata. If it is automatic understanding of your content that you are seeking, the vocabulary you need is one that you build out and enhance for your enterprise's relevance. Semantic tools need to know the special language you use to give the answers you need.


Source: http://gilbane.com/search_blog/2011/11/why_is_it_so_hard_to_get_semantics_inside_the_enterprise.html

Published at DZone with permission of Lynda Moulton, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)