A KNOWLEDGE SHARING VIEW OF COMMUNICATIONS BETWEEN ORGANIZATIONS Robert Neches USC / Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA 90292 310/822-1511, Fax 310/823-6714 Neches@ISI.edu OVERVIEW The Integrated User Support Environments Group at ISI has been working for several years on a very prototypical example of integration across organizational boundaries. The results of our work are embodied in a system called DRAMA, the Data Review, Analysis, and Monitoring Aid. DRAMA is intended to act as an ``intelligent conduit", linking triads of organizations: the enterprises which design and build large systems (e.g., aircraft), the enterprises which buy and use those systems, and the enterprises which supply the end users with spare and replacement parts. The approach taken in building DRAMA is driven by a philosophical bias toward incremental development and integration of models. This paper will give an overview the DRAMA system. It will describe the methodology and tools supporting its construction. Finally, it will consider the implications for enterprise integration efforts like DRAMA of a broader approach to model-based systems: the DARPA Knowledge Sharing Effort. THE DRAMA SYSTEM: A EI CASE STUDY Although intended to act as a conduit between a triad of enterprises, the design of DRAMA is driven by the needs of one organization in the triad: the Defense Logistics Agency of the U.S. Department of Defense. A little-known agency with a $15-billion annual budget, DLA is responsible for maintaining an inventory of spares and replacements for components of a broad range of systems utilized by military services. The commodities handled by DLA run the gamut from mundane items such as fuel, food, medical supplies and clothing, to fairly esoteric electronic and mechanical components. These commodities are components of systems ranging in size and complexity from pistols to forklifts to helicopters, airplanes, ships, and submarines. DLA needs to work closely with the military services which buy these systems and the industrial organizations which design and manufacture them. Coordination between these organizations is a non-trivial problem to DLA; operations research studies show that 36% of new additions to their inventory currently turn out to be purchased unnecessarily or prematurely. DRAMA's function is save DLA substantial amounts of money by maintaining synchronization between DLA's purchasing and inventory plans, the builders' design and manufacturing plans, and the users' deployment plans. To achieve this goal, DRAMA must satisfy a number of subgoals. It must recognize new components described in the design database which appear likely to require stocking by DLA, and must then trigger technical analyses of those components. In doing so, it must balance the benefits of spreading workload (analyzing parts as they come in rather than all at once) vs. the cost of error (triggering analyses of parts that are later deleted from the design). It must act as a conduit for feedback to the designer about supportability issues (e.g., the availability of a more commonly available alternative for a proposed part), ideally doing so early enough in the design process to have an impact. It must recognize when design changes (e.g., adding vibration dampers) directly or indirectly impact supply considerations (e.g., changes in predicted reliability can affect what to buy, when to buy it, how many to buy and where to keep the inventory). It must similarly evaluate support requests from DLA's customers, the military services, to determine if they are consistent with its knowledge about the design and deployment plans. (This is an example of a practical issue with profound implications for model integration: organizations can have very different goals that affect both the models they maintain and the information they wish to share. In this case, the military services are interested in maximizing the availability of the parts they need, should they need them, regardless of cost to DLA; DLA is concerned with breaking even on inventory costs, which it cannot recoup until orders are actually placed.) TOOLS AND METHODOLOGY: EVOLUTIONARY MODEL DEVELOPMENT IN DRAMA Not surprisingly, to perform its task DRAMA needs to bridge differences in models between organizations. Interaction with legacy systems is a non-trivial component of this problem. The design is described in terms of American Military Standard 1388, also known as Logistical Support Analysis Records (LSAR). This specification defines, in effect, a model which describes components of a design from a logistical point of view, but a view which is centered around that one design. LSAR is intended to be a bridge between CAD models and logistical models. In fact, it is more of a bridge to the middle of an ocean; additional bridges are needed to contact anything solid. The problem is that a complete logistical view is almost an inversion of a design view. For example, if a particular kind of bolt is used in four different places in an airplane design, the LSAR model will represent each of those usages, but not explicitly represent that the type of bolt is the same in those four places, nor even the total number of that kind of bolt used on the plane. The model is a hierarchical decomposition of assemblies into sub-assemblies, coupled with a detailed description of components. DLA, on the other hand, needs to view the world from the perspective of relations between parts, systems and customers. It would like to represent the world in terms of aggregations of usage for a type of part: by assembly, by system, by customer, by total. Thus, the standard model for the ``objects of discourse'' in the communications between organizations is inadequate; DRAMA must augment and mediate between the models. This situation is better than the model for the relevant processes within and across organizations, since that model is non-existent outside of DRAMA. When is the design organization sufficiently committed to a proposed component that it makes sense for the logistics organization to look at it? What are the steps in verifying a customer's requirements? What factors need to be considered before deferring a procurement? DRAMA's models are built from a framework that is intended to support evolution of these models. There are three major components to that framework which are relevant to model integration: (1) a formal knowledge representation language for modelling objects and relations; (2) a planner-like language for capturing process knowledge; and, (3) an informal, ``semi-structured'' notes system allowing users to capture information not anticipated at design-time. We believe that a key piece of enabling technology which makes the model-driven approach possible is the presence of a representation language with sufficiently principled semantics to allow the operation of an automatic classifier. This tool efficiently reasons about implications of additions/changes to the knowledge base. The classifier enables a single representation of information to be used for multiple purposes. In particular, its reasoning about the relationship between a given description of a concept and other concepts in its lattice can be used to drive pattern matching, perform retrieval, support truth maintenance, trigger event-driven processing, and ensure the completeness and consistency of extensions as a system is maintained and extended Our planner-like language, Scenarios/Agendas, is concerned with multiple activities that extend over long periods of time. A scenario is a plan-like specification of the tasks and sub-tasks which comprise an activity; an agenda is a collected set of such tasks which fit a common description. The tool provides mechanisms for helping users work through activities defined by scenarios. These mechanisms deal with: dividing labor between the user and system, presenting user tasks, maintaining and presenting agendas of tasks, filtering agendas based on priorities, etc. The subtasks described in a scenario and managed by Scenarios/Agendas can be described in the machine-interpretable planning language, or specified as semistructured notes. To the extent that a scenario is machine-interpretable, the system will break it down into sub-tasks and execute as much as possible. When sub-tasks are encountered during execution of a scenario which are defined to require user intervention, or which are expressed in notes that the system cannot interpret, Scenarios/Agendas will present them to a user. Semistructured notes provide a fallback for users of the system to extend the model. Notes are like formally-represented concepts, in that they have a type and the specification method for their internal structure defines attributes (fields) for each note type. However, a note may -- or may not -- require that all values of those attributes be represented in machine-interpretable form. Thus, the computer can still operate with at least partial understanding, because the notes have some structure to them and are associated with knowledge base entries that the computer can be programmed to understand. This gives the computer a way of knowing about things it can't handle and asking the user for help, thus reducing the brittleness of the model. What enables that capability is the presence of a hierarchy of note types. Our framework provides a domain-independent model of note categories, which is extended to provide a model for the DRAMA application. An end user can extend that model to describe more specific categories, allowing them to carry the system manually through problems that the formal model doesn't fully cover. These notes can then be collected and considered for guidance in revising the formal model. RELATIONSHIP TO BROADER APPROACHES: THE KNOWLEDGE SHARING EFFORT DRAMA's long-term payoffs are maximized to the extent that it can share and reuse standard models of design and logistical management, and provide its own models for sharing and reuse. At issue is not the language in which the models are expressed, but the abstract content of the models: the parts of the knowledge base that define the, ``rules of the game,'' for describing objects, relations, and constraints in this application domain. DARPA's Knowledge Sharing Effort is attempting to develop technology and infrastructure for sharing and reuse at this level. The Knowledge Sharing Effort doesn't start with a commitment to a language. We assume that a system developer will pick a language from a number of choices. Our architecture is actually a framework for composing a system. As a result, it is hard to describe the architecture without intertwining a discussion of the development methodology. That development methodology can loosely be characterized as, ``first borrow, then build''. More specifically, we believe that libraries of reusable knowledge-based software components can be established. These components will include knowledge representation systems: i.e., languages, their interpreters/compilers, and eventually management capabilities akin to DBMS's. The libraries will also include ontologies: i.e., top-level representations of abstractions containing sufficient information to lay down the ground rules for modeling a domain. Also in the libraries will be specialized reasoning modules: components that implement particular functions. This last set includes generic reasoners like truth maintenance systems, task-specific reasoners like diagnosers or schedulers, and domain-specific reasoners like circuit simulators. Our application development methodology calls for a developer to build a specialized shell for their application by taking out components from the library, fleshing out the knowledge base, adding custom application code as needed, and writing software interfaces to any other systems required by the new application. The generic architecture for which this methodology is intended assumes that an application consists of one or more knowledge-based modules. In the multiple-module case, there is a protocol for encoding messages between modules. (Currently, we assume that messages travel only between modules designed in advance to share ontologies, although we are exploring how to allow recovery in situations where that doesn't hold.) The communications protocol is neutral as to the control structure between modules. Within any individual knowledge-based module, the architecture calls for a user interface management system to stand between end users and the body of the application. That body is glued together by application-specific software. This links specialized reasoning modules, knowledge representation systems, and conventional software packages. These, in turn, share access to a local knowledge base. That knowledge base is partitioned. At the top level is the module's ontology, composed of ontolog(ies) obtained from the library. Below that are one or more components which we call knowledge bases, but which are intended to combine information normally held in knowledge bases with information normally held in databases. Also below the module's ontology in our scheme are things we call, "knowledge agents", essentially virtual knowledge bases that act as clients and use our communication protocol to treat other modules as servers. Why do things this way? The Knowledge Sharing Effort's approach is motivated partly by a desire to facilitate sharing and reuse. The systems that need to be built are bigger than any group has the resources to build on their own. The approach is intended to enable developers to build applications without starting from scratch, but assumes that no single system platform can serve all purposes. Hence, we want an architecture and development model that lets developers configure alternative system architectures that are pre-loaded with some knowledge. However, this is only part of the motivation, and technically is some ways down the road. Another motivation is the belief that it is useful and necessary to provide a platform which facilitates experimentation by allowing researchers to explore alternative architectures without having to build whole architectures themselves. The Knowledge Sharing Effort's architecture is an attempt to provide a basis for cumulative research in the design of very large knowledge-based systems. It attempts to do so by trying to be a meta-architecture: it partitions the architecture of knowledge-based systems into parts and talks about how to configure particular knowledge-based system architectures. Another way of viewing the Knowledge Sharing Effort is that it deals with infrastructure. Our effort seeks to provide a mechanism by which different approaches to knowledge-based system architectures could compete with each other in the marketplace of ideas and, perhaps someday, in a commercial marketplace. To do so, we want to break out different pieces of an overall architecture so they can compete separately. There are two things that frighten off potential consumers of any technology and thereby shrink its market. One is absence of competition. The other is the cost of commitment. Absence of competition raises the likelihood that undiscovered flaws exist, as well as making consumers dependent on services out of their control. High costs of commitment, inevitable with large comprehensive systems, make it expensive to get in and -- even more frighteningly -- expensive to get out if the choice proves wrong. In such circumstances, potential consumers hold back from adopting a new technology. Breaking things up into pieces in the manner advocated by the Knowledge Sharing Effort makes competition easier because competitors have less to do before they can join the game. Thus, one concern of potential consumers is allayed. Breaking up the system into pieces makes the cost of commitment or withdrawal more incremental, thus allaying the second concern. Thus, ironically, a monolithic system broken into pieces might be more attractive to more consumers than the same system presented as a unit -- even if the consumers ended up buying all the individual pieces needed to reconstitute the original.