From sumpter@llnl.gov Thu Feb 24 14:44:54 1994 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2058" "Thu" "24" "February" "1994" "11:14:12" "PST" "sumpter@llnl.gov" "sumpter@llnl.gov" "<9402241914.AA05980@ocfmail.ocf.llnl.gov>" "44" "" "^From:" nil nil "2" "1994022419:14:12" "" nil nil] nil) Return-Path: Received: from cv3.cv.nrao.edu by fits.cv.nrao.edu (4.1/DDN-DLB/1.5) id AA05619; Thu, 24 Feb 94 14:44:51 EST Received: from ocfmail.ocf.llnl.gov by cv3.cv.nrao.edu (4.1/DDN-DLB/1.13) id AA05613; Thu, 24 Feb 94 14:44:47 EST Received: from [134.9.50.11] (sumpter-mac.ocf.llnl.gov) by ocfmail.ocf.llnl.gov (4.1/SMI-4.0) id AA05980; Thu, 24 Feb 94 11:14:12 PST Message-Id: <9402241914.AA05980@ocfmail.ocf.llnl.gov> From: sumpter@llnl.gov To: 00R0RAVICHAN@BSUVAX1.bitnet, ABEZA.IRSS@mhs.unc.edu, aboulanger@bbn.com, ACD@STARLINK.LEICESTER.AC.UK, AHERBST@mazvm01.vnet.ibm.com, alessio@cs.columbia.edu, anamaria@garnet.berkeley.edu, andy@cyclone.jpl.nasa.gov, arai@is.saga-u.ac.jp, arots@xebec.gsfc.nasa.gov, arun@almaden.ibm.com, asl@nets.lanl.gov, asood@gmuvax.gmu.edu, babell@santafe.edu, bache@esosun.css.gov, badri@rags.rutgers.edu, barry@ipac.caltech.edu, beach@herbarium.bpp.msu.edu, belkin@zodiac.rutgers.edu, bfm@ipac.caltech.edu, bhs@cs.brown.edu, bib@uplvax.jhuapl.edu, bic@ics.uci.edu, biliris@cs.bu.edu, bjacobs@nssdca.gsfc.nasa.gov, BJR@IB.RL.AC.UK, BJR@ibm-b.rutherford.ac.uk, blakeley@csc.ti.com, bmihalas@ncsa.uiuc.edu, bnash@csn.org, bobbie@cs.umd.edu, bobc@eos.hac.com, bober@cs.wisc.edu, bohannon@paul.rutgers.edu, borning@cs.washington.edu, boyd@jpl.nasa.gov, boyno@apollo.montclair.edu, bph@envsci.evsc.virginia.edu, bradshaw@fsl.orst.edu, brown@larix.geo.msu.edu, buckland@otlet.berkeley.edu, buzbee@ncar.ucar.edu, cal@bierstadt.scd.ucar.edu, cal@ncar.ucar.edu, campbell@nssdca.gsfc.nasa.gov, campbell@nssdcb.gsfc.nasa.gov, carey@cs.wisc.edu, cfields@loglady.ninds.nih.gov, chang@cs.pitt.edu, chang@nucsrl.edu, chb@eecs.umich.edu, Cheng_Hsu@MTS.RPI.edu, choi@dblvax.umd.edu, chrisman@u.washington.edu, chrisp@gaia.arc.nasa.gov., chrys@cs.umass.edu, churgin@nodc.dnet.nasa.gov, cjoslyn@bingvaxu.cc.binghamton.edu, cjt@cs.arizona.edu, cl3e@andrew.cmu.edu, cohen@fsl.orst.edu, consens@db.toronto.edu, corcoran@heasrc.gsfc.nasa.gov, cova@princeton.edu, csfreds@umcvmb.bitnet, curator@lamont.ldgo.columbia.edu, cwc@princeton.edu, cwinton@william.Unf.edu, daniel@princeton.edu, Daniel_Rehak@KIEL.EDRC.CMU.edu, dave@natasha.jpl.nasa.gov, davids@stsci.edu, dbeech@oracle.com, dbouwer@selvax.sel.bdrdoc.gov, dbrady@ncsa.uiuc.edu, dbworld@cs.wisc.edu, dchilds@jplpds.jpl.nasa.gov, deb@tweety.ipc.Virginia.edu, denny@cray.com, dewitt@cs.wisc.edu, djy@inel.gov, dkingsbu@chablis.gwu.edu, dlittman@gmuvax2.gmu.edu, DLWilliams@GSFCmail.NASA.gov, dman@hugo.geol.scarolina.edu, dmanplus@lternet.edu, dmyers@pldsg3.gsfc.nasa.gov, dozier%crseo@hub.ucsb.edu, drew@objy.com, dsb@cs.utexas.edu, dwells@NRAO.EDU, dwilson@slate.Mines.Colorado.edu, dxp6129@tesla.njit.edu, D_Rotem@lbl.gov, eas@atlantic.jpl.nasa.gov, eichmann@a.cs.wvu.wvnet.edu, elhaddi@cdr.lter.umn.edu, elkan@cs.UCSD.edu, elmasri@cse.uta.edu, elton@astro.ufl.edu, embley@bunsen.cs.byu.edu, emv@msen.com, erccil@vegas1.las.epa.gov, es4e@a.ocfmail.ocf.llnl.gov Date: Thu, 24 Feb 94 11:14:12 PST ndrew.cmu.edu, eversole@voyager.jpl.nasa.gov, faloutsos@cs.umd.edu, farris@stsci.edu, fayyad@aig.jpl.nasa.gov, fbretherton@vms.macc.wisc.edu, FCSILLAG@SUNRISE.ACS.SYR.edu, ferris@delocn.udel.edu, fertig@cs.yale.edu, fgiovane@nasamail.nasa.gov, fiorino@typhoon.gsfc.nasa.gov, fischler@ai.sri.com, fm02@gte.com, fox@vtopus.cs.vt.edu, frank@magnus.acs.ohio-state.edu, freeston@ecrc.de, french@virginia.edu, frew@cs.berkeley.edu, frew@hub.ucsb.edu, frew@ucsb.edu, frithsen.jeff@epamail.epa.gov, fung@ccrs.emr.ca, futrell@corwin.CCS.Northeastern.edu, gary@darwin.life.uiuc.edu, gary@pldsa1.arc.nasa.gov, geller@vienna.njit.edu, geohal+@osu.edu, george@lenti.med.umn.edu, gharrison@BBN.com, gio@earth.stanford.edu, glenn@loch.mit.edu, Glenn@pimms.mit.edu, goodrich@cs.jhu.edu, graefe@cs.colorado.edu, grbr@hp850.mbari.org, green@nssdca.gsfc.nasa.gov, grosky@cs.wayne.edu, grossman@lac.math.uic.edu From: sumpter@llnl.gov (Robyne Sumpter) X-Sender: sumpter@ocfmail.ocf.llnl.gov Subject: Metadata Maillist Greetings, Your participation in other news groups suggests you might be interested in participating in a maillist to discuss metadata related issues. Your name was provided to me by Jim French. metadata@llnl.gov (soon to be changed to ieee+metadata@llnl.gov) provides a forum to discuss work from a series of IEEE Mass Storage Systems and Technology Committee sponsored workshops on metadata and data management issues. We welcome participation by others involved in related activities. If you would like to be included on the metadata maillist, please send a request to metadata-request@llnl.gov. Regards Robyne Sumpter =========================================================== Robyne M. Sumpter sumpter@llnl.gov Lawrence Livermore Laboratory Phone: (510) 423-5054 P.O. Box 808 L-60 Fax: (510) 423-8715 Livermore, CA 94550 =========================================================== From sumpter@llnl.gov Thu Feb 24 19:17:31 1994 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["646" "Thu" "24" "February" "1994" "16:17:24" "PST" "Robyne Sumpter" "sumpter@llnl.gov" "<9402250017.AA11542@ocfmail.ocf.llnl.gov>" "21" "Re: Subscribe" "^From:" nil nil "2" "1994022500:17:24" "Subscribe" nil nil] nil) Return-Path: Received: from ocfmail.ocf.llnl.gov by fits.cv.nrao.edu (4.1/DDN-DLB/1.5) id AA07023; Thu, 24 Feb 94 19:17:29 EST Received: from [134.9.50.11] (sumpter-mac.ocf.llnl.gov) by ocfmail.ocf.llnl.gov (4.1/SMI-4.0) id AA11542; Thu, 24 Feb 94 16:17:24 PST Message-Id: <9402250017.AA11542@ocfmail.ocf.llnl.gov> X-Sender: sumpter@ocfmail.ocf.llnl.gov From: sumpter@llnl.gov (Robyne Sumpter) To: dwells@fits.CV.NRAO.EDU (Don Wells) Subject: Re: Subscribe Date: Thu, 24 Feb 94 16:17:24 PST Welcome to the metadata maillist. If you have any questions or problems please send mail to: metadata-request@llnl.gov All other mail should be sent to: metadata@llnl.gov for distribution to the entire list. Regards Robyne =========================================================== Robyne M. Sumpter sumpter@llnl.gov Lawrence Livermore Laboratory Phone: (510) 423-5054 P.O. Box 808 L-60 Fax: (510) 423-8715 Livermore, CA 94550 =========================================================== From dale@convex1.convex.com Mon Feb 28 14:04:29 1994 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["4792" "Mon" "28" "February" "1994" "12:42:46" "-0600" "Dale Lancaster" "dale@convex1.convex.com" "<9402281842.AA00497@convex1.convex.com>" "113" "DMIG information" "^From:" nil nil "2" "1994022818:42:46" "DMIG information" nil nil] nil) Received: from cv3.cv.nrao.edu by fits.cv.nrao.edu (4.1/DDN-DLB/1.5) id AA24991; Mon, 28 Feb 94 14:04:28 EST Received: from ocfmail.ocf.llnl.gov by cv3.cv.nrao.edu (4.1/DDN-DLB/1.13) id AA06639; Mon, 28 Feb 94 14:04:24 EST Received: from pierce.llnl.gov by ocfmail.ocf.llnl.gov (4.1/SMI-4.0) id AA13487; Mon, 28 Feb 94 10:42:39 PST Received: by pierce.llnl.gov (4.1/LLNL-1.18/llnl.gov-05.92) id AA01242; Mon, 28 Feb 94 10:43:56 PST Return-Path: Received: from convex.convex.com (convex-inet.convex.com) by pierce.llnl.gov (4.1/LLNL-1.18/llnl.gov-05.92) id AA01216; Mon, 28 Feb 94 10:43:51 PST Received: from convex1.convex.com by convex.convex.com (5.64/1.35) id AA26098; Mon, 28 Feb 94 12:40:02 -0600 Received: by convex1.convex.com (5.64/1.28) id AA00497; Mon, 28 Feb 94 12:42:46 -0600 Message-Id: <9402281842.AA00497@convex1.convex.com> From: dale@convex1.convex.com (Dale Lancaster) To: metadata@llnl.gov Subject: DMIG information Date: Mon, 28 Feb 94 12:42:46 -0600 As promised at the workshop, below is some information about the DMIG work. The first is an overview (somewhat dated) and below that is a list of the directories stored at acsc.com/pub/dmig where you can retrieve anything you want to know about this group and its work. regards, dale ------- Start of forwarded message ------- From: dale@hydra.convex.com (Dale Lancaster) To: dale@hydra.convex.com Subject: DMIG info at acsc.com /pub/dmig Date: Mon, 28 Feb 94 11:50:57 -0600 The Uniform Resource Locator for this document is: file://acsc.com/pub/dmig/admin_doc/overview.doc Data Management Interfaces Group An Introduction July 1993 Revision 1.1 The Data Management Interfaces Group (DMIG) is an ad-hoc initiative representing over 30 operating system vendors, data storage vendors, and data management software vendors who are cooperatively developing and promoting extensions to the UNIX operating system to better enable data management applications. The goal of the DMIG is to design a new operating system interface to provide better support for a well defined set of filesystem management applications: file migration, file backup and recovery, file compression, file encryption, and directory browsing. These applications are all characterized by their need for a set of primitives (to monitor and control the usage of files) that are not adequately provided by UNIX computing platforms today. Many file management products must provide OS modifications to operate, or must live with undesirable limitations. The goal of the DMIG specification is to permit file-based data management applications to be developed by software vendors and installed by customers just like ordinary applications, without requiring modifications to the underlying operating system. A successful DMIG effort will provide positive results for end users and vendors alike. End users will be able to deploy data management products in a tight and seamless fashion, without concern for compatibility with operating system revisions or other operating system level software packages. End users will further find a wider variety of products to choose from, each with a larger set of features and functions. Vendors will be able to refocus their product development efforts on value-added technology, investing in new product functionality rather than operating system or file system level technology. Vendors will further find the cost of development lower, with improved time to market capabilities and lower support costs. In all, the market should expand by being able to reach more customers with better products, at lower cost. The group intends to reach agreement on a common specification by Q4 1993. This specification will then be immediately available to UNIX vendors to implement in their next releases. Initial DMIG-enabled operating systems could appear early as late 1994 and into early 1995, with the majority of releases in 1995 and into 1996. Participating vendors in the DMIG currently include ACSC, Advanced Software Concepts, Amdahl, Auspex, Advanced Archival Products, AT&T Comvault, Banyan, Bull, Cray Research, Delta Microsystems, Digital, Epoch Systems, E-Systems, Fujitsu, Hewlett Packard, Hitachi Computer Products, IBM, Lachman Technologies, Legato Systems, Motorola, NEC, NETstor, Novell (USL), OpenVision, QStar, Raxco-UIS, SCO, Silicon Graphics, StorageTek, SunSoft, Transarc and Veritas. The DMIG encourages end users and vendors to get involved. End users can best support the DMIG efforts by letting their vendors-of-choice know that the DMIG interface is important by requesting a statement of direction regarding the DMIG. Vendors are welcome to participate in the DMIG. Contact the DMIG via electronic mail at dmig-request@epoch.com. =============================================================================== From: dale@hydra.convex.com (Dale Lancaster) To: dale@hydra.convex.com Subject: DMIG info at acsc.com /pub/dmig Date: Mon, 28 Feb 94 11:50:15 -0600 The Uniform Resource Locator for this document is: file://acsc.com/pub/dmig/README The following directories exist at the top level admin_doc High level dmig administrative documents auspex-epoch Contains various documents describing an interface proposal designed to meet DMIG requirements. if_doc Versions of the interface docs mail_archive Archive for all dmig mail, includes requests for info and reflector traffic meeting_min Minutes of various DMIG meetings req_doc Versions of the requirements docuemtments vnode_stacking Dir containing background material on vnode stacking. From dale@convex1.convex.com Mon Feb 28 15:10:20 1994 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2993" "Mon" "28" "February" "1994" "13:51:57" "-0600" "Dale Lancaster" "dale@convex1.convex.com" "<9402281951.AA02296@convex1.convex.com>" "79" "MDDS workshop notes" "^From:" nil nil "2" "1994022819:51:57" "MDDS workshop notes" nil nil] nil) Received: from cv3.cv.nrao.edu by fits.cv.nrao.edu (4.1/DDN-DLB/1.5) id AA25111; Mon, 28 Feb 94 15:10:19 EST Received: from ocfmail.ocf.llnl.gov by cv3.cv.nrao.edu (4.1/DDN-DLB/1.13) id AA07170; Mon, 28 Feb 94 15:10:17 EST Received: from pierce.llnl.gov by ocfmail.ocf.llnl.gov (4.1/SMI-4.0) id AA15028; Mon, 28 Feb 94 11:51:38 PST Received: by pierce.llnl.gov (4.1/LLNL-1.18/llnl.gov-05.92) id AA10487; Mon, 28 Feb 94 11:52:55 PST Return-Path: Received: from convex.convex.com (convex-inet.convex.com) by pierce.llnl.gov (4.1/LLNL-1.18/llnl.gov-05.92) id AA10475; Mon, 28 Feb 94 11:52:52 PST Received: from convex1.convex.com by convex.convex.com (5.64/1.35) id AA29776; Mon, 28 Feb 94 13:49:01 -0600 Received: by convex1.convex.com (5.64/1.28) id AA02296; Mon, 28 Feb 94 13:51:57 -0600 Message-Id: <9402281951.AA02296@convex1.convex.com> From: dale@convex1.convex.com (Dale Lancaster) To: metadata@llnl.gov Subject: MDDS workshop notes Date: Mon, 28 Feb 94 13:51:57 -0600 I had sent these notes out once before, but evidently our internal mailer hosed it up when we had a mailer problem a couple weeks back. Anway, these are my notes from the Massive Digital Data Workshop for the Metadata subgroup. regards, dale Below are some notes I took from the Massive Digitial Data Systems Workshop held recently in Reston, VA. This workshop was sponsored by the Intelligence Community of the DoD and focused primarily on issues related to access to data. There were several sub-groups. Many of us that were involved in the IEEE Workshop were in the Metadata Working Group at this workshop. Below are some notes from that workshop. First, it started out being much like our first workshop with the IEEE in that there was lots of different needs and opinions. There never really was agreement on what metadata was/is/will be. The net result of the workshop was a list of "problems" that need to be addressed in some fashion. This was feed into the main MDDS work session that helps the sponsors of the workshop to determine what projects should/could be funded to help get them solved. The first area covered was the types of uses for metadata (in hopes of trying to figure out how to define metadata :-): - Optimize access to data - Manage data - Process data - Intrepret data - Classify data - Store/retrieve data - Scalability (not sure what we meant by this in terms of usage) - Complexity of data (relationships, users, elements, etc) The problems that need to be solved for metadata/access to data include those listed below. This list was prioritized, but I missed out on that part of the meeting. Maybe Robyn Sumpter, Anne Wheeler or Carol Hunter has the prioritized list. - Automation of the derivation of relationships in data and metadata - Defining a structure/method/standard for how metadata is created and defined (I like the MIB approach suggested by Robyn in the whitepaper). - Defining an API to applications and storage (and maybe other major functional areas). - Presentation of metadata. It was empahsized that it be based on how users naturally would access the data based on the application they are using. - Tools for metadata creation and management - How to evolve a metadata system (especially from legacy systems to state-of-the-art systems) - Automatic generation of metadata - Optimization (some wanted to say parallelization, but the real issue is how to make it fast and efficient) In summary, it was clear that there is lots of interest in solving major data access problems. There is still the natural split that we saw in our first workshop of applications and storage. Many users are interested in how applications use/manage metadata from a applications perspective and then the other group of users who are interested in how a storage system would use/manage metadata. A Metadata Reference Model should help clear this up and show how these two areas need to be married. Regards, Dale Lancaster From dale@convex1.convex.com Mon Feb 28 17:51:22 1994 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["20130" "Mon" "28" "February" "1994" "16:28:54" "-0600" "Dale Lancaster" "dale@convex1.convex.com" "<9402282228.AA05418@convex1.convex.com>" "473" "Metadata Workshop Meeting Meetings for 17-18 Febuary 1994" "^From:" nil nil "2" "1994022822:28:54" "Metadata Workshop Meeting Meetings for 17-18 Febuary 1994" nil nil] nil) Received: from cv3.cv.nrao.edu by fits.cv.nrao.edu (4.1/DDN-DLB/1.5) id AA25512; Mon, 28 Feb 94 17:51:20 EST Received: from ocfmail.ocf.llnl.gov by cv3.cv.nrao.edu (4.1/DDN-DLB/1.13) id AA08748; Mon, 28 Feb 94 17:51:16 EST Received: from pierce.llnl.gov by ocfmail.ocf.llnl.gov (4.1/SMI-4.0) id AA17426; Mon, 28 Feb 94 14:29:05 PST Received: by pierce.llnl.gov (4.1/LLNL-1.18/llnl.gov-05.92) id AA26173; Mon, 28 Feb 94 14:30:22 PST Return-Path: Received: from convex.convex.com (convex-inet.convex.com) by pierce.llnl.gov (4.1/LLNL-1.18/llnl.gov-05.92) id AA26130; Mon, 28 Feb 94 14:30:14 PST Received: from convex1.convex.com by convex.convex.com (5.64/1.35) id AA06180; Mon, 28 Feb 94 16:26:22 -0600 Received: by convex1.convex.com (5.64/1.28) id AA05418; Mon, 28 Feb 94 16:28:54 -0600 Message-Id: <9402282228.AA05418@convex1.convex.com> From: dale@convex1.convex.com (Dale Lancaster) To: metadata@llnl.gov Subject: Metadata Workshop Meeting Meetings for 17-18 Febuary 1994 Date: Mon, 28 Feb 94 16:28:54 -0600 Greetings to the reflector folks. Below are the minutes from our last meeting. My PC could not recover a portion of the file I had stored the minutes in. But I did take some written notes that I think may have filled in the gaps. Please review and let me know what I missed or should be corrected. After a few days of collecting changes, I will repost to the reflector and to other internet places that make sense. regards, dale Metadata Workshop Meeting Minutes 17-18 February 1994 This workshop was a second in a series focused on understanding and defining the "metadata" or data access problem. The goal is to create awareness of the problem and/or standards for how to deal with it. The workshop was held at the Center for High Performance Computing (CHPC) in Austin, Texas on 17 and 18 February 1994. Jim Almond (CHPC), Dale Lancaster (Convex) and Otis Graf (IBM FSC) co-lead the workshop. The workshop was sponsored by the IEEE Mass Storage Systems and Technology Committee (MSS&TC) as part of the Data Management Technology sub-committee. The minutes include the following categories: A) Actions Items from the meeting B) Attendance List C) IEEE Structure D) Metadata White Paper E) Metadata Types and taxonomy F) Metadata Reference Model G) Next Workshop in May on the Reference Model H) Other related metadata projects/items I) Miscellaneous Notes J) Who are the attendees and their reasons for attending A) Actions Items from the meeting * Kick-off an email discussion on the metadata@llnl.gov reflector to: - Suggest agenda items/structure for the upcoming Metadata Reference Model workshop in May. - Define the scope of a Metadata Reference Model - Define the functional requirements of a Metadata Reference Model The intent is to have various members of the metadata reflector propose various options and definitions for the above and have them debated electronically. There will be a need at some point to "moderate" and/or conclude the discussions before the workshop on the Metadata Reference Model to be held in May. * Update the Metadata whitepaper to include the various thoughts and conclusions from this workshops. Robyn Sumpter will do this. * Setup a Xmosaic master home page for Metadata. This home page can/will point to other sources of information relating to the Metadata effort. The intent is to create a clearing house for tracking all the various technologies and efforts related to data access and the metadata problem. Robyn Sumpter and Ron Pfaff will work on this. Also it was suggested to send a general message to the internet to invite other organizations to feed their information/work on metadata into this system. * Several people wanted information on the DMIG (Data Management Interface Group) effort which is defining a standard system call interface for handling low level file migration requirements. Dale Lancaster will post this information to the reflector. * Create a "real" effort to attempt to solve the metadata access problem in a practical way. Several people were interested in getting together to do this. This would not be part of the IEEE, but would be consistent with the metadata effort under it. B) Attendance List Name Organization Email Phone =============================================================================== Jim Almond CHPC j.almond@chpc.utexas.edu 512-471-2442 Dale Lancaster Convex dale@convex.com 214-497-4581 Otis Graf IBM FSC ofgraf@clearlake.ibm.com 713-282-8216 Roy Skruggs Triada triada@middlec.convex.com 404-951-5493 Mike Daily Mobil R&D midaily@dal.mobil.com 214-851-8836 Robyn Sumpter LLNL sumpter@llnl.gov 510-423-5054 Carol Hunter LLNL chunter@llnl.gov 510-422-1657 Becky Springmeyer LLNL springme@llnl.gov 510-423-0794 Ann Wheeler Britton Lee ann@bli.com 408-370-1400 Lynn Wheeler Britton Lee lynn@bli.com 408-370-1400 Tony Baraghimian Hughes gab@mitchell.hitc.com 703-759-1392 Parris Caulk Loral pcaulk@eos.hitc.com 301-925-0610 Charles Dollar National Archives 71162.3600@CompuServe.COM 301-713-7076 William Farrell SAIC farrell@gso.saic.com 619-458-2645 Gregory Jirak Xidak greg@xidak.com 415-855-9271 Maria Zemankova MITRE mzemanko@mitre.org 703-883-6217 Michael Josephs MITRE mjosephs@ciis.mitre.org 703-883-6567 Ron Pfaff LANL rtp@lanl.gov 505-667-8182 C) IEEE Structure Bob Coyne of IBM Federal Systems, representing the IEEE Mass Storage Systems and Technology Committee (MSS&TC) reviewed how this workshop and its participants fit into the IEEE MSS&TC structure. This workshop is known as the "Metadata Project" which is coordinated by Jim, Dale and Otis. This project comes under the "Data Management Project Committee made up of Bob Coyne (Chair), Otis Graf (Co-chair) and Dale Lancaster. This committee is responsible for coordinating and sponsoring workshops and projects under the IEEE MSS&TC that deal with data access and management with particular interest from a storage systems perspective. This committee had, at one time, created another workshop effort called "Data Management Workshop" that was originally to deal with large scientific databases and how to handle them in regards to a storage system. The "Data Management Workshop" coordination committee met and decided that their interests where identical to that of this metadata workshop series. As a result they have decided to hold a workshop in May to deal with a "Metadata Reference Model". It is not clear exactly how these two efforts should merge. According to discussion at this workshop, the participants felt that they should play a major role in deciding the agenda and structure of such a workshop. Bob noted that the IEEE MSS&TC wants a common set of terms to talk about metadata and to provide an intellectural framework to discuss such solutions. Such models take on four forms: whitepaper, guide, recommended practice or standard). The MSS&TC wants to provide terms and framework and then recommend where standards work should be done. (The MSS&TC cannot actually work on, or sponsor standards). Standards may then go over to ANSI and/or and/or the Storage Systems Standards Working Group (P1244 effort) or another place in IEEE. Bob stated that it would be desirable to avoid the problem of creating an implied standard such as was done when the Mass Storage Reference Model was created. He also suggested that we not concentrate on metadata just in terms of a filesystem, but that other storage methods and structures could exist that will need a metadata access solution. Washington D.C. Workshop - Francis Bretherton will chair this workshop. He has a reference model for metadata and wants to review and change it. Maybe have alternatives. Who should be invited to this? Will cost $300 to get in, 2-3 days, meals, small groups, assignments. Let Otis or Dale know if interested. D) Metadata White Paper. Robyn Sumpter outlined her paper to the group. Major items: * At least two types of metadata: application and filesystem. It was noted that other major types of metadata existed. It was suggested that another way to split it would be "system" and "non-system (application, user, etc)" metadata. * Robyn showed a block diagram of interfaces to a storage system It was recommended that "filesystem" and "data management" logical blocks shown be interlaced rather than layered on top of each other. * Need to define the type and structure of metadata for applications - promote portability (some discussion on abstractions) (more discussion on metadata is data as well and how to handle this) (be concerned about relationships between data) * SNMP model for metadata - - may need meta agent (things that describe the agent) - how to handle relationships between metadata - applications give meaning to data (can this be determined from a bottom up approach?). Robyn will update this whitepaper based on input received from the workshop. E) Metadata types and Taxonomy Jim Almond reviewed his whitepaper on Collection of Metadata Types and Taxonomy. * Two types of metadata - metadata describing the informational entity itself - abstract data type (What is this?, book,image,matrix) - informational essence (What basic information does it give me?) - relationships between the entity and other entities - metadata pertaining to the storage and use of the informational entity - pertaining to storage and replication of entities on physical 2media (computer and non-computer) - metadata attributes recording or controlling the use of entities. * Went through an example of the above using the definition of how and what a matrix of numbers is represented. F) Metadata Reference Model The group discussed the merits and possible structure of a Metadata Reference Model. There was agreement that there is a need for some type of Metadata Reference Model. As a minimum it would facilitate the need to develop a agreed upon terminology to describe the problem of metadata and intelligent access to data. It would also establish an intellectural framework within which new ideas could be developed. Several graphical diagrams were proposed during the meeting about how this model could appear and/or what its boundries are in a computer system. However there was some agreement that a common set of APIs that define the interaction of applications and system software with a metadata system is needed. One of the major concerns is how to handle "multiple views" of data. That is, the ability for a metadata and storage system to easily provide for different users to define and access the same data from different perspectives and requirements. There was no clear agreement about the scope or definition of the model. It was agreed to debate this electronically via the metadata reflector (metadata@llnl.gov) and to feed the results into the May workshop. G) Next Workshop in May on the Reference Model Bob Coyne announced this workshop will take place in May (see announcement later on in this section). He has suggested that a primary theme is to review a version of a Metadata Model from Francis Bretherton (who will be Chairing this meeting). Everyone agreed that this workshop and its participants should be heavily involved and provide as much input as possible. The group agreed to debate/gen up a proposed agenda for the workshop and submit that to Francis. We agreed that it is desirable not to make one person's view of the model the theme or focus of the workshop. It is hoped that debate of various models can be done electronically on the reflector and the results of that be discussed at the May workshop. The announcement for this workshop is: From: coyne@vnet.ibm.com To: metadata@llnl.gov Subject: IEEE Metadata Workshop, May 16-18, Wash DC Date: Mon, 21 Feb 94 15:43:30 CST IEEE MSS&TC plans to hold a Metadata workshop in the Washington area May 16-18, 1994. Francis Bretherton is the program chair. We are soliciting input for the workshop and inviting folks to volunteer to support the workshop. Attendance may be limited to 50 persons (combination of invitation and open registration for IEEE members). Registration fee target is $300.00 One topic of the workshop will be to discuss and critique Francis' Strawman Reference Document for Metadata. Another topic of interest is agreement on a suite of standards and public protocols require to ensure stewardship of data over a long period (decades). Other topics are solicited. We will probably have 4-5 separate working groups of 8-10 folks. We are looking for recommendation for the invitation list. It is not clear what role the folks attending the UT Austin Workshop and using this reflector desire through the IEEE MSS&TC Data Management committee efforts. Hopefully that will sort it self out over the next couple of workshops. Participants of the UT Austin Workshop and this reflector are welcome to participate in the May 16th workshop. Send comments and questions to me or Otis Graf. Regards, Bob Chair, IEEE MSS&TC Data Management Committee H) Other activities: - Massive Digital Data Workshop held in Reston, VA this past January. Dale Lancaster gave out copies of the notes from subgroup at this workshop that dealt with Metadata. (These notes were also sent to the reflector). The priorization of the metadata problems identified by the MDDS group include: 1. Presentation of metadata. It was emphasized that it be based on how users naturally would access the data based on the application they are using. 1. Defining a structure/method/standard for how metadata is created and defined (I like the MIB approach suggested by Robyn in the whitepaper). 2. Automation of the derivation of relationships in data and metadata 2. Defining an API to applications and storage (and maybe other major functional areas). 3. Automatic generation of metadata 4. How to evolve a metadata system (especially from legacy systems to state-of-the-art systems) 5. Tools for metadata creation and management 6. Optimization (some wanted to say parallelization, but the real issue is how to make it fast and efficient) - POSC - PetroTechnical Open Software Corp. Mike Daily gave an overview of this effort in the petroleum industry. Notes from his presentation: AAPG - American Association of Petroleum Geologists. They establish recommended practices for petroleum geologists. One of those is a standard data format called RP-66, a self-describing data format. SEG - Society of Exploration Geophysicists. SEG has established two types of tape formats: SEG-Y - current tape format which is header+data SEG-DEF - RP-66 flavored format RODE - Record Oriented Data Encapsulation POSC - PetroTechnical Open Software Corporation * 4 technical tracks: - base standards (C, POSIX, MOTIF, etc ) - GUI Style Guide (superset of Motif) - Data Access - is hybrid object-relational - pre-standard SQL3 (UniSQL, HP OpenODB) - PEF POSC Exchange Format extended RP-66 - Data Model - is object oriented - maintained in Express language - 100's of entities with multiple inheritance and complex dataq types (vector, array, etc) - relational projection > 900 tables - lots of reference entities - 'legal values' * 2 implementation tracks - Migration: tools, methods, policies, procedures for data load/validation, application (re)development, etc - Application Views: process-specific application "packages". A graphical view of the above was presented. Roughly drawn: User DBA Programmer Migration A p p l i c a t i o n V i e w s Migration U s e r I n t e r f a c e Migration D a t a A c c e s s Migration API and Exchange Format Migration Logical Model and Base Standards - NEONS (Naval Environ. Operation Nowcasting system) - Sequoia - William Farrell gave an overview of the effort. - goal is to develop information management and presentation applications to handle global change research issues. - Mainly sponsored by DEC and several researchers. - systems approach to a real world problem. - storage systems - filesystems - database systems - WAN protocols - visualization of data - goal to have 10 TB of accessible data by Jun 1994; not going to meet this goal - continuing series of technical reports available from Berkeley - Visualization '93 subgroup on metadata - MetaStore - CHPC effort reviewed by Jim Almond. Access to unix files based on descriptive metadata. Files in a HSM. User interface is GUI. Gain experience with Informational Entities. Currently used only as an experiment. No real data used in this system. Uses Abstract Data Types (ADTs) to classify different types of data used. - DMIG - Data Management Interface Group - Dale Lancaster gave a brief intro and stated he would post detailed information to the reflector. This group is defining a common operating system interface for system applications that implement file migration. I) Miscellaneous Notes * There was the usual discussion about "data is metadata is data is metadata". I think we all agreed that someone's metadata is someone else's metadata and vice versa. [So maybe we should just say that all data is metadata?? :-]. * There was a small effort to just pay a person/grad student to just go off and create a "standard" model and we all hash it out afterwards. This has some merit, but was agreed it wasn't really worthwhile. * Several people expressed an interest in getting together to "just do it". That is, to go head and create some kind of practical system from various parts to flesh out the problems and ideas of a metadata based data access and storage system. Several people had interest in this. * Another issued discussed was how to get metadata forms filled-in properly by users. User generally don't like to fill-in anything. Solutions ranged from automatically generated metadata to "if no metadata, then the data is deleted/not stored". * There still seemed to be a need to define broad classes of metadata. This will have to be discussed/addressed in the metadata reference model. * There was a discussion about have classes instead of objects and the question of "Why not objects to define data types?". And the associated question of where to store "algorithms and methods" associated with the data. This was not really resolved. J) Who are the attendees and their reasons for attending Bob Coyne - UBM Federal Systems - Representing IEEE MSS&TC and how it sees this workshop. Otis Graf - IBM Federal Systems - Representing IEEE MSS&TC Data Management Project Committee. Also interested in how to link metadata rdbms to a mass storage system and begin to classify information entities and research on tools for users and applications. Dale Lancaster - Interested in gathering requirements for metadata based products. Also interested in helping move things along in this area so that this products can be built with standard interfaces. Jim Almond - CHPC - Information Density and Abstract Data Types. Ron Pfaff - LLNL - Tying different user groups together for virtual teaming. Greg Jirak - Aurora Data management system - structure of scientific data. Interested in self-defining/extensible data structures. Tony Baraghimian - Hughes - interested in machine learning. Accessing petabytes of data Carol Hunter - LLNL - Intelligent Archive - how to search and browse large amounts of data. Becky Spinmeyer - LLNL - system level metadata, application metadata and a "middle level". Robyn Sumpter - LLNL - Interested in metadata from a storage perspective. Anne Wheeler - BLI - new DBMS to help deal with metadata access. Using a top down approach to handling the metadata problem. Roy Skruggs - Triada - N-Gram transform for data and intelligent pattern matching. Parris Caulk - Loral - NASA EOS - 3 TB data/day - have lots of data with different formats - "metadata" is a bad term - scientist don't want to be restricted in their search and metadata seems to imply that. Interested in content based searching. William Farrell - SAIC - Global Change Data Sets - Sequoia 2000 project Mike Daily - Mobil - Petabyte of data - involved in POSC - similar to OSF. POSC trying to set standard for data models and access. Maria Zemankova - MITRE - Database people need to work hand in hand with scientific people to solve this problem. (Our customer is the end user, not the standard itself). Don't want to be worse off than we are now because of standards. Charles Dollar - National Archives - Guidance to federal agencies to ensure information is stored, used, processed in such a way that it will be accessible for long term. Respectfully Submitted, Dale Lancaster From coyne@vnet.ibm.com Mon Feb 28 21:43:04 1994 Status: RO X-VM-v5-Data: ([nil nil nil nil nil nil nil nil nil] ["2875" "Mon" "28" "February" "1994" "19:30:44" "CST" "coyne@vnet.ibm.com" "coyne@vnet.ibm.com" "<9403010233.AA12105@pierce.llnl.gov>" "57" "IEEE Metadata Workshop" "^From:" nil nil "2" "1994030101:30:44" "IEEE Metadata Workshop" nil nil] nil) Received: from cv3.cv.nrao.edu by fits.cv.nrao.edu (4.1/DDN-DLB/1.5) id AA25728; Mon, 28 Feb 94 21:43:03 EST Received: from ocfmail.ocf.llnl.gov by cv3.cv.nrao.edu (4.1/DDN-DLB/1.13) id AA10010; Mon, 28 Feb 94 21:43:01 EST Received: from pierce.llnl.gov by ocfmail.ocf.llnl.gov (4.1/SMI-4.0) id AA20433; Mon, 28 Feb 94 18:32:28 PST Received: by pierce.llnl.gov (4.1/LLNL-1.18/llnl.gov-05.92) id AA12123; Mon, 28 Feb 94 18:33:45 PST Return-Path: Received: from vnet.IBM.COM by pierce.llnl.gov (4.1/LLNL-1.18/llnl.gov-05.92) id AA12105; Mon, 28 Feb 94 18:33:40 PST Message-Id: <9403010233.AA12105@pierce.llnl.gov> Received: from HOUVMSCC by vnet.IBM.COM (IBM VM SMTP V2R2) with BSMTP id 8615; Mon, 28 Feb 94 21:29:18 EST From: coyne@vnet.ibm.com To: dale@convex1.convex.com, metadata@llnl.gov Subject: IEEE Metadata Workshop Date: Mon, 28 Feb 94 19:30:44 CST Re: Dale's IEEE Metadata Workshop note I think it would be very useful for the folks on this reflector to debate and discuss what should be on the agenda for the May 16-18 meeting. I am not on the reflector yet, so I'll to wait for the results of the debate. Please send them to me asap for consideration by the program committee. I have not seen the minutes from the Austin either; need a copy. We would like to have continuity between the two Austin meetings and the other planned workshops. So I expect that some folks would volunteer to provide that continuity. Volunteers give me a call! We are still looking for volunteers to be on the program committee for the May 16-18 meeting. NASA GSFC has volunteered to help with logistics. Thank you NASA GSFC. It seemed to me that some attendees of the Austin meeting didn't know the IEEE MSS&TC planned to hold a series of workshops with co-hosts around the country. This decision was made January 1993 after an organizing meeting was held at Scripps. A second organizing meeting was held April 1993 in Monterrey; we committed to a workshop with ORNL (and later Francis Bretherton). At that time, Jim Almond (UT), Bob Grossman (UIC), Manuel Vigil (LANL), and Dick Watson (LLNL/NERSC) volunteered to host meetings. The ORNL Workshop was to be the first, but Jim Almond et al jumped in with welcomed (and less formal) workshops. From some of the recent comments, some folks didn't know that IEEE MSS&TC planned to have workshops with various groups. Some also felt we (IEEE and/or me) were not considerate of what the Austin group was trying to accomplish. Frankly, I was just implementing the planned approved by MSS&TC last year. Sorry, if folks were not informed of the plan or if IEEE plans were different from what was expected. Any group that desires to have an IEEE MSS&TC sponsored "standing specialist workshop" or "standing project committee" send the following information to me: 1) Scope of project 2) Purpose of project 3) Charter statement It seems the disconnect is that we (IEEE MSS&TC) don't see the two meetings in Austin as being a "standing committee". Not that it would be a bad idea; no one has requested it. It never hurts to ask for something. If you want to form a "standing committee" then request one! The only request that we have received with respect to the Austin meetings is permission to use the IEEE MSS&TC sponsorship. It was granted. I think that the Austin meeting agenda item title "our future as an IEEE MSS&TC project)" was appropriate. However, I haven't heard if that discussion produced a formal request to the IEEE MSS&TC or other tangible result. I won't apologize for implementing an approved IEEE workshop plan. I will apologize for Dale, Otis and me not providing more information about our plans to you sooner. Regards, Bob Chair, IEEE Data Mgt Committee