Innovation

The main technical novelty of the project is in its synergy of different scientific communities and research topics: digital libraries, Web publishing, logical environments.

From the point of view of Web publishing, our project is the first attempt to provide a comprehensive description, from content to metadata, of a given field of knowledge (in our case mathematics), in order to enhance its accessibility, exchange and elaboration via the World Wide Web. To this aim, we shall put to use most of the technologies recently introduced by the W3C: XML, DOM, XSL, XLL, Namespaces, MathML, RDF, etc. From this respect, the project is first of all a complex test for all these technologies, and should hopefully become an example of ``best practice'' in their use. Note that the final architecture is likely to be extendible to other fields of structured information: the emphasis on mathematics is motivated by the fact that, due to its complex interplay between content, structure and notation, it provides a major case study for Web-based information systems (it is not a case that MathML has been one of few instances of XML completely developed under the aegis of the World Wide Web Consortium).

From the point of view of digital libraries, our work is aimed at exploiting all the potential functionalities offered by the Web, and in particular a more integrated use of its browsing and searching facilities. The library is not merely seen as a more or less structured collection of texts, but as a virtual structure inside which we can freely navigate, jumping for instance from an entity to its definition, or peeping inside some information at deeper and deeper levels of details (such as different levels of detail of a proof). This is similar to what we currently do with HTML texts, but in order to enhance the effectiveness of the consultation, we clearly need a good metadata model of the information. Moreover, in such an integrated view, it is hardly conceivable to just apply some ``general purpose'' metadata model (like the Dublin Core system, say): the metadata model must be eventually specialised to the actual structure of the information it is supposed to model (and more structure we have on the information, more relevant metadata we can usually infer on the document). For instance, metadata could contain the whole signature of a given module of mathematical knowledge. The usual motivation for keeping metadata simple and general is that it is usually difficult to add this information by hand; but in our case a large part of the metadata is supposed to be extracted automatically by the (structured) text itself, allowing for pretty complex metadata models.

Finally, a main aspect of our project is the integration with current tools for the automation of formal reasoning and mechanisation of mathematics (proof assistants and logical frameworks). This integration has a mutual benefit. From the point of view of the mathematical library, the first and fundamental role of these systems is that of providing friendly authoring tools (for instance, our ``core'' library will be automatically extracted from existing libraries of these systems). The relevance of this point should not be underestimated: as a matter of fact, the main reason for the failure of complex markup modellings is usually the lack of suitable authoring tools (it is often painful to add the markup by hand). Of course, they can also provide other functionalities (like automatic proof checking) on fragments of the library (typically, the fragments generated by the tool itself, in its specific logical dialect). These additional functionalities may be especially relevant for industrial applications, e.g. in the context of IT security evaluation standards like the Common Criteria standard (see others/cc). In its highest assurance level, this standard requires the development of formal models of the IT product under evaluation, as well as mechanized proofs that it meets its security objectives. Such models and proofs must be published in a format that can be easily readable and understood by security evaluators. Hence, there is a strong need from software industry to be able to produce such documentation directly from the models introduced in the proof assistant, and to link it with documents describing the IT product, etc.

On the other side, there is a compelling need of integration between the current tools for automation of formal reasoning and mechanisation of mathematics and the most recent technologies for the development of Web applications and electronic publishing. XML, which is rapidly imposing as a pivotal technology in the future development of all Internet applications, and the main tool for representation, manipulation, and exchange of structured information in the networked age, looks as a natural, almost mandatory, choice for modelling the information.

In this way, we just obey to the very primitive commandment of the Web: make your information available. Currently, libraries in logical frameworks are usually saved in two formats: a textual one, in the specific tactical language of the proof assistant, and a compiled (proof checked) one in some internal, concrete representation language. Both representations are obviously unsatisfactory, since they are too oriented to the specific application: they restrict the access of the libraries to the users of the given application, and at the same time they are too sensible to the evolution and the maintenance of the application itself. On the other side, as soon as the information is put in a standard format on the Web, any kind of research becomes virtually possible, and anybody could start developing his own spider for implementing his own searching requirements. This is clearly a major improvement w.r.t. the present situation. Currently, you must not only rely on the searching facilities offered by the specific applications, but even if you would wish to implement your own searching algorithm, you would be prevented by the simple reason that the information is not accessible (in any reasonable sense of the word).

The project builds on the solid ground provided by several existing XML-based languages for the management of mathematical documents such as MathML, OpenMath and OMDoc. Each of these markup languages covers a different aspect of the information. Our aim is not to propose a new language, but to study and to develop the technological infrastructure required to integrate all these languages together, in order to take advantage of the specific features of each of them.

Index

This page is hosted by the Department of Computer Science, University of Bologna.
Last updated Fri Feb 2 17:12:26 CET 2007.