Online First | Browse Archives | About IJKCDT | For Contributors | Awards | Reviewers | Conferences | News |
Sorry.
You are not permitted to access the full text of articles.
If you have any questions about permissions,
please contact the Society.
죄송합니다.
회원님은 논문 이용 권한이 없습니다.
권한 관련 문의는 학회로 부탁 드립니다.
[ Article ] | |
International Journal of Knowledge Content Development & Technology - Vol. 5, No. 2, pp. 75-102 | |
ISSN: 2234-0068 (Print) 2287-187X (Online) | |
Print publication date Dec 2015 | |
Received 03 Jul 2015 Revised 01 Sep 2015 Accepted 14 Sep 2015 | |
DOI: https://doi.org/10.5865/IJKCT.2015.5.2.075 | |
Features of an Error Correction Memory to Enhance Technical Texts Authoring in LELIE | |
Patrick Saint-Dizier*
| |
*Research Director, IRIT-CNRS, Toulouse, France (stdizier@irit.fr) | |
In this paper, we investigate the notion of error correction memory applied to technical texts. The main purpose is to introduce flexibility and context sensitivity in the detection and the correction of errors related to Constrained Natural Language (CNL) principles. This is realized by enhancing error detection paired with relatively generic correction patterns and contextual correction recommendations. Patterns are induced from previous corrections made by technical writers for a given type of text. The impact of such an error correction memory is also investigated from the point of view of the technical writer's cognitive activity. The notion of error correction memory is developed within the framework of the LELIE project an experiment is carried out on the case of fuzzy lexical items and negation, which are both major problems in technical writing. Language processing and knowledge representation aspects are developed together with evaluation directions.
Keywords: error correction memory, controlled natural languages, natural language processing, logic programming |
Technical documents form a linguistic genre with specific linguistic constraints in terms of lexical realization, syntax, typography and overall document organization, including business or domain dependent aspects. Technical documents cover a large variety of types of documents: procedures, equipment and product manuals, various notices such as security notices, regulations of various types (security, management), requirements and specifications. These documents are designed to be easy to read and as efficient and unambiguous as possible for their users and readers. They must leave little space for personal interpretations. For that purpose, they tend to follow relatively strict controlled natural language (CNL hereafter) principles concerning both their form and contents. These principles are described in documents called authoring guidelines. There are general purpose guidelines, among which various norms in e.g. aeronautics, chemistry, or guidelines proper to a company. When these latter are not coherent with the former, they tend to be preferred to the general guidelines. There are several guidelines but no general consensus on what they should contain. Finally, depending on the industrial domain, the traditions of a company, the required security level, and the target user(s), major differences in the writing and overall document quality are observed.
Guidelines may be purely textual, i.e. their application is manually controlled by technical writers. Guidelines may also be implemented via templates also called boilerplates that authors must use. These are primarily designed for unexperienced authors, in particular for producing simple texts and requirements.
Guidelines are extremely useful to produce texts which are easy to interpret by users: technical texts are oriented towards action, However, guidelines are often felt to be too rigid, they are often felt to lack the flexibility and the context sensitivity that technical writers need in a number of situations. For example, uses of fuzzy lexical items may be acceptable in certain contexts (progressively close the tap) since 'close' is a punctual action in this context. Next, a strict observation of CNL principles may lead to very complex formulations: correcting some errors may be counterintuitive and complex from a cognitive point of view. For example, negation may be permitted when there is no simple alternative (do not threw in the sewer) since enumerating the possibilities may be long and context dependent. Similarly, complex sentences may be left as they are when their reformulation into several sentences makes the understanding even more difficult, e.g. with the use of various forms of references. Finally, besides their lack of flexibility, authoring principles and guidelines, in the everyday life of technical writers, are often only partly observed, for several reasons in including workload, authoring habits and the large number of more or less consistent revisions made by several actors on a given text. As a result, and in spite of several proof readings made by technical writers and validators, it turns out that most technical texts still contain major authoring problems that must be improved.
These considerations motivated the development of the LELIE project (Barcellini, Albert & Saint-Dizier, 2012), (Saint-Dizier, 2014), which is a system that detects several types of errors in technical documents, whatever their authoring and revision stages are. Lelie produces alerts related to these errors on terms, expressions or constructions that need various forms of improvements. By error, we mean the non-observation of an authoring rule in the guidelines that is applicable to the situation, for example the use of passives, modals or negation which must be avoided in instructions. LELIE also allows to specify business constraints such as controls on style and the use of business terms. The errors detected by LELIE are typical errors of technical texts (e.g. Table 1), they are not errors in ordinary language. To overcome difficulties such as the lack of flexibility and context sensitivity in Lelie (error detection is rigid and realized on a strict word occurrence basis), our aim is to pair Lelie with a mechanism that introduces flexibility and context sensitivity in error detection and correction, using techniques that complement Lelie's rule-based approach. We show in this paper that this can be realized via an error correction memory.
Error detection in LELIE depends on the discourse structure: for example modals are the norm in requirements (Grady, 2006) but not in instructions. Titles allow deverbals which are not frequently admitted in instructions or warnings. LELIE is parameterized and offers several levels of alerts depending on the a priori error severity. LELIE and the experiments reported below have been developed on the logic-based <TextCoop> platform (Saint-Dizier, 2012). Lelie is fully implemented in Prolog; its kernel is freely available for French and English. The output of LELIE is the original text with annotations.
Table 1 below shows some major errors found by LELIE, statistics have been realized on 300 pages of proofread technical documents from companies A, B and C (kept anonymous). The results presented in this table show that there are still many errors of various types that need to be fixed and therefore space for tools that help writers to improve their texts. Error rates are given for 30 pages which corresponds to an average size technical document in our corpus.
error type | number of errors for 30 pages | A | B | C |
---|---|---|---|---|
fuzzy lexical items | 66 | 44 | 89 | 49 |
deverbals | 29 | 24 | 14 | 42 |
modals in instructions | 5 | 0 | 12 | 1 |
light verb constructions | 2 | 2 | 2 | 3 |
pronouns with unclear reference | 22 | 4 | 48 | 2 |
negation | 52 | 8 | 109 | 9 |
complex discourse structures | 43 | 12 | 65 | 50 |
complex coordinations | 19 | 30 | 10 | 17 |
heavy N+N or noun complements | 46 | 58 | 62 | 15 |
passives | 34 | 16 | 72 | 4 |
future tense | 2 | 2 | 4 | 1 |
sentences too complex | 108 | 16 | 221 | 24 |
irregular enumerative construction rate | average | low | high | average |
incorrect references to sections or s | 13 | 33 | 22 | 2 |
These results show that there is an average of about one error every 3 to 4 lines of text. The alerts produced by the LELIE system have been found useful by most technical writers that tested the system. The main reactions and comments of technical writers indicate that:
The present paper aims at specifying, developing and testing several facets of an error correction memory system that would, after a period of observation of technical writers making corrections from the LELIE alerts, develop flexibility and context sensitivity in error detection and correction. This includes the two following operations:
An error correction memory is a necessary complement to Lelie to produce technical documents with an efficient and contextually accurate controlled natural language level. Indeed, systems such as LELIE, are rule-based. Rules can detect the main situations. Filters can be paired with these rules, however, these cannot be multiplied indefinitely. A hybrid approach that pairs a generic rule-based system with a learning procedure that observes in an accurate manner the technical writer's correction activity to adjust errors seems to be relevant and operational.
In this paper, we develop our analysis and the features of an error correction memory. Our approach is based on a two level organization:
For example a fuzzy adverb such as about in 'about 5 minutes' may possibly be replaced by a pattern such as: 'between X and Y minutes', but the boundaries of this interval, X and Y, depend on the context. Their instantiations will be realized via recommendations.
Generic patterns as well as recommendations are induced from a collection of correction situations which have been previously observed. However, correction divergences between technical writers often arise; therefore, a strict automatic learning process is not totally satisfactory. The objective is rather to propose to a team of technical writers several possible corrections (via simple generalizations on coherent subsets of corrections) and to let them decide on the best solution, via discussion, mediation, or via a decision made by an administrator. For errors with a straightforward correction, general correction patterns may be proposed a priori, and possibly tuned by a validator.
This paper is organized as follows. Section 2 develops related work aspects, and somewhat elaborates the differences between our approach and existing projects. Section 3 is twofold: it first discusses the impact of such a system on the technical writer's cognitive activity and then develops the experiments that were made and the resulting formalism. Section 4 and 5 develop two major typical cases of CNL which are quite different: correcting fuzzy lexical items, which is essentially of a lexical nature, and the correction of negation and negative expressions, which is much more grammatical and involve access to knowledge. Our approach is applicable to other types of errors such as passives, deverbals, modals, light verb constructions, complex sentences, etc. Section 6 develops some aspects of the implementation and the various facets of an evaluation. Section 7 concludes with perspectives.
The approach of an error correction memory that (1) includes flexibility and context sensitivity in the detection of errors and that (2) helps technical writers by providing them with error corrections validated and made homogeneous over a whole team of technical writers, via discussion and mediation, seems to be new to the best of our knowledge. It is a crucial tool, paired with a rule-based system such as LELIE, for improving the language quality of technical texts following CNL principles, which are felt to be too rigid. This tool is useful since correcting errors in texts is indeed a time consuming, painful and costly task. It should also make corrections more homogeneous over large texts and should avoid to introduce new errors when making corrections.
This notion of error correction memory originates from the notion of translation memory, it is however substantially different in its principles and implementation. An in-depth analysis of memory- based language processing is developed in (Daelemans & van Der Bosch, 2005) and implemented in the TiMBL software. These investigations develop several forms of statistical means to produce generalizations in syntax, semantics and morphology. They also warn against excessive forms of generalizations. (Buchholz, 2002) develops an insightful memory-based analysis on how grammatical constructions can be induced from samples. Memory-based systems are also used to resolve ambiguities, using notions such as analogies (Schriver, 1989). Finally, memory-based techniques are used in programming language support systems to help programmers to resolve frequent errors.
Guidelines for writing documents following controlled languages have been elaborated in various sectors by user consortium and companies, resulting in a large diversity of specifications (e.g. AECMA, ASD-STE, SLANG, Attempto simplified English, see also (Wyner et al., 2010) for some implementations). Norms tend to emerge, such as SBVR for business rules and OMG-INCOSE for requirements (Hull et al., 2011). The reader is invited to consult a detailed synthesis and classification of CNL principles and projects given in (Kuhn, 2014). Furthermore, investigations on grammars for CNL is investigated in (Kuhn, 2013). More specialized analysis for critical systems are given in e.g. (Tommila et al., 2013). Concrete examples and training for writing in CNL are summarized in e.g. (Alred, 2012), (Unwalla, 2004), (O'Brien, 2003) and (Weiss, 2000).
A number of systems have been developed in the past years to help technical writers to produce documents that follow CNL guidelines. These systems include facilities to introduce domain considerations via a user interface. This allows, for example, an author to indicate preferred terms, fuzzy lexical items which can be allowed, or specific lexical data that must be avoided, buzz words, etc. This view is developed in e.g. in (Ganier & Barcenilla, 2007), where the need of a flexible approach to CNL is advocated, from an ergonomics and conceptual point of view. A number of these systems are briefly presented below, and their features w.r.t. the present work are outlined.
ACE (Fuchs, 2012; Fuchs, Kaljurand, & Kuhn, 2008), stands for Attempto Controlled English. This system was initially designed to control software specifications, and has been used more recently in the semantic web. ACE contains a powerful parser, with a large lexicon of more than 100,000 entries, which produces Discourse Representation Structures (DRS). CNL principles are introduced via an automatic and unambiguous translation into first-order logic. The most notable features of ACE in terms of analysis include complex noun phrases, plurals, anaphoric references, subordinated clauses, modality, and questions. ACE is paired with RACE, a reasoning component that carries out consistency checking, proving and query answering. ACE has an editor that allows users to specify data and make reference to previous texts. It is applied to English, it does not propose corrections when errors are detected, but examples are provided to help the technical writer.
PENG (Processable English (White & Schwitter, 2009)) is a computer-processable controlled natural language system designed for writing unambiguous and precise specifications. It covers a strict subset of standard English and is precisely defined by a controlled grammar and a controlled lexicon. An intelligent authoring tool indicates the restrictions while the specification of the grammar is written, which makes the system flexible and adaptable to several contexts. The controlled lexicon consists of domain-specific content words and predefined function words that can be defined by the author on the fly. Texts written in PENG can be deterministically parsed and translated into discourse representation structures and also into first-order predicate logic for theorem proving. During parsing of a PENG specification, several operations are performed: anaphoric references are resolved, ellipsis are reconstructed, synonyms, acronyms, and abbreviations are checked and replaced, a discourse representation structure is constructed, etc., and a paraphrase is generated that shows how the machine interpreted the input. PENG has a power globally comparable to ACE, with some flexibility and some facilities for users, including the treatment of exceptional cases. It however does not have any correction facilities.
RUBRIC, a Flexible Tool for Automated Checking of Conformance to Requirement Boilerplates, (Chetan et al., 2013) is dedicated to requirement control. It is based on a kind of grammar that describes the structure of the various boilerplates the users want to use, based on Rupps Boilerplates syntax. The parsing allows the recognition of ill-formed structures. Besides boilerplate recognition, this system has rather simpler controls on CNL constraints (called Natural Language Best Practices Checking) such as absence of negation, no passives, conjunctions, controls on fuzzy and vague terms, etc. which are potentially problematic constructs in requirements. This tool does not propose any correction help. The RAT-RQA system, developed by the Reusecompany for authoring requirements has similar properties. It is developed for a number of languages and is connected to the IBM Doors requirement management system. Finally, let us mention RABBIT, designed for developing control natural language for authoring ontologies in a form understandable to people, in opposition to e.g. OWL.
In this section, the features and the impact of an error correction memory are analyzed w.r.t. the language performance and cognitive activity of technical writers: how this system affects and improves their activity and what could be the real benefits when it is operational and accepted. Then we develop the experiments that lead to the system: the corpus and the memorization of corrections. Finally, we develop more formal aspects concerning the formalism of correction rules, their associated contexts of use, and the way error correction rules are induced from previously made corrections.
The first step in the definition of an error correction memory is to observe how technical writers work from a linguistic and cognitive point of view with the alerts produced by LELIE. Alerts are inserted in the original document in various ways, depending on the authoring tool that is used. For example, alerts are displayed as specific comments in MS Word. In Editorial suites such as Scenari, which has been connected to Lelie, alerts appear as text bubbles or paper pads with links to explanations and definitions. Corrections can therefore be made on the original document.
In our experimentation, the following questions, crucial for the design of a controlled natural language system, have been considered:
Two small groups of technical writers, all native speakers, respectively composed of 3 and 4 persons with different ages (from novices of about 23 years old to senior writers of more than 20 years experience), technical qualification (from novice in the area to expert) and authoring experience, have been observed in the areas of energy and transportation. These writers are full-time working in technical writing and have received a theoretical and practical training in the area, their level is a master in technical writing. Our main observations are the following:
This analysis shows that only about 60% of the alerts are processed, while about 15% are left pending. It also shows that correcting an error is, by large, not only a language problem, but that business aspects are involved. These results show some useful features that an error correction memory should have:
Corrections are made directly accessible to technical writers: a lot of time is saved and corrections become homogeneous over the various documents of the company. Corrections reflect a certain know-how of the authoring habits and guidelines of a company: they can be used to train novices.
The goal is to evaluate the nature and form of corrections: what kind of prototypical forms emerge, their dependence to the utterance context and the type of lexical and grammatical resources which are needed to model and implement an error correction memory. For that purpose, we built a corpus of technical texts at different production stages and submitted it to LELIE.
Our experiments are based on a corpus of technical texts coming from seven companies, kept anonymous at their request. Our corpus contains about 120 pages extracted from 27 documents. The main features considered to validate our corpus are:
Texts are in French or English in almost equal proportions. Technical writers were then asked to make corrections in these texts for as many alerts as possible.
An observation on how technical writers proceed was then carried out from our corpus. The tests we made do not include any temporal or planning consideration (how much time it takes to make a correction, or how they organize the corrections) or any consideration concerning the means and the strategies used by technical writers. At this stage, we simply examine the correction results, which are stored in a database. At the moment, since no specific interface has been designed, the initial and the corrected texts are compared once all the corrections have been made. The only exception are requirements since the Doors requirement management system keeps tracks of all modifications made by authors. The global process for memorizing corrections is the following in the context of fuzzy lexical items and negation:
The database is realized in Prolog as follows:
alert ( [ type of alert ], [ tagged term(s) ], [ lexical category ],
[ severity level (1 to 3) ],
[ [ sentence with alert, sentence after correction with tags,
ID of writer ], ....]).
For example:
alert ( [ fuzzy ], [ progressively ], [ adverb ], [3],
[ [ [ <fuzzy>, progressively, </fuzzy>, heat, the, probe ],
[ [ <revised>, heat, the, probe, progressively,
in, 5, seconds, </revised>]], [John] ] .... ]
The determination of the scope of the error (the word it applies to) and its context (defined in sections 3.4 and 3.5) is realized during the induction process (section 3.6) via lexical look-up or the use of a local grammar. This allows the induction parameters on the form of the scope and the context to be explored in a later stage without affecting the database.
Let us now de ne the parameters for the taking into account of the context sensitivity of an alert so that corrections can be made flexible and parameterized depending on the context of the alert.
In our first experiment, an Error Context is a set of words which appears either before or after the alert (i.e. the fuzzy lexical item or the negation). In the case of fuzzy lexical items and negation, a context is composed of nouns or noun compounds (frequent in technical texts) Ni, adjectives Ak and action verbs Vj. Our strategy is to first explore the simple case of the use of a number of such terms to unambiguously characterize a context, independently of the alert category. In our experiment, the context is composed of (1) a main or head word (or expression) which is the word to which the fuzzy lexical item or the negation applies (e.g. 'fire alarms' in 'minimize alarms', or 'close' in 'do not close the door while the system is in operation') and (2) additional words that appear either before or after the main one. The closest words in terms of word distance are considered. In case of ambiguity, the words in the same clause are preferred.
This approach has the advantage of not including any complex syntactic consideration. To evaluate the number of additional words which are needed in the context besides the head word, we constructed 42 contexts from 3 different texts composed of 2, 3, 4 and 5 additional words (including technical compound terms, e.g. 'trim management', which count each as a single word). We then asked technical writers to indicate from what number of additional words each context was stable, i.e. adding a new words does not change what it basically means or refers to. Over our small sample, results are the following:
number of additional words | stability from previous set |
---|---|
3 | 83% |
4 | 92% |
5 | 94% |
From these observations, a context of 4 additional words (if these can be found in the utterance) in addition to the main word is adopted. This is probably a little bit vague, but sufficient for our present aim. This number of words is implemented as a parameter so that it can be revised in the future or depending on the type of document.
As indicated in 1.3, the definition of correction rules is based on a two level approach:
(1) the development of relatively generic correction rules, that reflect correction practices for a domain, a company or a type of text. These rules may be induced from the technical writer correction activity or may be defined a priori when they are linguistically or cognitively straightforward. They often contain underspecified fields.
(2) the development of accurate contextual correction recommendations, based on previously memorized and analyzed corrections made on a small set of closely related terms and situations in context. These recommendations correspond to the underspecified fields of the generic correction rules. They add flexibility and context sensitivity to the rules.
Correction rules are based on patterns that identify structures via unification. The first pattern identifies the error while the second one, based on the first proposes a correction: [error pattern] → [correction pattern].
These patterns are based on the syntax of Dislog (Discourse in Logic, (Saint-Dizier, 2012)), a logic-based language that runs on the TextCoop platform. Patterns include specific variables that represent contextual recommendations. Besides its logic-based character, Dislog extends the expressive power of regular expression, in particular via (1) the introduction of feature structures which may be typed, (2) negation on symbols which must not occur in the expression, (3) reasoning procedures and calls to knowledge for various controls and computations and (4) the construction of representations or results (e.g. corrections). These properties are necessary for the development of patterns and correction rules.
Patterns are finite ordered sequences of the following elements (external form):
Pattern samples are:
[ Neg VP before VP ] where Neg is a preterminal element that stands for 'do not' or 'never' 'before' is a terminal element and VP is a verb phrase,
[ in order not to V NP ] similarly, 'in order not to' is a terminal element and VP and NP are non-terminal elements,
[ Verb antonym (Adjective) ] where antonym is a function, Verb and Adjective are pre-terminal elements,
[ more than X Noun ] where X is a variable representing a recommendation.
A correction rule is a rewrite rule that, given an error identified by a pattern (the error pattern), rewrites the text segment into a correct construction (the correction pattern), represented also by a pattern partly based on material from the error pattern. Such a correction rule used for fuzzy manner adverbs is e.g.:
[ progressively VP (durative) ] → [ progressively VP (durative) in X (time) ],
e.g. progressively heat the probe X37 → progressively heat the probe X37 in 10 minutes.
X (time) (variable of type time, 10 minutes, in the example) is suggested by the correction recommendation, the adverb is present in order to keep the manner facet which is not fuzzy, since it is the temporal dimension that is fuzzy in this expression.
A contextual correction rule is a correction rule associated with a context. This allows the specification of a precise recommendation or possibly set of recommendations:
error pattern → correction pattern [ [ head term ], [ terms of Context ]
[ Values for contextual variables ] ],
e.g.: [ progressively VP (durative) ] → [ progressively VP (durative) in X (time) ] [ [ heat ],
[ probe X37, Airbus A320, pitot tube, icing conditions ] [ X (time) = 9 sec ] ].
In this example a recommendation of 9 seconds is provided for the time needed for 'heating' (head term, type durative) 'the probe X37 of the pitot tube of an airbus A320 in icing conditions' (context of 4 relevant terms). More examples are developed below in the sections dedicated to fuzzy lexical items and to negation.
The induction method appropriate for our objective combines formal forms of generalizations with interactions with technical writers. We view it as a tool for technical writers so that they can develop correction rules in a simpler and more reliable way. This tool induces generic correction proposals from corrections already made in similar contexts. Of interest is the identification of those parameters needed by writers to adjust the tool to the way they conceive the correction process.
From a formal point of view, the induction mechanism is based on unification theory, in particular on the notion of least upper bound (lub) (Lloyd, 2013). In unification theory, lubs are in general computed from a type lattice following the formal model developed in e.g. (Pfenning, 1992). A relatively similar approach is called Order-Sorted Unification, which is often paired with forms of type inference (Baader & Nipkow, 1998). In our case, linguistic structures and terminologies are considered instead of a strict type lattice, however, lexical entries features and terminologies can be interpreted as typed constructions.
Let us now develop different steps of the induction mechanism. In patterns and contexts let us consider the following elements:
The induction mechanism is organized around the idea that the correction pattern is the reference or the invariant, whereas generalizations can be made on the other structures: W, Scope and Cont to reach more abstract contextual error correction rules. Steps 1a,b,c are designed to produce a consistent set of corrections that has no duplicates. Steps 2a,b,c develop the three dimensions of generalizations. Step 3 includes two generalizations.
The generalization steps are the following:
The overall organization of the system is rather simple, it is basically an extension to Lelie connected to the text editor used by the company at stake. The main components of the error correction system include resources and processing modules which are specific to this task.
The main resources are:
The main processing components are:
All these modules are written in Prolog SWI, using the kernel syntax so that they can be used in various implementation contexts.
A first experimentation of the principles developed above is devoted to the case of fuzzy lexical items which is a major type of error, very representative of the use of an error correction memory. Roughly, a fuzzy lexical item denotes a concept whose meaning, interpretation, or boundaries can vary considerably according to context, readers or conditions, instead of being fixed once and for all. A fuzzy lexical item must be contrasted with underspecified expressions, which involve different forms of corrections. For example, a verb such as damaged in the mother card risks to be damaged is not fuzzy but underspecified because the importance and the nature of the damage is unknown similarly for heat the probe to reach 500 degrees because the means to heat the probe are not given but are in fact required to realize the action. In terms of corrections, an underspecified expression often requires a complement or an adjunct to be added, e.g. using a voltage of 50 Volts. This adjunct is highly dependent on the domain and on the user knowledge.
There are several categories of fuzzy lexical items which involve different correction strategies. They include a number of adverbs (manner, temporal, location, and modal adverbs), adjectives (adapted, appropriate) determiners (some, a few), prepositions (near, around ), a few verbs (minimize, increase) and nouns. These categories are not homogeneous in terms of fuzziness, e.g. fuzzy determiners and fuzzy prepositions are always fuzzy whereas e.g. fuzzy adverbs may be fuzzy only in certain contexts. The degree of fuzziness is also quite different from one term to another in a category.
The context in which a fuzzy lexical item is uttered may have an influence on its severity level. For example 'progressively' used in a short action (progressively close the water pipe) or used in an action that has a substantial length (progressively heat the probe till 300 degrees Celsius are reached ) may entail different severity levels because the application of 'progressively' may be more difficult to realize in the second case. In the case of this adverb, it is not the manner but the underlying temporal dimension that is fuzzy. Finally, some usages of fuzzy lexical items are allowed. This is the case of business terms that contain fuzzy lexical items which should not trigger any alert. For example, low visibility landing procedure in aeronautics corresponds to a precise notion, therefore 'low' must not trigger an alert in this case. The equivalent, non-business expression landing procedure with low visibility should probably originate an alert on 'low', but there is no consensus among technical writers.
In average, 2 to 4 fuzzy lexical items are found per page in our corpus. On a small experiment with two technical writers from the 'B' company, considering 120 alerts concerning fuzzy lexical items in different contexts, 36 have been judged not to be errors (rate: 30%). Among the other 84 errors, only 62 have been corrected. The remaining 22 have been judged problematic and very difficult to correct. It took between 2 and 15 minutes to correct each of the 62 errors, with an average of about 8 minutes per error. Correcting fuzzy lexical items indeed often requires domain expertise.
In the Lelie system, a lexicon has been implemented that contains the most common fuzzy lexical items and fuzzy expressions found in our corpus (about 450 terms). Since some items are a priori more fuzzy than others, they have been categorized by means of semantic features and a mark, between 1 and 3 (3 being the worse case) has been assigned a priori to each subcategory. This mark is however not fixed, it may evolve depending on technical writers' behavior. This mark has been defined according to the same methodology as developed in the above section, and with the same group of technical writers. This ensures a certain homogeneity in the judgements.
Such a lexicon is necessary so that the fuzzy term identification process can start. For illustrative purposes, Table 3 below gives figures about some types of entries of our lexicon for English.
category | number of entries | a priori severity level |
---|---|---|
manner adverbs | 130 | 2 to 3 |
temporal and location adverbs | 107 | in general 2 |
determiners | 24 | 3 |
prepositions | 31 | 2 to 3 |
verbs and modals | 73 | 1 to 2 |
adjectives | 87 | in general 1 |
Error correction memory scenarios include the following main situations, that we have observed in various texts of our corpus:
A rough frequency indication for each of these situations is given below, based on 52 different fuzzy lexical items with 332 observed situations:
case nb. | number of cases | rate (%) |
---|---|---|
1 | 60 | 18 |
2 | 154 | 46 |
3 | 44 | 13 |
4 | 46 | 14 |
5 | 28 | 9 |
It is important to note that, in a given domain, errors are very recurrent, they concern a small number of fuzzy terms, but with a large diversity of contexts. For example, a text of about 50 pages long contain between 5 to 8 fuzzy manner adverbs, which is very small and allows an accurate control in context.
Correction patterns have been categorized considering (1) the syntactic category of the fuzzy item and (2) the correction samples collected in the database. Organized by syntactic category, here are a few relevant and illustrative types of patterns which have been induced:
At the moment, 27 non-overlapping patterns have been induced from the corpus. Error correction recommendations are more difficult to stabilize because contexts may be very diverse and related to complex business aspects. At the moment, (1) either a precise recommendation has emerged or has been found in non fuzzy text counterparts and has been validated or (2) the system simply keeps track of all the corrections made and displays them by decreasing frequency. In any case, the correction decision is always the responsibility of the technical writer.
Negation is a complex phenomenon, both from a semantic and pragmatic point of view (Horn, 2001), with cognitive aspects related to e.g. presupposition and implicature, whose control is important in technical documents to avoid any misconceptions. Negation is linguistically realized in technical texts in several ways: first by the use of the adverbs not, never, no longer etc. but also by the use of terms with a strong negative dimension: verbs (e.g. avoid and forbid verb classes), quantifiers (no, none), prepositions (without), prefixes (un-) or suffixes (-less), etc. A number of forms of negative expressions must not be corrected. For example, expressions describing system states must not be altered, e.g. non available, invalid, failed, degraded, etc.
Although most of the above categories are important aspects of negation, CNL mainly deals with the adverbial forms of negation (not, never, etc.). In most CNL recommendations, negation must be avoided. This rule is felt to be too rigid by technical writers. Indeed in warnings and in some types of requirements negation is appropriate, clear and unambiguous: never close the door when the system is in operation otherwise..... Negation may also be acceptable in conditional expressions, goal or causal expressions (acting e.g. as warning supports): in order not to damage its connectors. The LELIE system, paired with TextCoop allows the recognition of the main discourse structures found in technical texts (Saint-Dizier, 2014), therefore, it is possible to select those structures where negation must or must not originate an alert.
Negation is much more difficult to correct than fuzzy lexical items. On the same corpus, our observations show that only about 50% of the alerts on negation produced by Lelie are judged to be relevant by technical writers, therefore there is about 50% of noise, which is very high. This is however not surprising because technical writers tend to say that an alert is not relevant when they do not know how to correct it without altering too much the text or producing a correction that is worse than the error. Among the 50% which are relevant, about 35% can be easily corrected while the remaining 15% must be corrected but their correction is difficult from a language or knowledge point of view.
In contrast with fuzzy terms, negation gets corrections which may be quite different in French and in English. Corrections may also be quite subtle and require a good command of the language. English glosses are given for French examples. Considering technical writers corrections, error correction memory scenarios include the following main situations:
Errors concerning negation are more difficult to generalize because they involve more aspects of syntax than fuzzy terms which are essentially lexical. Here are a few correction patterns induced from the technical writers' activity, these have been induced form our corpus.
At the moment 16 patterns have been identified, but this task is ongoing. In the case of negation, recommendations mainly concern functions that produce antonyms or hyperonyms. In general these are not unique and are largely contextual as outlined in the previous section. Recommendations mainly concern the choice of an antonym or an hyperonym that is the most relevant.
The main modules of this error correction memory system have been implemented in Prolog and connected to Lelie: the database, the induction algorithm and the automatic production of correction patterns with recommendations. These modules must now be integrated into each company's authoring system to be operational. Companies use a large diversity of editorial environments where the integration of the modules we implemented is possible via some substantial interface work. Systems like Doors (for requirement authoring and management) or Scenari (editorial suite) keep track of all changes including corrections made on texts, these traces can be re-used to construct the error correction database (section 3.3). The internal representations of most editorial systems are XML structures: it is therefore a priori possible to generate correction patterns and to use the man-machine interfaces of the editorial system at stake to generate readable correction patterns. These technical matters are out of the scope of this paper.
The integration of the error correction memory system into a company environment and information system is ongoing: it is a long and very technical task. At this stage, we can simply have a global evaluation of the improvement of the system in terms of noise reduction, carried out on a test corpus of 50 pages, from two companies. These evaluations are carried out with the collaboration of the two groups of 3 and 4 technical writers mentioned at the beginning of this article. Since they have very different skills, this allows us to have a number of useful reactions of different types.
The results are still preliminary and are more an estimation and indicative than final ones:
From a more global perspective, the evaluation of an error correction memory must be realized in a company's context from a functional point of view. The main areas we think are essential to be considered and evaluated are:
Except for the two first items, which are slightly more standard in evaluation technology, the other items involve major elaborations and require the development of dedicated protocols, once the system is operational in a company. This is obviously necessary to make sure the system is used on a large scale, but also to collect remarks in order to improve it and extend it to other areas of technical writing.
In this paper, we have explored the notion of error correction memory, which, paired with the LELIE system that detects specific errors of technical writing, allows a flexible and context sensitive detection and correction of errors. Correction scenarios are based on an architecture that develops an error correction memory based on (1) generic correction patterns and (2) contextual correction recommendations for elements in those patterns which are more contextual. Both levels are acquired from the observation of already realized corrections and correct texts.
This approach is quite new, it needs an in-depth evaluation in terms of linguistic adequacy and usability for technical writers. It is however still in an early research stage: evaluation is designed to develop improvement directions rather than to give definitive performances of the approach.
Besides negation and fuzzy lexical items, we are exploring additional facets of an error correction memory for the other major types of errors such as passives, future forms, deverbals, N+ N constructions (Garnier 2011), misplaced discourse structures, and complex sentences. Integration into a real company information system is ongoing and would permit a more suitable evaluation of the service and of its evolution.
1. | Alred, G. J., Charles, T. B., & Walter, E. O., (2012), Handbook of Technical Writing, NY: St Martin's Press. |
2. | Baader, F., & Nipkow, T., (1998), Term rewriting and qll that, London, England: Cambridge University Press. |
3. | Barcellini, F., Albert, C., & Saint-Dizier, P., (2012), Risk analysis and prevention: LELIE, a tool dedicated to procedure and requirement authoring, LREC, p698-705. |
4. | Boulle, L., & Mesic, M., (2005), Mediation: Principles Processes Practice, Australia: Butterworths. |
5. | Buchholz, S., (2002), Memory-based grammatical relation finding, Doctoral dissertation, Retrieved from http://ilk.uvt.nl/team/sabine/diss/buchholz_diss.pdf. |
6. | Croce, D., Moschitti, A., Basili, R., & Palmer, M., (2012), Verb classification using distributional similarity in syntactic and semantic structures, Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, 1, p263-272. |
7. | Cruse, A., (1986), Lexical semantics, London, England: Cambridge University Press. |
8. | Daelemans, W., & van Der Bosch, A., (2005), Memory-based language processing, London, England: Cambridge University Press. |
9. | Fuchs, N. E., Kaljurand, K., & Kuhn, T., (2008), Attempto controlled english for knowledge representation, In, Cristina Baroglio, Piero A. Bonatti, Jan Maluszynski, Massimo Marchiori, Axel Polleres, and editors Sebastian Schaffert, Reasoning Web, Fourth International Summer School 2008, Lecture Notes in Computer Science 5224, p104-124, Springer. |
10. | Fuchs, N. E., (2012), First-Order Reasoning for Attempto Controlled English, In Proceedings of the Second International Workshop on Controlled Natural Language (CNL 2010), Springer. |
11. | Ganier, F., & Barcenilla, J., (2007), Considering users and the way they use procedural texts: some prerequisites for the design of appropriate documents, In, D. Alamargot, P. Terrier, and (Eds) J. -M. Cellier, Improving the production and understanding of written documents in the workplace, p49-60, US: Elsevier Publishers. |
12. | Garnier, M., (2011), Correcting errors in N+N structures in the production of French users of English, EuroCall, p59-63. |
13. | Grady, J. O., (2006), System Requirements Analysis, US: Academic Press. |
14. | Hong, Y., & Zhang, J., (2015), Investigation of terminology coverage in radiology reporting templates and Free‐text Reports, International Journal of Knowledge Content Development & Technology, 5(1), p5-14. |
15. | Horn, L., (2001), A natural history of negation, D. Hume series, US: University of Chicago Press. |
16. | Hull, E., Jackson, K., & Dick, J., (2011), Requirements Engineering, US: Elsevier Publishers. |
17. | Kim, K. Y., Kim, S. Y., & Kim, H. M., (2012), Study of analyzing outcome of building and introducing system for preserving Full-Text of e-Journal, International Journal of Knowledge Content Development & Technology, 2(2), p5-16. |
18. | Kuhn, T., (2013), A Principled Approach to Grammars for Controlled Natural Languages and Predictive Editors, Journal of Logic, Language and Information, 22(1), p33-70. |
19. | Kuhn, T., (2014), A survey and classification of controlled natural languages, Computational Linguistics, 40(1), p121-170. |
20. | O'Brien, S., (2003), Controlling Controlled English. An Analysis of Several Controlled Language Rule Sets, Dublin City University report. |
21. | (Ed.) Pfenning, F., (1992), Types in logic programming, p215-223, Cambridge: MIT Press. |
22. | Saint-Dizier, P., (2012), Processing natural language arguments with the platform, Argument & Computation, 3(1), p49-82. |
23. | (Ed.) Saint-Dizier, P., (2014), Challenges of discourse processing: the case of technical documents, London, England: Cambridge University Press. |
24. | Schriver, K. A., (1989), Evaluating text quality: The continuum from text-focused to reader-focused methods, IEEE Transactions on Professional Communication, 32, p238-255. |
25. | Unwalla, M., (2004), Aecma simplified english. Communicator, Winter, p34-35, Retrieved from http://www.techscribe.co.uk/ta/aecma-simplified-english.pdf> Retrieved. |
26. | Van der Linden, K., (1993), Speaking of Actions: choosing Rhetorical Status and Grammatical Form in Instructional Text Generation. |
27. | Weiss, E. H., (2000), Writing remedies, Practical exercises for technical writing, Oryx Press. |
28. | White, C., & Schwitter, R., (2009), An Update on P ENG Light, In:, Pizzato, L., & (eds.) Schwitter, R., Proceedings of ALTA 2009, Sydney, Australia, p80-88. |
29. | Wyner, A., Angelov, K., Barzdins, G., Damljanovic, D., Davis, B., Fuchs, N., & Luts, M., (2010), On controlled natural languages: Properties and prospects. In Controlled Natural Language, p281-289, Germany, Berlin Heidelberg: Springer. |
E-mail: stdizier@irit.fr
Patrick SAINT-DIZIER, PhD, is a senior researcher in Computational Linguistics and Artificial Intelligence at CNRS, IRIT, Toulouse, France. He is specialized in discourse and semantic analysis. He has developed several national and European projects dedicated to logic programming, argumentation and technical text analysis. He is the author of more than 80 conference and journal articles and of 11 books. Besides foundational research, he has a long practice and experience of research and development activities.
His research merges foundational aspects with empirical studies and prototype development.A few generic ideas guide my team's work: