The action is more properly defined as a unction, which is a transformation performed on the entity, defined as an object, to modify it into a so-called product. The definition of the device that performs the function is broader and more comprehensive, not just limited to its structure, but using the term technical system. The latter consists of four main parts. The tool is the part that is in direct contact with the object during the performance of the function, according to a mechanical, acoustical, thermal, chemical, electrical, magnetic, intermolecular, or biological interaction. Supply is the part that generates the energy for performing the function, the transmission is the part that transmits energy from the supply to the tool and finally to the control, which regulates the operation of the technical system by interacting with the other parts.
The definition of the technical system is sufficiently broad to model devices that are significantly different from each other. For example, in the case of a hammer, the technical system includes both the device and the user, the tool is the head of the hammer, and the transmission is the set of the hammer handle, hand and arm of the user. The supply is the user muscle, and the control is the sight and hand of the user when he directs the hammer head exactly onto the nail (object) to hit it (function). In the case of a computer numerical control (CNC) machine, the technical system coincides with the device, since the tool is the utensil, the transmission is the axis and kinematics that move it, the supply is the set of electric motors that rotate and move the utensil, and the control is the set of the automatic sensors.
The objective of this study can therefore be reformulated in line with this ontology, as the automatic determination of the technical systems from patents to the performance of a function on an object assigned by the user.
To fulfil this objective, the proposed method consists of two main steps, presented in detail in the following. The first one consists of the determination of the document pool to be analysed, which is strictly dependent on the definition of the problem space, that is, how the initial problem to be solved is defined. The second step consists of extracting the design elements from the pool of documents.
3.1. Problem Space and Pool Definition
In design, the definition of the starting problem is a fundamental task, since it can greatly influence the search for solutions. This is because in one of its most established definitions, design is considered merely a problem-solving activity, where the starting problem coincides with a problem space. This latter contains all the details that define the problem, and the constraints and boundary conditions, while the alternative solutions are contained in the solution space [
28]. In this regard, the extent of the solution space depends on the extent of the problem space, as well as on the design methods, knowledge, and creativity of the designer.
In the proposed approach, the design knowledge extracted from patents is strictly dependent on the definition of the problem space regarding the definition of the search query to be used in the patent search. Having established the correctness of the formulation of the starting problem, which is not the task of this study to discuss, the first tedious task that arises in the application of the proposed method therefore concerns the definition of the search query.
One of the best ways to define a search query is to include a verb and a direct object. They are, respectively, the function and the object described in
Table 1. This strategy is in line with the correct definition of the design problem according to the TRIZ method, because in this way any reference to tool and technical system is avoided, and they are instead the design goal, i.e., the solution, and any inertia that may affect the design activity. The same reasoning is also valid at the level of patent search, allowing for the enlargement of the recall of the identified documents and therefore the solution space [
24]. In many cases, however, the problem may be far from trivial, given the many possible ways of understanding the same function in relation to the level of detail and the repercussions that its definition may have on the retrieved results obtained [
29].
Knowledge of the problem space is therefore also fundamental in order to allow the expansion of the query, including in it synonyms of both the verb used for the function and the noun used as object. This option makes it possible to greatly expand the results obtained in a database search, and the possible solutions, but only if the synonyms are carefully selected. There are several supporting tools, such as the Oxford Dictionary, for expanding query terms by looking for synonyms and modifiers.
Finally, the effectiveness of such a query also depends on the possibility of syntactically relating function and object as verbal predicate and direct object. Many document databases allow the query to be defined in this way, while others guarantee the possibility of establishing a semantic relation, for instance by establishing or including them in the same sentence with a distance operator. To retrieve design solutions, patent databases are suggested. This is because patent databases contain most of the world’s patents, i.e., over 110 million. In addition, the search can be streamlined when conducted in subsets, such as the WIPO or USPTO, or the entire EP, which counts a smaller number of patents.
3.2. Technical Systems Extraction
The objective of this step is to extract the technical systems from the collected patents with the query formulated in
Section 3.1.
The starting hypothesis of this study is that all the design elements considered, both those that are to be obtained (i.e., the technical system) and those used to obtain them (i.e., function and, object) have precise syntactic roles within the sentences of the analysed documents. Furthermore, it is also assumed that there are a finite number of lexical forms (e.g., by, in order to) and nouns (e.g., method, mean, system), which are used recursively to link design elements at the grammatical level.
Based on this assumption, the proposed method uses a semantic and syntactic analysis of patents to determine the technical systems, performed automatically using NLP (natural language processing) tools. To use these tools correctly for this analysis, however, it is necessary to define a rule, the purpose of which is to explain how the technical system can be related to the elements of the sentence. For instance, in the phrase: “A laser cuts a metal by evaporation”, “laser”, which is the technical system, is also the subject, “cut” is the function and the verbal predicate of “laser”, and “metal” is the object and the direct object. If the sentences were all formulated in this way, then only one rule could be defined for our analysis. However, not all design elements always appear in the same sentence, but may also be scattered over several sentences and still be logically connected. Consider, for example, the following two hypothetical sentences from the same patent text: “The claimed device cuts a metal” and “Said device is a laser”.
For this reason, a few rules have been developed in this study to recover more technical systems, even in cases such as the second example.
Each rule provides three types of directions to be followed:
The identification of which sentences the rules work on to extract technical systems, depending on the other design elements contained therein, i.e., other technical systems, functions, objects, and other lexical elements.
What syntactic roles the design and the other lexical elements can have in the various sentences, with the aim of isolating the possible technical systems with automatic logical analysis.
Which dependency patterns can be used in sentences to express design and other lexical elements to automatically refine the results of the logical analysis, improving the precision of the results.
In the following paragraphs, the three rules are explained in detail.
3.2.1. Rule 1
Rule 1 explains how to extract the technical system from a single sentence (called the main sentence) in which the function and object also appear (see
Table 2). This choice is one of the most common, which is typically considered by most automatic text analysis methods.
At the syntactic level, the hypothesis on which Rule 1 is based is that, in a sentence, the technical system is the subject, the function is the verbal predicate, the object is the direct object, and the behaviour is an indirect expansion sustained by certain lexical forms. With this logic, the extraction of the technical system and behaviour can then be performed automatically by tools that are able to perform logical sentence analysis.
Most NLP tools that can process a sentence like the one considered by Rule 1, require a lemma (i.e., noun, verb, adjective) as input from the user, and use it to search for terms related to it syntactically or semantically. Concerning the choice of the lemma, the function-oriented approach [
24] suggests using directly the verb expressing the function for the greater effectiveness in determining the SAO (Subject–Action–Object) triads to be considered according to Rule 1, i.e., those in which the subject is the technical system and the direct object is the object.
This is because not all SAO triads containing the function and object also contain the technical system. The cases where the subject is not the correct one are different, according to the research carried out for this study.
The technical system may not be recognised correctly when expressed by two terms: e.g., in the sentence “The cutting wheel cuts a metal layer”, an NLP tool based only on syntactic analysis can only return as a subject “cutting”, but not “wheel”, effectively losing the information that is most needed to define the technical system, or, when two technical systems execute the function, e.g., “A mold and a counter-mold cut the metal”, only one of the two could be automatically recognised as a subject.
Finally, there are all the sentences in which the subject is not a technical system, e.g., “the table (subject) for cutting metal includes a circular saw (technical system)”.
All these errors are strictly dependent on the software being used. For instance, with Sketch Engine (
https://www.sketchengine.eu/) (accessed on 23 July 2023) software all those reported have been found experimentally, while spaCy (the open-source library for NLP in Python) is able to identify subjects defined in a much more complex way, including structural characteristics. For instance, in the sentence “the wear-resistant silicide-based cermet tool to cut a metal is made from the following raw materials in parts by weight: 50 parts of titanium silicide, 15 parts of zirconium silicide”, spaCy identifies, as subject, “the wear-resistant silicide-based cermet tool is”, rightly attaching to it also “50 parts of titanium silicide, 15 parts of zirconium silicide” as an appositional modifier.
In any case, to facilitate the identification of the technical systems within Rule 1, our proposal is therefore to also perform a semantic processing of the sentences obtained with the syntactic analysis, by launching an automatic sub-routine. The latter exploits certain dependency patterns (see
Table 3) that typically introduce a technical system. They consist of the union of the function and some lexical forms.
The starting point for the collection of these dependency patterns was the analysis of the patterns collected for other purposes, i.e., to automatically catalogue risk assessment methods from scientific papers [
30] and to extract functions and application fields from patents [
3]. In this work, a new analysis was carried out ad hoc to confirm the validity of the dependency patterns retrieved from the literature, through the following procedure.
First, a patent pool related to metal cutting was analysed. The used query was “((cut+ or divid+ or slit+ or chop+ or separat+ or lanc+ or hash+ or sever+ or cleav+ or rend+ or sunder+ or dissever+) s (metal+))/TI/AB/TX”, which was launched in the Fampat patent database. Just over three million patents were retrieved through this query. Patents with priority dates in 2016 were arbitrarily excluded from this pool. To simplify, only the queries of these patents whose sentences correspond to about 20% of the total sentences in the pool were processed using an NLP tool.
A search on the web identified 314 technical systems for metal cutting. This set includes both generic technical systems, e.g., lathe and milling machine, and more specific ones, e.g., CO2 laser. All 314 technical systems were then used as input for the automatic syntactic processing of the documents. A total of 185 were identified in the analysed pool.
All the sentences in which these 185 Technical systems, syntactically linked to the function “cut” and the object “metal”, were manually analysed. From all these sentences, only those containing dependency patterns linking the technical system to the function have been isolated. In this case, the resulting technical systems were 127. The remaining sentences contain SAO triads about the technical system, function and object, without dependency patterns, e.g., “A laser cuts metal”.
Finally, all the identified dependency patters, i.e., those used in conjunction with the 127 technical systems, were collected.
Table 4 shows all the analysed dependency patterns, from which the most common ones were extracted and reported in
Table 3.
From this analysis, some alternative dependency patterns were collected (see
Table 5).
3.2.2. Rule 2
Rule 2 explains how to extract a technical system from an incomplete sentence in which it appears without the function and/or object, using another auxiliary sentence in which these two elements appear. The auxiliary sentence generally appears before the sentence used, within the patent text. According to this rule, the mode used to link the technical system of the used sentence to the function and object of the auxiliary sentence is the presence of a generic technical system, i.e., a generic term (e.g., “method”, “mean”, “system”) commonly used in patents to indicate a technical system. A sub-variation of Rule 2 concerns the extraction of a new technical system from the incomplete sentence, using an auxiliary sentence specifying which technical system is referred to in the generic technical system.
Table 6 summarises the criteria for selecting the sentences considered in Rule 2.
At the syntactic level, in the auxiliary sentence, the generic technical system is the subject, the function is the verb, and the object is the direct object, while in the main sentence there are two possibilities: the generic technical system is the subject and the technical system to be retrieved is the direct object, or vice versa.
Rule 2 was developed because it is based on a recurring stylistic pattern in patent writing, constructed from the use of few sentences in which the technical system, function and object are stated, and which provides specifications for other sentences. The aim is usually to broaden the scope of protection that a patent can offer at the legal level by claiming as many design variants as possible. This mechanism is very common, especially in the claims, although it is not missing from the description. An independent claim defines the essential characteristics of the invention whose protection is required, and serves to identify the invention. A dependent claim contains all the features of the independent claims to which it is linked, and indicates further features or variants for which protection is sought. It is not uncommon for a patent to have several independent claims and multiple dependent claims linked to them. Such alternative technical systems can be very useful during design, as they include real alternatives (e.g., lathe vs. milling machine), models and variations of the same device (e.g., numerically controlled lathe), and characteristics of its parts, such as materials and geometries (e.g., conical tip).
For our purposes, the analysis of all these “secondary” sentences (here called main sentences), whether they are part of the description or dependent claims, is very useful, as many alternative technical systems are included, often even in a single sentence. However, in these sentences, alternative technical systems do not always appear together with the main technical system, i.e., the one that is syntactically and/or semantically linked to the function and object in another sentence. Many times, patent writers prefer to use generic substitutes for both the technical system and the object, instead. The former is used by Rule 2 to identify technical systems.
The following procedure was used to identify these generic substitutes, which have been renamed as the “generic technical system”.
In the same patent pool used to determine the dependency patterns of Rule 1, only patents containing a limited set of technical systems, defined randomly, were automatically isolated, to limit the analysis. They are CNC, drill, laser, oxyfuel, plasma, and torch.
All the sentences of each patent that contain one of the used technical systems (i.e., the main sentences in
Table 2) and those containing the function, expressed with the root “cut” (e.g., cut, cuts, cutting,) and the object described by the root (“metal” (e.g., metal, metals, metallic plate) (i.e., the auxiliary sentences in
Table 6) were automatically collected.
These sentences were then processed individually in spaCy software [
4], using it as a dependency syntactic parser. The collected results were then manually analysed.
A generic technical system has been collected, for when three conditions exist simultaneously, as established by Rule 2:
In the same patent, at least two sentences have been identified, one of the main-sentence type and one of the auxiliary-sentence type.
From spaCy’s analysis of the main sentence, the generic technical system is the subject. The verbal predicate is a verb, which may be combined with a syntactic particle, such as “is, can be, consist of, is made of, comprises, etc.”. The direct object is one of the considered technical systems.
From spaCy’s analysis of the auxiliary sentence, the same generic technical system is the subject, the function “cut” is the verb, and the object “metal” is the direct object.
Consider, for example, the two sentences taken from the patent CN107009098A, in which the technical system is “CNC milling machine”, which was used to extract the generic technical system “method”. The main sentence is “The method according to claim 1, wherein the processing tool comprises a CNC milling machine and performing the wire cutting process”. The used auxiliary sentence is the claim 1: “A method for cutting sheet metals”. Both the results satisfy the requirements of point 1. As can be seen from
Figure 2, taken from the main sentence analysis carried out with spaCy, the term “method” was recognised as the subject, “comprises” as its verbal predicate and “CNC milling machine” as its object, in line with point 2. Meanwhile, from
Figure 2, taken from the analysis of spaCy’s auxiliary sentence, it can be seen that “method” has been recognised as the subject, “cutting” as the verbal predicate, and “sheet metal” as the object, as established by point 3.
Following the same procedure for all the collected sentences, the generic technical systems shown in
Table 7 were identified. They are generic nouns (e.g., method) that can be preceded by an introductory particle (e.g., said).
At the same time, from the automatic analysis of the main sentences (point 2), it was also possible to isolate the verbal predicates that are most used to link the technical system with the generic technical system (see
Table 8). They can be used independently, as the generic technical system is the subject and the technical system is the verb, and vice versa. In general, all these dependency patterns are valid to help automatically extract technical systems, except for those in the last three lines, categorised as “additional”. The latter are, in fact, aimed at introducing constitutive elements of the claimed device, which may be able to perform the function, and therefore be definable as a technical system, or not, e.g., “fluid-cooling device”. Such elements may, however, be useful for design purposes if they are associated with the cutting device of which they are part, and not considered as its substitutes. With a view to automating the method, this fact should be duly considered to avoid errors when cataloguing the results.
From the analysis of the auxiliary sentence (point 3), instead, the dependency patterns of the verbal predicate (function) used to link the generic technical system to the object, were determined (see
Table 8 and
Table 9). A reassurance of the completeness of these dependency patterns is given by the work of [
3], who, analysing a larger and more heterogeneous pool of documents, proposed the same list.
Finally, the analysis of the dependency patterns of the object in the auxiliary sentence did not reveal any new results compared to those shown in
Table 4.
3.2.3. Rule 3
Rule 3 explains how to obtain a new technical system from a sentence in which it appears together with another technical system because it has already been identified by analysing other sentences with one of the rules already presented. The auxiliary sentence used in this is the main sentence used in Rule 1. For this reason, the technical system is already known, and through the application of Rule 1 has been defined as the subject of the function searched and executed on the object. The particularity of the main sentence is that in it the new technical system is defined as an alternative to, or a constituent part of, the technical system (e.g., supply, transmission or tool).
Table 10 summarises the criteria for selecting the sentences considered by Rule 3.
The justification for adding Rule 3 is the same as that for Rule 2, i.e., the habit of adding several alternative technical systems to the main one, i.e., the subject matter of the invention of the patent. The way of working is different, however, since Rule 3 concerned those sentences in which the main technical system appears, together with alternative ones.
In order to better understand how these sentences are constructed, i.e., which dependency pattern they use, the same experimental analysis carried out to define Rule 1 was performed, since in that case we had already isolated those which for Rule 1 were the main sentences and for Rule 3 become the auxiliary sentences. In addition to these, the main sentences of Rule 3, i.e., the sentences in which the main technical system is expressed, together with other alternative technical systems, were also collected.
In order to isolate the sentences to be analysed, we simply extracted all the sentences containing the 185 technical systems which appear in the auxiliary sentences (i.e., the main sentences of Rule 1). These technical systems are the main technical systems of the main sentences. Then, by manually analysing all these sentences, only those containing the alternative technical systems were extracted, and were automatically analysed with spaCy, to identify the dependency patterns present.
The result of this analysis is that the dependency patterns of the verbal predicate that are used to link alternative technical systems to the main ones are exactly all those already expressed in
Table 8.
3.2.4. Rule Application
The discussion of the application of rules must take into account two aspects: how many of them are used, i.e., some or all of them together, and in what order. Regarding the number, we must consider that Rule 1 and Rule 2 are independent, while Rule 3 presupposes the use of Rule 1, using its main sentences as auxiliary sentences. Therefore, the way the rules are constructed means that their order of use is rather constrained, leaving few combinations open, especially if all three are used. Another aspect to consider is the available resources, since each rule requires additional analysis. However, the more rules one considers, the more technical systems can be identified. Assuming that all the rules are used, and by virtue of this reasoning, a possible sequence of use could be the one with which the rules have been presented. From this perspective, considering a patent pool, the application of Rule 1 allows us to collect a first set of technical systems. Through Rule 2 and Rule 3, they can retrieve new ones, which are reported in sentences not completed with all design elements, such as those analysed with Rule 1, thus expanding the set of results obtained.
Figure 3 offers a graphic schematisation of this application of the rules.
The application of the rules and the help they can provide to the design activity depends on the logic with which the search for the technical systems, which they aim to improve, is carried out, i.e., specifying the function and object. In response to them, the rules help to retrieve the technical systems related to them. In this type of search, if one narrows down the field by using a more specific function and/or object, one can obtain more specific technical systems, and vice versa. Another possibility is to set up a search to better explore the characteristics and constituent parts of one of the technical systems that have been identified. For example, instead of searching for technical systems related to the function “cut” and the object “metal”, it is possible to construct a pool based on the function “move” and the object “cutting tip”, to retrieve which transmission and supply can be used. The iterative application of the proposed method in relation to this multiple search logic is a possibility for fully supporting the design in the retrieval of a knowledge base to search for solutions. Such a way of proceeding could then be best grafted onto a sequential design approach (e.g., [
31]), where a problem is broken down into sub-problems and addressed sequentially.