Data mining, Bongard problems, and the concept of pattern conception

(1)

Data mining, Bongard problems, and the

concept of pattern conception

A. Linhares

Getulio Vargas Foundation,

Brazil

Abstract

One of the major problems of data mining systems is the identification of classes, categories, and concepts. We introduce a new framework for categorization which is based on the concept of “pattern conception” (a term that may be contrasted to “pattern recognition”, “pattern matching”, “pattern perception”, etc.). There are important distinctions between pattern conception and the mainstream pattern recognition models; furthermore, these distinctions lead us to new categorization information-processing architectures. The first major distinction tells us that there is more than one correct conception for each individual pattern. Each pattern may have numerous segmentations and descriptions which are fundamentally distinct but equally correct in a deep sense. Another striking distinction of pattern conception is the capability to “see as”, in which context will guide the interpretation of data such as that one object may be seen as if it were another type of object, or as if it were occupying the position or role of other objects. A final and related distinction is that there should be a ‘relativity theory’ view of concepts and categories, in which concepts are both

dejlned by their relations to other concepts and activated from the spread of activation of other concepts. In this work, we analyze how these distinctions appear under three distinct application domains: (I) the notorious case of Bongard problems; (ii) letter-string analogies; and (iii) the game of chess (viewed as a pattern analysis problem). It may be concluded that data mining methods must be able to handle these distinctions if they are to be effective at pattern conception, and, thus, to a wide class of information categorization problems.

(2)

1 Introduction

An outstanding problem for data mining arises from the fact that raw data does not come pre-classified into clear categories, clusters, or concepts. Consider a security system that has to mine (in a massive set of images) to identify a specific person – precisely the type of hard AI multidisciplinary problem that joins data mining and pattern classification. The major information processing paradigms to classify and categorize raw patterns are lacking some important capabilities. This paper analyzes these missing capabilities and introduces a new framework for categorization which is based on the concept of pattern

conception, a term that may be contrasted to “pattern recognition”, “pattern matching”, “pattern perception”, etc. Some important distinctions between pattern conception and the mainstream pattern recognition models are identified; furthermore, these distinctions lead us to new categorization iuformation-processing architectures.

Three distinct application domains reveal the mechanisms involved in pattern conception: (i) the notorious case of Bongard problems; (ii) the game of chess, viewed as a pattern analysis problem; and, finally, the much simpler (and now solved) case of (iii) letter-string analogies which provides a pathway for a framework for pattern conception which is briefly sketched on section 5. Let us start our analysis with Bongard problems.

2 The case of Bongard problems

Bongard problems (fig. 1) are a set of pattern understanding problems proposed by the Russian scientist Michail Bongard [1]. Each problem holds two sets of boxed images, and the task required is to find the aspect that distinguishes set A from set B. Data mining research has a lot to gain from Bongard problems, as they are related to some of the major challenges presented by Usama Fayyad [4], such as the CCill-dejined queries such as [to] find records similar to records in table A but not similar to table B”, or maybe tasks like to search by example in multimedia databases.

An initial point that can be brought up front is that in these problems,

patterns are perceived in terms of concepts. That means that raw data is perceived in terms of specific ‘higher-level’ structures (and in terms of the relation of this structure to other structures). Thus, in BP 10, information processing must go from the particular visual idiosyncrasy of each box to the clear-cut, abstract, and conceptual answer: “triangles versus rectangles”, discarding, at some point in the process, the visual information itself, in turn for the symbolic, conceptual information.

(3)

❑

zil

EIEI

—— ——

Im

RI

ml

w

m

Im

El

RI El

mm

EIEIIDZZI

•I

El

91

El

o 0 00

❑

m

Figure 1. Bongard problems 10, 19,52, and 91 [From M. M.Bongard, 1970] In BP91, something interesting happens: note how the conception of the squares differs in classes A and B: the square in class B is not taken as a unit; it

is not conceived as a square at all. It is interpreted as four individually distinct line segments that meet at endpoints. At the same time, in class A, each of the squares is taken as a single unit. Taken as a whole, they are not broken down into their composing line segments; a direct conclusion is that there are multiple ways of conceiving the same original data (see Linhares [6]).

The term “recognition” implies the existence of things to be recognized and also of a set of categories to recognize them into, leading easily to the idea that each thing to be recognized has a corresponding predefiied category to be recognized as being an instance of. Pattern conception, on the other hand, implies that a specific pattern can be conceived as some particular concept (as opposed to must be conceived). Under this view, a peculiar pattern is not such a ‘thing to be recognized at all, as there may be many competing conceptions. Thus we may conceive a triangular arrangement as a unique object, as three separate objects, or as something else entirely (as Bongard problems show). By doing this we are applying the process of “seeing as”, conceiving a pattern as some abstract concept, and that process is unlike merely labeling or “recognizing” an instance of something.

Figure 2 provides examples of visual arrangements that evoke a triangle; under the proper context a Bongard problem could include all of them as “triangles”. On the surface each of the boxes has no triangle; however, each one evokes a triangle, and they reinforce each other in the sense that a clear idea of a triangle emerges. A triangle, however, is a three-sided polygon (formed by line segments) with a precise mathematical definition; either something is a triangle or it is not. Thee point is, these figures cannot be recognized as triangles, but tiey can be conceived as triangles in an abstract sense. So what is involved in this conception process? Let us look closer at this process in the form of pattem-analysis chess.

(4)

mLl El12S

Figure 2. Conception versus recognition: we can conceive triangles from these figures, but we cannot recognize any triangles in them.

3 Pattern-analysis

chess

Chess has been traditionally viewed in AI as a classic symbolic-paradigm search problem: given an initial state, and an overarching goal, the system searches through a myriad of candidate options in search of an optimum. However, as psychologists have repeatedly pointed out, humans do not play chess by brute-force search [2,3,9]. Skilled human players hardly ‘search trough’ that great number of nodes of the game tree explored by computer programs. Human masters, however, are exceptionally skilled in reconstructing the board positions after having spotted them for some seconds. It is notorious that humans perceive a chessboard given their previous experience of the game, rapidly structuring a position into familiar terms; embedded in this perception lies the intelligence of the human chess player. It is a great scientific challenge to design a chess program which would simulate the cognitive processing of a skilled human chess player.

It is known that humans carve board positions into chunks, and eye saccade studies have shown that effort is concentrated in the most important regions of the board, with special emphasis at pieces under “chess relations” (attacks and defenses). Let us consider the board position of Fig 3. Consider the relation between the white knight and the black queen: the first can reach (attack) the queen in 1 move, while the queen can attack it in 2 moves. This kind of relationship can be described by a graph in which each (directed weighted) edge denotes the minimum number of movements between two pieces (represented by nodes). This representational scheme seems to be in line with the view that “chess players characterize the board spatially” [3, p.7].

Note that many abstract relations are described in the graph. For instance, we can perceive a fork as displayed in the graph, but it is not the only fork possible, it is, however, the more relevant one for the interpretation of the situation at hand. This is once again a process of pattern conception. There are

multiple candidate conceptions of how the pieces combine to defend and to attack the opposing camp. Each piece threatens to occupy some board squares in the following several moves; it is how these threats are conceived as a whole that differentiate skilled fi-om unskilled human players, and to enable a machine to conceive the board in an insightful way is a major challenge for data mining research. This multiplicity of interpretations (conceptions) that change according to context is a first similarity to Bongard problems.

(5)

A second similarity is the view of analogy between chunks (in the case of chess, it is more appropriate to use the term chunk instead of concept, as the term concept evokes the possibility of being expressible). If we look at the graph and discard the identity of the pieces, we can obtain a set of positions that are analogous to this one in an abstract and profound sense: positions in which the very same pressures emanate from a radically distinct variety of positions (as a trivial example, the fork could be emanating from a black queen, and nothing essential would be changed in the graph structure).

A third similarity with Bongard problems is that each particular board position is concrete, rich in all its peculiarity and its unstructured idiosyncrasy. But the conception in the graph representation is, the moment one discards the identity of the pieces, purely abstract and structured.

Figure 3. White to move: mate in one. Spatial graph representations maybe used for chunks to be mined in KDD.

If we consider the game of chess as a data mining problem, it can be said that there is a “language of chess” (which is gradually acquired by human players over the course of one’s life; enabling one to distinguish certain pressures on the board, prioritizing them over other pressures; in fact, it enables skilled players to see the dangers and opportunities overlooked by less skilled players). This ‘language’ is composed of chunks, which denote threats, blocks, defenses, forks, and a multitude of specific configurations. It is an outstanding challenge to data mining systems to identify such a ‘language’.

These points having been raised, let us look at a third ill-structured problem type with these characteristics: letter string analogies. From this simpler problem, we may sketch an information processing framework for pattern conception.

(6)

4 Letter-string

analogies

Consider the problem of letter string analogies, such as: if ABC changes to ABD, how can we change IJK in a similar way? Though these problems are much simpler than either Bongard problems or pattern-analysis chess, they still require a considerable effort of pattern conception.

First of all, there is a multiplicity of possibilities of conceiving each particular string. Let us analyze a simple example: if AABC changes to AABD, how can we change IJKK in a similar way?

One may be tempted at first to answer IJKL, by interpreting the transformation as a simple “change last letter to its successor”. So this is one possibility; but it does not consider the fact that in the original strings, the AA’s appear as together a group; one way to account for this in the answer would be to use IJLL, by applying a rule such as “change last group of letters to its successor”. Other possibilities abound, such as JJKK (change the first letter to its successor, nearly inverting the original rule and obtaining an awkward sounding configuration, given its asymmetry with the originals).

Two possibilities preserve symmetry with the original transformation: IJLK, and HJKK. The first one, IJLK, comes from the conception of the original transformation as: A-ABC=>A-ABD. This implies the rule “change last letter of the increasing sequence to successor”, which, when applied to IJK-K, yields IJLK. This is one way to conceive the strings and the transformation. Another way is AA-B-C=>AA-B-D, with the rule, “change last letter to successor”, being transformed to “change FIRST letter to PREDECESSOR’, in order to account for a symmetric relation between AA-B-C and I-J-KK, and also between AA-B-D and H-J-KK. In this manner, the KK stays fixed (just like the AA remains) and also the relation B-D is symmetric to H-J, despite the inversion of order.

So once again we have multiple interpretations: AABC interpreted as AA-B-C; A-ABAA-B-C; or something else entirely. Each of these interpretations will lead to a specific conceptual (chunk) activation: AA-B-C may activate the concept “same letter group”, while A-ABC activates “increasing sequence”; each conceptual activation in turn yields distinct analogous solutions to the problem.

Letter-string analogies are some of the few problems of pattern conception that have been solved satisfactorily (see [5,8]); hence they point us to promising pathways towards the solution of Bongard problems, pattern-analysis chess, “ill-defined queries such as to find records similar to those in table A but not similar to those in table B“, and search by example in multimedia databases. In the following section, we abstract from the particular model used in letter-stiing analogies by Mitchell [8] to sketch a framework for pattern conception.

5 A framework

for pattern conception

Let us consider the information processing involved in such a pattern conception process. The system must start with patterns composed of raw uncategorized data (such as visual images). At this point there is no identity neither individuality, as no objects have been identified. Thus, in Bongard problems, the

(7)

patterns are two-dimensional binary images; no objects and structures have been identified (and thus no object has been individualized). In pattern analysis chess, no blocking pieces, candidate moves, threats or opportunities are identified, and, though the pieces are placed in their positions, the situation still has no structure and is not carved into chtniks. In letter string analogies, each particular string has yet to be carved into blocks, such that once again its structure remains to be discovered. What each of the patterns do present is their peculiarity, the specific idiosyncratic arrangements that form that particular pattern [10].

Concepts are activated

1-Patterns _Concepts Peculiarity _Identity Particularity Idiosyncrasy Concrete - ‘:&:’:li’y

2

Activation spreads through a conceptual network

Active concepts attempt to ‘fit’ the patterns

Figure 4. A tiamework for pattern conception,

Then information processing starts: at this point there are no expectations about the pattern, so processing goes in a bottom up fashion, driven by the data. The objective is to obtain a coherent and clear conception of what is in the pattern – and thus understand what makes set A different from set B, or how to proceed given a specific chess position. Gradually the data activate chunks or concepts (which are defined by their specific relations to proximate concepts in a network). So a small comer at a Bongard figure triggers activation of the concept ‘rectangle’; a fork in a chess position activates a particular chunk (to describe the attack) and a ‘JJ’ sequence in a letter string triggers a concept such “same letter group”. Then the activation of concepts spreads. Activation may spread from the concept (node) rectangle to the concept triangle or maybe polygon. It may spread from our specific fork chunk to the related chunks that better describe the ‘mate in one’ situation; it may spread from “same letter group” to the broader “group” (see Mitchell [8]). This is an important step, because these newly activated concepts create top-down pressure to focus the search to new possibilities, so this is the point where expectations enter into the game. From now on the processing has both bottom-up and top-down pressures co-existing.

Each evoked concept has specific methods for identi~ing its candidate instances in the patterns. This is not a process of recognition because the process takes into consideration not only the quality of the match, but also the expectations given by the level of activation of each concept. Thus, each triangle of figure 2 does not ‘satisfi’, individually, the concept triangle (which cannot be

(8)

pattern-matched), but still, since the conceptual activation is so high, reinforced by the other boxes, the patterns are satisfactorily conceived as triangles.

This framework thus produces an activation of concepts and a fit of how well they match the patterns; pointing towards solutions to the problem of ill-defined queries such as search by example. The model is flexible enough to account for the contextual pressures that make a specific record similar or dissimilar to other records. The reader is referenced to [8] for detailed discussion of this framework.

6 Conclusion

Bongard problems and pattern analysis chess are deeply related to tasks such as search by example; these ill stmctured problems are crucial for data mining research. However, a glimpse at recent books on data mining shows the whole range of methods, which include statistical reasoning, machine learning, inductive logic programming, K-nearest neighbors, decision trees, association rules, neural networks, genetic algorithms, belief networks, classification rules, numeric prediction, clustering, inferring rules, covering algorithms, linear models, support vector machines, and others. These methods are inadequate for Bongard problems, pattern-analysis chess, and for such ill-defined data queries, because they fail to account for one or more of the basic distinctions which configure the need for a new class of pattern conception methods.

Pattern conception appears in contrast to the widespread methods of pattern matching and pattern recognition, in that patterns are perceived by an activation of symbolic concepts; there are multiple candidate activations for each case; activation spread to related concepts, which constrain the further search; and finally the activation itself becomes part of the solution (as the conceived concepts).

Three classes of problems demonstrate some possible applications of such methods: Bongard problems, letter-string analogies, and the pattern-analysis chess. In each of these problems, patterns are perceived in terms of concepts.

Pattern conception interprets particular idiosyncratic patterns as evocations of abstract symbolic concepts. Instead of the preciseness evoked by the term pattern recognition, in pattern conception nothing ever is or is not absolutely recognized; things are conceived by indirectly evoking concepts. If patterns evoke concepts, a triangle may sometimes evoke a set of separate line segments instead of being recognized as a unique object. An attack may be conceived as a chess fork, but the fork may be ignored if a competing conception for a checkmate becomes active. A string sequence “JKL” may evoke a concept “increasing sequence”. Implied in this view are the multiple possible interpretations for even the simplest clear-cut pattern.

Each peculiar pattern is conceived as an abstract concept. Here analogies play a key part, as things may be “like” in an ill-defined sense without being equal. Thus we may have the images of figure 2 as being like each other; we may imagine new chess positions with the same structure of the graph proposed in figure 3, but with distinct pieces “playing those roles”. Concepts are relative

(9)

to one another, in two ways: first, the activation of a concept stored in long term memory propagates activation to related concepts and determines the further search; finally, each concept is defined by the very specific relations it holds to other concepts.

Though this paper abstracts the solution framework proposed in [8], we find ourselves still in an early stage of knowledge to propose specific solution methods (such that that is not the intended goal here); our main contribution being one of problem analysis, regarding the outstanding challenge brought by a problem of overwhelming importance to Bongard problems, pattern-analysis chess, “ill-defined queries such as to find records similar to those in table A but not similar to those in table B“, and search by example in multimedia databases. These are critical issues for data mining research, and a further understanding of the problem of pattern conception seems central to this undertaking.

References

[1]

Bongard, M. M., Pattern Recognition, Spartan Books, New York, 1970. [2] Chase, W. G., and Simon, H.A., Perception in Chess, Cognitive Psycholog

4,55-81 (1973).

[3] de Groot, A.D., Thought and choice in chess. Mouton, New York, 1965. [4] Fayyad, U., Editorial, Data Mining and Knowledge Discovey 2,

115-119

(1998).

[5] Hofstadter, D., Fluid Concepts and Creative Analogies, Basic Books, New York, 1995.

[6] Linhares, A., A glimpse at the metaphysics of Bongard problems. Artijkial Intelligence 121,251-270,2000

[7] Linhares, A. Evaluating the FARG architecture for pattern analysis chess. Manuscript in preparation, Getulio Vargas Foundation, 2002.

[8] Mitchell, M., Analogy-making as perception, MIT Press, Cambridge, MA, 1993.

[9] Simon, H.A., and Chase, W. G., Skill in Chess, American Scientist 61, 394-403, 1973.