Advanced Search

Narrow Results

Results: 1 - 14of14

Follow results:

refine search

Filters

per page:

Sort: Relevance

Context for search term 1Search term 1*

All Dates

LastSelect static range

Custom Range

Select starting monthSelect starting year

Select ending monthSelect ending year

Advanced

Search name	Searched On	Run search
Keyword: Image Understanding (14)	31 Mar 2025	Run
Keyword: Regulation And Risk (1)	31 Mar 2025	Run
Keyword: Intelligent Systems (17)	31 Mar 2025	Run
Keyword: Large Time Behavior (13)	31 Mar 2025	Run
Keyword: Spectral Invariants (9)	31 Mar 2025	Run

articleNo Access
Counting in Visual Question Answering: Methods, Datasets, and Future Work
- Tesfayee Meshu Welde and
- Lejian Liao
International Journal of Image and Graphics20 Oct 2023
Preview Abstract
Visual Question Answering (VQA) is a language-based method for analyzing images, which is highly helpful in assisting people with visual impairment. The VQA system requires a demonstrated holistic image understanding and conducts basic reasoning tasks concerning the image in contrast to the specific task-oriented models that simply classifies object into categories. Thus, VQA systems contribute to the growth of Artificial Intelligence (AI) technology by answering open-ended, arbitrary questions about a given image. In addition, VQA is also used to assess the system’s ability by conducting Visual Turing Test (VTT). However, because of the inability to generate the essential datasets and being incapable of evaluating the systems due to flawlessness and bias, the VQA system is incapable of assessing the system’s overall efficiency. This is seen as a possible and significant limitation of the VQA system. This, in turn, has a negative impact on the progress of performance observed in VQA algorithms. Currently, the research on the VQA system is dealing with more specific sub-problems, which include counting in VQA systems. The counting sub-problem of VQA is a more sophisticated one, riddling with several challenging questions, especially when it comes to complex counting questions such as those that demand object identifications along with detection of objects attributes and positional reasoning. The pooling operation that is considered to perform an attention mechanism in VQA is found to degrade the counting performance. A number of algorithms have been developed to address this issue. In this paper, we provide a comprehensive survey of counting techniques in the VQA system that is developed especially for answering questions such as “How many?”. However, the performance progress achieved by this system is still not satisfactory due to bias that occurs in the datasets from the way we phrase the questions and because of weak evaluation metrics. In the future, fully-fledged architecture, wide-size datasets with complex counting questions and a detailed breakdown in categories, and strong evaluation metrics for evaluating the ability of the system to answer complex counting questions, such as positional and comparative reasoning will be executed.
articleNo Access
A Multi-Scale Cascaded Hierarchical Model for Image Labeling
- Degui Xiao,
- Qilei Chen, and
- Shanshan Li
International Journal of Pattern Recognition and Artificial Intelligence01 Nov 2016
Preview Abstract
Image labeling is an important and challenging task in the area of graphics and visual computing, where datasets with high quality labeling are critically needed. In this paper, based on the commonly accepted observation that the same semantic object in images with different resolutions may have different representations, we propose a novel multi-scale cascaded hierarchical model (MCHM) to enhance general image labeling methods. Our proposed approach first creates multi-resolution images from the original one to form an image pyramid and labels each image at different scale individually. Next, it constructs a cascaded hierarchical model and a feedback circle between image pyramid and labeling methods. The original image labeling result is used to adjust labeling parameters of those scaled images. Labeling results from the scaled images are then fed back to enhance the original image labeling results. These naturally form a global optimization problem under scale-space condition. We further propose a desirable iterative algorithm in order to run the model. The global convergence of the algorithm is proven through iterative approximation with latent optimization constraints. We have conducted extensive experiments with five widely used labeling methods on five popular image datasets. Experimental results indicate that MCHM improves labeling accuracy of the state-of-the-art image labeling approaches impressively.
articleNo Access
USING EXPERT SYSTEMS FOR IMAGE UNDERSTANDING
- P. SUETENS and
- A. OOSTERLINCK
International Journal of Pattern Recognition and Artificial Intelligence01 Aug 1987
Preview Abstract
Expert systems and image understanding have traditionally been considered as two separate application fields of artificial intelligence (AI). In this paper it is shown, however, that the idea of building an expert system for image understanding may be fruitful. Although this paper may serve as a framework for situating existing works on knowledge-based vision, it is not a review paper. The interested reader will therefore be referred to some recommended survey papers in the literature.
articleNo Access
CONTROL STRATEGIES IN A HIERARCHICAL KNOWLEDGE STRUCTURE
- H. NIEMANN,
- G. SAGERER, and
- W. EICHHORN
International Journal of Pattern Recognition and Artificial Intelligence01 Sep 1988
Preview Abstract
Two control strategies are presented as working on a hierarchical knowledge structure based on a semantic network. The control algorithms cover strict top-down control and a bidirectional control which is a mixture of top-down (model driven) and bottom-up (data driven) analysis. The knowledge used by the algorithm is represented in a semantic network. Besides the network some other knowledge sources may be generated automatically to direct the analysis and limit the search space. The approach was used successfully in image and speech understanding.
articleNo Access
LEARNING BLACKBOARD-BASED SCHEDULING ALGORITHMS FOR COMPUTER VISION
International Journal of Pattern Recognition and Artificial Intelligence01 Apr 1993
Preview Abstract
The goal of image understanding by computer is to identify objects in visual images and (if necessary) to determine their location and orientation. Objects are identified by comparing data extracted from images to an a priori description of the object or object class in memory. It is a generally accepted premise that, in many domains, the timely and appropriate use of knowledge can substantially reduce the complexity of matching image data to object descriptions. Because of the variety and scope of knowledge relevant to different object classes, contexts and viewing conditions, blackboard architectures are well suited to the task of selecting and applying the relevant knowledge to each situation as it is encountered.
This paper reviews ten years of work on the UMass VISIONS system and its blackboard-based high-level component, the schema system. The schema system could interpret complex natural scenes when given carefully crafted knowledge bases describing the domain, but its application in practice was limited by the problem of model (knowledge base) acquisition. Experience with the schema system convinced us that learning techniques must be embedded in vision systems of the future to reduce or eliminate the knowledge engineering aspects of system construction.
The Schema Learning System (SLS) is a supervised learning system for acquiring knowledge-directed object recognition (control) strategies from training images. The recognition strategies are precompiled reactive sequences of knowledge source invocations that replace the dynamic scheduler found in most blackboard systems. Each strategy is specialized to recognize instances of a specific object class within a specific context. Since the strategies are learned automatically, the knowledge base contains only general-purpose knowledge sources rather than problem-specific control heuristics or sequencing information.
articleNo Access
Cooperative Spatial Reasoning for Image Understanding
- Takashi Matsuyama and
- Toshikazu Wada
International Journal of Pattern Recognition and Artificial Intelligence01 Feb 1997
Preview Abstract
Spatial Reasoning, reasoning about spatial information (i.e. shape and spatial relations), is a crucial function of image understanding and computer vision systems. This paper proposes a novel spatial reasoning scheme for image understanding and demonstrates its utility and effectiveness in two different systems: region segmentation and aerial image understanding systems. The scheme is designed based on a so-called Multi-Agent/Cooperative Distributed Problem Solving Paradigm, where a group of intelligent agents cooperate with each other to fulfill a complicated task. The first part of the paper describes a cooperative distributed region segmentation system, where each region in an image is regarded as an agent. Starting from seed regions given at the initial stage, region agents deform their shapes dynamically so that the image is partitioned into mutually disjoint regions. The deformation of each individual region agent is realized by the snake algorithm¹⁴ and neighboring region agents cooperate with each other to find common region boundaries between them. In the latter part of the paper, we first give a brief description of the cooperative spatial reasoning method used in our aerial image understanding system SIGMA. In SIGMA, each recognized object such as a house and a road is regarded as an agent. Each agent generates hypotheses about its neighboring objects to establish spatial relations and to detect missing objects. Then, we compare its reasoning method with that used in the region segmentation system. We conclude the paper by showing further utilities of the Multi-gent/Cooperative Distributed Problem Solving Paradigm for image understanding.
articleNo Access
Situated Image Understanding in a Multiagent Framework
- N. Bianchi,
- P. Bottoni,
- P. Mussio,
- C. Spinu, and
- C. Garbay
International Journal of Pattern Recognition and Artificial Intelligence01 Aug 1998
Preview Abstract
The paper addresses the problem of controlling situated image understanding processes. Two complementary control styles are considered and applied cooperatively, a deliberative one and a reactive one. The role of deliberative control is to account for the unpredictability of situations, by dynamically determining which strategies to pursue, based on the results obtained so far and more generally on the state of the understanding process. The role of reactive control is to account for the variability of local properties of the image by tuning operations to subimages, each one being homogeneous with respect to a given operation. A variable organization of agents is studied to face this variability. The two control modes are integrated into a unified formalism describing segmentation and interpretation activities. A feedback from high level interpretation tasks to low level segmentation tasks thus becomes possible and is exploited to recover wrong segmentations. Preliminary results in the field of liver biopsy image understanding are shown to demonstrate the potential of the approach.
articleNo Access
Overcoming the Limitations of Learning-Based VQA for Counting Questions with Zero-Shot Learning
- A. Lubna and
- Saidalavi Kalady
International Journal on Artificial Intelligence Tools20 Aug 2024
Preview Abstract
Visual question answering (VQA) research has garnered increasing attention in recent years. It is considered a visual Turing test because it requires a computer to respond to textual questions based on an image. Expertise in computer vision, natural language processing, knowledge understanding, and reasoning is required to solve the problem of VQA. Most techniques employed for VQA consist of models that are developed to learn the combination of image and question features along with the expected answer. The techniques chosen for image and question feature extraction and combining the features change with each model. This method of teaching a model of the question–answer pattern is ineffective for queries that involve counting and reasoning. This approach also requires considerable resources and large datasets for the training. The general VQA datasets feature a restricted number of items as responses to counting questions ( $< 10$ ), and the distribution of the answers is not uniform. To investigate these issues in VQA, we created synthetic datasets that could be modified to adjust the number of objects in the image and the amount of occlusion. Specifically, a zero-shot learning VQA system was devised for counting-related questions that provide answers by analyzing the output of an object detector and the query keywords. Using synthetic datasets, our model generated 100% correct results. Testing on the benchmark datasets task directed image understanding challenge (TDIUC) and TallyQA-simple indicated that the proposed model matched the performance of the learning-based baseline models. This methodology can be used efficiently for counting VQA questions confined to certain domains when the number of items to be counted is significant.
articleNo Access
KYDON VISION SYSTEM: THE ADAPTIVE LEARNING MODEL
- J.S. MERTOGUNO and
- N.G. BOURBAKIS
International Journal on Artificial Intelligence Tools01 Dec 1995
Preview Abstract
In this paper, an adaptive learning model for an autonomous vision system multi-layers architecture, called Kydon, are presented, modeled, and analyzed. In particular two critical (deletion and saturation) points on the learning curve are evaluated. These points represent two extreme states on the learning process. The Kydon architecture consists of ‘k’ layers array processors. The lowest layers consists of lower-level processing layers, and the rest consists of higher-level processing layers. The interconnectivity of the PEs in each array is based on a full hexagonal mesh structure. Kydon uses graph models to represent and process the knowledge, extracted from the image. The knowledge base of Kydon is distributed among its PE’s. A unique model for evolving knowledge base has been developed especially for Kydon in order to provide it with some intelligence properties.
articleNo Access
IMAGE ENGINEERING AND RELATED PUBLICATIONS
- Y. J. ZHANG
International Journal of Image and Graphics01 Jul 2002
Preview Abstract
Image engineering is a discipline that includes image processing, image analysis, image understanding, and the applications of these techniques. To promote its development and evolvement, this paper provides a well-regulated explanation of the definition of image engineering, as well as its intention and extension. It also introduces a new classification of the theories of image engineering, and the applications of image technology. A thorough statistical survey on the publications in this discipline is carried out, and an analysis and discussion of the statistics from the classification results are presented. This work shows a general and an up-to-date picture of the status, progress, trends and application areas of image engineering.
articleNo Access
CLASSIFIER COMBINATION APPLIED FOR UNDERSTANDING OF EYES IMAGES
- ANTONIO VALERIO NETTO and
- ANDRE C. PONCE DE LEON F. DE CARVALHO
International Journal of Computational Intelligence and Applications01 Sep 2005
Preview Abstract
The article presents the development of a Classifiers Combination based on machine learning techniques (Artificial Neural Networks, Support Vector Machines and C4.5 algorithm) which were able to increase the performance achieved by the Refractive Errors Measurement System (REMS) that analyzes Hartmann-Shack (HS) images from human eyes. The HS images are analyzed in order to extract relevant data for identification of refractive errors (myopia, hypermetropia and astigmatism). Those data are extracted using Gabor wavelets transform and afterwards, machine learning techniques are employed to carry out the image analysis.
chapterFree Access
CONTEXT RELATED ISSUES IN IMAGE UNDERSTANDING
- L. F. PAU
Handbook of Pattern Recognition and Computer Vision01 Mar 1999
Preview Abstract
This chapter gives a formal model for scene understanding, as well as for context information; it helps in adapting image understanding procedures and software to varying contexts, when some formal assumptions are satisfied. We have defined and formalized context separation and context adaptation, which are essential for many applications, including to achieve the robustness of the understanding results in changing sensing environments. This model uses constraint logic programming and specialized models for the various interactions between the objects in the scene and the context. A comparison is made with, and examples are given of, the context models in more classical frameworks such as multilevel understanding structures, object based design in scenes, knowledge based approach, and perceptual context separation.
chapterNo Access
COOPERATIVE SPATIAL REASONING FOR IMAGE UNDERSTANDING
- TAKASHI MATSUYAMA and
- TOSHIKAZU WADA
Spatial Computing: Issues in Vision, Multimedia and Visualization Technologies01 Aug 1997
Preview Abstract
Spatial Reasoning, reasoning about spatial information (i.e. shape and spatial relations), is a crucial function of image understanding and computer vision systems. This paper proposes a novel spatial reasoning scheme for image understanding and demonstrates its utility and effectiveness in two different systems: region segmentation and aerial image understanding systems. The scheme is designed based on a so-called Multi-Agent/Cooperative Distributed Problem Solving Paradigm, where a group of intelligent agents cooperate with each other to fulfill a complicated task. The first part of the paper describes a cooperative distributed region segmentation system, where each region in an image is regarded as an agent. Starting from seed regions given at the initial stage, region agents deform their shapes dynamically so that the image is partitioned into mutually disjoint regions. The deformation of each individual region agent is realized by the snake algorithm¹⁴ and neighboring region agents cooperate with each other to find common region boundaries between them. In the latter part of the paper, we first give a brief description of the cooperative spatial reasoning method used in our aerial image understanding system SIGMA. In SIGMA, each recognized object such as a house and a road is regarded as an agent. Each agent generates hypotheses about its neighboring objects to establish spatial relations and to detect missing objects. Then, we compare its reasoning method with that used in the region segmentation system. We conclude the paper by showing further utilities of the Multi-Agent/Cooperative Distributed Problem Solving Paradigm for image understanding.
chapterFree Access
CONTEXT RELATED ISSUES IN IMAGE UNDERSTANDING
- L. F. PAU
Handbook of Pattern Recognition and Computer Vision01 Aug 1993
Preview Abstract
This chapter gives a formal model for scene understanding, as well as for context information; it helps in adapting image understanding procedures and software to varying contexts, when some formal assumptions are satisfied. We have defined and formalized context separation and context adaptation, which are essential for many applications, including to achieve the robustness of the understanding results in changing sensing environments. This model uses constraint logic programming and specialized models for the various interactions between the objects in the scene and the context. A comparison is made with, and examples are given of, the context models in more classical frameworks such as multilevel understanding structures, object based design in scenes, knowledge based approach, and perceptual context separation.