Abstract:The problem of spatiotemporal event visualization based on reports entails subtasks ranging from named entity recognition to relationship extraction and mapping of events. We present an approach to event extraction that is driven by data mining and visualization goals, particularly thematic mapping and trend analysis. This paper focuses on bridging the information extraction and visualization tasks and investigates topic modeling approaches. We develop a static, finite topic model and examine the potential benefits and feasibility of extending this to dynamic topic modeling with a large number of topics and continuous time. We describe an experimental test bed for event mapping that uses this end-to-end information retrieval system, and report preliminary results on a geoinformatics problem: tracking of methamphetamine lab seizure events across time and space.
Abstract:This paper examines the effect of mimicking discontinuous heredity caused by carrying more than one chromosome in some living organisms cells in Evolutionary Multi-Objective Optimization algorithms. In this representation, the phenotype may not fully reflect the genotype. By doing so we are mimicking living organisms inheritance mechanism, where traits may be silently carried for many generations to reappear later. Representations with different number of chromosomes in each solution vector are tested on different benchmark problems with high number of decision variables and objectives. A comparison with Non-Dominated Sorting Genetic Algorithm-II is done on all problems.
Abstract:This paper introduces a new dynamic neighborhood network for particle swarm optimization. In the proposed Clubs-based Particle Swarm Optimization (C-PSO) algorithm, each particle initially joins a default number of what we call 'clubs'. Each particle is affected by its own experience and the experience of the best performing member of the clubs it is a member of. Clubs membership is dynamic, where the worst performing particles socialize more by joining more clubs to learn from other particles and the best performing particles are made to socialize less by leaving clubs to reduce their strong influence on other members. Particles return gradually to default membership level when they stop showing extreme performance. Inertia weights of swarm members are made random within a predefined range. This proposed dynamic neighborhood algorithm is compared with other two algorithms having static neighborhood topologies on a set of classic benchmark problems. The results showed superior performance for C-PSO regarding escaping local optima and convergence speed.
Abstract:Artificial Intelligence (AI) techniques are known for its ability in tackling problems found to be unyielding to traditional mathematical methods. A recent addition to these techniques are the Computational Intelligence (CI) techniques which, in most cases, are nature or biologically inspired techniques. Different CI techniques found their way to many control engineering applications, including system identification, and the results obtained by many researchers were encouraging. However, most control engineers and researchers used the basic CI models as is or slightly modified them to match their needs. Henceforth, the merits of one model over the other was not clear, and full potential of these models was not exploited. In this research, Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) methods, which are different CI techniques, are modified to best suit the multimodal problem of system identification. In the first case of GA, an extension to the basic algorithm, which is inspired from nature as well, was deployed by introducing redundant genetic material. This extension, which come in handy in living organisms, did not result in significant performance improvement to the basic algorithm. In the second case, the Clubs-based PSO (C-PSO) dynamic neighborhood structure was introduced to replace the basic static structure used in canonical PSO algorithms. This modification of the neighborhood structure resulted in significant performance of the algorithm regarding convergence speed, and equipped it with a tool to handle multimodal problems. To understand the suitability of different GA and PSO techniques in the problem of system identification, they were used in an induction motor's parameter identification problem. The results enforced previous conclusions and showed the superiority of PSO in general over the GA in such a multimodal problem.
Abstract:Topic models are probabilistic models for discovering topical themes in collections of documents. In real world applications, these models provide us with the means of organizing what would otherwise be unstructured collections. They can help us cluster a huge collection into different topics or find a subset of the collection that resembles the topical theme found in an article at hand. The first wave of topic models developed were able to discover the prevailing topics in a big collection of documents spanning a period of time. It was later realized that these time-invariant models were not capable of modeling 1) the time varying number of topics they discover and 2) the time changing structure of these topics. Few models were developed to address this two deficiencies. The online-hierarchical Dirichlet process models the documents with a time varying number of topics. It varies the structure of the topics over time as well. However, it relies on document order, not timestamps to evolve the model over time. The continuous-time dynamic topic model evolves topic structure in continuous-time. However, it uses a fixed number of topics over time. In this dissertation, I present a model, the continuous-time infinite dynamic topic model, that combines the advantages of these two models 1) the online-hierarchical Dirichlet process, and 2) the continuous-time dynamic topic model. More specifically, the model I present is a probabilistic topic model that does the following: 1) it changes the number of topics over continuous time, and 2) it changes the topic structure over continuous-time. I compared the model I developed with the two other models with different setting values. The results obtained were favorable to my model and showed the need for having a model that has a continuous-time varying number of topics and topic structure.
Abstract:This paper presents a new technique for induction motor parameter identification. The proposed technique is based on a simple startup test using a standard V/F inverter. The recorded startup currents are compared to that obtained by simulation of an induction motor model. A Modified PSO optimization is used to find out the best model parameter that minimizes the sum square error between the measured and the simulated currents. The performance of the modified PSO is compared with other optimization methods including line search, conventional PSO and Genetic Algorithms. Simulation results demonstrate the ability of the proposed technique to capture the true values of the machine parameters and the superiority of the results obtained using the modified PSO over other optimization techniques.
Abstract:We describe our language-independent unsupervised word sense induction system. This system only uses topic features to cluster different word senses in their global context topic space. Using unlabeled data, this system trains a latent Dirichlet allocation (LDA) topic model then uses it to infer the topics distribution of the test instances. By clustering these topics distributions in their topic space we cluster them into different senses. Our hypothesis is that closeness in topic space reflects similarity between different word senses. This system participated in SemEval-2 word sense induction and disambiguation task and achieved the second highest V-measure score among all other systems.