World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Efficient Online Big Data Stream Clustering Using Dual Interactive Wasserstein Generative Adversarial Network

    https://doi.org/10.1142/S021821302450009XCited by:0 (Source: Crossref)

    Numerous real-world applications, such as online gaming, video streaming, and internet calls are streamed enormous volumes of data. So it is important to quickly process data streams in real-time. Data clustering methods are historically effective and efficient in extracting data from large datasets. Typically, they are ineffective for online data stream clustering. Therefore, an efficient online big data stream clustering using dual interactive Wasserstein generative adversarial network (OBDSC-DI-WGAN) is proposed in this paper. The proposed method consists of three phases: data initialization, online clustering, offline clustering. Initially, the input data are taken from Forest Cover Type dataset. During initialization phase, the dimensions of the input data can be reduced using kernel co-relation approach. After the initialization, the dimension-reduced data are fed to the dual interactive Wasserstein generative adversarial network (DI-WGAN) to accomplish efficient data stream clustering. Then the data enter the selected grid during the stage of online clustering. Afterward, the data stream is activated through the stage of online clustering and the data are activated in the stage of offline depending upon user request. The grid is regarded as a virtual data point in its geometric center during the offline phase. The density radius along cluster centers is determined under Billiards-inspired optimization algorithm. Finally, the clustering outcome is derived from optimum density radius. The proposed technique is activated in MATLAB, and its efficiency is analyzed under some performance metrics, such as accuracy, dice coefficient, purity, sensitivity, specificity, precision, processing time and jacquard coefficient. The proposed method provides better accuracy 27.5%, 10.32% and 16.65%, better precision 30.93%, 11.14% and 15.3% compared with existing methods, like fast grid-based clustering approach for hybrid data stream (FGCH-CCFD-OBDSC), optimized deep autoencoder including CNN for non-stationary environments surveillance data streams (DAE-CNN-OBDSC) and asynchronous dual-pipeline deep learning framework for online data stream classification (1D-CNN-OBDSC) respectively.