The deployment of fog computing has not only helped in task offloading for the end-users toward delay-sensitive task provisioning but also reduced the burden for cloud back-end systems to process variable workloads arriving from the user equipment. However, due to the constraints on the resources and computational capabilities of the fog nodes, processing the computational-intensive task within the defined timelines is highly challenging. Also, in this scenario, offloading tasks to the cloud creates a burden on the upload link, resulting in high resource costs and delays in task processing. Existing research studies have considerably attempted to handle the task allocation problem in fog–cloud networks, but the majority of the methods are found to be computationally expensive and incur high resource costs with execution time constraints. The proposed work aims to balance resource costs and time complexity by exploring collaboration among host machines over fog nodes. It introduces the concept of task scheduling and optimal resource allocation using coalition formation methods of game theory and pay-off computation. The work also encourages the formation of coalitions among host machines to handle variable traffic efficiently. Experimental results show that the proposed approach for task scheduling and optimal resource allocation in fog computing outperforms the existing system by 56.71% in task processing time, 47.56% in unused computing resources, 8.33% in resource cost, and 37.2% in unused storage.
Automatic white balance (AWB) is an important module for cameras and classification of the color-distorted image is critical to realize intelligent AWB. Though accurate classifiers usually can be achieved via deep neural network models, they cannot fit into embedded hardware due to their complexity. To increase the classification accuracy and decrease latency, a lightweight convolutional neural network (CNN) with a histogram layer for AWB (AWBHNet) is constructed, which consists of one histogram layer, one regular convolutional layer, three depth separable convolutional layers, four pooling layers, two fully connected layers and two dropout layers. One-tenth of ImageNet is utilized as the normal image dataset. To generate various distorted colors, histogram shifting and matching are proposed to randomly adjust the histogram position or shape. Furthermore, the extent of shifting or matching is randomly generated to ensure the diversity of color distortion. Subsequently, the proposed AWBHNet and other CNNs are successively trained. Experiments show that the accuracy of the classifier trained by AWBHNet is 0.9150, which is at least 1.33% higher than each regular or lightweight network. Finally, intelligent AWB is realized on smartphones, the inference latency of AWBHNet is 47.6% lower than the best existing network.
For bidirectional rings, there have been proposed self-stabilizing mutual exclusion protocols, which are either time-adaptive (i.e., efficient in recovery) or 1-latent (i.e., efficient in legal execution) but not both. This paper proposes a randomized self-stabilizing mutual exclusion protocol that inherits both of the advantages from them: It is 1-latent in the sense that the privilege is circulated in a linear round (i.e., very intuitively, the privilege is transferred from a process to another by a "step"), provided that the system always stays in legitimate configurations, and is weakly time-adaptive in the sense that the system stabilizes from any configuration c in O(f) steps with a high probability, where f is the number of corrupted processes in c. It also guarantees with a high probability that there is at most one privilege even while converging to a legitimate configuration.
In this paper, we explore the problem of mapping linear chain applications onto large-scale heterogeneous platforms. A series of data sets enter the input stage and progress from stage to stage until the final result is computed. An important optimization criterion that should be considered in such a framework is the latency, or makespan, which measures the response time of the system in order to process one single data set entirely. For such applications, which are representative of a broad class of real-life applications, we can consider one-to-one mappings, in which each stage is mapped onto a single processor. However, in order to reduce the communication cost, it seems natural to group stages into intervals. The interval mapping problem can be solved in a straightforward way if the platform has homogeneous communications: the whole chain is grouped into a single interval, which in turn is mapped onto the fastest processor. But the problem becomes harder when considering a fully heterogeneous platform. Indeed, we prove the NP-completeness of this problem. Furthermore, we prove that neither the interval mapping problem nor the similar one-to-one mapping problem can be approximated in polynomial time by any constant factor (unless P=NP).
As the CMOS technology continues to scale to achieve higher performance, power dissipation and robustness to leakage and, process variations are becoming major obstacles for circuit design in the nanoscale technologies. Due to increased density of transistors in integrated circuits and higher frequencies of operation, power consumption, propagation delay, PDP, and area is reaching the lower limits. We have designed 16-bit adder circuit by Carry-Select Adder (CSA) using different pass-transistor logic. The adder cells are designed by DSCH3 CAD tools and layout are generated by Microwind 3 VLSI CAD tools. Using CSA technique, the power dissipation, PDP, area, transistor count, are calculated from the layout cell of proposed 16-bit adder for Ultra Deep Submicron feature size of 120, 90, 70, and 50 nm. The UDSM signal parameters are calculated such as signal to noise ratio (SNR), energy per instruction (EPI), Latency, and throughput using layout parameter analysis of BSIM 4. The simulated results show that the CPL is dominant in terms of power dissipation, propagation delay, PDP, and area among the other pass gate logics. Our CPL circuit dominates in terms of EPI, SNR, throughput, and latency in signal parameters analysis. The proposed CPL adder circuit is compared with reported results and found that our CPL circuit gives better performance.
This paper, focus on synthesis design of a Δ–Σ digital-to-analog converter (DAC) algorithm intended for professional digital audio. A rapid register-transfer-level (RTL) using a top-down design method with VHSIC hardware description language (VHDL) is practiced. All the RTL design simulation, VHDL implementation and field programmable gate array (FPGA) verification are rapidly and systematically performed through the methodology. A distributed pipelining, streaming and resource sharing design are considered for area and speed optimization while maintaining the original precision of the audio DAC. The features of the design are high-precision, fast processing and low-cost. The related work is done with the MATLAB & QUARTUS II simulators.
Instantaneous observability is used to watch a system output with very fast signals as well as it is a system property that enables to estimate system internal states. This property depends on the pair of discrete matrices {A(k),C(k)}{A(k),C(k)} and it considers that the system state equations are known. The problem is that the system states are inside and they are not always accessible directly. A process, which is a time-varying running program in four parts composes the system under investigation here. It is shown it is possible to apply Kalman filtering on a digital personal computer’s system with particularly the four parts like the ones under investigation. A computing process is performed during a period of time called latency. The calculation of latency considers it as a random variable with Gaussian distribution. The potential application of the results attained is the forecasting of data traffic-jam on a digital personal computer, which has very fast signals inside. In a broader perspective, this method to calculate latency can be applied on other digital personal computer processes such as processes on random access memory. It is also possible to apply this method on local area networks and mainframes.
Today’s vehicles have become increasingly complex, as consumers demand more features and better quality in their cars. Most of these new features require additional electronic control units (ECU) and software control, constantly pushing back the limits of existing architectures and design methodologies. Indeed, modern automobiles have a larger number of critical time functions distributed and running simultaneously on each ECU. Data Distribution Service (DDS) is a publish/subscribe middleware specified by the international consortium Object Management Group (OMG), which makes the information available in real time, while offering a rich range of quality of service (QoS) policies. In this paper, we propose a new methodology to integrate DDS in automotive application. We evaluate the performance of our new design by testing the fulfillment of real time QoS requirements. We also compare the performance of the vehicle application when using FlexRay and Ethernet networks. Computations prove that the use of DDS over Gigabit Ethernet (GBE) is promising in the automotive field.
In this work, the architecture of a dual-coupled linear congruential generator (dual-CLCG) for pseudo-random bit generation is proposed to improve the speed of the generator and minimize power dissipation with the optimum chip area. To improve its performance, a new pseudo-random bit generator (PRBG) employing two-operand modulo adder and without shifting operation-based dual-CLCG architecture is proposed. The novelty of the proposed dual-CLCG architecture is the designing of LCG based on two-operand modulo adder rather than a three-operand one and without using shifting operation as compared to the existing LCG architecture. The aim of the work is to generate pseudo-random bits at a uniform clock rate at the maximum clock frequency and achieve maximum length of the random bit sequence. The power dissipation with the optimum chip area of PRBG is also observed for the proposed architecture. The generated sequence passes all the 15 tests of the National Institute of Standards and Technology (NIST) standard. Verilog HDL code is used for the design of the proposed architecture. Its simulation is done on commercially available Spartan-3E FPGA (ISE Design Suite by Xilinx) as well as on 90-nm CMOS technology (Cadence tool).
This paper introduces an FPGA implementation of a pseudo-random number generator (PRNG) using Chen’s chaotic system. This paper mainly focuses on the development of an efficient VLSI architecture of PRNG in terms of bit rate, area resources, latency, maximum length sequence, and randomness. First, we analyze the dynamic behavior of the chaotic trajectories of Chen’s system and set the parameter’s value to maintain low hardware design complexity. A circuit realization of the proposed PRNG is presented using hardwired shifting, additions, subtractions, and multiplexing schemes. The benefit of this architecture, all the binary multiplications (except Xi⋅YiXi⋅Yi and Xi⋅Zi)Xi⋅Zi) operations are performed using hardwired shifting. Moreover, the generated sequences pass all the 15 statistical tests of NIST, while it generates pseudo-random numbers at a uniform clock rate with minimum hardware complexity. The proposed architecture of PRNG is realized using Verilog HDL, prototyped on the Virtex-5 FPGA (XC5VLX50T) device, and its analysis has been done using the Matlab tool. Performance analysis confirms that the proposed Chen chaotic attractor-based PRNG scheme is simple, secure, and hardware efficient, with high potential to be adopted in cryptography applications.
With the rapid development of smart mobile devices, mobile applications are becoming more and more popular. Since mobile devices usually have constrained computing capacity, computation offloading to mobile edge computing (MEC) to achieve a lower latency is a promising paradigm. In this paper, we focus on the optimal offloading problem for streaming applications in MEC. We present solutions to find offloading policies of streaming applications to achieve an optimal latency. Streaming applications are modeled with synchronous data flow graphs. Two architecture assumptions are considered — with sufficient processors on both the local device and the MEC server, and with a limited number of processors on both sides. The problem is generally NP-complete. We present an exact algorithm and a heuristic algorithm for the former architecture assumption and a heuristic method for the latter. We carry out our experiments on a practical application and thousands of synthetic graphs to comprehensively evaluate our methods. The experimental results show that our methods are effective and computationally efficient.
A new method for the generation of pseudo-random bits, based on a coupled-linear congruential generator (CLCG) and two multistage variable seeds linear feedback shift registers (LFSRs) is presented. The proposed algorithm dynamically changes the value of the seeds of each linear congruential generator (LCG) by utilizing the multistage variable seeds LFSR. The proposed approach exhibits several advantages over the pseudo-random bit generator (PRBG) methods presented in the literature. It provides low hardware complexity and high-security strength while maintaining the minimum critical path delay. Moreover, this design generates the maximum length of pseudo-random bit sequence with uniform clock latency. Furthermore, to improve the critical path delay, one more architecture of PRBG is proposed in this work. It is based on the combination of coupled modified-LCG with two variable seeds multistage LFSRs. The modified LCG block is designed by the two-operand modulo adder and XOR gate, rather than the three-operands modulo adder and shifting operation, while it maintains the same security strength. The clock gating network (CGN) is also used in this work to minimize the dynamic power dissipation of the overall PRBG architecture. The proposed architectures are implemented using Verilog HDL and further prototyped on commercially available field-programmable gate array (FPGA) devices Virtex-5 and Virtex-7. The realization of the proposed architecture in this FPGA device accomplishes an improved speed of PRBG, which consumes low power with high randomness compared to existing techniques. The generated binary sequence from the proposed algorithms has been verified for randomness tests using NIST statistical test suites.
Software Defined Networking (SDN) is a new promising network architecture, with the property of decoupling the data plane from the control plane and centralizing the network topology logically, making the network more agile than traditional networks. However, with the continuous expansion of network scales, the single-controller SDN architecture is unable to meet the performance requirements of the network. As a result, the logically centralized and physically separated SDN multi-controller architecture comes into being, and thereupon the Controller Placement Problem (CPP) is proposed. In order to minimize the propagation latency in Wide Area Network (WAN), we propose Greedy Optimized K-means Algorithm (GOKA) which combines K-means with greedy algorithm. The main thought is to divide the network into multiple clusters, merge them greedily and iteratively until the given number of controllers is satisfied, and place a controller in each cluster through the K-means algorithm. With the purpose of proving the effectiveness of GOKA, we conduct experiments to compare with Pareto Simulated Annealing (PSA), Adaptive Bacterial Foraging Optimization (ABFO), K-means and K-means++++ on 6 real topologies from the Internet Topology Zoo and Internet2 OS3E. The results demonstrate that GOKA has a better and more stable solution than other four heuristic algorithms, and can decrease the propagation latency by up to 83.3%83.3%, 70.7%70.7%, 88.6%88.6% and 64.5%64.5% in contrast to PSA, ABFO, K-means and K-means++++, respectively. Moreover, the error rate between GOKA and the best solution is always less than 10%10%, which promises the precision of our proposed algorithm.
This paper presents a reconfigurable image confusion scheme, which uses Linear Congruential Generators (LCGs)-based Pseudorandom Bits Generator (PRBG). The PRBG is based on the variable input-coupled LCG with a reconfigurable clock divider. The proposed algorithm encrypts the input image up to four times successively using different random sequences in every attempt. This new scheme aims to efficiently extract statistically strong pseudorandom sequences from a proposed PRBG with a large keyspace and simultaneously increase the security level of the encrypted image. This PRBG was initially designed on Virtex-5 (XC5VLX110T), Virtex-7 (XC7VX330T) and Artix-7 (XC7A100T) Field Programmable Gate Arrays (FPGAs). The statistical properties of the proposed PRBG for four different configurations are verified by the National Institute of Standards and Technology (NIST) tests. Thereafter, a reconfigurable encryption/decryption algorithm that uses the proposed PRBG is developed for secure image encryption. The encryption process was accomplished using the MATLAB tool after obtaining the PRBG keys from the FPGA. To show the quality and strength of the encryption process, security analysis [correlations and Number of Pixels Change Rate (NPCR)] is performed. Security analysis results are compared with the conventional encryption algorithm to show that the developed reconfigurable encryption scheme provides better results in security tests.
Latency (i.e. time delay) in electronic markets affects the efficacy of liquidity taking strategies. During the time liquidity, takers process information and send marketable limit orders (MLOs) to the exchange, the limit order book (LOB) might undergo updates, so there is no guarantee that MLOs are filled. We develop a latency-optimal trading strategy that improves the marksmanship of liquidity takers. The interaction between the LOB and MLOs is modeled as a marked point process. Each MLO specifies a price limit so the order can receive worse prices and quantities than those the liquidity taker targets if the updates in the LOB are against the interest of the trader. In our model, the liquidity taker balances the tradeoff between the costs of missing trades and the costs of walking the book. In particular, we show how to build cost-neutral strategies, that on average, trade price improvements for fewer misses. We employ techniques of variational analysis to obtain the price limit of each MLO the agent sends. The price limit of an MLO is characterized as the solution to a class of forward–backward stochastic differential equations (FBSDEs) driven by random measures. We prove the existence and uniqueness of the solution to the FBSDE and numerically solve it to illustrate the performance of the latency-optimal strategies.
The objective is to develop a general stochastic approach to delays on financial markets. We suggest such a concept in the context of large Platonic markets, which allow infinitely many assets and incorporate a restricted information setting. The discussion is divided into information delays and order execution delays. The former enables modeling of markets, where the observed information is delayed, while the latter provides the opportunity to defer the indexed time of a received asset price. Both delays may be designed randomly and inhomogeneously over time. We show that delayed markets are equipped with a fundamental theorem of asset pricing and our main result is inheritance of the no asymptotic Lp-free lunch condition under both delay types. Eventually, we suggest an approach to verify absence of Lp-free lunch on markets with multiple brokers endowed with deviating trading speeds.
In this paper, we propose a method to organize a tree-based Peer-to-Peer (P2P) overlay for video streaming which is resilient to the temporal reduction of the upload capacity of a node. The basic idea of the proposed method is: (1) to introduce the redundancy to a given tree-structured overlay, in such a way that a part of the upload capacity of each node is proactively used for connecting to a sibling node, and (2) to use those links connecting to the siblings to forward video stream to the siblings. More specifically, we prove that even if the maximum number of children of a node temporally reduces from m to m − k for some 1 ≤ k ≤ m − 1, the proposed method continues the forwarding of video stream to all of m children in at most 2x hops, where x is the smallest integer satisfying m − k ≥ m/2x. We also derive a sufficient condition to bound the increase of the latency by an additive constant. The derived sufficient condition indicates that if each node can have at least six children in the overlay, the proposed method increases the latency by at most one, provided that the number of nodes in the overlay is at most 9331; namely the proposed method guarantees the delivery of video stream with a nearly optimal latency.
Software-Defined Networking disassociates the control plane from data plane. The problem of deciding upon the number and locations of controllers and assigning switches to them has attracted the attention of researchers. Foreseeing the possibility of failure of a controller, a backup controller has to be maintained for each switch so that the switches assigned to the failed controller can immediately be connected to their backup controllers. Hence, the switches cannot experience disconnections in case of failure of their controller. In this paper, two mathematical models are proposed. The first model focuses on minimizing the average of latencies from all switches to their backup controllers while considering the failure of the controllers. The second model aims at minimizing both the average and worst-case of latencies from all switches to the corresponding backup controllers. Both of our models are evaluated on three networks and are compared (in terms of two metrics, viz., average and worst-case latencies) with an existing model that focuses on minimizing only worst-case latency. The first model gives better average latency compared to the reference model. The second model also gives better average latency and almost equal worst-case latency compared to the reference model.
In this paper, we performed a comparative study of the different data replication strategies such as Adaptive Data Replication Strategy (ADRS), Dynamic Cost Aware Re-Replication and Rebalancing Strategy (DCR2S) and Efficient Placement Algorithm (EPA) in the cloud environment. The implementation of these three techniques is done in JAVA and the performance analysis is conducted to study the performance of those replication techniques by various parameters. The parameters used for the performance analysis of these three techniques are Load Variance, Response Time, Probability of File Availability, System Byte Effective Rate (SBER), Latency, and Fault Ratio. From the analysis, it is evaluated that by varying the number of file replicas, it shows deviations in the outcomes of these parameters. The comparative results were also analyzed.
The control plane plays an essential role in the implementation of Software Defined Network (SDN) architecture. Basically, the control plane is an isolated process and operates on control layer. The control layer encompasses controllers which provide a global view of the entire SDN. The Controller selection is more crucial for the network administrator to meet the specific use case. This research work mainly focuses on obtaining a better SDN controller. Initially, the SDN controllers are selected using integrated Analytic Hierarchy Process and Technique for Order Preference Similarity to Ideal Solution (AHP and TOPSIS) method. It facilitates to select minimal number of controllers based on their features in the SDN application. Finally, the performance evaluation is carried out using the CBENCH tool considering the best four ranked controllers obtained from the previous step. In addition, it is validated with the real-time internet topology such as Abilene and ERNET considering the delay factor. The result shows that the “Floodlight” controller responds better for latency and throughput. The selection of an optimum controller-Floodlight, using the real-world Internet topologies, outperforms in obtaining the path with a 28.57% decrease in delay in Abilene and 16.94% in ERNET. The proposed work can be applied in high traffic SDN applications.
Please login to be able to save your searches and receive alerts for new content matching your search criteria.