Performance

  1. A Note on Disk Drag Dynamics.

    Authors: Neil J. Gunther
    Subjects: Performance
    Abstract

    The electrical power consumed by typical magnetic hard disk drives (HDD) not
    only increases linearly with the number of spindles but, more significantly, it
    increases as very fast power-laws of speed (RPM) and diameter. Since the
    theoretical basis for this relationship is neither well-known nor readily
    accessible in the literature, we show how these exponents arise from
    aerodynamic disk drag and discuss their import for green storage capacity
    planning.

  2. Cross-entropy optimisation of importance sampling parameters for statistical model checking.

    Authors: Axel Legay, Cyrille Jégourel, Sean Sedwards
    Subjects: Performance
    Abstract

    Statistical model checking avoids the exponential growth of states associated
    with probabilistic model checking by estimating properties from multiple
    executions of a system and by giving results within confidence bounds. Rare
    properties are often very important but pose a particular challenge for
    simulation-based approaches, hence a key objective under these circumstances is
    to reduce the number and length of simulations necessary to produce a given
    level of confidence.

  3. Mathematical Model for the Optimal Utilization Percentile in M/M/1 Systems: A Contribution about Knees in Performance Curves.

    Authors: Francisco A. Gonzalez-Horta, Rogerio A. Enriquez-Caldera, Juan M. Ramirez-Cortes, Jorge Martinez-Carballido, Eldamira Buenfil-Alpuche
    Subjects: Performance
    Abstract

    Performance curves of queueing systems can be analyzed by separating them
    into three regions: the flat region, the knee region, and the exponential
    region. Practical considerations, usually locate the knee region between 70-90%
    of the theoretical maximum utilization. However, there is not a clear agreement
    about where the boundaries between regions are, and where exactly the
    utilization knee is located. An open debate about knees in performance curves
    was undertaken at least 20 years ago.

  4. Pushing the limits for medical image reconstruction on recent standard multicore processors.

    Authors: Jan Treibig, Gerhard Wellein, Georg Hager, Hannes G. Hofmann, Joachim Hornegger
    Subjects: Performance
    Abstract

    Volume reconstruction by backprojection is the computational bottleneck in
    many interventional clinical computed tomography (CT) applications. Today
    vendors in this field replace special purpose hardware accelerators by standard
    hardware like multicore chips and GPGPUs. This paper presents low-level
    optimizations for the backprojection algorithm, guided by a thorough
    performance analysis on four generations of Intel multicore processors
    (Harpertown, Westmere, Nehalem EX, and Sandy Bridge).

  5. Allocation and Admission Policies for Service Streams.

    Authors: Michele Mazzucco, Isi Mitrani, Mike Fisher, Paul McKee
    Subjects: Performance
    Abstract

    A service provisioning system is examined, where a number of servers are used
    to offer different types of services to paying customers. A customer is charged
    for the execution of a stream of jobs; the number of jobs in the stream and the
    rate of their submission is specified. On the other hand, the provider promises
    a certain quality of service (QoS), measured by the average waiting time of the
    jobs in the stream. A penalty is paid if the agreed QoS requirement is not met.
    The objective is to maximize the total average revenue per unit time.

  6. Profit-Aware Server Allocation for Green Internet Services.

    Authors: Michele Mazzucco, Dmytro Dyachuk, Marios Dikaiakos
    Subjects: Performance
    Abstract

    A server farm is examined, where a number of servers are used to offer a
    service to impatient customers. Every completed request generates a certain
    amount of profit, running servers consume electricity for power and cooling,
    while waiting customers might leave the system before receiving service if they
    experience excessive delays. A dynamic allocation policy aiming at satisfying
    the conflicting goals of maximizing the quality of users' experience while
    minimizing the cost for the provider is introduced and evaluated.

  7. Scheduling in a random environment: stability and asymptotic optimality.

    Authors: M. Jonckheere, U. Ayesta, M. Erausquin, I.M. Verloop
    Subjects: Performance
    Abstract

    We investigate the scheduling of a common resource between several concurrent
    users when the feasible transmission rate of each user varies randomly over
    time. Time is slotted and users arrive and depart upon service completion. This
    may model for example the flow-level behavior of end-users in a narrowband HDR
    wireless channel (CDMA 1xEV-DO). As performance criteria we consider the
    stability of the system and the mean delay experienced by the users.

  8. Computationally Efficient Modulation Level Classification Based on Probability Distribution Distance Functions.

    Authors: Paulo Urriza, Eric Rebeiz, Przemys\law Pawe\lczak, Danijela \vCabrić
    Subjects: Performance
    Abstract

    We present a novel modulation level classification (MLC) method based on
    probability distribution distance functions. The proposed method uses modified
    Kuiper and Kolmogorov- Smirnov (KS) distances to achieve low computational
    complexity and outperforms the state of the art methods based on cumulants and
    goodness-of-fit (GoF) tests. We derive the theoretical performance of the
    proposed MLC method and verify it via simulations. The best classification
    accuracy under AWGN with SNR mismatch and phase jitter is achieved with the
    proposed MLC method using Kuiper distances.

  9. A framework to experiment optimizations for real-time and embedded software.

    Authors: Hugues Cassé, Karine Heydemann, Haluk Ozaktas, Jonathan Ponroy, Christine Rochange, Olivier Zendra
    Subjects: Performance
    Abstract

    Typical constraints on embedded systems include code size limits, upper
    bounds on energy consumption and hard or soft deadlines. To meet these
    requirements, it may be necessary to improve the software by applying various
    kinds of transformations like compiler optimizations, specific mapping of code
    and data in the available memories, code compression, etc. However, a
    transformation that aims at improving the software with respect to a given
    criterion might engender side effects on other criteria and these effects must
    be carefully analyzed.

  10. Performance of Multi-Channel Multi-Stage Spectrum Sensing.

    Authors: Przemysław Pawełczak, Danijela Čabrić, Wesam Gabran
    Subjects: Performance
    Abstract

    We present an analytical framework which enables performance evaluation of
    different multi-channel multi-stage spectrum sensing protocols for
    Opportunistic Spectrum Access networks. Analyzed performance metrics include
    the average secondary user throughput and the average collision probability
    between the primary and secondary users. The analysis framework takes into
    account buffering of incoming secondary user traffic, parallel and single
    channel access, as well as prolonged channel observation periods at the first
    and last stage of sensing.

  11. Mantis: Predicting System Performance through Program Analysis and Modeling.

    Authors: Ling Huang, Petros Maniatis, Byung-Gon Chun, Mayur Naik, Sangmin Lee
    Subjects: Performance
    Abstract

    We present Mantis, a new framework that automatically predicts program
    performance with high accuracy. Mantis integrates techniques from programming
    language and machine learning for performance modeling, and is a radical
    departure from traditional approaches. Mantis extracts program features, which
    are information about program execution runs, through program instrumentation.
    It uses machine learning techniques to select features relevant to performance
    and creates prediction models as a function of the selected features.

  12. Low Power Reversible Parallel Binary Adder/Subtractor.

    Authors: H G Rangaraju, U. Venugopal, K N Muralidhara, K B Raja
    Subjects: Performance
    Abstract

    In recent years, Reversible Logic is becoming more and more prominent
    technology having its applications in Low Power CMOS, Quantum Computing,
    Nanotechnology, and Optical Computing. Reversibility plays an important role
    when energy efficient computations are considered. In this paper, Reversible
    eight-bit Parallel Binary Adder/Subtractor with Design I, Design II and Design
    III are proposed. In all the three design approaches, the full Adder and
    Subtractors are realized in a single unit as compared to only full Subtractor
    in the existing design.

  13. Forever Young: Aging Control In DTNs.

    Authors: Eitan Altman, Rachid El-Azouzi, Daniel Sadoc Menasche, Yuedong Xu
    Subjects: Performance
    Abstract

    The demand for Internet services that require frequent updates through small
    messages, also known as microblogging, has tremendously grown in the past few
    years. Although the use of such applications by domestic users is usually free,
    their access from mobile devices is subject to fees and consumes energy from
    limited batteries. If a user activates his mobile device and is in range of a
    service provider, a content update is received at the expense of monetary and
    energy costs. Thus, users face a tradeoff between such costs and their messages
    aging.

  14. Performance bounds in wormhole routing, a network calculus approach.

    Authors: Nadir Farhi, Bruno Gaujal
    Subjects: Performance
    Abstract

    We present a model of performance bound calculus on feedforward networks
    where data packets are routed under wormhole routing discipline. We are
    interested in determining maximum end-to-end delays and backlogs of messages or
    packets going from a source node to a destination node, through a given virtual
    path in the network. Our objective here is to give a network calculus approach
    for calculating the performance bounds. First we propose a new concept of
    curves that we call packet curves.

  15. A new tool for the performance analysis of massively parallel computer systems.

    Authors: Anton Stefanek, Richard Hayden, Jeremy Bradley
    Subjects: Performance
    Abstract

    We present a new tool, GPA, that can generate key performance measures for
    very large systems. Based on solving systems of ordinary differential equations
    (ODEs), this method of performance analysis is far more scalable than
    stochastic simulation. The GPA tool is the first to produce higher moment
    analysis from differential equation approximation, which is essential, in many
    cases, to obtain an accurate performance prediction. We identify so-called
    switch points as the source of error in the ODE approximation.

  16. Performance Evaluation of Components Using a Granularity-based Interface Between Real-Time Calculus and Timed Automata.

    Authors: Karine Altisen, Yanhong Liu, Matthieu Moy
    Subjects: Performance
    Abstract

    To analyze complex and heterogeneous real-time embedded systems, recent works
    have proposed interface techniques between real-time calculus (RTC) and timed
    automata (TA), in order to take advantage of the strengths of each technique
    for analyzing various components. But the time to analyze a state-based
    component modeled by TA may be prohibitively high, due to the state space
    explosion problem. In this paper, we propose a framework of granularity-based
    interfacing to speed up the analysis of a TA modeled component.

  17. Decentralized Fair Scheduling in Two-Hop Relay-Assisted Cognitive OFDMA Systems.

    Authors: Vincent K. N. Lau, Ying Cui, Rui Wang
    Subjects: Performance
    Abstract

    In this paper, we consider a two-hop relay-assisted cognitive downlink OFDMA
    system (named as secondary system) dynamically accessing a spectrum licensed to
    a primary network, thereby improving the efficiency of spectrum usage. A
    cluster-based relay-assisted architecture is proposed for the secondary system,
    where relay stations are employed for minimizing the interference to the users
    in the primary network and achieving fairness for cell-edge users.

  18. Performance Evaluation of Components Using a Granularity-based Interface Between Real-Time Calculus and Timed Automata.

    Authors: Karine Altisen, Yanhong Liu, Matthieu Moy
    Subjects: Performance
    Abstract

    To analyze complex and heterogeneous real-time embedded systems, recent works
    have proposed interface techniques between real-time calculus (RTC) and timed
    automata (TA), in order to take advantage of the strengths of each technique
    for analyzing various components. But the time to analyze a state-based
    component modeled by TA may be prohibitively high, due to the state space
    explosion problem. In this paper, we propose a framework of granularity-based
    interfacing to speed up the analysis of a TA modeled component.

  19. Magnetohydrodynamics on Heterogeneous architectures: a performance comparison.

    Authors: Bijia Pang, Ue-li Pen, Michael Perrone
    Subjects: Performance
    Abstract

    We present magneto-hydrodynamic simulation results for heterogeneous systems.
    Heterogeneous architectures combine high floating point performance many-core
    units hosted in conventional server nodes. Examples include Graphics Processing
    Units (GPU's) and Cell. They have potentially large gains in performance, at
    modest power and monetary cost. We implemented a magneto-hydrodynamic (MHD)
    simulation code on a variety of heterogeneous and multi-core architectures ---
    multi-core x86, Cell, Nvidia and ATI GPU --- in different languages, FORTRAN,
    C, Cell, CUDA and OpenCL.

  20. Efficient multicore-aware parallelization strategies for iterative stencil computations.

    Authors: Jan Treibig, Gerhard Wellein, Georg Hager
    Subjects: Performance
    Abstract

    Stencil computations consume a major part of runtime in many scientific
    simulation codes. As prototypes for this class of algorithms we consider the
    iterative Jacobi and Gauss-Seidel smoothers and aim at highly efficient
    parallel implementations for cache-based multicore architectures. Temporal
    cache blocking is a known advanced optimization technique, which can reduce the
    pressure on the memory bus significantly. We apply and refine this optimization
    for a recently presented temporal blocking strategy designed to explicitly
    utilize multicore characteristics.

  21. On the stability of flow-aware CSMA.

    Authors: T. Bonald, M. Feuillet
    Subjects: Performance
    Abstract

    We consider a wireless network where each flow (instead of each link) runs
    its own CSMA (Carrier Sense Multiple Access) algorithm. Specifically, each flow
    attempts to access the radio channel after some random time and transmits a
    packet if the channel is sensed idle. We prove that, unlike the standard CSMA
    algorithm, this simple distributed access scheme is optimal in the sense that
    the network is stable for all traffic intensities in the capacity region of the
    network.

  22. A Rank Based Replacement Policy for Multimedia Server Cache Using Zipf-Like Law.

    Authors: T R GopalaKrishnan Nair, P Jayarekha
    Subjects: Performance
    Abstract

    The cache replacement algorithm plays an important role in the overall
    performance of Proxy-Server system. In this paper we have proposed VoD cache
    memory replacement algorithm for a multimedia server system. We propose a Rank
    based cache replacement policy to manage the cache space in individual proxy
    server cache.

  23. Measuring Bandwidth for Super Computer Workloads.

    Authors: A. Neela Madheswari, R. S. D. Wahida Banu
    Subjects: Performance
    Abstract

    Parallel computing plays a major role in almost all the fields from research
    to major concern problem solving purposes. Many researches are till now
    focusing towards the area of parallel processing. Nowadays it extends its usage
    towards the end user application such as GPU as well as multi-core processor
    development.

  24. The Missing Piece Syndrome in Peer-to-Peer Communication.

    Authors: Ji Zhu, Bruce Hajek
    Subjects: Performance
    Abstract

    Typical protocols for peer-to-peer file sharing over the Internet divide
    files to be shared into pieces. New peers strive to obtain a complete
    collection of pieces from other peers and from a seed. In this paper we
    identify a problem that can occur if the seeding rate is not large enough. The
    problem is that, even if the statistics of the system are symmetric in the
    pieces, there can be symmetry breaking, with one piece becoming very rare. If
    peers depart after obtaining a complete collection, they can tend to leave
    before helping other peers receive the rare piece.

  25. Modeling the Probability of Failure on LDAP Binding Operations in Iplanet Web Proxy 3.6 Server.

    Authors: Alejandro Chinea Manrique de Lara
    Subjects: Performance
    Abstract

    This paper is devoted to the theoretical analysis of a problem derived from
    interaction between two Iplanet products: Web Proxy Server and the Directory
    Server. In particular, a probabilistic and stochastic-approximation model is
    proposed to minimize the occurrence of LDAP connection failures in Iplanet Web
    Proxy 3.6 Server. The proposed model serves not only to provide a
    parameterization of the aforementioned phenomena, but also to provide
    meaningful insights illustrating and supporting these theoretical results.

  26. A Performance Study of GA and LSH in Multiprocessor Job Scheduling.

    Authors: G. Padmavathi, S. R. Vijayalakshmi
    Subjects: Performance
    Abstract

    Multiprocessor task scheduling is an important and computationally difficult
    problem. This paper proposes a comparison study of genetic algorithm and list
    scheduling algorithm. Both algorithms are naturally parallelizable but have
    heavy data dependencies. Based on experimental results, this paper presents a
    detailed analysis of the scalability, advantages and disadvantages of each
    algorithm. Multiprocessors have emerged as a powerful computing means for
    running real-time applications, especially where a uni-processor system would
    not be sufficient enough to execute all the tasks.

  27. Performance Analysis of Software to Hardware Task Migration in Codesign.

    Authors: Dorsaf Sebai, Abderrazak Jemai, Imed Bennour
    Subjects: Performance
    Abstract

    The complexity of multimedia applications in terms of intensity of
    computation and heterogeneity of treated data led the designers to embark them
    on multiprocessor systems on chip. The complexity of these systems on one hand
    and the expectations of the consumers on the other hand complicate the
    designers job to conceive and supply strong and successful systems in the
    shortest deadlines. They have to explore the different solutions of the design
    space and estimate their performances in order to deduce the solution that
    respects their design constraints.

  28. Fault Tolerant Real Time Systems.

    Authors: T.R.Gopalakrishnan Nair, A. Christy Persya
    Subjects: Performance
    Abstract

    Real time systems are systems in which there is a commitment for timely
    response by the computer to external stimuli. Real time applications have to
    function correctly even in presence of faults. Fault tolerance can be achieved
    by either hardware or software or time redundancy. Safety-critical applications
    have strict time and cost constraints, which means that not only faults have to
    be tolerated but also the constraints should be satisfied. Deadline scheduling
    means that the taskwith the earliest required response time is processed.

  29. On the Model Transform in Stochastic Network Calculus.

    Authors: Kui Wu, Yuming Jiang, Jie Li
    Subjects: Performance
    Abstract

    Stochastic network calculus requires special care in the search of proper
    stochastic traffic arrival models and stochastic service models. Tradeoff must
    be considered between the feasibility for the analysis of performance bounds,
    the usefulness of performance bounds, and the ease of their numerical
    calculation. In theory, transform between different traffic arrival models and
    transform between different service models are possible. Nevertheless, the
    impact of the model transform on performance bounds has not been thoroughly
    investigated.

  30. RapidMind: Portability across Architectures and its Limitations.

    Authors: Iris Christadler, Volker Weinberg
    Subjects: Performance
    Abstract

    Recently, hybrid architectures using accelerators like GPGPUs or the Cell
    processor have gained much interest in the HPC community. The RapidMind
    Multi-Core Development Platform is a programming environment that allows
    generating code which is able to seamlessly run on hardware accelerators like
    GPUs or the Cell processor and multicore CPUs both from AMD and Intel.

  31. Sharp utilization thresholds for some real-time scheduling problems.

    Authors: Sathish Gopalakrishnan
    Subjects: Performance
    Abstract

    Scheduling policies for real-time systems exhibit threshold behavior that is
    related to the utilization of the task set they schedule, and in some cases
    this threshold is sharp. For the rate monotonic scheduling policy, we show that
    periodic workload with utilization less than a threshold $U_{RM}^{*}$ can be
    scheduled almost surely and that all workload with utilization greater than
    $U_{RM}^{*}$ is almost surely not schedulable.

  32. Performance Evaluation of Wimax Physical Layer under Adaptive Modulation Techniques and Communication Channels.

    Authors: Md. Ashraful Islam, Riaz Uddin Mondal, Md. Zahid Hasan
    Subjects: Performance
    Abstract

    Wimax (Worldwide Interoperability for Microwave Access) is a promising
    technology which can offer high speed voice, video and data service up to the
    customer end. The aim of this paper is the performance evaluation of an Wimax
    system under different combinations of digital modulation (BPSK, QPSK, 4 QAM
    and 16 QAM) and different communication channels AWGN and fading channels
    (Rayleigh and Rician). And the Wimax system incorporates Reed Solomon (RS)
    encoder with Convolutional encoder with half and two third rated codes in FEC
    channel coding.

  33. uFLIP: Understanding Flash IO Patterns.

    Authors: Luc Bouganim, Björn Jónsson, Philippe Bonnet
    Subjects: Performance
    Abstract

    Does the advent of flash devices constitute a radical change for secondary
    storage? How should database systems adapt to this new form of secondary
    storage? Before we can answer these questions, we need to fully understand the
    performance characteristics of flash devices. More specifically, we want to
    establish what kind of IOs should be favored (or avoided) when designing
    algorithms and architectures for flash-based systems. In this paper, we focus
    on flash IO patterns, that capture relevant distribution of IOs in time and
    space, and our goal is to quantify their performance.

  34. Experimental Performances Analysis of Load Balancing Algorithms in IEEE 802.11.

    Authors: Hamdi Salah, Soudani Adel, Tourki Rached
    Subjects: Performance
    Abstract

    In IEEE 802.11, load balancing algorithms (LBA) consider only the associated
    stations to balance the load of the available access points (APs). However,
    although the APs are balanced, it causes a bad situation if the AP has a lower
    signal length (SNR) less than the neighbor APs. So, balance the load and
    associate one mobile station to an access point without care about the signal
    to noise ratio (SNR) of the AP cause possibly an unforeseen QoS, such as the
    bit rate, the end to end delay, the packet loss.

Syndicate content