Architecture

  1. An efficient FPGA implementation of MRI image filtering and tumor characterization using Xilinx system generator.

    Authors: S. Allin Christe, M.Vignesh, A.Kandaswamy
    Subjects: Architecture
    Abstract

    This paper presents an efficient architecture for various image filtering
    algorithms and tumor characterization using Xilinx System Generator (XSG). This
    architecture offers an alternative through a graphical user interface that
    combines MATLAB, Simulink and XSG and explores important aspects concerned to
    hardware implementation. Performance of this architecture implemented in
    SPARTAN-3E Starter kit (XC3S500E-FG320) exceeds those of similar or greater
    resources architectures. The proposed architecture reduces the resources
    available on target device by 50%.

  2. Low-Latency SC Decoder Architectures for Polar Codes.

    Authors: Chuan Zhang, Keshab K. Parhi, Bo Yuan
    Subjects: Architecture
    Abstract

    Nowadays polar codes are becoming one of the most favorable capacity
    achieving error correction codes for their low encoding and decoding
    complexity. However, due to the large code length required by practical
    applications, the few existing successive cancellation (SC) decoder
    implementations still suffer from not only the high hardware cost but also the
    long decoding latency. This paper presents novel several approaches to design
    low-latency decoders for polar codes based on look-ahead techniques.

  3. Reduced-Latency SC Polar Decoder Architectures.

    Authors: Chuan Zhang, Keshab K. Parhi, Bo Yuan
    Subjects: Architecture
    Abstract

    Polar codes have become one of the most favorable capacity achieving error
    correction codes (ECC) along with their simple encoding method. However, among
    the very few prior successive cancellation (SC) polar decoder designs, the
    required long code length makes the decoding latency high. In this paper,
    conventional decoding algorithm is transformed with look-ahead techniques. This
    reduces the decoding latency by 50%. With pipelining and parallel processing
    schemes, a parallel SC polar decoder is proposed.

  4. Efficient Network for Non-Binary QC-LDPC Decoder.

    Authors: Chuan Zhang, Keshab K. Parhi
    Subjects: Architecture
    Abstract

    This paper presents approaches to develop efficient network for non-binary
    quasi-cyclic LDPC (QC-LDPC) decoders. By exploiting the intrinsic shifting and
    symmetry properties of the check matrices, significant reduction of memory size
    and routing complexity can be achieved. Two different efficient network
    architectures for Class-I and Class-II non-binary QC-LDPC decoders have been
    proposed, respectively.

  5. A Design Methodology for Folded, Pipelined Architectures in VLSI Applications using Projective Space Lattices.

    Authors: Hrishikesh Sharma, Sachin Patkar
    Subjects: Architecture
    Abstract

    Semi-parallel, or folded, VLSI architectures are used whenever hardware
    resources need to be saved at design time. Most recent applications that are
    based on Projective Geometry (PG) based balanced bipartite graph also fall in
    this category. In this paper, we provide a high-level, top-down design
    methodology to design optimal semi-parallel architectures for applications,
    whose Data Flow Graph (DFG) is based on PG bipartite graph. Such applications
    have been found e.g. in error-control coding and matrix computations.

  6. Reversible arithmetic logic unit.

    Authors: Rigui zhou, Yang shi, Manqun Zhang
    Subjects: Architecture
    Abstract

    Quantum computer requires quantum arithmetic. The sophisticated design of a
    reversible arithmetic logic unit (reversible ALU) for quantum arithmetic has
    been investigated in this letter. We provide explicit construction of
    reversible ALU effecting basic arithmetic operations. By provided the
    corresponding control unit, the proposed reversible ALU can combine the
    classical arithmetic and logic operation in a reversible integrated system.
    This letter provides actual evidence to prove the possibility of the
    realization of reversible Programmable Logic Device (RPLD) using reversible
    ALU.

  7. HMTT: A Hybrid Hardware/Software Tracing System for Bridging Memory Trace's Semantic Gap.

    Authors: Yan Zhu, Yungang Bao, Jinyong Zhang, Dan Tang, Yuan Ruan, Mingyu Chen, Jianping Fan
    Subjects: Architecture
    Abstract

    Memory trace analysis is an important technology for architecture research,
    system software (i.e., OS, compiler) optimization, and application performance
    improvements. Hardware-snooping is an effective and efficient approach to
    monitor and collect memory traces. Compared with software-based approaches,
    memory traces collected by hardware-based approaches are usually lack of
    semantic information, such as process/function/loop identifiers, virtual
    address and I/O access.

  8. Multi-Amdahl: Optimal Resource Sharing with Multiple Program Execution Segments.

    Authors: Tsahee Zidenberg, Isaac Keslassy, Uri Weiser
    Subjects: Architecture
    Abstract

    This paper presents Multi-Amdahl, a resource allocation analytical tool for
    heterogeneous systems. Our model includes multiple program execution segments,
    where each one is accelerated by a specific hardware unit. The acceleration
    speedup of the specific hardware unit is a function of a limited resource, such
    as the unit area, power, or energy. Using the Lagrange theorem we discover the
    optimal resource distribution between all specific units. We then illustrate
    this general Multi-Amdahl technique using several examples of area and power
    allocation among several cores and accelerators.

  9. Improving Network-on-Chip-based turbo decoder architectures.

    Authors: Maurizio Martina, Guido Masera
    Subjects: Architecture
    Abstract

    In this work novel results concerning Network-on-Chip-based turbo decoder
    architectures are presented. Stemming from previous publications, this work
    concentrates first on improving the throughput by exploiting adaptive-bandwidth
    reduction techniques. This technique shows in the best case an improvement of
    more than 60 Mb/s. Moreover, it is known that double-binary turbo decoders
    require higher area than binary ones. This characteristic has the negative
    effect of increasing the data width of the network nodes.

  10. High Speed Multiple Valued Logic Full Adder Using Carbon Nano Tube Field Effect Transistor.

    Authors: Ashkan Khatir, Shaghayegh Abdolahzadegan, Iman Mahmoudi
    Subjects: Architecture
    Abstract

    High speed Full-Adder (FA) module is a critical element in designing high
    performance arithmetic circuits. In this paper, we propose a new high speed
    multiple-valued logic FA module. The proposed FA is constructed by 14
    transistors and 3 capacitors, using carbon nano-tube field effect transistor
    (CNFET) technology. Furthermore, our proposed technique has been examined in
    different voltages (i.e., 0.65v and 0.9v). The observed results reveal power
    consumption and power delay product (PDP) improvements compared to existing FA
    counterparts

  11. No-Break Dynamic Defragmentation of Reconfigurable.

    Authors: Josef Angermeier, Tom Kamphans, Nils Schweer, Juergen Teich, Sandor Fekete, Christopher Tessars, Jan C. van der Veen, Dirk Koch
    Subjects: Architecture
    Abstract

    We propose a new method for defragmenting the module layout of a
    reconfigurable device, enabled by a novel approach for dealing with
    communication needs between relocated modules and with inhomogeneities found in
    commonly used FPGAs. Our method is based on dynamic relocation of module
    positions during runtime, with only very little reconfiguration overhead; the
    objective is to maximize the length of contiguous free space that is available
    for new modules.

  12. Multiplierless Modules for Forward and Backward Integer Wavelet Transform.

    Authors: Vasil Kolev
    Subjects: Architecture
    Abstract

    This article is about the architecture of a lossless wavelet filter bank with
    reprogrammable logic. It is based on second generation of wavelets with a
    reduced of number of operations. A new basic structure for parallel
    architecture and modules to forward and backward integer discrete wavelet
    transform is proposed.

  13. Heuristic approach to optimize the number of test cases for simple circuits.

    Authors: SM. Thamarai, K.Kuppusamy, T.Meyyappan
    Subjects: Architecture
    Abstract

    In this paper a new solution is proposed for testing simple stwo stage
    electronic circuits. It minimizes the number of tests to be performed to
    determine the genuinity of the circuit. The main idea behind the present
    research work is to identify the maximum number of indistinguishable faults
    present in the given circuit and minimize the number of test cases based on the
    number of faults that has been detected. Heuristic approach is used for test
    minimization part, which identifies the essential tests from overall test
    cases.

  14. Multi-standard programmable baseband modulator for next generation wireless communication.

    Authors: Indranil Hatai, Indrajit Chakrabarti
    Subjects: Architecture
    Abstract

    Considerable research has taken place in recent times in the area of
    parameterization of software defined radio (SDR) architecture. Parameterization
    decreases the size of the software to be downloaded and also limits the
    hardware reconfiguration time. The present paper is based on the design and
    development of a programmable baseband modulator that perform the QPSK
    modulation schemes and as well as its other three commonly used variants to
    satisfy the requirement of several established 2G and 3G wireless communication
    standards.

  15. Universal Numeric Segmented Display.

    Authors: S. M. Kamruzzaman, Md. Abul Kalam Azad, Rezwana Sharmeen
    Subjects: Architecture
    Abstract

    Segmentation display plays a vital role to display numerals. But in today's
    world matrix display is also used in displaying numerals. Because numerals has
    lots of curve edges which is better supported by matrix display. But as matrix
    display is costly and complex to implement and also needs more memory, segment
    display is generally used to display numerals.

  16. A Unique 10 Segment Display for Bengali Numerals.

    Authors: S. M. Kamruzzaman, Md. Abul Kalam Azad, Rezwana Sharmeen, Shabbir Ahmad
    Subjects: Architecture
    Abstract

    Segmented display is widely used for efficient display of alphanumeric
    characters. English numerals are displayed by 7 segment and 16 segment display.
    The segment size is uniform in this two display architecture. Display
    architecture using 8, 10, 11, 18 segments have been proposed for Bengali
    numerals 0...9 yet no display architecture is designed using segments of
    uniform size and uniform power consumption. In this paper we have proposed a
    uniform 10 segment architecture for Bengali numerals. This segment architecture
    uses segments of uniform size and no bent segment is used.

  17. On the Design and Analysis of Quaternary Serial and Parallel Adders.

    Authors: Masud Hasan, Anindya Das, Ifat Jahangir
    Subjects: Architecture
    Abstract

    Optimization techniques for decreasing the time and area of adder circuits
    have been extensively studied for years mostly in binary logic system. In this
    paper, we provide the necessary equations required to design a full adder in
    quaternary logic system. We develop the equations for single-stage parallel
    adder which works as a carry look-ahead adder. We also provide the design of a
    logarithmic stage parallel adder which can compute the carries within log2(n)
    time delay for n qudits.

  18. Wideband Spectrum Sensing at Sub-Nyquist Rates.

    Authors: Moshe Mishali, Yonina C. Eldar
    Subjects: Architecture
    Abstract

    We present a mixed analog-digital spectrum sensing method that is especially
    suited to the typical wideband setting of cognitive radio (CR). The advantages
    of our system with respect to current architectures are threefold. First, our
    analog front-end is fixed and does not involve scanning hardware. Second, both
    the analog-to-digital conversion (ADC) and the digital signal processing (DSP)
    rates are substantially below Nyquist.

  19. BSSSN: Bit String Swapping Sorting Network for Reversible Logic Synthesis.

    Authors: Md. Saiful Islam
    Subjects: Architecture
    Abstract

    In this paper, we have introduced the notion of UselessGate and
    ReverseOperation. We have also given an algorithm to implement a sorting
    network for reversible logic synthesis based on swapping bit strings. The
    network is constructed in terms of n*n Toffoli Gates read from left to right
    and it has shown that there will be no more gates than the number of swappings
    the algorithm requires. The gate complexity of the network is O(n2). The number
    of gates in the network can be further reduced by template reduction technique
    and removing UselessGate from the network.

  20. Sorting Network for Reversible Logic Synthesis.

    Authors: Md. Saiful Islam, Md. Rafiqul Islam, Abdullah Al Mahmud, Muhammad Rezaul karim
    Subjects: Architecture
    Abstract

    In this paper, we have introduced an algorithm to implement a sorting network
    for reversible logic synthesis based on swapping bit strings. The algorithm
    first constructs a network in terms of n*n Toffoli gates read from left to
    right. The number of gates in the circuit produced by our algorithm is then
    reduced by template matching and removing useless gates from the network. We
    have also compared the efficiency of the proposed method with the existing
    ones.

  21. Low Power Shift and Add Multiplier Design.

    Authors: C. N.Marimuthu, P. Thangaraj, Aswathy Ramesan
    Subjects: Architecture
    Abstract

    Today every circuit has to face the power consumption issue for both portable
    device aiming at large battery life and high end circuits avoiding cooling
    packages and reliability issues that are too complex. It is generally accepted
    that during logic synthesis power tracks well with area. This means that a
    larger design will generally consume more power. The multiplier is an important
    kernel of digital signal processors. Because of the circuit complexity, the
    power consumption and area are the two important design considerations of the
    multiplier.

  22. Distributed Fault-Tolerant Avionic Systems - A Real-Time Perspective.

    Authors: Michael Burke, Neil Audsley
    Subjects: Architecture
    Abstract

    This paper examines the problem of introducing advanced forms of
    fault-tolerance via reconfiguration into safety-critical avionic systems. This
    is required to enable increased availability after fault occurrence in
    distributed integrated avionic systems(compared to static federated systems).
    The approach taken is to identify a migration path from current architectures
    to those that incorporate re-configuration to a lesser or greater degree.

  23. Static Address Generation Easing: a Design Methodology for Parallel Interleaver Architectures.

    Authors: Cyrille Chavet, Philippe Coussy, Eric Martin, Pascal Urard
    Subjects: Architecture
    Abstract

    For high throughput applications, turbo-like iterative decoders are
    implemented with parallel architectures. However, to be efficient parallel
    architectures require to avoid collision accesses i.e. concurrent read/write
    accesses should not target the same memory block. This consideration applies to
    the two main classes of turbo-like codes which are Low Density Parity Check
    (LDPC) and Turbo-Codes. In this paper we propose a methodology which finds a
    collision-free mapping of the variables in the memory banks and which optimizes
    the resulting interleaving architecture.

  24. Ahb Compatible DDR Sdram Controller Ip Core for Arm Based Soc.

    Authors: Dr. R. Shashikumar, C. N. Vijay Kumar, M. Nagendrakumar, C. S. Hemanthkumar
    Subjects: Architecture
    Abstract

    DDR SDRAM is similar in function to the regular SDRAM but doubles the
    bandwidth of the memory by transferring data on both edges of the clock cycles.
    DDR SDRAM most commonly used in various embedded application like networking,
    image or video processing, Laptops ete. Now a days many applications needs more
    and more cheap and fast memory. Especially in the field of signal processing,
    requires significant amount of memory. The most used type of dynamic memory for
    that purpose is DDR SDRAM.

  25. Evaluation and Design Space Exploration of a Time-Division Multiplexed NoC on FPGA for Image Analysis Applications.

    Authors: Linlin Zhang, Virginie Fresse, Mohammed Khalid, Dominique Houzet, Anne-Claire Legrand
    Subjects: Architecture
    Abstract

    The aim of this paper is to present an adaptable Fat Tree NoC architecture
    for Field Programmable Gate Array (FPGA) designed for image analysis
    applications. Traditional NoCs (Network on Chip) are not optimal for dataflow
    applications with large amount of data. On the opposite, point to point
    communications are designed from the algorithm requirements but they are
    expensives in terms of resource and wire. We propose a dedicated communication
    architecture for image analysis algorithms.

  26. VLSI Architectures for WIMAX Channel Decoders.

    Authors: Maurizio Martina, Guido Masera
    Subjects: Architecture
    Abstract

    This chapter describes the main architectures proposed in the literature to
    implement the channel decoders required by the WiMax standard, namely
    convolutional codes, turbo codes (both block and convolutional) and LDPC. Then
    it shows a complete design of a convolutional turbo code encoder/decoder system
    for WiMax.

  27. Maintaining Virtual Areas on FPGAs using Strip Packing with Delays.

    Authors: Josef Angermeier, Sandor P. Fekete, Tom Kamphans, Nils Schweer, Juergen Teich
    Subjects: Architecture
    Abstract

    Every year, the computing resources available on dynamically partially
    reconfigurable devices increase enormously. In the near future, we expect many
    applications to run on a single reconfigurable device. In this paper, we
    present a concept for multitasking on dynamically partially reconfigurable
    systems called virtual area management. We explain its advantages, show its
    challenges, and discuss possible solutions.

  28. An Architectural Approach for Decoding and Distributing Functions in FPUs in a Functional Processor System.

    Authors: R. Selva rani, T.R. Gopalakrishnan Nair, H.K. Krutthika
    Subjects: Architecture
    Abstract

    The main goal of this research is to develop the concepts of a revolutionary
    processor system called Functional Processor System. The fairly novel work
    carried out in this proposal concentrates on decoding of function pipelines and
    distributing it in FPUs as a part of scheduling approach. As the functional
    programs are super-level programs that entails requirements only at functional
    level, decoding of functions and distribution of functions in the heterogeneous
    functional processor units are a challenge.

  29. A Multicore Processor based Real-Time System for Automobile management application.

    Authors: T.R. Gopalakrishnan Nair, Vaidehi. M.
    Subjects: Architecture
    Abstract

    In this paper we propose an Intelligent Management System which is capable of
    managing the automobile functions using the rigorous real-time principles and a
    multicore processor in order to realize higher efficiency and safety for the
    vehicle. It depicts how various automobile functionalities can be fine grained
    and treated to fit in real time concepts. It also shows how the modern
    multicore processors can be of good use in organizing vast amounts of
    correlated functions to be executed in real-time with excellent time
    commitments.

  30. Virtual-Threading: Advanced General Purpose Processors Architecture.

    Authors: Andrei I. Yafimau
    Subjects: Architecture
    Abstract

    The paper describes the new computers architecture, the main features of
    which has been claimed in the Russian Federation patent 2312388 and in the US
    patent application 11/991331. This architecture is intended to effective
    support of the General Purpose Parallel Computing (GPPC), the essence of which
    is extremely frequent switching of threads between states of activity and
    states of viewed in the paper the algorithmic latency.

  31. A Fault-tolerant Structure for Reliable Multi-core Systems Based on Hardware-Software Co-design.

    Authors: Huazhong Yang, Bingbing Xia, Fei Qiao, Hui Wang
    Subjects: Architecture
    Abstract

    To cope with the soft errors and make full use of the multi-core system, this
    paper gives an efficient fault-tolerant hardware and software co-designed
    architecture for multi-core systems. And with a not large number of test
    patterns, it will use less than 33% hardware resources compared with the
    traditional hardware redundancy (TMR) and it will take less than 50% time
    compared with the traditional software redundancy (time redundant).Therefore,
    it will be a good choice for the fault-tolerant architecture for the future
    high-reliable multi-core systems.

  32. A Scalable VLSI Architecture for Soft-Input Soft-Output Depth-First Sphere Decoding.

    Authors: Ernst Martin Witte, Filippo Borlenghi, Gerd Ascheid, Heinrich Meyr
    Subjects: Architecture
    Abstract

    Multiple-input multiple-output (MIMO) wireless transmission imposes huge
    challenges on the design of efficient hardware architectures for iterative
    receivers. A major challenge is soft-input soft-output (SISO) MIMO demapping,
    often approached by sphere decoding (SD). In this paper, we introduce the - to
    our best knowledge - first VLSI architecture for SISO SD applying a single
    tree-search approach. Compared with a soft-output-only base architecture
    similar to the one proposed by Studer et al.

  33. Hard Data on Soft Errors: A Large-Scale Assessment of Real-World Error Rates in GPGPU.

    Authors: Imran S. Haque, Vijay S. Pande
    Subjects: Architecture
    Abstract

    Graphics processing units (GPUs) are gaining widespread use in computational
    chemistry and other scientific simulation contexts because of their huge
    performance advantages relative to conventional CPUs. However, the reliability
    of GPUs in error-intolerant applications is largely unproven. In particular, a
    lack of error checking and correcting (ECC) capability in the memory subsystems
    of graphics cards has been cited as a hindrance to the acceptance of GPUs as
    high-performance coprocessors, but the impact of this design has not been
    previously quantified.

  34. Turbo NOC: a framework for the design of Network On Chip based turbo decoder architectures.

    Authors: Maurizio Martina, Guido Masera
    Subjects: Architecture
    Abstract

    This work proposes a general framework for the design and simulation of
    network on chip based turbo decoder architectures. Several parameters in the
    design space are investigated, namely the network topology, the parallelism
    degree, the rate at which messages are sent by processing nodes over the
    network and the routing strategy.

  35. Boosting XML Filtering with a Scalable FPGA-based Architecture.

    Authors: Abhishek Mitra, Marcos Vieira, Petko Bakalov, Walid Najjar, Vassilis Tsotras
    Subjects: Architecture
    Abstract

    The growing amount of XML encoded data exchanged over the Internet increases
    the importance of XML based publish-subscribe (pub-sub) and content based
    routing systems. The input in such systems typically consists of a stream of
    XML documents and a set of user subscriptions expressed as XML queries. The
    pub-sub system then filters the published documents and passes them to the
    subscribers. Pub-sub systems are characterized by very high input ratios,
    therefore the processing time is critical.

  36. Hardware Virtualization Support In INTEL, AMD And IBM Power Processors.

    Authors: Kamanashis Biswas, Md. Ashraful Islam
    Subjects: Architecture
    Abstract

    At present, the mostly used and developed mechanism is hardware
    virtualization which provides a common platform to run multiple operating
    systems and applications in independent partitions. More precisely, it is all
    about resource virtualization as the term hardware virtualization is
    emphasized. In this paper, the aim is to find out the advantages and
    limitations of current virtualization techniques, analyze their cost and
    performance and also depict which forthcoming hardware virtualization
    techniques will able to provide efficient solutions for multiprocessor
    operating systems.

RSS-материал