- Multiple Identities in BitTorrent Networks -- It deserves a seperate page :-)
Peer-to-peer (P2P) file sharing systems have become ubiquitous these days and at present the BitTorrent (BT) based P2P systems are very popular and successful. It has been argued that this is mostly due to the Tit-For-Tat (TFT) strategy used in BT that discourages free-ride behavior. However, Hale and Patarin identify the weakness of TFT and hypothesize that it is possible to use multiple identities to cheat and the success of BT is due to other reasons, notably the lack of meta-data search. To test whether this hypothesis is true and also to better understand why BT systems are so successful, we modify the official BT source code to allow the creation of multiple processes by one BT client. These processes use different identities and cooperatively download the same file simultaneously. They download different pieces of the same file without overlapping using several piece selection and sharing algorithms.
Experiment results show that, BT is fairly robust to the exploitation of multiple identities except for one case. In most cases, the use of multiple identities does not provide a speedup in a consistent or significant way. Besides clients with multiple identities are still punished if they do not maintain a comparable upload rate with other normal clients. We attribute this to the robust way that the Tit-For-Tat policy works.
- Direct File Transfer in BitTorrent Networks
Peer-to-peer (P2P) file sharing systems have become ubiquitous these days and so far BitTorrent (BT) based P2P systems seems to be the most popular successful. In this work, we develop and implement a modified BitTorrent client which supports direct file transfer between users as an additional functionality. We explain our approach in details and elaborate on our modifications to the original BitTorrent source code. Validation of our methodology is shown in our experiments. We then outline directions for future work including unique username across BitTorrent network and secure file transfer.
- A Delay Aware Multi-level Clustering Protocol for Wireless Ad Hoc Networks
PDF file: Class report for CS257 Wireless and Mobile Networking
The proliferation of small wireless devices today has made the development of scalable ad hoc wireless protocols critical to their success. This article presents design and evaluation of a distributed, delay-aware clustering protocol, which forms clusters based on node's average media access delay. We also develop intra/inter-cluster routing to utilize the cluster structure, in which cluster heads do not perform any special functionality in data forwarding for its members but take responsibility in control and routing information dissemination. The simulation result shows our protocol outperforms AODV in terms of routing control overhead and network throughput.
- Characteristic of Internet Traffic with Respect to Long-Range Dependence
PDF file: Class report for CS204 Advanced Networks
Nowadays network traffic has been modeled as self-similar process and identified as longrange
dependence (LRD) in large time scales compared with the Poisson process around ten
years ago. In our project, we employ the WIDE backbone traces and try to demonstrate the
existence of LRD in large time scales. The data we use here include two-day trace files, as well
as the aggregated traffic over the last four years. We examine various statistics characteristics
of the original network traces, including auto-correlation function, cross-correlation function,
as well as the evaluation of differrent Hurst estimators. In order to eliminate the effect of
short-range dependence, we also apply to the trace files with internal shuffling buckets. After
numerous statistical data analysis of the result from the experiments, we show that LRD does
exist in WIDE daily traces at large time scales. Since the non-consistent estimations for Hurst
exponent from different methods have been shown in our results, we believe that it is unreliable
to use only one or several estimators to verify the existence of long-range memory. At the end
of the report, we propose some future work.
- Intelligent Icons for Text Classification
Intelligent icon is a novel visualization tool for classifications introduced. The standard file icons will be replaced by automatically generated icons that reflect the content of files. Our project is to generate intelligent icons automatically for text classification.
In automatic text classification, for example, for classifying different articles into predefined groups such as economics and technology, the most important step is to choose effective keywords. In some traditional classification methods, such as frequency analysis, high-frequency words are preferred and low-frequency words tend to be disregarded. However, there always exist low-frequency words that are appropriate for text classification. For instance, words appear in specific categories tends to have relatively low frequencies, but they are effective keywords for text classification. Therefore, we need to dynamically select those effective keywords among a collection of documents.
After automatically generating effective keywords, we classify text files into different categories based on key word frequency in each single file by compute the Euclid distance. And then we generate intelligent icons for these test files for a better presentation and classification.
- Computer Science: Craft, Science or Engineering?
PDF file: Class report for CS245 Software Evolution
There has been a lot of philosophical and pragmatic debate on whether Computer Science is a craft, science or an engineering discipline. This article examines both Paul Graham's claim that hackers and painters have a lot in common and Dr. Tolbert's claim that Computer Science is a craft on its way to becoming an engineering discipline. We conclude that both claims have truth in them and argue that Computer Science is becoming a rigorous engineering discipline while still leaving plenty room for artistic creativity that demands great craft skills.
- Survey on Power-Aware Design for High-performance Processors
Today, Power and energy consumptions are very important issues because of the various cooling and packaging problems as well as for the portable and hand held devices' battery life. Continuing growth in complexity, frequency, and speculative execution outstrip the benefits of technology scaling and voltage reduction. This paper is presenting a survey of the existing methods to reduce the power and energy consumption of microprocessors, with an emphasis on the architectural modifications imposed by these solutions. Several topics are covered in this paper:
- Energy-efficient Front End: we will focus on power-aware design in branch prediction and register renaming
- Data Compression: The redundancy in the information (addresses, instructions, and data) stored and exchanged between the processor and the memory system can be removed by compression. Compression can reduce the number of bus lines while maintaining the same effective bandwidth.
- Dynamic Voltage Scaling (DVS): DVS is recognized as one of the most effective power reduction techniques. It exploits the fact that a major portion of power of CMOS circuitry scales quadratically with the supply voltage. As a result, lowering the supply voltage can significantly reduce power dissipation.
- Multi Clock Domain (MCD) Architectures: In MCD, multiple domains can use different frequencies and voltages to reduce power consumption, by dividing a chip into several (coarse-grained) clock domains, allowing frequency and voltage to be reduced in domains that are not currently on the application s critical path.
- Static Power Management (SPM) techniques: SPM techniques are applied at design time, together with a variety of simulation and measurement tools, targeting different levels of the system's harware and software components.
- A Survey of Load-Use Latency Hiding Techniques
As the gap between the main memory and processor clock speed keeps on increasing, load-to-use delay, i.e. the time required for a load instruction to fetch data from the memory hierarchy and to deliver it to the dependent instructions, becomes one of the key factors that impair the microprocessor performance. Despite the use of multi-level on-chip caches, higher frequency and larger caches negatively affect the load-to-use latency of the first-level cache. This paper surveys several techniques that hide the load-to-use latency:
- Address prediction: predict the load address ahead of time and allow lower level memory hierarchy access in speculation in parallel with front-end memory address calculation and verification.
- Load value prediction: predict the result of load instructions at dispatching to allow speculative execution of the dependant instructions.
- Cache hit-miss prediction: A mechanism that predicts cache hit and miss with high accuracy, which will help scheduling of the load-dependent instructions.
- Out of order load and store: allow load and store to execute and retire out of order from the Load and Store Queue (LSQ) and place long-latency-pending loads to a separate load wait buffer, so that to free the LSQ to accommodate new load and store instructions and to boost the overall performance.
- Signature Buffer: A new storage media together with a novel addressing mechanism that can avoid address calculation and eliminate the load-to-use stalls.
- Data prefectching: bring the required data earlier so that the miss latency can be overlapped with prior memory operations or other computations.