Distributed Systems
In this chapter you will be able to understand:
Ø Why to have a Distributed Systems
Ø What is Distributed Systems?
Ø Introduction to a Distributed Systems
Ø Examples of Distributed Systems
Ø Trends of Distributed Systems
Ø Focus on resource sharing
Ø Design issues of Distributed Systems
Why to have a Distributed
Systems
Cost of hardware and management
A collection of
cheap computers may be less expensive than one large supercomputer
Small simple
computers may be easier to manage than one large one
Resilience to failures
If one
component fails, others can proceed with work on the task
Scalability
The system can
be extended by adding more components (i.e., www)
Introduction:
Computer hardware prices falling, power increasing
Network connectivity increasing – Everyone is connected with fat
pipes
It is easy to connect hardware together
Definition
A Distributed System is one which components located at networked
computers communicate and coordinates their actions only by passing messages
According to Andrew Tanenbum, a Distributed System is a collection
of independent computers that appear to the users as a single coherent system
Typical Layering in a
Distributed System
The term middleware applies to a software layer that provides a
programming abstraction as well as masking the heterogeneity of the underlying
networks, hardware, operating systems and programming languages.
Examples: RMI, CORBA, DCOM
In building a Distributed System we need to address the following
consequences:
Concurrency
No Global Clock
Independent
Failures
The prime motivation for construction and using Distributed System
stems from a desire to share resources
Applications, Services
Middleware Services (RPC, RMI)
Operating System
Computer & Network Hardware
Examples of Distributed
System
Web
Search
Massively
Multiplayer Online Games
Financial
Trading
Web Search:
Nearly 6 billion search requests hit google server’s everyday
About 276 Million search request hit yahoo server’s everyday
Web search is one of the largest and most complex distributed system
installation in the history of computing Its infrastructure comprises of
ü Underlying physical infrastructure with very large numbers of
networked computers located at data centre’s around the world
ü A distribute file system designed to support very large files
ü An associated structured distributed storage system that offers fast
to very large datasets
ü A lock service that offers distributed system functions such as
distributed locking and agreement
ü Programming model supporting very large parallel and distribute
computations
Google, the market leader in web search technology, has put
significant effort into the design of a sophisticated distributed system
infrastructure to support search (and indeed other Google applications and
services such as Google Earth)
This represents one of the largest and most complex distributed
systems installation in the history of computing
Highlights of this infrastructure include
Massive largely scalable networked pool of computers forming the
underlying physical infrastructure
- A distributed file system designed to
support very large files
- Distributed storage system that offers
fast access to very large datasets
- Distributed locking and agreement
- A programming model that supports the
management of very large parallel and distributed computations
Massively Multiplayer
Online Games
Massively Multiplayer Online Games: The engineering of MMOGs
represents a major challenge particularly because of the need for fast response
times to preserve the user experience of the game Eve II is one of the largest
MMOG played in the year 2013
World of war craft is one of the most played games with over 10
million customers
According to the prediction, MMOG are estimated to rise to 60% by
2015 to $3.1 Billion
Challenges include
Fast response
times
Real time
propagation of events
Solutions for the design of a MMOG
Client-sever
model where server is formed through a dedicated duster
Distributed
architectures spread across globally (Example: Ever Quest)
Completely
decentralized approach based on peer-to-peer technology
Financial Trading:
Foreign exchange markets averaged $5.3 Trillion per day in April
2013
Events in Distribute systems involve communication & processing
of items of interest
Such systems typically employ distributed event based system
The data generated on any given day, across all publicly traded
stocks can be upwards of one terabyte of data
This massive amount of data must be analyzed in a very rapid and
efficient way in order to derive predictions faster than the competition
The analysis of data this large can thankfully be managed by use of
a distributed system that will process the various tasks in parallel
The concept was called Map-Reduce and since the papers publication,
it has been steadily gaining popularity among companies with massive amounts of
data
Google’s implementation has been kept proprietary, but an open
source project called Hadoop was created to satisfy the needs of other
companies with massive amounts of data such as an EMC, Facebook, IBM to name a
few.
P2P file sharing
These are the current generation terminology for information
generation and storage needs.
How should the file sharing requirements scale to such a
requirement?
- Peer to peer systems are generally
associated with the sharing of data resources and the bandwidth required
to access those shared data resources, but we notes other resources
- Information these days has 3 metrics –
Velocity, Variety, Volume
- The next generation storage requirements
are as follows and also explains what are we aiming at with respect to
files sharing
- Velocity
of Information: the rate at which information
is being produced
- Variety
of Information: types of information
generated, ex. Companies these days are focused to know the interests of
the customers to provide him valid promotions. Doing this way users are
happy with the suggestions and companies as well are able to pitch up the
sales brilliantly.
If 1 Byte = one grain of sand
We can identify three generations of peer of peer systems”
Napster music
exchange | railed in part on a central server
File sharing
systems | with greater fault tolerance and no reliance on a central server,
examples
- Gnutella
- Direct Connect
- ISO Hunt
- Bit torrent
- The Pirate Bay
The emergence of middle ware layers for peer to peer systems making
possible the application independent provision of resources
SETI@Home
SETI@Home (“SETI at home”) is an internet based public volunteer
computing project employing the BONIC software platform, hosted by the space
sciences laboratory, at the university of California, Berkeley, in the united
states. SWT1 is an acronym for the search for extra terrestrial intelligence.
Its purpose is to analyze radio signals, searching for signs of extra
terrestrial intelligence, and is one of many activities undertaken as part of
SET1.
www.seti.org